I appreciate that this interview is more like a conversation, with a focus on problem-solving. In the past, I've had interviews where the tone was passive-aggressive instead of constructive, so it's refreshing to have a more productive experience. It's great when interviews can be a valuable use of time, rather than a frustrating waste.
Even not knowing much about data science, this interview was very helpful in learning how it might be applied to real, known problems, and the mutual feedback at the end was very helpful to learn about the dynamics of interviews, too! Thank you guys for doing this.
I love the part they pull open a Google doc to clarify their points and have it noted down. Excellent problem-solving approach. Focusing on the problem, its scope, its limitation, whats expected behaviour, what isnt etc ...was always something that I was rushing ahead with. Jumping to the solution right away is something that I had to unlearn. Also great on Kylie for being super composed. Another super power and not easy to do in live situations. Great video!
As someone who just finished their masters and is looking for a job in data science, this interview really boosted my confidence cuz I was able to respond to every question
Thanks for featuring!!
Thank you for this interview. For the spam issue, you can tag on the traditional features that Kylie Ying mentioned, and then use a multi-lingual embedding model to create vectors out of the post content. Use these features + embeddings to train your model.
Thank you for sharing this video. As someone who is transitioning into the Data Science field( Machine Learning/ AI), I was very surprised that I was able to keep during the interview. I was it lost at all. I don’t have a technical background, but I’ve been studying ,Azure,python,GitHub,R,SQL etc… pretty hard for the past few months, and doing some labs I’m feeling pretty confident I can make the move.
Hello, I dont usually comment on Youtube but this time I just wanna say thank you for the people involved in this video. This really helped go through my technical interview and I ended up getting the job I wanted. The introduction and close of the interview was almost a copy of what happen in my interview. Obviously, the technical part was different (in my case they just asked about the technical challenge I had to solve prior the interview) but the way I approached the questions was very similar. Just THANKS!!
⭐ Contents ⭐ ⌨ (0:00:00) Video overview & format ⌨ (0:02:13) Introductory Behavioral questions ⌨ (0:07:46) Social media platform bot issue task overview ⌨ (0:15:26) What are some features we should investigate regarding the bot issue? ⌨ (0:25:02) Classification model implementation details (using feature vectors) ⌨ (0:41:38) What would a dataset to train models to detect bots look like? How would you approach collecting this data? ⌨ (0:51:38) Technical implementation details (python libraries, cloud services, etc) ⌨ (0:56:01) Any questions for me? ⌨ (1:03:42) Post-interview breakdown & analysis
I really liked that Keith decided to use a Google doc, because in these what I would First Round 'team fit' and knowledge-gauging interviews the assumption is you'll just talk over zoom and not do whiteboarding or use another tool. This was a good reminder to expect the unexpected - you could be asked to do anything - maybe even code :)
I watched this video like a year ago knowing very little. Now, I feel like I can completely answer each question in detail and follow all of the concepts being mentioned.
My Knowledge increase a lot by watching this. Please Upload more mock interviews like this. I also some technical details in model implementation.
Can we have Data Analyst mock interviews too?
Keith is one fantastic teacher. He took my analytics skills from 0 to 5 very quickly. Great content.
Nice video. I think in the beginning when asked if she had any first thoughts on the issue of spam bots, one thing that could've been added was 'what are the positives of bots' Too much was said about the negative aspects of bots, and the first impression I had was, if all bots are so negative, just ban bots. But bots do have a role, and many bots are used to automate functionality. So the real important point is, identify bots that are being malicious in some way. Then dive into how to develop metrics to identify the concept of 'malicious'.
my approach is slightly different : Basic data collection : 1. You have basic details capture when you create a YT account, name, email, DOB, Image, etc 2. Assume that everytime you log into Youtube your activity is recorded as follows : comment made (if any), time, post, ip_address, email, like, dislikes, report, type of report, followers, activity spent (scrolling, browsing etc) . Selection on queries : 1. Filter on accounts where comments are made and activity time in one session is extremely long (say more than 12hrs) or very frequent activities in a time interval ( e.g. log in/out 5 times in 1 minute) On feature engg side : Extract features -> in 1day how many comments are made, number of links posted in comment, number of reports per time interval, number of words, number of words which are spam, difference between account followers vs those followed , activity time in minutes/seconds (scrolling 100 videos/minutes indicates bot action). Target variables: 1. Set threshold based on existing patterns visible eg. more than 50 comments are day, more than 100 reports in 1hr, clicking 100 video's in 1min, if spam words > 10 ( should satisfy any of these conditions ) -> set to 1 else set to 0. Classification model side : Sklearn (fast and quick to test your ideas and features), Model- XGB, Logistic/SVM (baseline) Deploy, see and rework This is not perfect but this what I was coming up while seeing this video.
I kinda surprisingly enjoyed this. I didn't even know when the interview started. Feels like two people having a convo about data science
Thanks to Nick and Kylie, it was very informative. Love the flow of the interview. I wanted to add something about the real issue around Spam Bots, as this is internal to the platform or system, you can always add a small CAPTCHA routine around the message sender side. Example, the click of "post" or enter button may pre-calculate the CAPTCHA before message is posted in a response, it may look heavy but can easily be done in a lean way.
This was really a useful video. If interviews are like this. It's love 🎉 One thing we can add in this feature are the links. Certain spam posts consists of similar links to the same post/ profile. Not only bots do it, but even people constantly spam their account in the comments. So Link Frequency, Link Context and so on. That feature used with Bayes Classifier can be useful to make the model more robust. Bag of Words from NLP can also be used in order to make this Link tasks easier. Just an opinion of mine. 😊
@nicokalkusinski9320