@nicokalkusinski9320

Key takeaways:
1. Talk a lot and explain why you decided to go with the specific solution
2. Don't be afraid to take a pause and rethink the problem
3. Don't get fixated on one aspect of the problem too much, always try to approach it from the bird's view
4. Focus first on the easier parts of the problem and then approach the harder ones
5. Remember it's a conversation with the interviewer, not the solo showoff
6. Ask them questions about the work they've done so far, what they learnt in the company, look for the signs of being valued as a person in a team, how would the first 6 months on the job look like

@oumaymaredissi5213

I appreciate that this interview is more like a conversation, with a focus on problem-solving. In the past, I've had interviews where the tone was passive-aggressive instead of constructive, so it's refreshing to have a more productive experience. It's great when interviews can be a valuable use of time, rather than a frustrating waste.

@arsnakehert

Even not knowing much about data science, this interview was very helpful in learning how it might be applied to real, known problems, and the mutual feedback at the end was very helpful to learn about the dynamics of interviews, too! Thank you guys for doing this.

@lifeofadeel

I love the part they pull open a Google doc to clarify their points and have it noted down. 

Excellent problem-solving approach. 

Focusing on the problem, its scope, its limitation, whats expected behaviour, what isnt etc ...was always something that I was rushing ahead with. 

Jumping to the solution right away is something that I had to unlearn.

Also great on Kylie for being super composed. Another super power and not easy to do in live situations.

Great video!

@duckcluck123

As someone who just finished their masters and is looking for a job in data science, this interview really boosted my confidence cuz I was able to respond to every question

@KylieYYing

Thanks for featuring!!

@temiwale88

Thank you for this interview. For the spam issue, you can tag on the traditional features that Kylie Ying  mentioned, and then use a multi-lingual embedding model to create vectors out of the post content. Use these features + embeddings to train your model.

@ShermSuite

Thank you for sharing this video. As someone who is transitioning into the Data Science field( Machine Learning/ AI), I was very surprised that I was able to keep during the interview. I was it lost at all. I don’t have a technical background, but I’ve been studying ,Azure,python,GitHub,R,SQL etc… pretty hard for the past few months, and doing some labs I’m feeling pretty confident I can make the move.

@agil.eera3010

Hello, I dont usually comment on Youtube but this time I just wanna say thank you for the people involved in this video. This really helped go through my technical interview and I ended up getting the job I wanted. The introduction and close of the interview was almost a copy of what happen in my interview. Obviously, the technical part was different (in my case they just asked about the technical challenge I had to solve prior the interview) but the way I approached the questions was very similar. Just THANKS!!

@Fetrah2

⭐ Contents ⭐
⌨ (0:00:00) Video overview & format
⌨ (0:02:13) Introductory Behavioral questions
⌨ (0:07:46) Social media platform bot issue task overview
⌨ (0:15:26) What are some features we should investigate regarding the bot issue?
⌨ (0:25:02) Classification model implementation details (using feature vectors)
⌨ (0:41:38) What would a dataset to train models to detect bots look like? How would you approach collecting this data?
⌨ (0:51:38) Technical implementation details (python libraries, cloud services, etc)
⌨ (0:56:01) Any questions for me?
⌨ (1:03:42) Post-interview breakdown & analysis

@BobCat-n7p

I really liked that Keith decided to use a Google doc, because in these what I would First Round 'team fit' and knowledge-gauging interviews the assumption is you'll just talk over zoom and not do whiteboarding or use another tool.  This was a good reminder to expect the unexpected - you could be asked to do anything - maybe even code :)

@CraigThePoet

I watched this video like a year ago knowing very little. Now, I feel like I can completely answer each question in detail and follow all of the concepts being mentioned.

@animexworld6614

My Knowledge increase a lot by watching this. Please Upload more mock interviews like this. I also some technical details in model implementation.

@rmfalme

Can we have Data Analyst mock interviews too?

@streetthinker1978

Keith is one fantastic teacher. He took my analytics skills from 0 to 5 very quickly. Great content.

@TravisMeyerPhD

Nice video. I think in the beginning when asked if she had any first thoughts on the issue of spam bots, one thing that could've been added was 'what are the positives of bots'  Too much was said about the negative aspects of bots, and the first impression I had was, if all bots are so negative, just ban bots. But bots do have a role, and many bots are used to automate functionality. So the real important point is, identify bots that are being malicious in some way. Then dive into how to develop metrics to identify the concept of 'malicious'.

@mudumbypraveen3308

my approach is slightly different : 

Basic data collection :

1. You have basic details capture when you create a YT account, name, email, DOB, Image, etc
2. Assume that everytime you log into Youtube your activity is recorded as follows : comment made (if any), time, post, ip_address, email, like, dislikes, report, type of report, followers, activity spent (scrolling, browsing etc) .


Selection on queries :

 1. Filter on accounts where comments are made and activity time in one session is extremely long (say more than 12hrs) or very frequent activities in a time interval ( e.g. log in/out 5 times in 1 minute) 

On feature engg side :

Extract features  -> in 1day how many comments are made, number of links posted in comment, number of reports per time interval, number of words, number of words which are spam, difference between account followers vs those followed , activity time in minutes/seconds (scrolling 100 videos/minutes indicates bot action). 

Target variables: 
1. Set threshold based on existing patterns visible eg. more than 50 comments are day, more than 100 reports in 1hr,  clicking 100 video's in 1min, if spam words > 10 ( should satisfy any of these conditions ) -> set to 1 else set to 0. 

Classification model side : Sklearn (fast and quick to test your ideas and features), Model- XGB, Logistic/SVM (baseline)

Deploy, see and rework

This is not perfect but this what I was coming up while seeing this video.

@DANNYEL20122

I kinda surprisingly enjoyed this. I didn't even know when the interview started. Feels like two people having a convo about data science

@Mr767267

Thanks to Nick and Kylie, it was very informative. Love the flow of the interview. I wanted to add something about the real issue around Spam Bots, as this is internal to the platform or system, you can always add a small CAPTCHA routine around the message sender side. Example, the click of "post" or enter button may pre-calculate the CAPTCHA before message is posted in a response, it may look heavy but can easily be done in a lean way.

@dipankarnandi7708

This was really a useful video. If interviews are like this. It's love 🎉 

One thing we can add in this feature are the links. Certain spam posts consists of similar links to the same post/ profile. 
Not only bots do it, but even people constantly spam their account in the comments. So Link Frequency,  Link Context and so on.

That feature used with Bayes Classifier can be useful to make the model more robust.  Bag of Words from NLP can also be used in order to make this Link tasks easier. 

Just an opinion of mine. 😊