@leixun

My takeaways:
1. DIstance 9:30
2. k-means algorithm 17:03
- How to choose k 23:57
- Unlucky initial centroids 25:56
- An example 28:58
- Scale data into the same range 37:11

@wajihaliaquat3365

Professor gifting the ones who contribute to this lecture. Loved that👏💝

@RaviShankar-vd8en

The explanation level of this video is by far the best I have ever watched. Prof. Guttag does a very good job in explaining every concept more clearly.

@jorgebjimenez3752

K-Means at16.30: one of the very best algorithms in IA

@MrSrijanb

it just struck me, after all these lecture videos, that professor Guttag is actually using a classic positive reinforcement technique to make the students more attentive and responsive in class by giving out candies for correct answer. lol! and i am not sure if its the result of this or something else but the students seem wayyy too eager to answer questions in this paticular lecture video!

@handang9165

I cant believe I am binge watching MIT lectures. I wish I had a chance to attend MIT back then.

@dontusehername

I wish I get the opportunity to sit in a class at MIT someday! Such brilliant minds

@newbie8051

Great lecture !
I attended the Clustering lecture by prof Ayan Seal today (even though I dont have the course : Introduction to Data Science) , he didn't focus a lot on code, but had similar things to share about clustering !

@naheliegend5222

love that prof for 4:35 - that is brilliant

@matheusbarros8488

When we are clustering the airports, the professor only stopped to think about linkage when he arrived  at Denver. Shouldn't we have thought about it since the beginning of the clustering?
If so, we could have gotten (BOS, SF) instead of (BOS, NY) for the first iteration using complete linkage.

@shaileshrana7165

I wanna attend Professor Guttag's classes mostly for the education but also for the candies.

@adiflorense1477

39:54 I think z-scaling is the same as creating a normally distributed dataset

@BaoTran-se4xi

The guys who down voted this video must had nothing better to do. The lecture was nicely paced and I think he already made the problem as clear as it can get.
Anyway, that was a great lecture. A big thank you to Professor Guttag and the MIT OpenCourseWare team.

@ElVerdaderoAbejorro

This professor is awesome!

@flamingjob2

thank you mit! from singapore . lots of love

@Furzgranate666

Professor Guttag: 'Dendrogram... I should write that down.'
also Professor Guttag: mispells it :D

@mauricesavery

great professor

@marceli1109

What are some some methods to evaluate the quality of the clusters, if we do not have an outcome variable? In the example they were evaluated based in part based on whether the subjects in the cluster died at a higher rate. What do I do if I don't have an outcome to look at, only characteristics? For context, I'm creating cognitive style groups based on user data for an insurance company, and these styles will be later used for morphing, churn etc. but do not have an outcome variable per se.

@shivaanyakulkarni4357

At 28:00, can anyone help here ?  How do we compare this dissimilarity (mentioned in IF statement), in Python. Badly need this.

@henrikmanukyan3152

Main issues of K-Means : choosing the number of clusters (k) and data scaling: But what if one wants to apply weights to the features (parameters)? Should you just multiply the features with the desired coefficients?