Excellent high-level overview. Very concise and clear!
Thank you so much for the clear explanation! ☺
Great explanation and high quality video. Thanks!
During the LLM training phase, the vocabulary or text corpus gets tokenized and then each token gets embedded into its own vector. When the cosine similarity comparison is made during the retrieval phase, is the query embedded as one vector or is each token in the query embedded into its own vector? If the latter, is vector addition then used to combine them all into one vector? Cosine similarity only compares two vectors at a time, right? If so, I'm curious as to how queries (and docs), each comprising multiple tokens, are prepared for similarity search. I hope my question makes sense.
In the final year of my undergrad, doing my dissertation on forecasting volatility of financial time series. Thanks for all your work. If you could do any videos that are applicable to this that would be great! (I'm sure you get requests every second of the day but yeah, thanks)
Nicely explained! What do you use for your illustrations?
Thanks for the explanation Ritvik! Could u pls make a video about LDA too?
Well done video
Thanks mate
Thank you for the video. At 6:30 You mention using Cosine Simularity here. Have you heard of a recent paper titled "Surpassing Cosine Similarity for Multidimensional Comparisons: Dimension Insensitive Euclidean Metric (DIEM)"? It'd be great to hear your opinion on it.
Hello Ritvik, I would appreciate it greatly if you could please make a video on modelling the ARIMAX model in excel. It would be hugely beneficial for me. Please please please see this comment 🙏
What vectro databases are widely used for this purpose?
have you ever thought of making a course on applied stats ? Please do
@ritvikmath