@SelfSimilarJosh

Imagine creating something so spectacularly elegant and having it rejected because your draw of two straws were all short. That's the process and that's the problem. Very well-done video and phenomenal pedagogy.

@jamescamacho3403

As someone actively working on this stuff, this channel has the best explanations on the internet, and the 'tuber actually understands what is going on.

@RexPilger

About peer review: As one comment noted, there could be many more candidate papers presented than could be accommodated at the venue. However,  this video argues, the rejection justification for this paper is inadequate at best. Some comments ask whether the rejection is important; for academics, the answer is yes, because presentations and publications count for tenure, promotions, and raises plus continued funding of the research. Since several comments plus the video indicate that the algorithm had already received a lot of publicity, for the sake of the project it may not matter if it can continue to be funded, especially if commercial implementations are successful. What is interesting in any case is that the paper exists; in effect it has been published; the authors may not get the desired credit for formal publication, but their work and the reviewer comments are out there now. A couple of decades ago that would not have been the case; most people in the field would be unaware of the algorithm. In terms of peer review, in general (outside of AI), in my field, one of the natural sciences, a paper I submitted for publication encountered an editor plus two reviewers who were well qualified in the field; after asking for two revisions to the manuscript, the third version was rejected. Interestingly, all three scientists had published research which my paper undermined; they may well have lost funding for their research or even their position had that manuscript of mine been published (I speculate here). Peer review cuts both ways. While iterating with the editor and reviewers I continued to expand my research project and made some additional discoveries. Following the rejection I wrote a completely different paper which incorporated my initial work supplemented by the new discoveries; happily it was published a few months ago (in a different journal). I'm formally retired now, but continue to do research. To young researchers -- never give up. Learn from rejection, refine your work, be humble, exercise integrity and honesty, and take pride in your accomplishments, even if only a few know about them. Peer review (by humans) is a necessity and will continue to be. There is no such thing as a perfect filter, but science and technology would be overwhelmed by irrelevancy, dishonesty, and duplication of effort without it. AI may become a useful filtering tool, but science is a human endeavor.

@andreasbeschorner1215

During my Ph.D times a paper of mine got rejected at ICASSP for not having quoted a certain paper (I guess the reviewer was one of the authors) which had absolutely NOTHING to do with what my paper was about... So yes, a lot in the reviewing process seems to be a) personal and b) must do this and that even if it is not related to your paper at all. Since years...

@jarib3858

One small note on RNN's, reservoir computing is a very high dimensional random RNN with linear regression readout, therefore there is no exploding nor vanishing gradient. Reservoir computing is currently the standard for non-linear dynamic time series prediction

@jawadmansoor6064

wow, you've made some difficult i mean extremely difficult algorithms look easy. thank you.

@shirenlu5260

Wow this is a great video. I've been having a lot of trouble understanding and getting an intuition of how Mamba works, and this video just made it make sense. The visuals were a massive help and the explanations are super simple and easy to understand.

@rikkathemejo

Nice video! I just wanted to point out that the parallel scan algorithm can be also implemented in O(n) time (instead of the O(n log(n)) version peresented in the video. and this is the version that the MAMBA uses.

@anrilombard1121

Currently testing it on molecular generation, so excited to see where these strengths hold and where they falter :)

@Paraxite7

I finally understand MAMBA! I've been trying to get my head around it for months, but now see that approaching the way the original paper stated wasn't the best way. Thank you.

@kamdynshaeffer9491

Absolutely amazing vid. Just subbed after getting recommended to this channel. Never stop making videos dude <3

@EkShunya

please open your community tab 
your content is incredible

@timseguine2

Thanks for the clear explanation. This gives me enough understanding to not only implement it myself, but to also have some ideas for sensible architecture modifications.

@Levy1111

I do hope you'll soon get at least 6 figures subscribers count. The quality of your videos (both in terms of education and presentation) is top notch, people need you to become popular (at least within our small tech bubble).

@mehnot8193

Extremely noob question but, at 13:52 why aren't the input vectors x multplied by P^-1 instead of P? Don't you need to convert them to the eigenbasis before applying the D transformation (or, equivalently, taking the hadamard product with the diag(D) vector)?

@logician1234

After 10 months passed, do you still think MAMBA is better than Transformers? Have there been any new updates and improvements?

@tianlechen

Peer reviews are highly motivated by the reviewers protecting their existing work extending previously state-of-the-art methodologies. If you have an actually new innovation that goes against the grain, you need to publish regardless of whether the venue is highly regarded or not.

@blutwurst9000

Love the video but I have the question: Shouldn't be the approximation at 17:00 be something like n*w^(n-1)*0.001*x, so isn't there an n missing? Or how was the approximation done?

@InfiniteQuest86

I like how we now call 1 billion parameters small.

@yqisq6966

Peer review is broken nowadays because people have little time to actually read through a manuscript with attention to details given the amount of pressure to publish their own papers. So when you have more papers out there than the time people can spend on reviewing, you get low quality peer review.