In this video 📝 we are going to take a look at the new meta-transformer model for multiple inputs. Meta-transformer is a unified framework for multimodal learning and can take both images, video, text, audio, sensor data and so on as inputs. We are going to go through their project page, github repo and look at the model architecture and results.
If you’re looking for courses and to extend your knowledge even more, check out this link here:
👉 nicolai-nielsen-s-school.teachable.com/courses
If you enjoyed this video, be sure to press the 👍 button so that I know what content you guys like to see.
_______________________________________________________________
🧑🏻💻 My AI and Computer Vision Courses⭐:
📚 Research Paper Implementation Course: nicolai-nielsen-s-school.teachable.com/p/research-…
📕 Object Detection Course:nicolai-nielsen-s-school.teac/…
📗 OpenCV GPU Course: nicolai-nielsen-s-school.teac/...
📘 SegFormer Course: nicolai-nielsen-s-school.teac/...
📙 Object Tracking Course: nicolai-nielsen-s-school.teachable.com/p/yolov8-ob…
🦾 Online Courses with Job Guarantee on Springboard (Save $1000 with: "NICOLAINIELSEN") www.springboard.com/landing/influencer/nicolainiel…
_______________________________________________________________
📞 Connect with Me:
🌍 My Website: www.nicolai-nielsen.com/
🤖 GitHub: github.com/niconielsen32
👉 LinkedIn: www.linkedin.com/in/nicolai-h…
🐦 Twitter: twitter.com/NielsenCV_AI
_______________________________________________________________
tags:
#transformer #metatransformer #multimodal
コメント