Introducing SAM 3D: Powerful 3D Reconstruction for Physical World Images | Weiyao Wang & Xitong Yang
The AI Talks
Introducing SAM 3D: Powerful 3D Reconstruction for Physical World Images | Weiyao Wang & Xitong Yang
38:37
[S5E3] Scaling Beyond Autoregression: Order scaling as a new path to AGI | Jinjie Ni | NUS
The AI Talks
[S5E3] Scaling Beyond Autoregression: Order scaling as a new path to AGI | Jinjie Ni | NUS
59:28
[S5E2] Video Models Are Zero-Shot Learners and Reasoners | Thaddäus Wiedemer | Google Deepmind
The AI Talks
[S5E2] Video Models Are Zero-Shot Learners and Reasoners | Thaddäus Wiedemer | Google Deepmind
44:53
[S5E1] The Tolman–Sherrington Metamorphosis of Intelligence | Hokin Deng
The AI Talks
[S5E1] The Tolman–Sherrington Metamorphosis of Intelligence | Hokin Deng
1:25:30
【S4E8】Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense
The AI Talks
【S4E8】Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense
38:02
【S4E7】Towards democratising robot learning for all
The AI Talks
【S4E7】Towards democratising robot learning for all
37:23
【S4E6】Learning Humanoid Robots
The AI Talks
【S4E6】Learning Humanoid Robots
1:04:15
【S4E5】Understanding and Mitigating the Pre-training Noise on Downstream Tasks
The AI Talks
【S4E5】Understanding and Mitigating the Pre-training Noise on Downstream Tasks
28:14
【S4E4】Video Creation with Diffusion Models
The AI Talks
【S4E4】Video Creation with Diffusion Models
45:29
【S4E3】Distilling Vision-Language Models on Millions of Videos
The AI Talks
【S4E3】Distilling Vision-Language Models on Millions of Videos
38:21
【S4E2】Towards Learning a Driving Simulator from the Real World
The AI Talks
【S4E2】Towards Learning a Driving Simulator from the Real World
43:08
【S4E1】InstantID: Zero-shot Identity-Preserving Generation in Seconds
The AI Talks
【S4E1】InstantID: Zero-shot Identity-Preserving Generation in Seconds
31:26
【S3E10】Long video understanding with minimal supervision
The AI Talks
【S3E10】Long video understanding with minimal supervision
46:31
【S3E9】3D Human Modelling from Image and Text Guidance
The AI Talks
【S3E9】3D Human Modelling from Image and Text Guidance
29:40
【S3E8】Learning visual language models for video understanding
The AI Talks
【S3E8】Learning visual language models for video understanding
43:31
【S4E7】Inductive Biases for Learning Long-Horizon Manipulation Skills
The AI Talks
【S4E7】Inductive Biases for Learning Long-Horizon Manipulation Skills
50:59
【S3E6】Generalist Embodied AI in an Open World
The AI Talks
【S3E6】Generalist Embodied AI in an Open World
36:39
【S3E5】3D Structured Generative Models
The AI Talks
【S3E5】3D Structured Generative Models
46:18
【S3E4】Learning to Edit 3D Objects and Scenes
The AI Talks
【S3E4】Learning to Edit 3D Objects and Scenes
49:45
【S3E3】Multimodal Representation Learning with Deep Generative Models
The AI Talks
【S3E3】Multimodal Representation Learning with Deep Generative Models
36:29
【S2E8】Customizing Large-Scale Generative Models
The AI Talks
【S2E8】Customizing Large-Scale Generative Models
31:49
【S3E2】Collecting and Leveraging Data without Crowd Workers
The AI Talks
【S3E2】Collecting and Leveraging Data without Crowd Workers
31:16
【S3E1】Mixture-of-Experts Meets Instruction Tuning: A Winning Combination for Large Language Models
The AI Talks
【S3E1】Mixture-of-Experts Meets Instruction Tuning: A Winning Combination for Large Language Models
29:49
【S2E11】Learning from Language Models for Visual Intelligence
The AI Talks
【S2E11】Learning from Language Models for Visual Intelligence
44:54
【S2E4】Adaptive and trustworthy NLP with retrieval for information access for everyone
The AI Talks
【S2E4】Adaptive and trustworthy NLP with retrieval for information access for everyone
45:32
【S2E10】Vision-and-Language Alignment - Towards Universal Multimodal AI
The AI Talks
【S2E10】Vision-and-Language Alignment - Towards Universal Multimodal AI
34:27
【S2E9】Advancing Semi-Supervised Learning: Methods and Benchmarks
The AI Talks
【S2E9】Advancing Semi-Supervised Learning: Methods and Benchmarks
40:18
【S2E7】The case for reasoning beyond recognition
The AI Talks
【S2E7】The case for reasoning beyond recognition
45:20
【S2E6】On the Gauge Transformation of Neural Fields
The AI Talks
【S2E6】On the Gauge Transformation of Neural Fields
32:43
【S2E5】Depth Estimation from Unstabilized Mobile Photography
The AI Talks
【S2E5】Depth Estimation from Unstabilized Mobile Photography
1:18:19
【S2E3】Unknown-aware learning for object detection and beyond
The AI Talks
【S2E3】Unknown-aware learning for object detection and beyond
37:39
【S2E2】MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
The AI Talks
【S2E2】MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
34:27
【S2E1】Personalizing Text-to-image Generation
The AI Talks
【S2E1】Personalizing Text-to-image Generation
47:34
【EP11】Improving Robustness to Distribution Shifts: Methods and Benchmarks
The AI Talks
【EP11】Improving Robustness to Distribution Shifts: Methods and Benchmarks
31:31
【EP10】StyleGAN-Based Portrait Image and Video Style Transfer
The AI Talks
【EP10】StyleGAN-Based Portrait Image and Video Style Transfer
37:09
【EP9】Principled solutions for efficient artificial neural networks
The AI Talks
【EP9】Principled solutions for efficient artificial neural networks
47:22
【EP8】Prompting-based Continual Learning
The AI Talks
【EP8】Prompting-based Continual Learning
44:42
【EP7】Finetuning Vision Models: Improving Robustness and Accuracy
The AI Talks
【EP7】Finetuning Vision Models: Improving Robustness and Accuracy
40:29
【EP6】Architectures and Training for Visual Understanding
The AI Talks
【EP6】Architectures and Training for Visual Understanding
1:04:05
【EP5】Bit Diffusion: Generating Discrete Data using Diffusion Models with Analog Bits
The AI Talks
【EP5】Bit Diffusion: Generating Discrete Data using Diffusion Models with Analog Bits
1:03:54
【EP4】MMAI: Close the loop for Medical AI application
The AI Talks
【EP4】MMAI: Close the loop for Medical AI application
31:19
【EP3】Large-Scale Visual Representation Learning with Vision Transformers
The AI Talks
【EP3】Large-Scale Visual Representation Learning with Vision Transformers
1:03:21
【EP2】Using AI to Diagnose and Assess Parkinson's Disease: Challenges, Algorithms, and Applications
The AI Talks
【EP2】Using AI to Diagnose and Assess Parkinson's Disease: Challenges, Algorithms, and Applications
1:11:43
【EP1】A Vision-and-Language Approach to Computer Vision in the Wild: Modeling and Benchmark
The AI Talks
【EP1】A Vision-and-Language Approach to Computer Vision in the Wild: Modeling and Benchmark
1:05:49