All Episodes

May 17, 2023 34 mins

All links are available in the blog post.

In this episode of the Talking Papers Podcast, I hosted Jiahao Zhang to chat about our CVPR 2023 paper "Aligning Step-by-Step Instructional Diagrams to Video Demonstrations".

furniture assembly diagram. To do that, we collected and annotated a brand new dataset: "IKEA Assembly in the Wild" where we aligned YouTube videos with IKEA's instruction manuals. Our approach to addressing this task proposes several supervised contrastive losses that contrast between video and diagram, video and manual, and internal manual images.

Jiahao is currently a PhD student at the Australian National University. His research focus is on human action recognition and multi-modal representation alignment. We first met (virtually) when Jiahao did his Honours project, where he developed an amazing (and super useful) video annotation tool ViDaT. His strong software engineering and web development background gives him a strong advantage when working on his research projects. Even though we never met in person (yet), we are actively collaborating and I already know what he is cooking up next. I hope to share it with the world soon.

AUTHORS
Jiahao Zhang, Anoop Cherian, Yanbin Liu, Yizhak Ben-Shabat, Cristian Rodriguez, Stephen Gould

RELATED PAPERS
📚IKEA ASM Dataset
📚CLIP
📚SlowFast

LINKS AND RESOURCES
📚 Paper
💻Project page
💻Dataset page
💻Code

SPONSOR
This episode was sponsored by YOOM. YOOM is an Israeli startup dedicated to volumetric video creation. They were voted as the 2022 best start-up to work for by Dun’s 100.
Join their team that works on geometric deep learning research, implicit representations of 3D humans, NeRFs, and 3D/4D generative models.

Visit YOOM

For job opportunities with YOOM visit https://www.yoom.com/careers/

CONTACT

If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.com

This episode was recorded on  May 1st,  2023.

#talkingpapers #CVPR2023 #IAWDataset #ComputerVision #AI #ActionRecognition #DeepLearning #MachineLearning #research #artificialintelligence #podcasts

🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikbs.com

📧Subscribe to our mailing list: http://eepurl.com/hRznqb

🐦Follow us on Twitter: https://twitter.com/talking_papers

🎥YouTube Channel: https://bit.ly/3eQOgwP

Mark as Played

Advertise With Us

Popular Podcasts

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

24/7 News: The Latest

24/7 News: The Latest

The latest news in 4 minutes updated every hour, every day.

Therapy Gecko

Therapy Gecko

An unlicensed lizard psychologist travels the universe talking to strangers about absolutely nothing. TO CALL THE GECKO: follow me on https://www.twitch.tv/lyleforever to get a notification for when I am taking calls. I am usually live Mondays, Wednesdays, and Fridays but lately a lot of other times too. I am a gecko.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.