Show Notes(02:07) Emeli shared her educational background getting degrees in Applied Mathematics and Informatics from the Peoples’ Friendship University of Russia in the early 2010s.(04:24) Emeli went over her experience getting a Master’s Degree at Yandex School of Data Analysis.(07:06) Emeli reflected on lessons learned from her first job out of university working as a Software Developer at Rambler, one of the biggest Russian web portals.(09:33) Emeli walked over her first year as a Data Scientist developing e-commerce recommendation systems at Yandex.(13:38) Emeli discussed core projects accomplished as the Chief Data Scientist at Yandex Data Factory, Yandex’s end-to-end data platform.(17:52) Emeli shared her learnings transitioning from an IC to a manager role.(19:21) Emeli mentioned key components of success for industrial AI, given her time as the co-founder and Chief Data Scientist at Mechanica AI.(22:40) Emeli dissected the makings of her Coursera specializations — “Machine Learning and Data Analysis” and “Big Data Essentials.”(26:14) Emeli discussed her teaching activities at Moscow Institute of Physics and Technology, Yandex School of Data Analysis, Harbour.Space, and Graduate School of Management — St. Petersburg State University.(30:12) Emeli shared the story behind the founding of Evidently AI, which is building a human interface to machine learning, so that companies can trust, monitor, and improve the performance of their AI solutions.(32:32) Emeli explained the concept of model monitoring and exposed the monitoring gap in the enterprise (read Part 1 and Part 2 of the Monitoring series).(34:13) Emeli looked at possible data quality and integrity issues while proposing how to track them (read Part 3, Part 4, and Part 5 of the Monitoring series).(36:47) Emeli revealed the pros and cons of building an open-source product.(39:13) Emeli talked about prioritizing product roadmap for Evidently AI.(41:24) Emeli described the data community in Moscow.(42:03) Closing segment.
Emeli’s Contact InfoLinkedInTwitterCourseraGitHubMedium
Evidently AI’s ResourcesWebsiteTwitterLinkedInGitHubDocumentation
Blog PostsML Monitoring, Part 1: What Is It and How It Differs? (Aug 2020)ML Monitoring, Part 2: Who Should Care and What We Are Missing? (Aug 2020)ML Monitoring, Part 3: What Can Go Wrong With Your Data? (Sep 2020)ML Monitoring, Part 4: How To Track Data Quality and Data Integrity? (Oct 2020)ML Monitoring, Part 5: Why Should You Care About Data And Concept Drift? (Nov 2020)ML Monitoring, Part 6: Can You Build a Machine Learning Model to Monitor Another Model? (April 2021)
Courses“Machine Learning and Data Analysis”“Big Data Essentials”
PeopleYann LeCun (Professor at NYU, Chief AI Scientist at Facebook)Tomas Mikolov (the creator of Word2Vec, ex-scientist at Google and Facebook)Andrew Ng (Professor at Stanford, Co-Founder of Google Brain, Coursera, and Landing AI, Ex-Chief Scientist at Baidu)
Book“The Elements of Statistical Learning” (by Trevor Hastie, Robert Tibshirani, and Jerome Friedman)
Since the podcast was recorded, a lot has happened at Evidently! You can use this open-source tool (https://github.com/evidentlyai/evidently) to generate a variety of interactive reports on the ML model performance and integrate it into your pipelines using JSON profiles.
This monitoring tutorial is a great showcase of what can go wrong with your models in production and how to keep an eye on them: https://evidentlyai.com/blog/tutorial-1-model-analytics-in-production.
About The Show
Datacast features long-form conversations with practitioners and researchers in the data community to walk through their professional journey and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths - from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing firstname.lastname@example.org.
Subscribe by searching for Datacast wherever you get podcasts or click one of the links below:Listen on SpotifyListen on Apple PodcastsListen on Google Podcasts
If you're new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.