NOTE: The event will be held in Central Daylight Time (CDT), UTC -5.

View More Details for Open Source Summit + Embedded Linux Conference North America 2020
Registration Information.
Back To Schedule
Wednesday, July 1 • 4:05pm - 4:55pm
Horovod: Distributed Deep Learning for Reliable MLOps at Uber - Travis Addair, Uber Technologies, Inc.

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
At Uber, deep learning powers increasingly many business-critical use cases, including ETA prediction, dynamic pricing, fraud detection, autonomous vehicles, and many more. New data coming in every day is used to build bigger and more accurate models, but at the cost of increasing the time to train on a single machine. Data processing tools like Spark are used to scale data extraction, transformation, and feature engineering, but efficiently training a deep neural network requires specialized hardware and cannot be arbitrarily parallelized.

In this talk, you will learn how Uber is solving these challenges using Horovod, the open source framework created to make distributed training of deep neural networks fast and easy for TensorFlow, PyTorch, and MXNet models. You’ll see how Horovod Spark Estimators are used to seamlessly insert deep learning into Spark pipelines, dynamically blending CPU and GPU resources with Spark 3’s resource-aware scheduling.

This talk is for machine learning engineers and data scientists looking to speed up their model training lifecycle, and MLOps engineers interested in adding deep learning as a first-class citizen to their production ML pipelines.

avatar for Travis Addair

Travis Addair

Senior Software Engineer II, Uber Technologies
Travis Addair is a software engineer at Uber working on the Michelangelo machine learning platform. He leads the Horovod project and chairs its Technical Steering Committee within the Linux Foundation.  In the past, he’s worked on scaling machine learning systems at Google and... Read More →

Wednesday July 1, 2020 4:05pm - 4:55pm CDT
AI/ML/DL Theater