Introduction to Koalas: pandas API on Apache Spark and MLflow
Time: 3-4:30PM EST
Workshop Leaders:
Rexwell Minnis, Director of Software Engineering, Capital One
Andrew Gertsog, Sr Lead ML Engineer, Capital One
Pramod Lahoti, Master Software Engineer, Capital One
Arunadevi Inamdar, Senior Data Engineer, Capital One
This workshop is intended to demystify Machine Learning using Apache Spark. The biggest barrier people face is the distributed nature of Spark that scares them away. We’ll show that there’s nothing to be afraid of and the transition to machine learning at scale can be pretty smooth through the set of well known APIs that are considered to be industry standards. Koalas enables non Scala/Java data engineers/scientists to be productive very quickly. Leverage Big Data capabilities and Machine learning features with spark gives data scientists capability to process algorithms with larger dataset it gives more real time processing of data giving real time analysis.
All workshop attendees please create a Databricks Community edition account before attending the workshop.
This session is sponsored by Capital One.
There is no need to register for this individual session, to register for Datapalooza go here.