Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jason-dai/cvpr2020
https://github.com/jason-dai/cvpr2020
Last synced: 3 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/jason-dai/cvpr2020
- Owner: jason-dai
- License: apache-2.0
- Created: 2020-03-22T12:48:15.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2020-07-07T13:41:59.000Z (over 4 years ago)
- Last Synced: 2024-11-16T15:05:52.793Z (2 months ago)
- Language: HTML
- Size: 8.07 MB
- Stars: 6
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Automated ML Workflow for Distributed Big Data Using Analytics Zoo
___## Speaker
[Jason Dai](https://jason-dai.github.io/)## Schedule
_2-5PM (Pacific Time), June 19, 2020_## Description
Applying machine learning (ML) techniques to distributed big data analytics plays a central role in today’s intelligent applications and systems. These problem settings have pushed the field to address issues of data scale that were almost inconceivable even a decade ago for AI researchers. In addition, building machine learning applications for these big data problems can also be a laborious and knowledge-intensive process for ML engineers.To address these challenges, we have open sourced [Analytics Zoo](https://github.com/intel-analytics/analytics-zoo), which helps users to build and productionize end-to-end ML workflow for distributed big data in an automated fashion. Using Analytics Zoo, users can simply build conventional Python notebooks on their laptops (with possible AutoML support), which can then automatically scale out to large clusters and process large amount of data in a distributed fashion.
This tutorial will present how to implement the automated ML workflow for big data (with a focus on supporting computer vision models and pipelines), by seamlessly integrating different technologies including deep learning frameworks (e.g., TensroFlow, Keras, PyTorch, etc.), distributed analytics frameworks (e.g., Apache Spark, Apache Flink, Apache Kafka, Ray, etc.), and AutoML techniques (such as hyperparameter optimizations). In addition, it will also share real-world experience and "war stories" of users who have adopted Analytics Zoo to address their challenges when applying ML techniques to distributed big data analytics.
## Tutorial
* SlideShare ([link](https://www.slideshare.net/jason-dai/automated-ml-workflow-for-distributed-big-data-using-analytics-zoo-cvpr2020-tutorial))
* Slides ([pdf](slides/AIonBigData_cvpr20.pdf))## Link
* Related [tutorial](https://jason-dai.github.io/cvpr2018/) at [CVPR 2018](http://cvpr2018.thecvf.com) (YouTube video available)
* Related [tutorial](https://jason-dai.github.io/aaai2019) at [AAAI 2019](https://aaai.org/Conferences/AAAI-19/aaai19tutorials/#sp2)