Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/omarhimada/floyo-ml-scala
Distributed ML for eCommerce platforms (recommendations, churn prediction, segmentation) written in Scala, using Spark MLlib, Elasticsearch and AWS SDK
https://github.com/omarhimada/floyo-ml-scala
aws ml scala spark
Last synced: 10 days ago
JSON representation
Distributed ML for eCommerce platforms (recommendations, churn prediction, segmentation) written in Scala, using Spark MLlib, Elasticsearch and AWS SDK
- Host: GitHub
- URL: https://github.com/omarhimada/floyo-ml-scala
- Owner: omarhimada
- License: apache-2.0
- Created: 2020-06-28T14:45:01.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2020-07-11T03:54:08.000Z (over 4 years ago)
- Last Synced: 2024-12-17T01:32:53.750Z (2 months ago)
- Topics: aws, ml, scala, spark
- Language: Scala
- Homepage:
- Size: 91.8 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# floyo-ML-scala
Distributed ML for eCommerce platforms (recommendations, churn prediction, segmentation) written in Scala, using Spark MLlib, Kafka, Elasticsearch (elastic4s), and AWS**Note:** *ML models are trained via batch download process from S3, and predictions are executed using the trained/persisted ML models and realtime streamed transactions via Kafka*
## Process:
1. Drop your transactional/sales data into S3 to train the models=
2. Trigger the training process
- e.g.:
- *AWS SNS trigger on S3*
- *cron*
- *SBT CLI*
3. Write new transactions to Kafka streams to make predictions in realtime
- Persisted ML models are used to make predictions from transactions that are read from stream
4. Search/scroll written Elasticsearch data to automate some business process
- e.g.:
- *automate emails to key customer segments such as most loyal, small baskets, infrequent shoppers, etc.*
- *recommend products intelligently on your eCommerce website*
- *automate a push notification to customers who are at risk of churning with a special offer*![Diagram](https://floyalty-ca.s3.ca-central-1.amazonaws.com/floyomlscala-diagram.svg)
| Feature | Progress |
|-----------------------------------------------------------------------------------------|------------|
| Customer segmentation via K-Means++ | 0.1 |
| Churn prediction via logistic regression | In progress|
| Product recommendations via matrix factorization (collaborative filtering) | 0.1 |
| Email automation | To-do |
| Elasticsearch integration | 0.1 |
| S3 integration | 0.1 |
| Kibana dashboard & visualizations | To-do |
| Read transactions from stream in realtime to make predictions | 0.1 |