Projects in Awesome Lists by lupusruber
A curated list of projects in awesome lists by lupusruber .
https://github.com/lupusruber/protein-function-annotation-project
This project implements models for Protein-Protein Interaction (PPI) prediction, focusing on graph-based methods such as the SOTA GNN model call GIPA.
Last synced: 04 Apr 2025
https://github.com/lupusruber/rnmp_homework3
A Spark Streaming and Kafka-based project for processing health data in real-time. Includes a machine learning pipeline for predictions, Dockerized infrastructure, and scripts for data ingestion, model training, and streaming pipelines.
docker-compose kafka model-selection-and-evaluation spark-streaming
Last synced: 04 Apr 2025
https://github.com/lupusruber/rnmp_homework2
A recommendation system project that uses the Spark MLlib's ALS model to train and evaluate on the MovieLens dataset. Includes Dockerized setup, hyperparameter tuning, and evaluation metrics (RMSE, Precision@K, Recall@K, NDCG) for performance insights.
docker mllib recommender-system spark
Last synced: 04 Apr 2025
https://github.com/lupusruber/rnmp_homework1
This project simulates message production and consumption using Kafka, with real-time data transformations via Flink, all running within a Docker environment. Requires: Docker, Git, and Python.
data-engineering docker flink kafka python
Last synced: 04 Apr 2025
https://github.com/lupusruber/crypto_stats
A project that provides a cloud-native solution for ingesting, transforming, and visualizing cryptocurrency data, utilizing modern tools and workflows for scalability and automation.
data-engineering data-streaming etl-pipeline gcp terraform
Last synced: 04 Apr 2025
https://github.com/lupusruber/music_analytics
This project processes real-time music event data using Kafka, Apache Spark on Google Cloud Dataproc, and stores the transformed data in BigQuery for analytics, all orchestrated by Airflow and managed with Terraform.
bigquery data-proc dimensional-modeling gcp-project kafka spark-structured-streaming
Last synced: 28 Mar 2025