An open API service indexing awesome lists of open source software.

Projects in Awesome Lists by lupusruber

A curated list of projects in awesome lists by lupusruber .

https://github.com/lupusruber/protein-function-annotation-project

This project implements models for Protein-Protein Interaction (PPI) prediction, focusing on graph-based methods such as the SOTA GNN model call GIPA.

dgl gnns torchgeometric

Last synced: 04 Apr 2025

https://github.com/lupusruber/quantum

Last synced: 14 Mar 2025

https://github.com/lupusruber/rnmp_homework3

A Spark Streaming and Kafka-based project for processing health data in real-time. Includes a machine learning pipeline for predictions, Dockerized infrastructure, and scripts for data ingestion, model training, and streaming pipelines.

docker-compose kafka model-selection-and-evaluation spark-streaming

Last synced: 04 Apr 2025

https://github.com/lupusruber/rnmp_homework2

A recommendation system project that uses the Spark MLlib's ALS model to train and evaluate on the MovieLens dataset. Includes Dockerized setup, hyperparameter tuning, and evaluation metrics (RMSE, Precision@K, Recall@K, NDCG) for performance insights.

docker mllib recommender-system spark

Last synced: 04 Apr 2025

https://github.com/lupusruber/rnmp_homework1

This project simulates message production and consumption using Kafka, with real-time data transformations via Flink, all running within a Docker environment. Requires: Docker, Git, and Python.

data-engineering docker flink kafka python

Last synced: 04 Apr 2025

https://github.com/lupusruber/crypto_stats

A project that provides a cloud-native solution for ingesting, transforming, and visualizing cryptocurrency data, utilizing modern tools and workflows for scalability and automation.

data-engineering data-streaming etl-pipeline gcp terraform

Last synced: 04 Apr 2025

https://github.com/lupusruber/music_analytics

This project processes real-time music event data using Kafka, Apache Spark on Google Cloud Dataproc, and stores the transformed data in BigQuery for analytics, all orchestrated by Airflow and managed with Terraform.

bigquery data-proc dimensional-modeling gcp-project kafka spark-structured-streaming

Last synced: 28 Mar 2025

https://github.com/lupusruber/ai2023

Last synced: 04 Apr 2025