Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2024-07-29 13:36:33 UTC
- JSON Representation
https://github.com/red-data-tools/unicode_plot.rb
Plot your data by Unicode characters
data-science data-visualization ruby
Last synced: 02 Aug 2024
https://github.com/analysiscenter/cardio
CardIO is a library for data science research of heart signals
data-science deep-learning deep-neural-networks healthcare machine-learning python
Last synced: 02 Aug 2024
https://github.com/asad70/reddit-sentiment-analysis
This program goes thru reddit, finds the most mentioned tickers and uses Vader SentimentIntensityAnalyzer to calculate the ticker compound value.
algotrading data-science data-science-projects data-visualization mentioned-tickers reddit reddit-sentiment-analysis sentiment sentiment-analysis stocks ticker-compound trading vader vader-sentiment-analysis vader-sentimentintensityanalyzer wallstreetbets
Last synced: 01 Aug 2024
https://github.com/Griperis/BlenderDataVis
Data visualisation addon for Blender
blender blender-addon chart data-science data-visualisation
Last synced: 03 Aug 2024
https://github.com/PKU-DAIR/Hetu
A high-performance distributed deep learning system targeting large-scale and automated distributed training.
artificial-intelligence autograd data-science deep-learning deep-neural-networks distributed-systems distributed-training embeddings gpu high-dimensional machine-learning python state-of-the-art
Last synced: 31 Jul 2024
https://github.com/justmarkham/trump-lies
Tutorial: Web scraping in Python with Beautiful Soup
beautiful-soup data-science dataset pandas python requests tutorial web-scraping
Last synced: 02 Aug 2024
https://github.com/jphall663/GWU_data_mining
Materials for GWU DNSC 6279 and DNSC 6290.
data-mining data-science data-visualization h2o image-processing image-recognition machine-learning python r sas text-mining
Last synced: 04 Aug 2024
https://github.com/samueldobbie/markup
A web-based document annotation tool, powered by GPT-4 :rocket:
active-learning annotation-tool data-labeling data-science gpt-4 machine-learning named-entity-recognition natural-language-processing ner nlp sequence-to-sequence text-annotation text-annotation-tool
Last synced: 31 Jul 2024
https://github.com/laresbernardo/lares
Analytics & Machine Learning R Sidekick
analytics api automation automl data-science descriptive-statistics h2o machine-learning marketing mmm predictive-modeling puzzle r r-package rlanguage robyn rstats visualization
Last synced: 05 Aug 2024
https://github.com/graphia-app/graphia
A visualisation tool for the creation and analysis of graphs
analysis data data-analysis data-science data-visualization graphs interpretation networks visualisation visualization
Last synced: 01 Aug 2024
https://github.com/dialnd/imbalanced-algorithms
Python-based implementations of algorithms for learning on imbalanced data.
data-science imbalanced-data machine-learning notre-dame python
Last synced: 01 Aug 2024
https://github.com/Yu-Group/covid19-severity-prediction
Extensive and accessible COVID-19 data + forecasting for counties and hospitals. 📈
coronavirus coronavirus-tracking county-health-data county-level covid-19 covid-19-data covid-19-data-analysis data-analysis data-science epidemic-model forecasting outbreak outbreak-severity python3 response4life risk-assessment risk-modelling statistics ventilator visualization
Last synced: 08 Aug 2024
https://github.com/Benjamin-Lee/deep-rules
Ten Quick Tips for Deep Learning in Biology
bioinformatics biology computational-biology data-science deep-learning genomics machine-learning manubot manuscript
Last synced: 02 Aug 2024
https://github.com/lgalke/vec4ir
Word Embeddings for Information Retrieval
data-science embedding-models embeddings evaluation information-retrieval natural-language-processing nlp retrieval-model similarity-scoring word-embeddings
Last synced: 02 Aug 2024
https://github.com/voxel51/voxelgpt
AI assistant that can query visual datasets, search the FiftyOne docs, and answer general computer vision questions
artificial-intelligence chatgpt computer-vision data-science deep-learning fiftyone langchain llm machine-learning openai python
Last synced: 02 Aug 2024
https://github.com/modal-labs/modal-client
Python client library for Modal
cloud data-science distributed machine-learning modal python serverless
Last synced: 31 Jul 2024
https://github.com/Minyus/pipelinex
PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
data-engineering data-science deep-learning experimentation machine-learning pipeline
Last synced: 31 Jul 2024
https://github.com/bgruening/docker-galaxy-stable
:whale::bar_chart::books: Docker Images tracking the stable Galaxy releases.
data-science docker-image galaxy galaxyproject science
Last synced: 01 Aug 2024
https://github.com/koalaverse/homlr
Supplementary material for Hands-On Machine Learning with R, an applied book covering the fundamentals of machine learning with R.
data-science machine-learning r supervised-learning unsupervised-learning
Last synced: 31 Jul 2024
https://github.com/nickslevine/zebras
Data analysis library for JavaScript built with Ramda
data-analysis data-science functional-programming javascript pandas ramda
Last synced: 01 Aug 2024
https://github.com/analysiscenter/radio
RadIO is a library for data science research of computed tomography imaging
computed-tomography data-science deep-learning machine-learning medical-imaging neural-networks tensorflow
Last synced: 07 Aug 2024
https://github.com/Alex-Lekov/AutoML_Alex
State-of-the art Automated Machine Learning python library for Tabular Data
auto-ml automatic-machine-learning automl cross-validation data-science data-science-projects hyperparameter-optimization hyperparameter-tuning machine-learning machine-learning-library machine-learning-models ml model-selection optimisation python sklearn stacking stacking-ensemble xgboost
Last synced: 05 Aug 2024
https://github.com/maxpumperla/learning_ray
Notebooks for the O'Reilly book "Learning Ray"
data-science deep-learning distributed-computing machine-learning notebook python ray
Last synced: 03 Aug 2024
https://github.com/vertica/VerticaPy
VerticaPy is a Python library that exposes sci-kit like functionality to conduct data science projects on data stored in Vertica, thus taking advantage Vertica’s speed and built-in analytics and machine learning capabilities.
big-data data-science data-visualization machine-learning preparation python python-library vertica
Last synced: 02 Aug 2024
https://github.com/project-codeflare/codeflare
Simplifying the definition and execution, scaling and deployment of pipelines on the cloud.
automl data-science hyperparameter-optimization machine-learning pipelines ray sklearn workflows
Last synced: 02 Aug 2024
https://github.com/anki-code/xonsh-cheatsheet
Cheat sheet for xonsh shell with copy-pastable examples. The best doc for the new users.
awesome awesome-cheatsheet cheat-sheet cheat-sheets cheatsheet cheatsheets console data-science devops devops-scripts hacking shell terminal xonsh xontrib
Last synced: 31 Jul 2024
https://github.com/mukeshmithrakumar/Book_List
Python, Machine Learning, Deep Learning and Data Science Books
algorithms books data-science deep-learning free machine-learning python
Last synced: 02 Aug 2024
https://github.com/neurodata/hyppo
Python package for multivariate hypothesis testing
data-science hacktoberfest hypothesis-testing independence ksample-testing python
Last synced: 02 Aug 2024
https://github.com/zeno-ml/zeno
AI Data Management & Evaluation Platform
ai data-science evaluation evaluation-framework machine-learning python
Last synced: 01 Aug 2024
https://github.com/fastverse/fastverse
An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R
c cpp data-aggregation data-manipulation data-science data-transformation high-performance low-dependency panel-data r rstats statistical-computing time-series weights
Last synced: 05 Aug 2024
https://github.com/ocademy-ai/machine-learning
Learn AI together, for free. AI learning and teaching resources for everyone.
ai data-engineering data-science deep-learning jupyter jupyter-notebook machine-learning ml mlops python scikit-learn visualization
Last synced: 01 Aug 2024
https://github.com/shaildeliwala/delbot
It understands your voice commands, searches news and knowledge sources, and summarizes and reads out content to you.
ai bot bots chatbot data-science flask natural-language-processing python
Last synced: 04 Aug 2024
https://github.com/google-aai/sc17
SuperComputing 2017 Deep Learning Tutorial
data-science deep-learning google-cloud-platform machine-learning tutorial
Last synced: 07 Aug 2024
https://github.com/data-dot-all/dataall
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
aws aws-glue aws-lake-formation aws-s3 data data-science etl-framework lakeformation lakehouse redshift
Last synced: 13 Aug 2024
https://github.com/saimadhu-polamuri/DataAspirant_codes
Complete machine learning model codes
data-mining data-science machine-learning python
Last synced: 05 Aug 2024
https://github.com/xlang-ai/DS-1000
[ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".
benchmark code-generation data-science large-language-models semantic-parsing
Last synced: 09 Aug 2024
https://github.com/Fixy-TR/fixy
Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çözebilen, eşsiz yaklaşımlar öne süren ve literatürdeki çalışmaların eksiklerini gideren open source bir yazım destekleyicisi/denetleyicisi oluşturmak. Kullanıcıların yazdıkları metinlerdeki yazım yanlışlarını derin öğrenme yaklaşımıyla çözüp aynı zamanda metinlerde anlamsal analizi de gerçekleştirerek bu bağlamda ortaya çıkan yanlışları da fark edip düzeltebilmek.
acikhack2 ai artificial-intelligence bert data-science deep-learning deeplearning keras natural-language-processing neural-network neural-networks nlp python
Last synced: 02 Aug 2024
https://github.com/dataplane-app/dataplane
Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a React front end.
airflow data data-analysis data-engineering data-integration data-pipelines data-science dataplane datawarehouse etl finance golang kubernetes pipelines robotics-process-automation rpa scheduler workflow workflow-automation workflows
Last synced: 02 Aug 2024
https://github.com/jgoerner/data-science-stack-cookiecutter
🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)
airflow apistar cookiecutter data-science docker docker-image jupyter minio postgres python superset
Last synced: 31 Jul 2024
https://github.com/A3Data/hermione
ML made simple
data-science hermione machine-learning python
Last synced: 31 Jul 2024
https://github.com/blobcity/python-for-data-science
A collection of Jupyter Notebooks for learning Python for Data Science.
data-science jupyter jupyter-notebook jupyter-notebooks learn-python python
Last synced: 02 Aug 2024
https://github.com/Laurae2/Laurae
Advanced High Performance Data Science Toolbox for R by Laurae
data-science laurae machine-learning r supervised-learning xgboost
Last synced: 07 Aug 2024
https://github.com/Speedml/speedml
Speedml is a Python package to speed start machine learning projects.
data-science machine-learning python
Last synced: 03 Aug 2024
https://github.com/danaugrs/go-tsne
t-Distributed Stochastic Neighbor Embedding (t-SNE) in Go
3d data-science dimensionality-reduction go machine-learning tsne unsupervised-learning visualization
Last synced: 02 Aug 2024
https://github.com/dylan-profiler/visions
Type System for Data Analysis in Python
data-analysis data-science hacktoberfest numpy pandas python spark type-inference type-system
Last synced: 03 Aug 2024
https://github.com/nteract/bookstore
📚 Notebook storage and publishing workflows for the masses
data-science notebook nteract scheduling storage versioned-buckets
Last synced: 01 Aug 2024
https://github.com/mvlearn/mvlearn
Python package for multi-view machine learning
data-science machine-learning multiview-learning python
Last synced: 02 Aug 2024
https://github.com/benedekrozemberczki/DANMF
A sparsity aware implementation of "Deep Autoencoder-like Nonnegative Matrix Factorization for Community Detection" (CIKM 2018).
autoencoder cikm clustering community-detection coordinate-descent danmf data-science deep-learning deepwalk dimensionality-reduction embedding gemsec machine-learning mnmf nmf node-embedding node2vec sklearn unsupervised-learning word2vec
Last synced: 31 Jul 2024
https://lge-arc-advancedai.github.io/auptimizer/
An automatic ML model optimization tool.
automated-machine-learning automl data-engineering data-science deep-learning hpo hyperparameter-optimization hyperparameter-tuning machine-learning neural-networks
Last synced: 01 Aug 2024
https://github.com/flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
automation data data-science extensible flyte flyte-tasks hacktoberfest mlops pypi python sdk spark workflows
Last synced: 02 Aug 2024
https://github.com/h2oai/nitro
Create apps 10x quicker, without Javascript/HTML/CSS.
app apps data-analysis data-science developer-tools devtools graphics h2o-nitro low-code python ui ui-components user-interface web-application webapp widget-library widgets
Last synced: 01 Aug 2024
https://github.com/milaan9/DataScience_Interview_Questions
My Solutions to 120 commonly asked data science interview questions.
data-analysis data-science interview-preparation interview-questions machine-learning predective-modeling probability product-metrics python-jupyter-notebooks python-tutorial-github python4datascience statistical-inference tutor-milaan9
Last synced: 02 Aug 2024
https://github.com/agilescientific/striplog
Lithology and stratigraphic logs for wells or outcrop.
data-mining data-science geology petrophysics sedimentology swung-stack
Last synced: 30 Jul 2024
https://github.com/storieswithsiva/Data-Science-Resources
👨🏽🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
artificial-intelligence artificial-neural-networks data data-analysis data-analytics data-mining data-science data-science-resource data-science-resources data-scientist data-scientists data-visualization data-world datascience dataset learning learning-kit machine-learning python repository
Last synced: 01 Aug 2024
https://github.com/LGE-ARC-AdvancedAI/auptimizer
An automatic ML model optimization tool.
automated-machine-learning automl data-engineering data-science deep-learning hpo hyperparameter-optimization hyperparameter-tuning machine-learning neural-networks
Last synced: 03 Aug 2024
https://github.com/ideonate/cdsdashboards
JupyterHub extension for ContainDS Dashboards
bokeh data-science jupyter jupyterhub panel plotly-dash rshiny streamlit visualization
Last synced: 01 Aug 2024
https://github.com/PecanProject/pecan
The Predictive Ecosystem Analyzer (PEcAn) is an integrated ecological bioinformatics toolbox.
bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants r
Last synced: 03 Aug 2024
https://github.com/Esri/awesome-arcgis-developer
A curated list of resources to help you with ArcGIS development, APIs, SDKs, tools, and location services
arcgis arcgis-apis awesome awesome-list data-science developer developer-experience developer-tools developers gis location-intelligence location-services mapping productivity samples spatial-analysis web-development web-mapping
Last synced: 07 Aug 2024
https://github.com/yogeshhk/TeachingDataScience
Course notes for Data Science related topics, prepared in LaTeX
course-materials data-science deep-learning jupyter-notebooks latex machine-learning natural-language-processing open-source python
Last synced: 31 Jul 2024
https://github.com/aws/amazon-redshift-python-driver
Redshift Python Connector. It supports Python Database API Specification v2.0.
amazon-redshift aws-redshift data-analysis data-science
Last synced: 01 Aug 2024
https://github.com/AtomScott/SportsLabKit
A python package for turning sports video into csv files
computer-vision data-science football multi-object-tracking multiobject-tracking python soccer sports sports-analytics tracking
Last synced: 02 Aug 2024
https://github.com/ivnvxd/pyquest
Python everything Cheatsheet and a Journey to the land of Python programming
algorithms architecture cheatsheet concurrency data-science data-structures data-types database fundamentals jupyter-notebook learn oop python standard-library tutorial web-development
Last synced: 01 Aug 2024
https://github.com/analysiscenter/batchflow
BatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory.
data-science machine-learning pipeline pipeline-framework python python3 workflow workflow-engine
Last synced: 01 Aug 2024
https://github.com/alan-turing-institute/skpro
A unified framework for tabular probabilistic regression and probability distributions in python
ai data-science framework machine-learning prediction probabilistic-models probability-distributions python regression sklearn
Last synced: 29 Jul 2024
https://github.com/Toloka/crowd-kit
Control the quality of your labeled data with the Python tools you already know.
aggregations annotation crowd crowdsourcing data-mining data-science labeling python quality-control toloka truth-inference
Last synced: 31 Jul 2024
https://github.com/launchflow/buildflow
BuildFlow, is an open source framework for building large scale systems using Python. All you need to do is describe where your input is coming from and where your output should be written, and BuildFlow handles the rest. No configuration outside of the code is required.
batch data-science pipeline python streaming
Last synced: 06 Aug 2024
https://github.com/ActivitySim/activitysim
An Open Platform for Activity-Based Travel Modeling
activitysim bsd-3-clause data-science microsimulation python travel-modeling
Last synced: 31 Jul 2024
https://github.com/dair-ai/dair-ai.github.io
Home of DAIR.AI
ai data-science education machine-learning nlp
Last synced: 28 Aug 2024
https://github.com/jthomasmock/gtExtras
A Collection of Helper Functions for the gt Package.
data-science data-visualization datascience ggplot2 gt plots r rstats sparkline sparkline-graphs sparklines tables
Last synced: 13 Aug 2024
https://github.com/explosion/jupyterlab-prodigy
🧬 A JupyterLab extension for annotating data with Prodigy
active-learning annotation annotation-tool artificial-intelligence computer-vision data-annotation data-science jupyter jupyterlab labeling-tool machine-learning machine-teaching natural-language-processing nlp prodigy spacy
Last synced: 07 Aug 2024
https://github.com/plotly/dash-oil-and-gas-demo
Dash Demo App - New York Oil and Gas
dash data-science data-visualization energy plotly python technical-computing
Last synced: 05 Aug 2024
https://github.com/tonybeltramelli/Deep-Spying
Spying using Smartwatch and Deep Learning
data-science deep-learning neural-networks privacy recurrent-neural-networks security wearable-devices
Last synced: 07 Aug 2024
https://github.com/TMiguelT/PandasSchema
A validation library for Pandas data frames using user-friendly schemas
data-science pandas schema validation
Last synced: 07 Aug 2024
https://github.com/blue-season/pywarm
A cleaner way to build neural networks for PyTorch.
clean-code data-science deep-learning keras machine-learning neural-network neural-networks python3 pytorch
Last synced: 03 Aug 2024
https://github.com/seg/2016-ml-contest
Machine learning contest - October 2016 TLE
contest data-science fun geophysics geoscience machine-learning
Last synced: 07 Aug 2024
https://github.com/multimeric/PandasSchema
A validation library for Pandas data frames using user-friendly schemas
data-science pandas schema validation
Last synced: 02 Aug 2024
https://github.com/nshiab/simple-data-analysis.js
Easy-to-use and high-performance JavaScript library for data analysis.
data data-analysis data-science duckdb javascript nodejs typescript
Last synced: 12 Aug 2024
https://github.com/robmarkcole/HASS-data-detective
Explore and analyse your Home Assistant data
data data-science home home-assistant home-automation
Last synced: 01 Aug 2024
https://github.com/nshiab/simple-data-analysis
Easy-to-use and high-performance JavaScript library for data analysis.
data data-analysis data-science duckdb javascript nodejs typescript
Last synced: 31 Jul 2024
https://github.com/microsoft/finnts
Microsoft Finance Time Series Forecasting Framework (FinnTS) is a forecasting package that utilizes cutting-edge time series forecasting and parallelization on the cloud to produce accurate forecasts for financial data.
business data-science feature-selection finance finnts forecasting machine-learning microsoft r r-package rstats time-series
Last synced: 13 Aug 2024
https://github.com/swoop-inc/spark-alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
data-engineering data-science scala spark
Last synced: 06 Aug 2024
https://github.com/d5555/TagEditor
🏖TagEditor - Annotation tool for spaCy
annotation annotation-tool coreference-resolution data-science labeling-tool machine-learning named-entities named-entity-recognition natural-language-processing neural-networks neuralcoref nlp spacy spacy-visualizer tagging-tool text-annotation text-tagging training-data
Last synced: 03 Aug 2024
https://github.com/eurostat/gridviz
A package for visualizing gridded data 🌐
cartography csv d3 data data-analysis data-science data-visualization datascience geospatial gis gridded-statistics grids gridviz map map-making mapping mapping-tools maps visualization webgl
Last synced: 04 Aug 2024
https://github.com/coqui-ai/Trainer
🐸 - A general purpose model trainer, as flexible as it gets
ai data-science deep-learning machine-learning pytorch
Last synced: 07 Aug 2024
https://kevinheavey.github.io/modern-polars/
Code and data for the Modern Polars book
data-analytics data-engineering data-science dataengineering pandas polars python
Last synced: 04 Aug 2024
https://github.com/SETL-Framework/setl
A simple Spark-powered ETL framework that just works 🍺
big-data data-analysis data-engineering data-science data-transformation dataset etl etl-pipeline framework machine-learning modularization pipeline scala setl spark
Last synced: 01 Aug 2024
https://github.com/Azure/DataScienceVM
Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)
ai azure big-data data-analysis data-science deep-learning dsvm machine-learning ml python r sqlserver
Last synced: 08 Aug 2024
https://github.com/capeprivacy/cape-python
Privacy transformations on Spark and Pandas dataframes backed by a simple policy language.
collaboration data-science hacktoberfest machine-learning pandas policy privacy python spark
Last synced: 03 Aug 2024
https://github.com/Oxen-AI/Oxen
Oxen.ai's core rust library, server, and CLI
artificial-intelligence data-science database machine-learning version-control
Last synced: 17 Aug 2024
https://github.com/fedora-infra/fedmsg
Federated Messaging with ZeroMQ
data-science fedora-project message-bus python zeromq
Last synced: 20 Aug 2024
https://github.com/kdr-aus/ogma
Scripting language focused on processing tabular data.
data-science language rust scripting-language table-data
Last synced: 31 Jul 2024
https://github.com/maxheld83/ghactions
GitHub actions for R and accompanying R package
cicd continous-delivery continous-integration data-science devops github github-actions rstats setup
Last synced: 05 Aug 2024
https://github.com/kevin-hanselman/dud
A lightweight CLI tool for versioning data alongside source code and building data pipelines.
data-engineering data-pipelines data-science dataset dvcs machine-learning mlops
Last synced: 31 Jul 2024
https://github.com/dlab-berkeley/Python-Fundamentals-Legacy
D-Lab's 12 hour introduction to Python. Learn how to create variables and functions, use control flow structures, use libraries, import data, and more, using Python and Jupyter Notebooks.
data-science introduction-to-python jupyter python
Last synced: 02 Aug 2024
https://github.com/google/starthinker
Reference framework for building data workflows provided by Google. Accelerates authentication, logging, scheduling, and deployment of solutions using GCP. To borrow a tagline.. "The framework for professionals with deadlines."
airflow app-engine automation bigquery cloud-functions cm360 colab-notebook data-science django dv360 google-ads google-analytics logger python scheduler ui workflows
Last synced: 04 Aug 2024
https://github.com/Automunge/AutoMunge
Tabular feature encoding pipelines for machine learning with options for string parsing, missing data infill, and stochastic perturbations.
Last synced: 31 Jul 2024
https://learnbyexample.github.io/py_resources/
Collection of Python learning resources
curated-list data-science learning machine-learning python resources scientific-computing
Last synced: 01 Aug 2024
https://github.com/unnati-xyz/scalable-data-science-platform
Content for architecting a data science platform for products using Luigi, Spark & Flask.
data-engineer data-pipeline data-science luigi machine-learning rest-api spark
Last synced: 07 Aug 2024
https://github.com/jgoerner/beyond-jupyter
🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
airflow apache apistar data-science docker docker-compose jupyter jupyter-notebook minio postgres superset
Last synced: 31 Jul 2024
https://github.com/curiousily/Machine-Learning-from-Scratch
Succinct Machine Learning algorithm implementations from scratch in Python, solving real-world problems (Notebooks and Book). Examples of Logistic Regression, Linear Regression, Decision Trees, K-means clustering, Sentiment Analysis, Recommender Systems, Neural Networks and Reinforcement Learning.
artificial-intelligence book classification data-science machine-learning machine-learning-algorithms neural-networks notebook recommender-systems regression reinforcement-learning sentiment-analysis
Last synced: 08 Aug 2024