Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2024-11-12 00:06:57 UTC
- JSON Representation
https://github.com/andrea-ballatore/open-geo-data-education
Open Geospatial Datasets for GIS Education: This is a repository of open geospatial datasets to be used in an educational context. I created these files over years of teaching Geographic Data Science and GIS. All original datasets are freely available online with open data licenses (see the dataset attribution for details). All the datasets in this repository have been selected, cleaned, harmonised, and repackaged for GIS exercises in a higher-education context. This is a pretty time-intensive process that other educators can hopefully avoid by using these versions.
data-science geojson geospatial-data geospatial-datasets gis gis-data gis-education tsv
Last synced: 27 Oct 2024
https://github.com/hemansnation/data-analyst-roadmap
Data-Analyst-Roadmap for Professionals. This roadmap contains 8 Chapters that can be completed in 8 weeks, whether you are a fresher in the field or an experienced professional who wants to transition into Data Analysis.
analytics data-analysis data-analysis-python data-analytics data-science numpy predictive-analytics project-based-learning python statistics tableau
Last synced: 08 Nov 2024
https://github.com/ropensci/gittargets
Data version control for reproducible analysis pipelines in R with {targets}.
data-science data-version-control data-versioning r r-package reproducibility reproducible-research rstats targets workflow
Last synced: 31 Oct 2024
https://github.com/mahmoudparsian/pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
algorithms big-data data data-abstractions data-science dataframe distributed-computing graphframes mapreduce monoid nosql partitioning pyspark pyspark-algorithms python rdd spark transformations
Last synced: 06 Nov 2024
https://github.com/ekramasif/basic-machine-learning
This is a repo of basic Machine Learning what I learn. More to go...
ann artficial-neural-network artificial-intelligence bert-embeddings bert-model blstm collaborate data-science deep-learning embeddings keras lstm machine-learning natural-language-processing neural-network nlp pandas python seaborn tensorflow
Last synced: 26 Oct 2024
https://github.com/imdeepmind/neuralpy
NeuralPy: A Keras like deep learning library works on top of PyTorch
data-science deep-learning keras library machine-learning neural-network neuralpy neuralpy-torch python pytorch
Last synced: 13 Nov 2024
https://github.com/produvia/ai-platform
An open-source platform for automating tasks using machine learning models
artificial-intelligence automation data-science deep-learning java keras-models machine-learning model-zoo neural-networks python pytorch-models r task tasks tensorflow-models
Last synced: 26 Sep 2024
https://github.com/FlyRanch/figurefirst
A layout-first approach to figure making
data-science inkscape inkscape-extensions matplotlib plotting python svg
Last synced: 03 Aug 2024
https://github.com/OpenSTEF/openstef
Automated Machine Learning pipelines. Builds the Open Short Term Energy Forecasting package.
data-science energy energy-forecasting forecasting machine-learning python time-series
Last synced: 03 Aug 2024
https://github.com/datakitchen/data-observability-installer
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.
data data-engineering data-observability data-profiling data-quality data-reliability data-science datachecker datacleaner datacleaning dataops dataquality datatesting datavalidation mssql pipeline-tests postgresql redshift self-hosted snowflake
Last synced: 12 Oct 2024
https://github.com/woz-u/DS-Student-Resources
Data Science Student Companion Notebooks and Data Lake
data-analysis data-science data-visualization machine-learning nosql python r sql statistics
Last synced: 08 Aug 2024
https://github.com/trademaster-ntu/fintech-literature
Fintech literature, including journal, conference, book and useful links
artificial-intelligence data-science machine-learning natural-language-processing quantitative-finance reinforcement-learning
Last synced: 10 Nov 2024
https://github.com/5agado/conversation-analyzer
Analyzer and statistics generator for text-based conversations. Includes Facebook scraper and parser
data-science facebook quantified-self scraper
Last synced: 08 Nov 2024
https://github.com/Invictify/Jupter-Notebook-REST-API
Run your jupyter notebooks as a REST API endpoint. This isn't a jupyter server but rather just a way to run your notebooks as a REST API Endpoint.
data-science data-science-pipelines docker dockerfile fastapi jupyter python rest-api
Last synced: 26 Oct 2024
https://github.com/zjuearthdata/geochemistrypi
an open-sourced highly automated machine learning Python framework for data-driven geochemistry discovery
dash data-science fastapi flaml geochemistry mlflow nodejs ray reactjs scikit-learn typer
Last synced: 10 Oct 2024
https://github.com/weiji14/zen3geo
The 🌏 data science library you've been waiting for~
analysis-ready-data cloud-native cloud-optimized-geotiff composition data-science datapipe earth-observation foss4g geospatial machine-learning-ready-data stac torch torchdata zarr zen
Last synced: 31 Oct 2024
https://github.com/ECSIM/pem-dataset1
Proton Exchange Membrane (PEM) Fuel Cell Dataset
activation-procedure chemistry data data-science dataset electrochemistry energy fuel-cell impedance mea nafion open-science open-source pem physics polarization power proton-exchange-membrane science science-research
Last synced: 03 Aug 2024
https://github.com/dominodatalab/domino-research
Projects developed by Domino's R&D team
data-science mlflow mlops python sagemaker
Last synced: 13 Aug 2024
https://github.com/ibm/kafka-streaming-click-analysis
Use Kafka and Apache Spark streaming to perform click stream analytics
apache-spark clickstream data-science ibm-data-science-experience ibmcode jupyter-notebook kafka spark structured-streaming
Last synced: 28 Sep 2024
https://github.com/Erfaniaa/crypto-trading-strategy-backtester
Easy-to-use cryptocurrency trading strategy simulator and backtester
backtesting backtesting-trading-strategies binance bitcoin crypto cryptocurrency data-science dataset dataset-generation machine-learning python quantitative-finance quantitative-trading simulation time-series trading trading-strategies
Last synced: 09 Nov 2024
https://github.com/psyplot/psyplot
Python package for interactive data visualization
cartopy climate data-science earth-science earth-system-model interactive matplotlib models netcdf python regression visualization
Last synced: 08 Nov 2024
https://github.com/rodrigo-arenas/pyworkforce
Standard tools for workforce management, queuing, scheduling, rostering and optimization problems.
begginer-friendly data-science erlangc investigation-of-operation investigations-search looking-for-contributors operations-research optimization ortools python schedule scheduling-algorithms up-for-grabs workforce workforce-management
Last synced: 13 Nov 2024
https://github.com/jonrau1/SyntheticSun
SyntheticSun is a defense-in-depth security automation and monitoring framework which utilizes threat intelligence, machine learning, managed AWS security services and, serverless technologies to continuously prevent, detect and respond to threats.
anomaly-detection automation aws aws-security aws-serverless data-science data-visualization elasticsearch geolocation guardduty incident-response kibana machine-learning misp sagemaker security-automation security-tools serverless threat-detection threat-intelligence
Last synced: 04 Aug 2024
https://github.com/dataprofessor/python
Python codes from tutorials on the Data Professor YouTube channel
data-professor data-science dataprofessor datascience machine-learning machinelearning machinelearning-python python python-tutorial
Last synced: 11 Nov 2024
https://github.com/aws-samples/aws-fargate-with-rstudio-open-source
This project delivers AWS CDK Python code to provision serverless infrastructure in AWS Cloud to run Open Source RStudio Server and Shiny.
amazon-athena amazon-ecr amazon-ecs amazon-efs amazon-route53 amazon-s3 amazon-ses amazon-vpc aws-cdk aws-codepipeline aws-datasync aws-fargate-application aws-kms aws-lambda aws-secrets-manager aws-wafv2 data-science rstudio-server shiny-apps
Last synced: 13 Aug 2024
https://github.com/janishar/data-analytics-project-template
A python project starter template for data-analytics and data-science.
ai anaconda conda data-analysis data-analytics data-science jupyter-notebook keras matplotlib notebook numpy pandas project-starter-kit python python3 tensorflow
Last synced: 02 Nov 2024
https://github.com/tirthajyoti/synthetic-data-gen
Various methods for generating synthetic data for data science and ML
classification data data-science machine-learning python regression symbolic-computation time-series
Last synced: 22 Oct 2024
https://github.com/PetoLau/petolau.github.io
Blog about time series data mining in R.
artificial-intelligence blog data-analysis data-mining data-science data-visualization forecasting machine-learning r time-series time-series-analysis time-series-clustering time-series-data-mining time-series-forecasting time-series-prediction
Last synced: 11 Nov 2024
https://github.com/TomasBeuzen/python-programming-for-data-science
Content from the University of British Columbia's Master of Data Science course DSCI 511.
data-manipulation data-science numpy pandas programming python teaching
Last synced: 07 Aug 2024
https://github.com/nbarrowman/vtree
An R package for calculating and drawing variable trees
data-science data-visualization exploratory-data-analysis r statistics
Last synced: 31 Oct 2024
https://github.com/microsoft/coml
Interactive coding assistant for data scientists and machine learning developers, empowered by large language models.
automated-machine automl copilot data-science hyperparameter-optimization jupyter jupyter-lab large-language-models llm machine-learning
Last synced: 07 Oct 2024
https://github.com/kianweelee/Edator
A python package that performs exploratory data analysis for users. Additionally, it generates 3 types of output files (cleaned CSV, plots and a text report).
data-analysis data-science exploratory-data-analysis
Last synced: 03 Aug 2024
https://github.com/capitalone/dataCompareR
dataCompareR is an R package that allows users to compare two datasets and view a report on the similarities and differences.
compare-data data data-analysis data-science r
Last synced: 13 Aug 2024
https://github.com/glemaitre/pyparis-2018-sklearn
PyParis tutorial on machine learning using scikit-learn
data-science machine-learn pandas scikit-learn
Last synced: 01 Nov 2024
https://github.com/MLMI2-CSSI/foundry
Simplifying the discovery and usage of machine-learning ready datasets in materials science and chemistry
chemistry data-science datasets machine-learning materials-science
Last synced: 05 Aug 2024
https://github.com/felipenoris/math-server-docker
The ideal multi-user Data Science server with Jupyterhub and RStudio, ready for Python, R and Julia languages.
data-science docker julia julia-language jupyter jupyter-kernels jupyterhub jupyterlab latex python rstudio-servers shiny-server
Last synced: 28 Oct 2024
https://github.com/manumerous/vpselector
Visual Pandas Selector: Visualize and interactively select time-series data
data-science data-visualization pandas python selector
Last synced: 29 Oct 2024
https://github.com/Thomas-George-T/Thomas-George-T
Readme for my :octocat: Profile
data-engineer data-science github github-profile icons machine-learning profile-readme readme svg svg-icons
Last synced: 26 Oct 2024
https://github.com/erfaniaa/crypto-trading-strategy-backtester
Easy-to-use cryptocurrency trading strategy simulator and backtester
backtesting backtesting-trading-strategies binance bitcoin crypto cryptocurrency data-science dataset dataset-generation machine-learning python quantitative-finance quantitative-trading simulation time-series trading trading-strategies
Last synced: 27 Oct 2024
https://github.com/grailbio/bio
Bioinformatic infrastructure libraries
bioinformatics data-science golang
Last synced: 09 Nov 2024
https://github.com/siddhujetty/Product-analytics-insights-collection
My Solutions to "A Collection of Data Science Take-Home Challenges" by Giulio Palombo.
data-science machine-learning r-programming solutions take-home-test
Last synced: 13 Aug 2024
https://github.com/polymathorg/dataframe
DataFrame in Pharo - tabular data structures for data analysis
data-analysis data-frame data-science data-visualization gsoc hacktoberfest pharo pharo-smalltalk smalltalk statistics tabular-data
Last synced: 09 Oct 2024
https://github.com/bcgov/bcmaps
An R package of map layers for British Columbia
data-science env r r-package rstats
Last synced: 05 Aug 2024
https://github.com/bramvanroy/spacy_conll
Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doc and its sentences and tokens. Can also be used as a command-line tool.
conll conll-u data-science machine-learning natural-language-processing nlp pandas parser python spacy spacy-extension spacy-pipeline stanford-machine-learning stanford-nlp stanza udpipe
Last synced: 13 Nov 2024
https://github.com/xiaodaigh/jlboost.jl
A 100%-Julia implementation of Gradient-Boosting Regression Tree algorithms
catboost data-science gbdt gbrt lightgbm machine-learning tree tree-boosting-algorithms xgboost
Last synced: 08 Nov 2024
https://github.com/uc-r/Advanced-R
Advanced Analytics with R training material delivered in a 2 day format
data-science educational-materials r training-materials workshop-materials
Last synced: 13 Nov 2024
https://github.com/balavenkatesh3322/model_deployment
A collection of model deployment library and technique.
aws azure caffe data-science deep-learning keras machine-learning model model-deployment model-server model-serving mxnet neural-network pytorch serving serving-pytorch-models serving-recommendation serving-tensors tensorflow
Last synced: 10 Nov 2024
https://github.com/verynifty/RolodETH
A Rolodex for popular Ethereum chain address.
data-science ethereum ethereum-blockchain
Last synced: 03 Aug 2024
https://github.com/PolyMathOrg/DataFrame
DataFrame in Pharo - tabular data structures for data analysis
data-analysis data-frame data-science data-visualization gsoc hacktoberfest pharo pharo-smalltalk smalltalk statistics tabular-data
Last synced: 03 Aug 2024
https://github.com/piquette/qtrn
A cli tool to streamline financial markets data analysis :wrench:
cli data data-science finance go golang options quotes scraper stock stock-analysis stock-market
Last synced: 04 Nov 2024
https://github.com/imteekay/machine-learning-research
✨ ML/AI Research
data-science deep-learning machine-learning python science
Last synced: 31 Oct 2024
https://github.com/data-centric-ai/dcbench
A benchmark of data-centric tasks from across the machine learning lifecycle.
Last synced: 30 Oct 2024
https://github.com/visgl/deck.gl-data
Data for the data visualization library deck.gl examples (https://uber.github.io/deck.gl/#/)
data data-science data-visualization uber
Last synced: 07 Aug 2024
https://github.com/maartengr/projects
Data Science Portfolio
data-science jupyter-notebook machine-learning nlp portfolio python pytorch reinforcement-learning
Last synced: 28 Oct 2024
https://github.com/stanfordnlp/edu-convokit
Edu-ConvoKit: An Open-Source Framework for Education Conversation Data
data data-analysis data-science education language natural-language-processing
Last synced: 08 Nov 2024
https://github.com/robertmartin8/udemyml
Templates, code and notes for Kirill Eremenko's Machine Learning course
data-science machine-learning python r tutorial udemy udemy-machine-learning
Last synced: 22 Oct 2024
https://github.com/shenxiangzhuang/pythondataanalysis
The data and code that used in my book.
data-science python3 webcrawler
Last synced: 31 Oct 2024
https://github.com/shenxiangzhuang/PythonDataAnalysis
The data and code that used in my book.
data-science python3 webcrawler
Last synced: 30 Oct 2024
https://github.com/devsgnr/breadroll
breadroll 🥟 is a simple lightweight library for data processing operations written in Typescript and powered by Bun.
bun csv csv-parser data-engineering data-science data-transformation eda exploratory-data-analysis tsv tsv-parser
Last synced: 17 Aug 2024
https://github.com/jbramburger/DataDrivenDynSyst
Scripts and notebooks to accompany the book Data-Driven Methods for Dynamic Systems
autoencoder autoencoder-neural-network autoencoders conservation-laws data-science dynamic-mode-decomposition dynamical-systems extended-dynamic-mode-decomposition forecasting kernel-methods machine-learning neural-network physics-informed-learning physics-informed-neural-networks poincare-map sindy sindy-algorithm stroboscopic-map time-delay universal-approximation-theorem
Last synced: 12 Nov 2024
https://github.com/argilla-io/biome-text
Custom Natural Language Processing with big and small models 🌲🌱
allennlp data-science natural-language-processing nlp pytorch
Last synced: 30 Sep 2024
https://github.com/fneum/data-science-for-esm
data-science energy energy-data energy-system-modelling
Last synced: 31 Oct 2024
https://github.com/localcascadeensemble/lce
Random Forest or XGBoost? It is Time to Explore LCE
classification data-science machine-learning python regression scikit-learn-api
Last synced: 31 Oct 2024
https://github.com/hsbc/tslumen
A library for Time Series EDA (exploratory data analysis)
analysis data-analysis data-science data-visualization eda exploratory-data-analysis exploratory-data-visualizations pandas profiling python time-series time-series-analysis time-series-eda time-series-profiling timeseries timeseries-analysis timeseries-eda
Last synced: 31 Oct 2024
https://github.com/frjnn/bhtsne
Parallel Barnes-Hut t-SNE implementation written in Rust.
barnes-hut bhtsne data-science data-visualization dimensionality-reduction machine-learning rust similarity-measures
Last synced: 31 Oct 2024
https://github.com/cloud-cv/evalai-starters
How to create a challenge on EvalAI?
agent ai cv data-science data-science-competition environments evalai get-started getting-started ml reinforcement-learning rl
Last synced: 09 Nov 2024
https://github.com/aiwithqasim/Free-Artificial-Intelligence-Resources
Welcome, to this Open Source Repository regarding FREE ARTIFICIAL INTELLIGENCE RESOURCE. Get Benefit from the free resources mention & kindly five STAR & FORK this so that it can get maximum Fame so that Everyone can take advantage.
ai article artificial-intelligence artificial-neural-networks blog data-science datascientist deep-learning freeresources hacktoberfest hecktoberfest2021 jobs machine-learning machine-learning-algorithms natural-language-processing nlp project python3 youtube
Last synced: 02 Nov 2024
https://github.com/charmve/paperweeklyai
📚「@MaiweiAI」Studying papers in the fields of computer vision, NLP, and machine learning algorithms every week.
advanced applied-machine-learning computer-vision data-mining data-science deep-learning machine-learning machine-learning-algorithms nlp paper-with-code papers study-papers tutorials
Last synced: 28 Oct 2024
https://github.com/jianzhnie/autotabular
Automatic machine learning for tabular data. ⚡🔥⚡
automl catboost data-science deep-learning feature-engineering hpo lightgbm machine-learning pytorch-lightning scikit-learn structured-data tabular-data xgboost
Last synced: 27 Oct 2024
https://github.com/jianzhnie/AutoTabular
Automatic machine learning for tabular data. ⚡🔥⚡
automl catboost data-science deep-learning feature-engineering hpo lightgbm machine-learning pytorch-lightning scikit-learn structured-data tabular-data xgboost
Last synced: 05 Aug 2024
https://github.com/brubinstein/diffpriv
Easy differential privacy in R
data-science differential-privacy diffpriv machine-learning r r-package statistics
Last synced: 02 Aug 2024
https://github.com/gitonthescene/csv-reconcile
A reconciliation service for OpenRefine serving data from a given CSV file.
Last synced: 06 Nov 2024
https://github.com/montanaz0r/bayesian-statistics-the-fun-way
Solutions and workflow for the Bayesian Statistics The Fun Way book in Python
bayesian-data-analysis bayesian-statistics data-science jupyter-notebook numpy pandas probability python scipy statistics
Last synced: 07 Nov 2024
https://github.com/tomasonjo/graphs-network-science
Accompanying repository for my book about Graph Data Science
algorithms data-science graph graph-algorithms machine-learning
Last synced: 09 Nov 2024
https://github.com/tpvasconcelos/ridgeplot
Beautiful ridgeline plots in Python
data-analysis data-science data-visualization distplot ggridges graphing joyplot plot plotly plotting python ridgeline visualization
Last synced: 05 Nov 2024
https://github.com/tirendazacademy/awesome-data-science-resources
Resources about data science, machine learning, deep learning, data engineering, and SQL.
ai artificial-intelligence data-analysis data-engineering data-science dataengineering datascience deep-learning deeplearning machine-learning machinelearning machinelearning-python sql
Last synced: 08 Nov 2024
https://github.com/LaihoE/did-it-spill
Check if you have training samples in your test set
computer-vision data-science deep-learning pytorch semantic-similarity time-series
Last synced: 12 Nov 2024
https://github.com/apple/ml-symphony
Symphony: Interactive Data Widgets (CHI 2022)
computational-notebooks data-science data-visualization machine-learning
Last synced: 07 Oct 2024
https://github.com/anna-geller/prefect-deployment-patterns
Code examples showing flow deployment to various types of infrastructure
automation aws data data-engineering data-engineering-infrastructure data-engineering-pipeline data-engineering-team data-products data-science dataflow dataflow-ops orchestration pipeline prefect python serverless serverless-framework
Last synced: 28 Oct 2024
https://github.com/ndleah/ibm-data-analyst-professional
Capstone projects of the IBM Data Analyst Professional
analyzing-data data-analysis data-analyst data-manipulation data-science data-visualization data-visualizations ibm-datascience-certification pandas python
Last synced: 13 Nov 2024
https://github.com/dayyass/text-classification-baseline
Pipeline for fast building text classification TF-IDF + LogReg baselines.
baseline classification data-science fast hacktoberfest logistic-regression machine-learning natural-language-processing nlp python text text-classification tf-idf
Last synced: 07 Nov 2024
https://github.com/omarsar/mri-analysis-pytorch
MRI analysis using PyTorch and MedicalTorch
data-science deep-learning health healthcare medicine neural-network pytorch
Last synced: 28 Oct 2024
https://github.com/provectus/sak-kubeflow
🚀 Deploy Kubeflow on AWS EKS with Terraform 🤖
ai argocd artificial-intelligence automation aws cluster data-science deep-learning devops eks gitops iac infrastructure infrastructure-as-code kubeflow machine-learning ml open-source terraform
Last synced: 08 Nov 2024
https://github.com/vatshayan/final-year-disease-prediction-project
Final Year Project Diseases Prediction System through Machine Learning. Disease Prediction system with code and documents
btech btech-project btechfinalyear btechproject college-project data-science disease disease-prediction final final-project final-year-project finalyearproject finalyearprojects machine-learning machine-learning-algorithms machinelearning prediction python sem8
Last synced: 28 Oct 2024
https://github.com/nishkarshraj/automation-using-shell-scripts
Development Automation using Shell Scripting.
anacron at automation automation-framework backup bash-script cron crontab data-science data-structures development linux scenarios scheduler shell shell-scripts sorting-algorithms
Last synced: 16 Oct 2024
https://github.com/ahammadmejbah/artificial-intelligence-important-documents-collections
AI technology is significant because it allows software to do human functions—understanding, reasoning, planning, communication, and perception—increasingly effectively, efficiently, and affordably.
ai algorithms big-data computer-science computer-vision data-analyst data-engineering data-mining data-science deep-learning machine-learning mathematics python
Last synced: 11 Nov 2024
https://github.com/DARIAH-DE/Topics
A Python library for topic modeling and visualization
data-science digital-humanities lda machine-learning natural-language-processing python3 text-mining topic-modeling
Last synced: 13 Nov 2024
https://github.com/dayyass/qaner
Unofficial implementation of QaNER: Prompting Question Answering Models for Few-shot Named Entity Recognition.
data-science machine-learning named-entity-recognition natural-language-processing ner nlp python python3 question-answering
Last synced: 07 Nov 2024
https://github.com/renumics/sliceguard
A library for detecting problematic data segments in structured and unstructured data with few lines of code.
data-analysis data-cleaning data-curation data-exploration data-science data-visualization deep-learning eda exploratory-data-analysis machine-learning python visualization
Last synced: 27 Oct 2024
https://github.com/Desbordante/desbordante-core
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.
anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data
Last synced: 04 Nov 2024
https://github.com/bnosac/crfsuite
Labelling Sequential Data in Natural Language Processing with R - using CRFsuite
chunking conditional-random-fields crf crfsuite data-science intent-classification natural-language-processing ner nlp r r-package
Last synced: 11 Nov 2024
https://github.com/polyaxon/hypertune
A library for performing hyperparameter optimization
data-science deep-learning hyperparameter-optimization hyperparameter-tuning machine-learning mlops numpy scikit-learn workflow
Last synced: 10 Oct 2024
https://github.com/DataKitchen/data-observability-installer
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.
data data-engineering data-observability data-profiling data-quality data-reliability data-science datachecker datacleaner datacleaning dataops dataquality datatesting datavalidation mssql pipeline-tests postgresql redshift self-hosted snowflake
Last synced: 02 Aug 2024
https://github.com/dask-contrib/dask-awkward
Native Dask collection for awkward arrays, and the library to use it.
columnar-format dask data-analysis data-science data-structure jagged-array python ragged-array
Last synced: 11 Nov 2024
https://github.com/hannansatopay/roughviz
A Python visualization library for creating sketchy/hand-drawn styled charts.
charts data-science hacktoberfest jupyter-notebook python-visualization roughviz vizualisation
Last synced: 08 Nov 2024
https://github.com/tcsvn/activity-assistant
Activity Assistant provides a platform for logging, evaluating and predicting Activities of Daily Living for Home Assistant.
activities-of-daily-living activity-assistant adls data-mining data-science django django-rest-framework home-assistant home-assistant-addons home-automation homeassistant human-activity-recognition machine-learning smart-home smarthome visualization
Last synced: 06 Nov 2024
https://jaeyk.github.io/comp_thinking_social_science/
Computational Thinking for Social Scientists book project
computational-social-science data-science digital-humanities machine-learning python r social-sciences visualization web-scraping
Last synced: 27 Oct 2024
https://github.com/tirthajyoti/covid-19-analysis
Analysis with Covid-19 data
analytics coronavirus covid-19 covid-data covid19-data data-science epidemiology machine-learning modeling numpy object-oriented-programming pandemic python visualization
Last synced: 09 Nov 2024
https://github.com/mitre/menelaus
Online and batch-based concept and data drift detection algorithms to monitor and maintain ML performance.
concept-drift data-drift data-science drift-detection machine-learning statistics
Last synced: 09 Nov 2024
https://github.com/maxent-ai/zeroshot_topics
Topic Inference with Zeroshot models
bert data-science huggingface hypernymy-extraction keybert keyword-extraction knowledge-graph labelled-data labelling linguistics machine-learning nli nlp taxonomy text text-classification transformers weak-supervision weakly-supervised-learning zeroshot-learning
Last synced: 07 Nov 2024