An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with datamining

A curated list of projects in awesome lists tagged with datamining .

https://github.com/openrefine/openrefine

OpenRefine is a free, open source power tool for working with messy data and improving it

data-analysis data-science data-wrangling datacleaning datacleansing datajournalism datamining java journalism opendata reconciliation wikidata

Last synced: 11 Jan 2026

https://github.com/OpenRefine/OpenRefine

OpenRefine is a free, open source power tool for working with messy data and improving it

data-analysis data-science data-wrangling datacleaning datacleansing datajournalism datamining java journalism opendata reconciliation wikidata

Last synced: 15 Mar 2025

https://github.com/discord-datamining/discord-datamining

Datamining Discord changes from the JS files

builds canary datamining discord javascript

Last synced: 14 May 2025

https://github.com/foochane/books

整理一些书籍 ,包含 C&C++ 、git 、Java、Keras 、Linux 、NLP 、Python 、Scala 、TensorFlow 、大数据 、推荐系统、数据库、数据挖掘 、机器学习 、深度学习 、算法等。

big-data c cpp database datamining dl git java keras ml nlp python scala tensorflow

Last synced: 29 Mar 2025

https://github.com/sirkon/ldetool

Code generator for fast log file parsers

bigdata datamining log-parsing logs-analysis logs-parsing parsing parsing-csv

Last synced: 07 Apr 2025

https://github.com/YingtongDou/CARE-GNN

Code for CIKM 2020 paper Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters

datamining deep-learning fraud-detection fraud-prevention graphneuralnetwork machine-learning reinforcement-learning security

Last synced: 11 May 2025

https://github.com/yingtongdou/care-gnn

Code for CIKM 2020 paper Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters

datamining deep-learning fraud-detection fraud-prevention graphneuralnetwork machine-learning reinforcement-learning security

Last synced: 09 Apr 2025

https://github.com/chuanyuxue/cikm-2019-analyticup

1st Solution for 2019-CIKM-Analyticup: Efficient and Novel Item Retrieval for Large-scale Online Shopping Recommendation

cikm datamining kaggle-competition recommender-system tianchi

Last synced: 09 Apr 2025

https://github.com/ChuanyuXue/CIKM-2019-AnalytiCup

1st Solution for 2019-CIKM-Analyticup: Efficient and Novel Item Retrieval for Large-scale Online Shopping Recommendation

cikm datamining kaggle-competition recommender-system tianchi

Last synced: 20 Jul 2025

https://github.com/pavlovtech/WebReaper

Web scraper, crawler and parser in C#. Designed as simple, declarative and scalable web scraping solution.

crawler datamining parser parsing scraper scraping scraping-api scraping-data scraping-tool scraping-web scraping-websites webcrawler webscraping

Last synced: 08 Apr 2025

https://github.com/wmpscc/dataminingnotesandpractice

记录我学习数据挖掘过程的笔记和见到的奇技,持续更新~

datamining kaggle machinelearning tianchi

Last synced: 16 May 2025

https://github.com/denadai2/real-estate-neighborhood-prediction

Code to repeat the experiments of "The economic value of neighborhoods: Predicting real estate prices from the urban environment"

datamining real-estate urban urban-planning

Last synced: 15 Mar 2025

https://github.com/cvjena/libmaxdiv

Implementation of the Maximally Divergent Intervals algorithm for Anomaly Detection in multivariate spatio-temporal time-series.

anomalydetection anomalydiscovery data-analysis data-mining datamining machine-learning machine-learning-library machinelearning time-series timeseries

Last synced: 28 Jan 2026

https://github.com/xhyrom/discord-datamining

repository offering datamining and analyzing web builds, host builds, articles, blog posts, policies, jobs, @discord github and their domain(s)

articles blog datamining discord discord-datamining discord-experiments experiments github posts scripts

Last synced: 16 Oct 2025

https://github.com/wchill/acnhautocataloger

Automatically records what's in your Animal Crossing: New Horizons catalog

animal-crossing animal-crossing-new-horizons computer-vision datamining hardware

Last synced: 22 Mar 2025

https://github.com/lingbai-kong/CausalFormer

PyTorch Implementation of CausalFormer: An Interpretable Transformer for Temporal Causal Discovery

causal-discovery datamining interpretability pytorch-implementation time-series

Last synced: 17 Sep 2025

https://github.com/triforcely/octave.net

📈 More than cross-platform Octave process wrapper 🔬

analysis csharp datamining dotnet-standard matlab octave

Last synced: 12 May 2025

https://github.com/chuanyuxue/bdci-2020-timeseries-prediction

12th Solution: Matrix Factorization for High-Dimensional and Sparse Time Series Prediction

datamining kaggle-competition machine-learning time-series

Last synced: 29 Jul 2025

https://github.com/PrimozGodec/ImageColorization

Image and video colorizer is package for automatic image and video colorization. Models are allready trained

artificial-intelligence colorization datamining image-colorization machine-learning neural-network

Last synced: 07 May 2025

https://github.com/primozgodec/imagecolorization

Image and video colorizer is package for automatic image and video colorization. Models are allready trained

artificial-intelligence colorization datamining image-colorization machine-learning neural-network

Last synced: 10 Mar 2026

https://github.com/ohmybahgosh/FONTS_DOT_COM_RIPPER

Script to extract entire font families from Fonts.com, rips them as woff2 and final output includes woff2 and ttf files

bash bash-script curl datamining download-fonts font fonts scrape scrape-websites scraper sed shell-script typography woff2 woff2-files xidel

Last synced: 27 Mar 2025

https://github.com/wxxshirley/cikm2023direc

Codes, data, and baselines for CIKM 2023 Long Paper "Dual Intents Graph Modeling for User-centric Group Discovery"

datamining gnn graphneuralnetwork

Last synced: 15 Oct 2025

https://github.com/mhahsler/streammoa

Interface for data stream clustering algorithms implemented in the MOA (Massive Online Analysis) framework.

clustering datamining datastream

Last synced: 15 Apr 2025

https://github.com/joemar25/cs-elect-2-mid-term

Data Mining Mid Term Project is a case-study project that focuses on data analysis. With R, we have done our best to present the Data in a more meaningful and impactful way that could improve the financial/ecommerce/market decision making and others. Dataset is scrapped from https://www.flipkart.com/

case-study datamining nlp rstudio

Last synced: 10 Jul 2025

https://github.com/dimitryzub/ecommerce-scraper-py

Scrape ecommerce websites such as Amazon, eBay, Walmart, Home Depot, Google Shopping from a single module in Python🐍

data datamining ecommerce ecommerce-website python python3 selectolax selenium serpapi webscraper webscraping

Last synced: 03 Sep 2025

https://github.com/shulhan/tabula

A Go library for working with rows, columns, or matrix (deprecated, see https://github.com/shuLhan/share/tree/master/lib/tabula).

datamining dataset go golang matrix slice

Last synced: 15 Dec 2025

https://github.com/gibbed/gibbed.borderlandsenhanced.datamining

Datamining tools & code for use with Borderlands Enhanced.

datamining game-dev modding

Last synced: 14 Jun 2025

https://github.com/dfederschmidt/pyliwc

LIWC (Linguistic Inquiry and Word Count) in Python

datamining liwc nlp python3

Last synced: 22 Apr 2025

https://github.com/im-n1/rug

Library for fetching various stock data from the internet (official and unofficial APIs).

datamining scraping stocks

Last synced: 17 Jan 2026

https://github.com/gtzinos/bigdata-graph-analysis

Probably the first scalable and open source triangle count based on each edge, on scala and spark for every Big Dataset. (Louvain)

big-data bigdata community-detection data-mining data-science datamining intellij louvain louvain-algorithm louvain-community-detection louvain-method sbt scala spark triangle-counting

Last synced: 04 Jul 2025

https://github.com/dsdanielpark/arxiv2text

Converting PDF files to text, mainly with a focus on arXiv papers.

crawling data datamining translation

Last synced: 05 Sep 2025

https://github.com/gibbed/gibbed.borderlands2.datamining

Datamining tools & code for use with Borderlands 2.

datamining game-dev modding

Last synced: 14 Apr 2025

https://github.com/prateekiiest/student-group-activity-recognition

CNERG IIT Kgp Internship Project - Group Activity Recognition Data Mining Project on StudentLife Data Set

activity-recognition datamining group-actions hacktoberfest machine-learning student-database

Last synced: 04 May 2025

https://github.com/enriquefynn/ethereum-partitioning-experiments

Experiments on sharding ethereum

datamining ethereum sharding

Last synced: 04 Mar 2026

https://github.com/ppalmes/juliaworkshop

Collections of Tutorials and Demos for Learning Julia in Data Science, Simulation/Modeling, and High Performance Computing

caret datamining julia machine-learning scikitlearn-machine-learning visualization

Last synced: 11 Apr 2025

https://github.com/gpsyrou/twitter_sentiment_analysis

Exploration of the Twitter API and sentiment & topic analysis on tweets relevant to COVID-19

datamining nltk python sentiment-analysis textmining topic-modeling tweepy twitter twitter-api

Last synced: 29 Oct 2025

https://github.com/leonism/sample-superstore

This is the Python version analysis approach, towards the legendary Sample Superstore Dataset with Pandas

data-analysis datamining datascience dataset eda jupyter-notebook machine-learning python

Last synced: 09 Oct 2025

https://github.com/gibbed/gibbed.borderlandsoz.datamining

Datamining tools & code for use with Borderlands: The Pre-Sequel!

datamining game-dev modding

Last synced: 14 Apr 2025

https://github.com/peterk/pimmer

Exploratory code for PDF image mining

code4lib datamining humanities image-analysis image-mining opencv

Last synced: 12 Apr 2025

https://github.com/beiyuouo/data-mining-hw

🚀 Data mining homework for HNU 海南大学数据仓库与数据挖掘作业

data-mining datamining python

Last synced: 07 Sep 2025

https://github.com/renien/doc-diff

:snake: Support app to get a diff results from two document :green_book:

comparison-reports csv datamining datascience doc-diff python

Last synced: 14 Dec 2025

https://github.com/carnivuth/datamining

notes on datamining course of professor Sartori

alma-mater-studiorum appunti bologna datamining magistrale note unibo

Last synced: 27 Apr 2025

https://github.com/epocdotfr/rwrs

Players statistics, servers list and more for the Running With Rifles (RWR) game as well as its Pacific and Edelweiss DLCs

charts datamining discord-bot edelweiss flask geolite2 pacific player-statistics players python rest-api running-with-rifles rwr servers stats steam-api webapp

Last synced: 24 Oct 2025

https://github.com/dimitryzub/py-google-scholar-organic-cite-to-csv-sqlite

Scrape historic Google Scholar Organic and Cite results to CSV, MySQL Lite using Python and SerpApi.

csv data dataextraction datamining datascience datascraping dataset google googlescholar python scraper serpapi sqlite webscraper webscraping

Last synced: 14 Aug 2025

https://github.com/omarsar/friendly_data_science

Material and resources for the "Friendly Data Science" YouTube series.

analytics data-science datamining deep-learning natural-language-processing neural-networks text-mining

Last synced: 07 Sep 2025

https://github.com/krazete/sgmprocessor

The preprocessing scripts used to generate assets for my other Skullgirls Mobile projects.

apk datamining game mobile pillow skullgirls unity

Last synced: 12 Aug 2025

https://github.com/omarsar/data_mining_lab

Material for Data Mining Lab Session (Fall Semester @ NTHU)

data datamining datapreprocessing datavisualization

Last synced: 14 Jul 2025

https://github.com/shervinnd/btc_close_price_predict_ml

Predicting the price of Bitcoin closes with machine learning method and testing linear modes and using linear regression model.

bitcoin cryptocurrency data data-science datamining finance linear-regression linerregression machine-learning machine-learning-algorithms machinelearning ml numpy pandas predictive-modeling python regression sklearn

Last synced: 24 Oct 2025

https://github.com/ritreshgirdhar/apriori-algorithm

Data Mining show frequently bought related Item - For forecasting shelf item to continue retail growth

apriori-algorithm apriori-algorithm-python data-science datamining

Last synced: 08 Apr 2025

https://github.com/code2k13/nlphosegui

This tool allows you to create Natural Language Processing pipelines for use with nlphose using a Blockly based GUI editor in any browser. As you create a pipeline it shows you the corresponding nlphose command which will execute the pipeline.

blockly data-science datamining drag-and-drop gui machine-learning natural-language-processing nlp no-code

Last synced: 11 Jun 2025

https://github.com/cutetenshii/wplace-datamining

Datamining for the Wplace.live website

dataminer datamining wplace

Last synced: 02 Mar 2026

https://github.com/datadrivenconstruction/importexceltorevit

Add-in allows to import parameter values from Excel database, created with DataDrivenConstruction Excel Add-in or converters

api autodesk datamining excel import revit rvt

Last synced: 04 Apr 2026

https://github.com/zeinhasan/introduction-to-data-mining-course-material

Introduction to Data Mining Course Lecturer and Laboratory Assistant Teaching Materials

datamining

Last synced: 24 Apr 2025

https://github.com/nmrr/datamining-pshitt

Logiciel permettant la visualisation (avec D3.js et Linkurious.js) et l'analyse de données obtenues à partir d'un pot de miel : pshitt (faux serveur SSH)

d3js datamining datavisualisation dataviz honeypot

Last synced: 17 Mar 2025

https://github.com/timm/ish

(Some (useful (ish)) LISP)

data-science datamining fun libraries lisp simple teaching

Last synced: 28 Feb 2026

https://github.com/faizanmohd5/web-scraping-iphone-11-reviews

This is a web scraping project that extracts customer reviews for the iPhone 11 from Flipkart.com using Python and BeautifulSoup. The extracted data is saved in a CSV file for further analysis. Use it as a starting point for your own web scraping projects or for analyzing customer reviews of the iPhone 11.

beautifulsoup csv data-visualization dataanalysis dataextraction datainsights datamining datapreprocessing ecommerce-website ipython-notebook jupyter-notebook python reviews reviewscrapper webscraping

Last synced: 01 Feb 2026

https://github.com/omochikaeri15/nyanko-scripts

Standlone pure-rust binaries made to be lightweight and perform very specific functions for The Battle Cats.

asset-extraction battle-cats binaries cli command-line datamining game-tools reverse-engineering rust

Last synced: 08 Jun 2026

https://github.com/wchill/acnhitemtextureexporter

Automatically decompresses, decodes and extracts Layout textures from Animal Crossing New Horizons

animal-crossing animal-crossing-new-horizons computer-graphics datamining

Last synced: 20 Jul 2025

https://github.com/panastasiadis/data-mining-operations

This repository contains three Knime workflows that aim to analyze the Air Traffic Passenger Statistics dataset from the San Francisco International Airport. The workflows include tasks such as classification comparison, regression analysis, and outlier detection using various machine learning techniques.

ai airtrafficpassengerstatistics bigdata classificationcomparison dataanalysis datamining datapreprocessing datascience datavisualization knime machinelearning outlierdetection regressionanalysis sanfranciscointernationalairport

Last synced: 03 May 2026

https://github.com/jlgarridol/tfg-smartbeds

MINERÍA DE DATOS APLICADA A LA DETECCIÓN DE CRISIS EPILÉPTICAS - GII18.13

bed datamining ensemble epileptic-seizures manifold medical-informatics oneclasssvm pca rotation-forest scikit-learn weka

Last synced: 30 Apr 2026

https://github.com/timm/dart

optimizing = cluster + contrast

cocomo datamining lua optimzation

Last synced: 08 Jul 2025

https://github.com/mengyaohuang/data-mining-isir

Data mining based on Introduction to Statistical learning

datamining machine-learning-algorithms rlanguage statistics

Last synced: 07 Jan 2026

https://github.com/9akashnp8/opensea-web-scraper

Scrap OpenSea for NFT data mining

datamining nft-marketplace opensea webscraping

Last synced: 20 Mar 2025

https://github.com/arfazrll/data-mining-competition

Repository ini berisi partisipasi saya dalam kompetisi ADIKARA 2024 - Data Mining Competition. Repository ini terkait mengembangkan model prediksi Food Price Index menggunakan dataset spatiotemporal.

dataanalysis datamining kaggle-competition machine-learning predictive-modeling spatiotemporal-forecasting

Last synced: 24 Jul 2025

https://github.com/divithraju/divith-raju-data-mining

This project focuses on customer segmentation using data mining techniques, specifically K-Means clustering, to classify customers into distinct groups based on their purchasing behaviors. The goal is to analyze customer data and segment them into clusters for targeted marketing strategies and better customer relationship management.

algorthims analytics apache business client connector data dataarchitecture database dataengineering datamining datascience hadoop k-means-clustering mysql project project-repository pyspark python3 spark

Last synced: 06 Mar 2026

https://github.com/nmercadeb/computational-biophysics-project

Codes for the final project of the subject Computational Biophysics of the degree Engineering Physics (UPC). By Marta Alcalde i Núria Mercadé.

animal-behavior animal-model animal-tracking datamining deep-learning deep-neural-networks deeplabcut stroke

Last synced: 05 May 2025

https://github.com/mreliptik/dmfinalproject

Final project for Data Mining course : Using OPTICS on 2 datasets

clustering datamining optics-clustering python

Last synced: 20 Jan 2026

https://github.com/nhviet03/is252_datamining_knn_pla

Implementing 2 basic classification algorithms: K-Nearest Neighbor (KNN) and Perceptron Learning Algorithm (PLA) to predict the likelihood of customers subscribing to term deposits. The implementation process from manual calculation based on mathematical formulas to utilizing libraries.

datamining knn-classification perceptron-learning-algorithm

Last synced: 11 Jun 2025