https://jonathanwvd.github.io/awesome-industrial-datasets/
A curated collection of public industrial datasets.
https://jonathanwvd.github.io/awesome-industrial-datasets/
awesome-list dataset industry-40 machine-learning time-series
Last synced: about 1 month ago
JSON representation
A curated collection of public industrial datasets.
- Host: GitHub
- URL: https://jonathanwvd.github.io/awesome-industrial-datasets/
- Owner: jonathanwvd
- License: other
- Created: 2022-10-25T20:00:48.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2025-04-16T16:54:03.000Z (2 months ago)
- Last Synced: 2025-05-06T23:01:59.085Z (about 1 month ago)
- Topics: awesome-list, dataset, industry-40, machine-learning, time-series
- Language: HTML
- Homepage:
- Size: 20.3 MB
- Stars: 151
- Watchers: 6
- Forks: 22
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Awesome Industrial Datasets
**🔗 Check the [HTML version](https://jonathanwvd.github.io/awesome-industrial-datasets/) for better navigation.**
Welcome to the Awesome Industrial Datasets repository! This project aims to simplify the access to high-quality industrial datasets across various sectors such as chemical, mechanical, oil and gas, and more. These datasets are invaluable for researchers, engineers, and data scientists working on machine learning models and other analytical tasks that require real-world industrial data.
If you find this repository useful, please consider giving it a ⭐ to show your support!
🤝 If you're interested in contributing, please refer to the [Contribution Guidelines](#contribution-guidelines).
## Datasets Table
| Dataset Name | Labeled | Dataset Characteristics | Data Source | Additional Tags |
|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------|:-----------------------------------------------------------------|:--------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [3D Printer](markdown/3d_printer.md) | Likely | Multivariate | Real | 3D printing; Mechanical engineering; Printer settings; Material strength; Print quality; Regression; Classification |
| [3W](markdown/3w.md) | Likely Yes | Multivariate, Time-Series | Likely Both | Oil and Gas; Fault detection; Multivariate data; Sensor data; Time-series analysis; Oil wells; Machine learning benchmark |
| [Ai4I 2020 Predictive Maintenance Dataset](markdown/ai4i_2020_predictive_maintenance_dataset.md) | Yes | Multivariate, Time-Series | Synthetic | Predictive maintenance; Synthetic data; Machine failure detection; Time-series data; Multivariate data; Classification; Regression |
| [Aitex](markdown/aitex.md) | Information not available | Information not available | Information not available | Information not available |
| [Aps Failure At Scania Trucks Data Set](markdown/aps_failure_at_scania_trucks_data_set.md) | Yes | Multivariate | Real | Heavy trucks; Air Pressure System; Component failure; Fault detection; Anonymized features; Industrial challenge dataset; Scania trucks |
| [Additional Tennessee Eastman Process Simulation Data For Anomaly Detection Evaluation](markdown/additional_tennessee_eastman_process_simulation_data_for_anomaly_detection_evaluation.md) | Yes | Multivariate, Time-Series | Synthetic | Tennessee Eastman Process; Process simulation; Anomaly detection; Fault detection; Chemical process; Machine learning benchmark; Synthetic data |
| [Air Quality](markdown/air_quality.md) | Likely | Multivariate, Time-Series | Real | Air Quality Monitoring; Gas Sensor Data; Multivariate Time-Series; Environmental Sensor; Chemical Sensors; Pollution Measurement; Cross-sensitivity and Sensor Drift |
| [Anemometer Fault Detection](markdown/anemometer_fault_detection.md) | Yes | Multivariate, Time-Series | Real | Anemometer; Fault detection; Wind power industry; Time-series data; Multivariate data; Sensor data; PHM competition |
| [Appliances Energy Prediction](markdown/appliances_energy_prediction.md) | No | Multivariate, Time-Series | Real | Indoor environment monitoring; ZigBee wireless network; Temperature data; Humidity data; Weather integration; Energy consumption; M-bus energy meters; Airport weather station |
| [Asset Failure And Replacement](markdown/asset_failure_and_replacement.md) | Yes | Multivariate, Time-Series | Real | Asset health; Failure prediction; Part replacement; Time-series data; Industrial monitoring; Health score; Prognostics and health management |
| [Athabasca Oil Sands Dataset Mcmurray Formation ](markdown/athabasca_oil_sands_dataset__mcmurray_formation_.md) | Information not available | Multivariate | Real | Athabasca Oil Sands; McMurray Formation; Wabiskaw Member; Geology; Well log data; Core analyses; Oil and Gas |
| [Bsdata](markdown/bsdata.md) | Likely | Multivariate | Real | Surface defects; Ball screw drives; Defect classification; Prognostics; Detection; Condition monitoring; Mechanical components |
| [Bearing](markdown/bearing.md) | Likely | Information not available | Real | NASA; Prognostics; Health Management; Aerospace; Time-Series Data; Sensor Data; Diagnostics |
| [Beijing Pm2 5 Data](markdown/beijing_pm2_5_data.md) | Yes | Multivariate, Time-Series | Real | Air pollution; PM2.5; Time-series data; Meteorological data; Beijing; Climate and Environment; Pollution monitoring |
| [Bosch Production Line Performance](markdown/bosch_production_line_performance.md) | Yes | Multivariate, Tabular | Real | Manufacturing; Production line data; Sensor measurements; Binary classification; Quality control; Assembly line monitoring; Matthews correlation coefficient |
| [Brent Oil Prices](markdown/brent_oil_prices.md) | Yes | Time-Series, Univariate | Real | Brent crude oil; Time-series data; Oil price forecasting; Energy economics; Daily historical prices; U.S. Energy Information Administration; Financial data |
| [Bridge Crack Datatset](markdown/bridge_crack_datatset.md) | Likely | Image, Multivariate | Real | Bridge cracks; Surface defects; Image dataset; Defect detection; Structural health monitoring; Computer vision; Machine learning |
| [Business And Industry Reports](markdown/business_and_industry_reports.md) | Likely | Multivariate, Time-Series | Real | US Census Bureau; Economic reports; Time series data; Business and industry; Multivariate data; Monthly and quarterly data; Economic indicators |
| [C-Mapss Aircraft Engine Simulator Data](markdown/c-mapss_aircraft_engine_simulator_data.md) | Yes | Time-Series, Multivariate | Synthetic | Aircraft engine; Simulator data; Engine performance; Sensor data; Prognostics |
| [Cmapss Jet Engine Simulated Data](markdown/cmapss_jet_engine_simulated_data.md) | Information not available | Information not available | Information not available | Information not available |
| [Cnc Mill Tool Wear](markdown/cnc_mill_tool_wear.md) | Yes | Multivariate, Time-Series | Real | CNC machining; Tool wear detection; Classification; Time series data; Manufacturing; Sensor data; Industrial process monitoring |
| [Cwru Bearing Data](markdown/cwru_bearing_data.md) | Information not available | Information not available | Information not available | Information not available |
| [Car Evaluation](markdown/car_evaluation.md) | Yes | Multivariate | Real | Hierarchical decision model; Automobile evaluation; Categorical features; Car acceptability; Classification dataset; Constructive induction; Attribute structure |
| [Casting Product Image Data For Quality Inspection](markdown/casting_product_image_data_for_quality_inspection.md) | Yes | Image, Multiclass, Classification | Real | Casting manufacturing; Quality inspection; Industrial defect detection; Grayscale images; Image classification; Binary classification; Deep learning dataset |
| [Chemical Composition Of Ceramic Samples](markdown/chemical_composition_of_ceramic_samples.md) | Yes | Multivariate | Real | Energy Dispersive X-ray Fluorescence; Ceramic samples; Chemical composition; Multivariate dataset; Classification; Clustering; Physics and Chemistry |
| [Chemical Production India 2013 To 2020](markdown/chemical_production_india_2013_to_2020.md) | No | Multivariate, Time-Series | Real | Chemical Production; India; Time-Series Data; Industrial Manufacturing; Chemical Industry; Metric Tonnes; Department of Chemicals and Petrochemicals |
| [Chinese Power Line Insulator Dataset](markdown/chinese_power_line_insulator_dataset.md) | Likely | Image, Multivariate | Real | Power line insulator; UAV images; Defect detection; Synthetic data; Image classification; Electrical equipment monitoring; Computer vision |
| [Civil Engineering Cement Manufacturing Dataset](markdown/civil_engineering__cement_manufacturing_dataset.md) | Yes | Multivariate | Real | Civil Engineering; Cement Manufacturing; Concrete Compressive Strength; Regression Dataset; Material Science; Multivariate Data; Real-world Data |
| [Combined Cycle Power Plant](markdown/combined_cycle_power_plant.md) | Yes | Multivariate | Real | Combined Cycle Power Plant; Electrical Energy Output; Ambient Variables; Gas Turbine; Steam Turbine; Regression; Real-valued features |
| [Concrete Compressive Strength](markdown/concrete_compressive_strength.md) | Yes | Multivariate | Real | Civil Engineering; Concrete Strength; Regression; Real-valued Features; No Missing Values; Material Science; Quantitative Data |
| [Concrete Crack Images For Classification](markdown/concrete_crack_images_for_classification.md) | Yes | Image, Multiclass | Real | Concrete; Crack detection; Image classification; RGB images; High-resolution images; Machine learning; Structural health monitoring |
| [Maintenance Of Naval Propulsion Plants](markdown/maintenance_of_naval_propulsion_plants.md) | Likely | Multivariate | Synthetic | Gas Turbine; Naval propulsion; Simulator data; Performance decay; Multivariate data; Regression task; Synthetic data |
| [Condition Based Maintenance Of Naval Propulsion Systems](markdown/condition_based_maintenance_of_naval_propulsion_systems.md) | Yes | Multivariate | Synthetic | Gas Turbine Simulator; Naval Propulsion System; Condition Based Maintenance; Performance Decay; Multivariate Data; Regression Task; Synthetic Data |
| [Condition Monitoring Of Hydraulic Systems](markdown/condition_monitoring_of_hydraulic_systems.md) | Yes | Multivariate, Time-Series | Real | Hydraulic systems; Condition monitoring; Multivariate time-series; Sensor data; Fault diagnosis; Real-world data; Industrial equipment |
| [Control Loop Datasets](markdown/control_loop_datasets.md) | Information not available | Multivariate, Time-Series | Real | Industrial datasets; Control loops; Oil and gas data; Time-series; Oscillation detection; Machine learning; Process control |
| [Data-Driven Prediction Of Battery Cycle Life Before Capacity Degradation](markdown/data-driven_prediction_of_battery_cycle_life_before_capacity_degradation.md) | Yes | Multivariate, Time-Series | Real | Lithium-ion batteries; Battery life prediction; Fast charging; Time-series data; Battery degradation; Multivariate data; Thermocouple temperature measurement |
| [Deep Pcb](markdown/deep_pcb.md) | Likely | Image, Multiclass | Real | Surface defect detection; Printed circuit boards; Industrial inspection; Image classification; Real-world images; Defect localization; Machine learning dataset |
| [Defective Solar Cells Dataset](markdown/defective_solar_cells_dataset.md) | Likely | Images, Classification | Real | Solar cells; Defect detection; Electroluminescence images; Photovoltaic; Computer vision; Anomaly detection; Machine learning |
| [Degradation Measurement Of Robot Arm Position Accuracy](markdown/degradation_measurement_of_robot_arm_position_accuracy.md) | Likely | Multivariate, Time-Series | Real | Robot arm position accuracy; Universal Robot UR5; Prognostics and health management; Positional degradation; Controller level sensing data; Multivariate time-series; Robot system health assessment |
| [Detecting Anomalies In Wafer Manufacturing](markdown/detecting_anomalies_in_wafer_manufacturing.md) | Yes | Imbalanced, High Dimensionality, Classification | Real | Wafer manufacturing; Anomaly detection; Imbalanced dataset; Industrial IoT; High dimensionality; Semiconductor; Machine learning classification |
| [Diesel Engine Faults Features](markdown/diesel_engine_faults_features.md) | Yes | Multivariate, Time-Series | Synthetic | Diesel engine; Fault diagnosis; Predictive maintenance; Pressure curves; Torsional vibration; Thermodynamic model; Synthetic data |
| [Eco Dataset](markdown/eco_dataset.md) | Yes | Multivariate, Time-Series | Real | Electricity consumption; Occupancy detection; Non-intrusive load monitoring; Time-series data; Smart meter; Swiss households; Energy disaggregation |
| [Electrical Grid Stability Simulated Data](markdown/electrical_grid_stability_simulated_data.md) | Yes | Multivariate | Synthetic | Electrical grid stability; Decentralized control; 4-node star system; Synthetic data; Power grid simulation; Physics and Chemistry; Classification and Regression tasks |
| [Electricity Load Diagrams 2011-2014](markdown/electricity_load_diagrams_2011-2014.md) | Likely | Time-Series | Real | Electricity consumption; Time-series data; Energy data; Portuguese local time; Smart grid; Client consumption profiles; Daylight saving time adjustments |
| [Energy Efficiency](markdown/energy_efficiency.md) | Yes | Multivariate | Synthetic | Building energy efficiency; Heating load prediction; Cooling load prediction; Multivariate dataset; Synthetic building data; Regression tasks; Classification tasks |
| [Gc10-Det](markdown/gc10-det.md) | Yes | Image, Multiclass | Real | Metal surface defects; Industrial dataset; Image classification; Object detection; Grayscale images; Manufacturing quality control; Steel sheet defects |
| [Greend](markdown/greend.md) | Likely | Multivariate, Time-Series | Real | Energy consumption; Household power measurements; Austria; Italy; Time-series data; Per device energy profiles; High frequency sampling |
| [Gas Sensor Array Drift At Different Concentrations](markdown/gas_sensor_array_drift_at_different_concentrations.md) | Yes | Multivariate, Time-Series | Real | Chemical sensors; Sensor drift; Gas concentration; Time series data; Multivariate data; Environmental sensing; Pattern recognition |
| [Gas Sensor Array Temperature Modulation](markdown/gas_sensor_array_temperature_modulation.md) | Yes | Multivariate, Time-Series | Real | Gas sensors; Temperature modulation; Metal oxide semiconductor sensors; Carbon monoxide; Humidity control; Time series data; Multivariate sensor data |
| [Gas Sensor Array Under Dynamic Gas Mixtures](markdown/gas_sensor_array_under_dynamic_gas_mixtures.md) | Yes | Multivariate, Time-Series | Real | Chemical sensors; Gas mixture analysis; Time series data; Multivariate sensor data; Continuous acquisition; Sensor array; Artificial intelligence research |
| [Gearbox Fault Detection](markdown/gearbox_fault_detection.md) | Likely | Multivariate | Real | Gearbox fault detection; Accelerometer data; Bearing geometry; PHM Data Challenge 2009; Fault magnitude estimation; Prognostics and health management; Machine learning benchmark |
| [Genesis Pick-And-Place Demonstrator Dataset](markdown/genesis_pick-and-place_demonstrator_dataset.md) | Yes | Multivariate, Time-Series | Real | Pick-and-place; Industrial automation; Pneumatic linear drive; Anomaly detection; Time-series sensor data; Predictive maintenance; Labeled anomalies |
| [Global Power Plant](markdown/global_power_plant.md) | No | Multivariate, Global, Geospatial | Real | Power plants; Global dataset; Energy production; Primary fuel type; Yearly generation data; Open source; Geospatial data |
| [Green House Gas Produce By Different Industry](markdown/green_house_gas_produce_by_different_industry.md) | Likely | Multivariate, Time-Series, Environmental Data | Real | Greenhouse Gas Emissions; Environmental Data; Industry Emissions; Multivariate; Time-Series; Carbon Footprint; ISO Measurement |
| [Hci Industrial Optical Inspection](markdown/hci_industrial_optical_inspection.md) | Information not available | Information not available | Information not available | |
| [High Storage System Anomaly Detection](markdown/high_storage_system_anomaly_detection.md) | Yes | Multivariate, Time-Series, Anomaly Detection | Real | High Storage System; Energy Optimization; Anomaly Detection; Industry 4.0; Timed Automata; Conveyor Belt Sensors; Real-world Industrial Data |
| [High Storage System Data For Energy Optimization](markdown/high_storage_system_data_for_energy_optimization.md) | Yes | Multivariate, Time-Series | Real | High storage system; Conveyor belts; Energy optimization; Anomaly detection; Time-series data; Industrial IoT; Sensor data |
| [Hill-Valley](markdown/hill-valley.md) | Yes | Sequential | Information not available | Terrain data; Hill and valley classification; No noise and noise variations; Sequential data; Binary classification; Real-valued features; Creative Commons licensed |
| [Isdb - International Stiction Data Base](markdown/isdb_-_international_stiction_data_base.md) | Likely | Multivariate, Time-Series, Control Systems | Real | Control loops; Valve stiction; Nonlinearities in control systems; Fault diagnosis; Oscillation detection; MATLAB software; Process industries |
| [Iv2V And Iv2I Industrial Datasets](markdown/iv2v_and_iv2i__industrial_datasets.md) | Information not available | Multivariate | Real | Industrial wireless datasets; Vehicle-to-vehicle communication; Vehicle-to-infrastructure communication; AI4Mobile project; Industrial communication systems; Wireless sensor data; Machine learning support data |
| [Individual Household Electric Power Consumption](markdown/individual_household_electric_power_consumption.md) | Likely Yes | Multivariate, Time-Series | Real | Electric power consumption; Time series; Household energy data; Multivariate data; Missing values present; Minute sampling rate; Sub-metering |
| [Industrial Safety And Health Analytics Database](markdown/industrial_safety_and_health_analytics_database.md) | Likely | Multivariate | Real | Industrial accidents; Workplace safety; Manufacturing plants; Occupational health; Accident severity levels; Multicountry data; Real-world data |
| [Kolektor Surface-Defect Dataset Kolektorsdd ](markdown/kolektor_surface-defect_dataset__kolektorsdd_.md) | Yes | Image data, Defect detection | Real | Surface-defect detection; Industrial images; Defect annotations; Machine vision; Deep learning; Image segmentation; Controlled environment |
| [Kylberg Texture Dataset](markdown/kylberg_texture_dataset.md) | Yes | Multiclass, Image, Texture | Real | Texture classification; Image patches; Multiclass dataset; Normalized images; Texture analysis; Computer vision; Image dataset |
| [Large Scale Image Dataset Of Wood Surface Defects](markdown/large_scale_image_dataset_of_wood_surface_defects.md) | Yes | Image, Multiclass Classification | Real | Wood surface; Image dataset; Object detection; Defect detection; YOLO annotations; Automated quality control; Industrial inspection |
| [Laser Welding](markdown/laser_welding.md) | Yes | Multivariate | Real | Laser beam welding; Steel-copper lap joints; Welding parameters; Cracking detection; Screening design; Weld depth analysis; Material thickness |
| [Li-Ion Battery Aging Datasets](markdown/li-ion_battery_aging_datasets.md) | Likely | Time-Series, Multivariate, Run-to-Failure | Real | Li-ion batteries; Battery aging; Prognostics testbed; Electrochemical Impedance Spectroscopy; Run-to-Failure data; Remaining Useful Life prediction; NASA PCoE |
| [Mvtec Anomaly Detection Mvtec Ad ](markdown/mvtec_anomaly_detection__mvtec_ad_.md) | Yes | Image data, Anomaly detection, High-resolution images | Real | Industrial inspection; High-resolution images; Anomaly detection; Pixel-precise annotations; Image dataset; Unsupervised learning; Defect detection |
| [Magnetic Tile Defect](markdown/magnetic_tile_defect.md) | Information not available | Information not available | Information not available | |
| [Maintenance Action Recommendation](markdown/maintenance_action_recommendation.md) | Yes | Multivariate, Time-Series, Classification | Real | Maintenance recommendation; Industrial equipment; Event codes; Parameter data; Remote monitoring; Diagnostics; PHM Society data challenge |
| [Maintenance Of Naval Propulsion Plants Data Set](markdown/maintenance_of_naval_propulsion_plants_data_set.md) | Yes | Multivariate | Synthetic | Naval propulsion; Gas Turbine; Simulator data; Performance decay; Regression task; Multivariate data; Synthetic data |
| [Manufacturing Defects](markdown/manufacturing_defects.md) | Yes | Univariate, Time-Series | Real | Manufacturing; Defect detection; Quality control; Time series analysis; Industrial data; Minor defects; Inspection data |
| [Manufacturing Cost](markdown/manufacturing_cost.md) | Likely Yes | Multivariate | Real | Manufacturing; Cost estimation; Economies of scale; Production volume; Regression; Real-world data |
| [Manywells](markdown/manywells.md) | Yes | Multivariate, Time-Series, Large-Scale | Synthetic | multiphase flow; simulation; oil and gas; time-series; large-scale; machine learning; Hugging Face |
| [Mechanic Component Images](markdown/mechanic_component_images.md) | Yes | Image, Multiclass Classification | Real | Mechanical components; Image data; Defect detection; Piston quality recognition; Computer vision; Multiclass classification; Manufacturing quality control |
| [Mechanical Analysis](markdown/mechanical_analysis.md) | Yes | Multivariate | Real | Fault diagnosis; Electromechanical devices; Multivariate data; Classification; Machine components; Real-world data; Mechanical measurements |
| [Mercedes-Benz Greener Manufacturing](markdown/mercedes-benz_greener_manufacturing.md) | Yes | Multivariate | Real | Automobiles; Manufacturing; Regression; Categorical data; Feature permutations; Test bench optimization; Carbon dioxide emissions |
| [Milling](markdown/milling.md) | Information not available | Information not available | Information not available | NASA; Intelligent Systems; Autonomous Systems; Prognostics; Robust Software Engineering; Collaborative Systems; Ames Research Center |
| [Milling Wear](markdown/milling_wear.md) | Information not available | Information not available | Information not available | Information not available |
| [Multi-Stage Continuous-Flow Manufacturing Process](markdown/multi-stage_continuous-flow_manufacturing_process.md) | Likely | Multivariate, Time-Series | Real | Manufacturing Process; Continuous Flow; Multistage; Time-Series Data; Real Production Data; Regression Task; Process Control |
| [Nasa Bearing Dataset](markdown/nasa_bearing_dataset.md) | Yes | Multivariate, Time-Series | Real | Bearing vibration data; Predictive maintenance; Prognostics; Test-to-failure experiments; Time-series sensor data; Accelerometer signals; Rotating machinery |
| [Neu Surface Defect Dataset](markdown/neu_surface_defect_dataset.md) | Information not available | Image, Multiclass | Real | Surface defect detection; Steel strip images; Industrial defect classification; Image dataset; Hot-rolled steel; Machine learning; Computer vision |
| [Oecd Data - Crude Oil Production](markdown/oecd_data_-_crude_oil_production.md) | Likely No | Multivariate, Time-Series | Real | Oil and Gas; Energy; Time-Series; Crude Oil Production; OECD Data; Regression; Country-level Data |
| [Oil Storage Tanks](markdown/oil_storage_tanks.md) | Yes | Image data, Multivariate | Real | Oil storage tanks; Satellite imagery; Object detection; Floating head tanks; Bounding box annotations; Google Earth images; Energy sector |
| [Oil And Gas](markdown/oil_and_gas.md) | Yes | Multivariate, Time-Series | Real | Oil production; Natural gas production; Energy prices; Exports data; Historical data; Time-series; Economic indicators |
| [Oil Well](markdown/oil_well.md) | Yes | Multivariate, Time-Series | Real | Oil well operation; Time-series data; Reservoir pressure; Oil and gas production; Water cut percentage; Dynamic level; Field development monitoring |
| [Degradation Of A Cutting Blade](markdown/degradation_of_a_cutting_blade.md) | No | Multivariate, Time-Series | Real | Industrial component degradation; Cutting blade; Predictive maintenance; Multivariate time series; Manufacturing; Sensor data; Remaining useful life prediction |
| [Open Industrial Data Project Oil Gas ](markdown/open_industrial_data_project__oil___gas_.md) | No | Multivariate, Time-Series, Industrial | Real | Oil and Gas; Industrial Data; Time-Series Data; Predictive Maintenance; Condition Monitoring; Cognite Data Fusion; Aker BP; Open Data |
| [Oscillation Detection Artificial Dataset](markdown/oscillation_detection_artificial_dataset.md) | Information not available | Information not available | Likely Synthetic | Oscillation detection; Artificial dataset; Machine learning; Time series data; Signal analysis; Control systems; Industrial process monitoring |
| [Phm 2008 Challenge](markdown/phm_2008_challenge.md) | Information not available | Information not available | Information not available | PHM challenge; Predictive maintenance; Prognostics; NASA dataset; Machine learning dataset |
| [Phm Data Challenge](markdown/phm_data_challenge.md) | Yes | Multivariate, Time-Series, Fault Detection | Real | Fault detection; Prognostics; Industrial plant monitoring; Time-series sensor data; Multivariate data; Predictive maintenance; Open competition |
| [Panasonic 18650Pf Li-Ion Battery Data](markdown/panasonic_18650pf_li-ion_battery_data.md) | Likely | Multivariate, Time-Series | Real | Lithium Ion Battery; State of Charge Estimation; Kalman Filtering; Neural Networks; Energy Storage; Electric Vehicles; Battery Testing |
| [Parts Manufacturing](markdown/parts_manufacturing.md) | Yes | Multivariate | Real | Manufacturing; Parts dimensions; Operator performance; Multivariate data; Industrial dataset; Classification; Real-world data |
| [Plant Fault Detection](markdown/plant_fault_detection.md) | Likely | Information not available | Information not available | PHM Society; Prognostics and Health Management; Fault detection; Industrial plant data; Maintenance data challenge; Predictive maintenance; Time-series likely |
| [Plastic Extrusion Defects](markdown/plastic_extrusion_defects.md) | Likely Yes | Multivariate, Time-Series, Tabular | Real | Plastic extrusion; Manufacturing defect detection; Time-series sensor data; Process parameters; Visual defect inspection; Film breakage; Multivariate data |
| [Power Consumption Of Tetuan City](markdown/power_consumption_of_tetuan_city.md) | Likely | Multivariate, Time-Series | Real | Power consumption; Time-series; Energy distribution networks; Morocco; Weather data integration; Multivariate dataset; Regression task |
| [Predicting Manufacturing Defects Dataset](markdown/predicting_manufacturing_defects_dataset.md) | Yes | Tabular, Multivariate | Synthetic | Manufacturing; Defect Prediction; Synthetic Data; Quality Control; Production Metrics; Supply Chain; Classification Dataset |
| [Production Plant Data For Condition Monitoring](markdown/production_plant_data_for_condition_monitoring.md) | Yes | Multivariate | Real | Condition monitoring; Predictive maintenance; Run-to-failure; Production plant; Self-Organizing Map; Degradation prediction; Industrial process data |
| [Productivity Prediction Of Garment Employees](markdown/productivity_prediction_of_garment_employees.md) | Yes | Multivariate, Time-Series | Real | Garment manufacturing; Employee productivity; Labour-intensive industry; Time-series data; Regression task; Classification task; Industry expert validated |
| [Prognostics Data Repository](markdown/prognostics_data_repository.md) | Likely | Time-Series, Multivariate | Both | Prognostics; Time-Series Data; Run-to-Failure Data; NASA Ames Research Center; Battery Data; Engine Degradation; Industrial Equipment Monitoring |
| [Pump Sensor Data](markdown/pump_sensor_data.md) | Yes | Multivariate, Time-Series | Real | Predictive maintenance; Water pump monitoring; Sensor data; Multivariate time series; Classification task; Industrial equipment; Real-world data |
| [Mining Process](markdown/mining_process.md) | Yes | Multivariate, Time-Series | Real | Mining process; Flotation plant; Iron ore; Silica impurity; Time series; Industrial data; Regression task |
| [Quality Prediction In A Mining Process](markdown/quality_prediction_in_a_mining_process.md) | Yes | Multivariate, Time-Series | Real | Mining process; Flotation plant; Iron ore quality; Silica impurity prediction; Process engineering; Time series data; Industrial manufacturing |
| [Railway Surface Defect Detection Dataset](markdown/railway_surface_defect_detection_dataset.md) | Yes | Image, Multivariate | Real | Railway inspection; Surface defect detection; Image dataset; Deep learning; Computer vision; Image defects; Industrial maintenance |
| [Renewable Power Plants](markdown/renewable_power_plants.md) | Likely No | Multivariate, Time-Series | Real | Renewable Energy; Power Plants; Energy Capacity; Time Series; European Countries; Multivariate Data; Energy Infrastructure |
| [Road Surface Cracks Dataset](markdown/road_surface_cracks_dataset.md) | Information not available | Image data | Real | Road surface cracks; Image data; Crack detection; Material defects; Infrastructure monitoring; GitHub dataset; Computer vision |
| [Robot Execution Failures](markdown/robot_execution_failures.md) | Yes | Multivariate, Time-Series | Real | Force and torque measurements; Robot failure detection; Multivariate time-series; Classification tasks; Physical sensor data; Integer features; Short time window data |
| [Sacac](markdown/sacac.md) | Information not available | Multivariate, Industrial Process Data | Real | Industrial process data; Control loop performance; PID control loops; Process industries; Multivariate data; Fault diagnosis; Process control monitoring |
| [Secom](markdown/secom.md) | Yes | Multivariate | Real | Semi-conductor manufacturing; Feature selection; Sensor data; Yield prediction; Multivariate data; Missing values; Classification |
| [Sml2010](markdown/sml2010.md) | Yes | Multivariate, Sequential, Time-Series, Text | Real | Smart home monitoring; Domotic house; Environmental sensors; Time-series data; Multivariate data; Indoor temperature; Carbon dioxide levels |
| [Secure Water Treatment Swat Dataset](markdown/secure_water_treatment__swat__dataset.md) | Yes | Multivariate, Time-Series, Cyber-Physical Systems | Real | Cyber-Physical Systems; Industrial Control Systems; Water Treatment; Anomaly Detection; Time-Series Data; Sensor Data; Cybersecurity |
| [Severstal Steel Defect Detection](markdown/severstal_steel_defect_detection.md) | Yes | Multivariate, Image data | Real | Steel manufacturing; Defect detection; Image segmentation; Image classification; Manufacturing quality control; Multiclass defects; Kaggle competition |
| [Solar Power Generation Data](markdown/solar_power_generation_data.md) | Likely | Multivariate, Time-Series | Real | Solar power; Renewable energy; Time-Series; Multivariate; Sensor data; Power generation; India |
| [Steel Dataset](markdown/steel_dataset.md) | Yes | Multivariate, Time-Series | Real | Steel industry; Energy consumption; Time-series data; Electricity usage; Reactive power; Environmental data; Manufacturing data |
| [Steel Industry Datasets](markdown/steel_industry_datasets.md) | Likely | Multivariate, Time-Series | Real | Energy consumption; Steel production; Reactive power; CO2 emissions; Time-Series Data; Power Factor; Renewable energy integration |
| [Steel Plate Faults](markdown/steel_plate_faults.md) | Yes | Multivariate | Real | Steel plates; Fault classification; Multivariate dataset; Integer features; Real features; Pattern recognition; No missing values |
| [Superconductivty Data](markdown/superconductivty_data.md) | Yes | Multivariate | Real | Superconductors; Physics and Chemistry; Critical temperature prediction; Multivariate data; Real-valued features; Chemical formula data; No missing values |
| [Tig Welding](markdown/tig_welding.md) | Yes | Image, Multivariate | Real | TIG welding; Aluminium 5083; HDR camera; Weld defect classification; Neural networks; Image data; Non-destructive testing |
| [Tennessee Eastman Process Simulation Dataset](markdown/tennessee_eastman_process_simulation_dataset.md) | Yes | Multivariate, Time-Series, Synthetic | Synthetic | Chemical process simulation; Fault detection; Anomaly detection; Process monitoring; Time-series data; Multivariate data; Synthetic data |
| [Textures Classification Dataset](markdown/textures_classification_dataset.md) | Likely | Image data, Surface defect inspection | Information not available | Texture classification; Surface defect inspection; Image dataset; Convolutional neural network; Machine learning; Defect detection; Computer vision |
| [Textures Under Varying Illumination](markdown/textures_under_varying_illumination.md) | Information not available | Image, Multivariate, Variations in scale, pose, and illumination | Real | Texture images; Illumination variation; Pose variation; Scale variation; Material recognition; Image database; Computer vision |
| [The Reference Energy Disaggregation Data Set Redd ](markdown/the_reference_energy_disaggregation_data_set__redd_.md) | Yes | Multivariate, Time-Series | Real | Energy disaggregation; Residential buildings; Power consumption; Multivariate time series; High frequency measurements; Circuit level data; Massachusetts |
| [Tool Path Generation](markdown/tool_path_generation.md) | Likely | Multivariate | Real | 5-axis machining; Tool path optimization; Shape deviation; Cutting conditions; Manufacturing; Regression; Multivariate data |
| [Top Defense Manufacturers](markdown/top_defense_manufacturers.md) | Yes | Tabular, Multivariate | Real | Defense contractors; Military industry; Company revenue data; Business data; International companies; Defense sector; Revenue ranking |
| [Turbofan Engine Degradation Simulation Data Set](markdown/turbofan_engine_degradation_simulation_data_set.md) | Yes | Multivariate, Time-Series | Synthetic | Turbofan engine; Degradation simulation; C-MAPSS; Run-to-failure; Prognostics; Sensor data; Fault modes |
| [Turning Dataset For Chatter Diagnosis](markdown/turning_dataset_for_chatter_diagnosis.md) | Yes | Multivariate, Time-Series | Real | Chatter diagnosis; Machining; Turning; Accelerometer data; Microphone data; Laser tachometer; Time-series sensor data |
| [U S Crude Oil Imports](markdown/u_s__crude_oil_imports.md) | Yes | Multivariate, Time-Series | Real | Crude oil; Energy imports; Time-series data; United States; Oil and Gas; Economic analysis; Supply chain |
| [Uk-Dale Dataset](markdown/uk-dale_dataset.md) | Yes | Multivariate, Time-Series | Real | Energy consumption; Domestic electricity usage; Appliance-level data; High-frequency power data; Time-series data; Multivariate; United Kingdom |
| [Urban Land Cover](markdown/urban_land_cover.md) | Yes | Multivariate | Real | Urban land cover; High resolution aerial imagery; Multivariate data; Classification; Feature selection; Spectral data; Texture analysis |
| [Vehicle Manufacturing Dataset](markdown/vehicle_manufacturing_dataset.md) | Information not available | Information not available | Information not available | |
| [Versatile Production Dataset](markdown/versatile_production_dataset.md) | Likely | Multivariate, Time-Series, Condition Monitoring | Real | Industrial production; Condition monitoring; Predictive maintenance; Manufacturing data; Sensor data; Anomaly detection; Time-series data |
| [Wm811K Wafer Maps](markdown/wm811k_wafer_maps.md) | Likely | Classification | Real | Wafer map; Semiconductor manufacturing; Defect classification; Pattern recognition; Large-scale dataset; Machine learning; Failure analysis |
| [Water Distribution Wadi Dataset](markdown/water_distribution__wadi__dataset.md) | Yes | Multivariate, Time-Series | Real | Cyber-Physical Systems; Water distribution; Critical infrastructure; Attack scenarios; Sensor data; Time-series data; Anomaly detection |
| [Wind Turbine Scada Dataset](markdown/wind_turbine_scada_dataset.md) | Yes | Multivariate, Time-Series | Real | Wind turbine; SCADA data; Renewable energy; Time-series; Multivariate; Power generation; Wind speed and direction |
| [Wine Quality](markdown/wine_quality.md) | Yes | Multivariate | Real | Wine samples; Physicochemical tests; Wine quality; Red and white wine; Portuguese Vinho Verde; Classification; Regression |## Acknowledgements
This repository was partly inspired by and developed using ideas from the following repositories:
- [Industrial ML Datasets by Nicolas J92](https://github.com/nicolasj92/industrial-ml-datasets)
- [Awesome Industrial Machine Datasets by Makinarocks](https://github.com/makinarocks/awesome-industrial-machine-datasets)
- [Industrial Surface Inspection Datasets by Donrax](https://github.com/donrax/industrial-surface-inspection-datasets)Their work provided useful reference points during the development of this project.
## LLM Assistance for JSON Generation
To save time and maintain a consistent standard, I used a Language Model (LLM) to automatically build the JSON files. The LLM reads the `datasets.csv`, loads each webpage, and creates the JSON based on a predefined template. This approach not only addresses time limitations but also makes it easier to update the JSON files if any changes are required.
## Contribution Guidelines
Thank you for considering contributing to our repository.
### How You Can Contribute
You can contribute in several ways:
- **Suggest a New Dataset**: Propose a new dataset by creating an issue under the "Enhancement" label in the Issues tab.
- **Add a Dataset**: Create a JSON file describing a dataset and submit a pull request to add it to the repository.
- **Suggest Changes**: You can suggest improvements through the Issues tab or directly edit the JSON files and submit your changes via a pull request.### Adding a Dataset
Before adding a new dataset, please ensure that it is unique and not already included in the repository.To add a dataset:
1. Create a JSON file that accurately describes the dataset, following the same template as the existing datasets in the `json` folder.
2. Place this JSON file in the `json/manual` folder.#### Updating Documentation
To update the documentation (Markdown and HTML files) and refresh the README:
1. Run the `generate_documentation.py` script located in the root of the repository. This script will:
- Generate Markdown files in the `markdown` folder.
- Generate HTML files in the `html` folder.
- Update the `README.md` file with the latest datasets table.### Making a Pull Request
Please adhere to these guidelines when submitting a pull request:
- **Check for Duplicates**: Ensure your contribution is unique and not already included.
- **Submit Separate Pull Requests**: Submit individual pull requests for each suggestion or dataset.
- **Follow the format**: Use our JSON template for datasets and maintain readability and structure in documentation.