https://github.com/bchoubert/hadoop-regression-vehicles

polytech-lyon

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/bchoubert/hadoop-regression-vehicles
Owner: bchoubert
Created: 2017-01-15T21:03:43.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2017-01-15T21:49:09.000Z (over 8 years ago)
Last Synced: 2025-03-11T13:53:00.698Z (3 months ago)
Topics: polytech-lyon
Language: Java
Size: 4.88 KB
Stars: 0
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        

# hadoop-regression-vehicles

This repo is an example of Hadoop Map-Combiner-Reduce operation over CSV files.

Regression is a Java program that gathers vehicle data from Vehicules.csv file, representing a vehicle sample.

The goal of this is to calculate a covariance between the weight and the consumption of the vehicle.

## File structure

The Vehicules.csv is structured like this :

Weight (kg) | Consumption (liters)

--- | ---

## Execute the project

With hadoop installed, you must put the file on the hadoop disk :

`hadoop fs -put Vehicules.csv /test`

Next, after having compiled the project (with Maven for example : `mvn clean package`), you will execute the project :

`hadoop jar NameOfYourJar.jar RegressionDriver /test/Vehicules.csv /results`

You can see the results using  (Hue) for example.

## Results

X = Weight

Y = Consumption

```

n	28.0

Sx	33515.0

Sx²	4.2694125E7

X Variance	92066.67729591834

Sy	254.1

Sy²	2440.5699999999993

Sum X x Y	321404.5

X average	1196.9642857142858

X² average	1524790.1785714286

Y average	9.075

Y² average	87.16321428571426

X x Y average	11478.732142857143

X variance	92066.67729591834

Y variance	4.807589285714272

Covariance	616.28125

Corrélation r	0.9263263981866983

β0	1.0626912269314541

β1	0.006693857844127085

yi = 1.0626912269314541 + 0.006693857844127085 * xi + ei	0.0

```

The correlation between the weight and the consumption of the vehicle is very good !

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bchoubert/hadoop-regression-vehicles

Awesome Lists containing this project

README