https://github.com/pankajarm/boston-house-pricing-prediction
https://github.com/pankajarm/boston-house-pricing-prediction
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/pankajarm/boston-house-pricing-prediction
- Owner: pankajarm
- Created: 2017-01-29T18:32:52.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-01-29T18:51:33.000Z (over 8 years ago)
- Last Synced: 2025-03-02T09:39:40.769Z (3 months ago)
- Language: Python
- Size: 5.86 KB
- Stars: 1
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Boston-House-Pricing-Prediction
In this nano project, We'll use Multiple linear regression (many Variables) to make prediction of Boston House Pricing from
the given set of variables like number of rooms, square feet size of house, etc.We are using SciKit Learn to Create Linear Regression Model
Read more about SciKit Learn LinearRegression Here http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
In this model we are using Multiple Linear Regression equation
y = m1x1+ m2x2 + m3x3 +.....mnxn +bIn our example (These are given from data):
y is MEDV: Median value of owner-occupied homes in $1000's
x1. CRIM: per capita crime rate by town
x2. ZN: proportion of residential land zoned for lots over 25,000 sq.ft.
x3. INDUS: proportion of non-retail business acres per town
x4. CHAS: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
x5. NOX: nitric oxides concentration (parts per 10 million)
x6. RM: average number of rooms per dwelling
x7. AGE: proportion of owner-occupied units built prior to 1940
x8. DIS: weighted distances to five Boston employment centres
x9. RAD: index of accessibility to radial highways
x10. TAX: full-value property-tax rate per $10,000
x11. PTRATIO: pupil-teacher ratio by town
x12. B: 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
x13. LSTAT: % lower status of the population
These will be generated during model fit process
m1 is the slope (relationship factor ) between x1 and y, m2 is the slope between x2 and y, m3 is the slope between x3 and y, so on
b is constant value
Long Story short, They way it works (atleast in my understanding..), is SciKit Learn LinearRegression model will try to create a model of equation type y = m1x1+ m2x2 + m3x3 +.....mnxn +b from the given data which has multiple variables (x1,x2,x3...xn) and House Median Value (y), via fitting all values of x1 in relationship to y by finding slope m and by adding b (a constant value) to it.
Basically, by using SciKit Learn Linear Regression fit method, we are telling it to find correct relationhip for example, between given Data of x1 (CRIME in the Area) with y (Median House Value)
and
x2 (proportion of residential land zoned for lots over 25,000 sq.ft.) with y (Median House Value)
and
x3 (proportion of non-retail business acres per town ) with y (Median House Value)
and so on ...till x13 (% lower status of the population ) with y (Median House Value)
Hope it all make sense to you...
Read more about above varibles relationship to house pricing in Boston here.
https://archive.ics.uci.edu/ml/datasets/Housing