https://github.com/alezoon/movie-revenue-prediction
Sk-learn practice using Linear Regression, ML workflow practice.
https://github.com/alezoon/movie-revenue-prediction
jupyter machine-learning matplotlib-pyplot numpy pandas python scikit-learn
Last synced: about 1 month ago
JSON representation
Sk-learn practice using Linear Regression, ML workflow practice.
- Host: GitHub
- URL: https://github.com/alezoon/movie-revenue-prediction
- Owner: alezoon
- Created: 2025-07-31T00:28:25.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-08-29T03:02:40.000Z (3 months ago)
- Last Synced: 2025-08-29T06:51:33.974Z (3 months ago)
- Topics: jupyter, machine-learning, matplotlib-pyplot, numpy, pandas, python, scikit-learn
- Language: Jupyter Notebook
- Homepage:
- Size: 725 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Movie Revenue Prediction
Machine learning model predicting box office revenue using budget and popularity metrics.
## Project Overview
This project analyzes movie industry data to predict box office revenue using linear regression with logarithmic transformation. The model identifies key financial indicators that drive movie success.
## Key Results
- **Model:** Linear Regression with log transformation
- **R² Score:** 0.65 (65% variance explained)
- **RMSE:** 1.18 (log scale)
- **Features:** Budget (log transformed), Popularity score
- Successfully improved predictions through feature engineering
## Technologies
- Python (pandas, scikit-learn, numpy, matplotlib)
- Linear Regression with log transformation
- Feature engineering and preprocessing
- Modular code architecture
## Project Structure
```
MovieRevenuePrediction/
├── data/ # Dataset storage
│ └── tmdb_movie_dataset.csv
├── src/ # Source code
│ ├── data_processing.py # Data cleaning and preprocessing
│ └── model.py # Model training and evaluation
├── notebooks/ # Jupyter notebooks
│ └── movie_revenue_analysis.ipynb
├── results/ # Model outputs
│ ├── plots/ # Visualization outputs
│ └── metrics.txt # Model performance metrics
├── requirements.txt # Project dependencies
└── README.md # Project documentation
```
## Features
- Automated data preprocessing pipeline
- Log transformation for handling skewed financial data
- Train/test split validation
- Comprehensive model evaluation metrics
- Revenue prediction for new movies
## Example Prediction
```python
# Predict revenue for a $50M budget movie with popularity score of 3
predicted_revenue = revenue_prediction(reg, 50000000, 3)
# Output: $116,991,401
## Usage
```bash
# Install dependencies
pip install -r requirements.txt
# Run the notebook
jupyter notebook movie_revenue_analysis.ipynb
# Or use the modules directly
from src.model import train_model, revenue_prediction
```