https://github.com/snowdogapps/python-data-engineer-intern-recruitment-test
This is test recruitment task for intern candidate for Python Developer or Data Engineer position.
https://github.com/snowdogapps/python-data-engineer-intern-recruitment-test
Last synced: 16 days ago
JSON representation
This is test recruitment task for intern candidate for Python Developer or Data Engineer position.
- Host: GitHub
- URL: https://github.com/snowdogapps/python-data-engineer-intern-recruitment-test
- Owner: SnowdogApps
- Created: 2020-11-16T08:00:33.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2020-11-16T08:03:51.000Z (over 4 years ago)
- Last Synced: 2025-04-12T21:58:32.897Z (16 days ago)
- Size: 1.95 KB
- Stars: 0
- Watchers: 4
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Python Developer / Data Engineer Intern recruitment test
This is test recruitment task for intern candidate for Python Developer or Data Engineer position.0. Fork this repository.
1. Download from https://www.imdb.com/interfaces/ files:
- `title.basics.tsv.gz`
- `title.ratings.tsv.gz`2. Implement Python script which merges data about videos from those files using *tconst* column and save it into `all_videos.tsv` file. New file should contain columns from `title.basics.tsv.gz` and `title.ratings.tsv.gz`
3. Implement Python script which extracts from `all_videos.tsv` only these videos with *titleType* "movie" and save them into `movies.tsv` file
4. Implement Python script which calculates average rating for each *genre* save them into `movie_rating_by_genres.tsv` file
5. Repeat point 4. for average rating per *startYear* column and save them into `movie_rating_by_year.tsv` file
6. All scripts and `movie_rating_by_genres.tsv`, `movie_rating_by_year.tsv` files upload to your repository. Separate commits for points 2-5.
7. Optional - visualize on chart `movie_rating_by_genres.tsv` and `movie_rating_by_year.tsv` files using any Python library and upload script and png files to your repository.