Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ereh11/when-was-the-golden-age-of-video-games
Using Data Manipulation skills in SQL for finding hidden insides in data set to deliver "When was the golden age of video games?"
https://github.com/ereh11/when-was-the-golden-age-of-video-games
Last synced: 6 days ago
JSON representation
Using Data Manipulation skills in SQL for finding hidden insides in data set to deliver "When was the golden age of video games?"
- Host: GitHub
- URL: https://github.com/ereh11/when-was-the-golden-age-of-video-games
- Owner: Ereh11
- Created: 2023-10-24T21:12:13.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-10-24T22:31:33.000Z (about 1 year ago)
- Last Synced: 2023-10-24T23:25:17.329Z (about 1 year ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 0 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# :video_game: When-Was-the-Golden-Age-of-Video-Games :video_game:
The global gaming market is projected to be worth more than $300 billion by 2027 according to Mordor Intelligence. With so much money at stake, the major game publishers are hugely incentivized to create the next big hit. But are games getting better, or has the golden age of video games already passed?
I'll explore the top 400 best-selling video games created between 1977 and 2020. I'll compare a dataset on game sales with critic scores and user reviews to determine whether or not video games have improved as the gaming market has grown.
The database contains two tables
game_sales
column
type
meaning
game
varchar
Name of the video game
platform
varchar
Gaming platform
publisher
varchar
Game publisher
developer
varchar
Game developer
games_sold
float
Number of copies sold (millions)
year
int
Release year
reviews
column
type
meaning
game
varchar
Name of the video game
critic_score
float
Critic score according to Metacritic
user_score
float
User score according to MetacriticAt the beginning I looked at some of the top-selling video games
## 1. The ten best-selling video games
```python
SELECT *
FROM game_sales
ORDER BY games_sold DESC
LIMIT 10;
```
game
platform
publisher
developer
games_sold
year
Wii Sports for Wii
Wii
Nintendo
Nintendo EAD
82.90
2006
Super Mario Bros. for NES
NES
Nintendo
Nintendo EAD
40.24
1985
Counter-Strike: Global Offensive for PC
PC
Valve
Valve Corporation
40.00
2012
Mario Kart Wii for Wii
Wii
Nintendo
Nintendo EAD
37.32
2008
PLAYERUNKNOWN'S BATTLEGROUNDS for PC
PC
PUBG Corporation
PUBG Corporation
36.60
2017
Minecraft for PC
PC
Mojang
Mojang AB
33.15
2010
Wii Sports Resort for Wii
Wii
Nintendo
Nintendo EAD
33.13
2009
Pokemon Red / Green / Blue Version for GB
GB
Nintendo
Game Freak
31.38
1998
New Super Mario Bros. for DS
DS
Nintendo
Nintendo EAD
30.80
2006
New Super Mario Bros. Wii for Wii
Wii
Nintendo
Nintendo EAD
30.30
2009
## 2. Missing review scores
Wow, the best-selling video games were released between 1985 to 2017! That's quite a range; we'll have to use data from the
reviews
table to gain more insight on the best years for video games.First, it's important to explore the limitations of our database. One big shortcoming is that there is not any
reviews
data for some of the games on thegame_sales
table.```python
SELECT COUNT(game_sales.game)
FROM game_sales
LEFT JOIN reviews
USING(game)
WHERE critic_score IS NULL AND user_score IS NULL;
```
count
31
## 3. Years that video game critics loved
It looks like a little less than ten percent of the games on the
game_sales
table don't have any reviews data. That's a small enough percentage that we can continue our exploration, but the missing reviews data is a good thing to keep in mind as we move on to evaluating results from more sophisticated queries.There are lots of ways to measure the best years for video games! Let's start with what the critics think.
```python
SELECT year,
ROUND(AVG(critic_score), 2) AS avg_critic_score
FROM game_sales
INNER JOIN reviews
USING(game)
GROUP BY year
ORDER BY avg_critic_score DESC
LIMIT 10;
```
year
avg_critic_score
1990
9.80
1992
9.67
1998
9.32
2020
9.20
1993
9.10
1995
9.07
2004
9.03
1982
9.00
2002
8.99
1999
8.93
## 4. Was 1982 really that great?
The range of great years according to critic reviews goes from 1982 until 2020: we are no closer to finding the golden age of video games!
Hang on, though. Some of those
avg_critic_score
values look like suspiciously round numbers for averages. The value for 1982 looks especially fishy. Maybe there weren't a lot of video games in our dataset that were released in certain years.Let's update our query and find out whether 1982 really was such a great year for video games.
```python
SELECT COUNT(game_sales.game) AS num_games,
year,
ROUND(AVG(critic_score), 2) AS avg_critic_score
FROM game_sales
INNER JOIN reviews
USING(game)
GROUP BY year
HAVING COUNT(game_sales.game) > 4
ORDER BY avg_critic_score DESC
LIMIT 10;
```
num_games
year
avg_critic_score
10
1998
9.32
11
2004
9.03
9
2002
8.99
11
1999
8.93
13
2001
8.82
26
2011
8.76
13
2016
8.67
18
2013
8.66
20
2008
8.63
12
2012
8.62
## 5. Years that dropped off the critics' favorites list
That looks better! The
num_games
column convinces us that our new list of the critics' top games reflects years that had quite a few well-reviewed games rather than just one or two hits. But which years dropped off the list due to having four or fewer reviewed games? Let's identify them so that someday we can track down more game reviews for those years and determine whether they might rightfully be considered as excellent years for video game releases!To get started, we've created tables with the results of our previous two queries:
top_critic_years
column
type
meaning
year
int
Year of video game release
avg_critic_score
float
Average of all critic scores for games released in that year
top_critic_years_more_than_four_games
column
type
meaning
year
int
Year of video game release
num_games
int
Count of the number of video games released in that year
avg_critic_score
float
Average of all critic scores for games released in that year```python
SELECT year,
avg_critic_score
FROM top_critic_yearsEXCEPT
SELECT year,
avg_critic_score
FROM top_critic_years_more_than_four_games
ORDER BY avg_critic_score DESC;
```
year
avg_critic_score
1990
9.80
1992
9.67
2020
9.20
1993
9.10
1995
9.07
1982
9.00
## 6. Years video game players loved
Based on our work in the task above, it looks like the early 1990s might merit consideration as the golden age of video games based on
critic_score
alone, but we'd need to gather more games and reviews data to do further analysis.Let's move on to looking at the opinions of another important group of people: players! To begin, let's create a query very similar to the one we used in Task Four, except this one will look at
user_score
averages by year rather thancritic_score
averages.```python
SELECT COUNT(gs.game) AS num_games,
gs.year,
ROUND(AVG(r.user_score), 2) AS avg_user_score
FROM game_sales AS gs
INNER JOIN reviews AS r
USING(game)
GROUP BY gs.year
HAVING COUNT(gs.game) > 4
ORDER BY avg_user_score DESC
LIMIT 10;
```
num_games
year
avg_user_score
8
1997
9.50
10
1998
9.40
23
2010
9.24
20
2009
9.18
20
2008
9.03
5
1996
9.00
13
2005
8.95
16
2006
8.95
8
2000
8.80
11
1999
8.80
## 7. Years that both players and critics loved
Alright, we've got a list of the top ten years according to both critic reviews and user reviews. Are there any years that showed up on both tables? If so, those years would certainly be excellent ones!
Recall that we have access to the
top_critic_years_more_than_four_games
table, which stores the results of our top critic years query from Task 4:
top_critic_years_more_than_four_games
column
type
meaning
year
int
Year of video game release
num_games
int
Count of the number of video games released in that year
avg_critic_score
float
Average of all critic scores for games released in that yearWe've also saved the results of our top user years query from the previous task into a table:
top_user_years_more_than_four_games
column
type
meaning
year
int
Year of video game release
num_games
int
Count of the number of video games released in that year
avg_user_score
float
Average of all user scores for games released in that year```python
SELECT year
FROM top_critic_years_more_than_four_games
INNER JOIN top_user_years_more_than_four_games
USING(year)
```
year
1998
2008
2002
## 8. Sales in the best video game years
Looks like we've got three years that both users and critics agreed were in the top ten! There are many other ways of measuring what the best years for video games are, but let's stick with these years for now. We know that critics and players liked these years, but what about video game makers? Were sales good? Let's find out.
This time, we haven't saved the results from the previous task in a table for you. Instead, we'll use the query from the previous task as a subquery in this one! This is a great skill to have, as we don't always have write permissions on the database we are querying.
```python
SELECT year,
SUM(games_sold) AS total_games_sold
FROM game_sales
WHERE year IN
(
SELECT year
FROM top_critic_years_more_than_four_games
INNER JOIN top_user_years_more_than_four_games
USING(year)
)
GROUP BY year
ORDER BY total_games_sold DESC;
```
year
total_games_sold
2008
105.07
1998
101.52
2002
58.67