{"id":15002399,"url":"https://github.com/divinemoments/project-2-titanic-sql-analysis","last_synced_at":"2025-03-12T02:36:51.610Z","repository":{"id":254842796,"uuid":"847752356","full_name":"divinemoments/Project-2-Titanic-SQL-Analysis","owner":"divinemoments","description":"The objective of Project 2 is to perform SQL analysis to extract insights about Titanic passenger demographics, survival rates, and other relevant statistics.","archived":false,"fork":false,"pushed_at":"2024-08-26T14:27:16.000Z","size":9,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-18T12:15:51.662Z","etag":null,"topics":["sql","sql-server","titanic"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/divinemoments.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-26T13:33:28.000Z","updated_at":"2024-08-26T23:53:38.000Z","dependencies_parsed_at":null,"dependency_job_id":"52e65032-e1d7-477d-91b2-8d870695a71c","html_url":"https://github.com/divinemoments/Project-2-Titanic-SQL-Analysis","commit_stats":{"total_commits":4,"total_committers":1,"mean_commits":4.0,"dds":0.0,"last_synced_commit":"938f8ee3c458f8404e48a75707ea3f373054f89d"},"previous_names":["divinemoments/project-2"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/divinemoments%2FProject-2-Titanic-SQL-Analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/divinemoments%2FProject-2-Titanic-SQL-Analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/divinemoments%2FProject-2-Titanic-SQL-Analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/divinemoments%2FProject-2-Titanic-SQL-Analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/divinemoments","download_url":"https://codeload.github.com/divinemoments/Project-2-Titanic-SQL-Analysis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243146536,"owners_count":20243737,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["sql","sql-server","titanic"],"created_at":"2024-09-24T18:49:58.931Z","updated_at":"2025-03-12T02:36:51.591Z","avatar_url":"https://github.com/divinemoments.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Project 2: Titanic SQL Analysis\n## Skill: SQL\n## Difficulty: ★☆☆☆☆\n\n\n### Dataset:\nTitanic Dataset\n\n### Objective:\nPerform SQL analysis to extract insights about passenger demographics, survival rates, and other relevant statistics.\n\n### Description:\nOne slow Saturday morning, I then decided to sharpen my SQL skills by diving into the Titanic dataset. For an hour or so, I worked on analyzing the data to uncover meaningful insights. This project showcases my proficiency in SQL through tasks such as data manipulation, aggregation, and reporting.\n\n### SQL Tool:\nMicrosoft SQL Server\n\n#\n\n### Clean the data\nWill not delete rows with NULL rather define the details better\n\nSTEP 1: Add Columns\n\n```sql\nALTER TABLE titanic\nADD PassengerClass VARCHAR(10),\n    SurvivalStatus VARCHAR(15),\n    EmbarkedLocation VARCHAR(15);\n```\n\nSTEP 2: Define the data in the pclass, survived, embarked\n\n\n```sql\nUPDATE titanic\nSET PassengerClass = CASE\n    WHEN pclass = 1 THEN 'First'\n    WHEN pclass = 2 THEN 'Middle'\n    WHEN pclass = 3 THEN 'Third'\nEND;\n\nUPDATE titanic\nSET SurvivalStatus = CASE\n    WHEN survived = 'True' THEN 'Survived'\n    WHEN survived = 'False' THEN 'Did not Survive'\nEND;\n\nUPDATE titanic\nSET EmbarkedLocation = CASE\n    WHEN embarked = 'C' THEN 'Cherbourg'\n    WHEN embarked = 'Q' THEN 'Queenstown'\n    WHEN embarked = 'S' THEN 'Southampton'\nEND;\n```\n\n### 1. Find the Total Number of Passengers.\n\n```sql\nSELECT COUNT(*)\nFROM titanic;\n```\n\n### 2. List the Names and Ages of Passengers Who Were Younger Than 18.\n\n```sql\nSELECT name, age\nFROM titanic\nWHERE age\u003c18;\n```\n\n### 3. Count the Number of Passengers Who Embarked at Each Port (Embarked).\n\n```sql\nSELECT EmbarkedLocation, COUNT(*) AS PassengerCount\nFROM titanic\nWHERE EmbarkedLocation IS NOT NULL\nGROUP BY EmbarkedLocation\nORDER BY PassengerCount DESC;\n```\n\n### 4. List the Names of All Passengers Who Survived.\n\n```sql\nSELECT name\nFROM titanic\nWHERE survived='True';\n```\n\n### 5. Get the Maximum Age of Passengers in Each Class (Pclass).\n\n```sql\nSELECT PassengerClass, MAX(age) AS MaxAge\nFROM titanic\nGROUP BY PassengerClass\nORDER BY PassengerClass;\n```\n\n### 6. Find the Passenger Who Had the Highest Fare and List Their Details.\n\n```sql\nSELECT TOP 1 *\nFROM titanic\nORDER BY fare DESC;\n```\n\n### 7. Calculate the Average Age of Passengers.\n\n```sql\nSELECT ROUND(AVG(age),2) AS AvgAge\nFROM titanic;\n```\n\n### 8. Find the Most Common Embarked Port.\n\n```sql\nSELECT TOP 1 EmbarkedLocation, COUNT(*) AS EmbarkedCount\nFROM titanic\nGROUP BY EmbarkedLocation\nORDER BY EmbarkedCount DESC;\n```\n\n### 9. List the Names and Ticket Numbers of All Passengers Who Had a Ticket Number Starting with 'A'.\n\n```sql\nSELECT name, ticket\nFROM titanic\nWHERE ticket LIKE 'A%';\n```\n\n### 10. Get the Distribution of Passengers by Class and Gender.\n\n```sql\nSELECT\n\tPassengerClass,\n\tSex,\n\tCOUNT(*) AS PassengerCount\nFROM titanic\nGROUP BY\n\tPassengerClass,\n\tSex\nORDER BY\n\tPassengerClass,\n\tSex;\n```\n\n### 11. Find the Number of Passengers Who Had a Missing Age Value.\n\n```sql\nSELECT COUNT(*) AS PassengerCount\nFROM titanic\nWHERE Age IS NULL;\n```\n\n### 12. Determine the Survival Rate (Percentage) for Each Class (Pclass).\nSTEP 1. Define the table: pclass, survived, total\n\n```sql\nSELECT\n\tPassengerClass,\n\tSUM(CASE WHEN survived = 'True' THEN 1 ELSE 0 END) AS SurvivedPassenger,\n\tCOUNT(*) AS TotalPassenger\nFROM titanic\nGROUP BY PassengerClass;\n```\n\nSTEP 2. Calculate the survival rate\n\n```sql\nWITH SurvivalStat AS(\nSELECT\n\tPassengerClass,\n\tSUM(CASE WHEN survived = 'True' THEN 1 ELSE 0 END) AS SurvivedPassenger,\n\tCOUNT(*) AS TotalPassenger\nFROM titanic\nGROUP BY PassengerClass\n)\n\nSELECT\n\tPassengerClass,\n\tROUND((CAST(SurvivedPassenger AS FLOAT) / TotalPassenger) * 100,2) AS SurvivalRate\nFROM SurvivalStat;\n```\n\n### 13. Find All the Passengers with the Same Name and Count How Many There Are.\n\n```sql\nSELECT\n\tname,\n\tCOUNT(*) AS PassengerCount\nFROM titanic\nGROUP BY name\nHAVING Count(*) \u003e 1\nORDER BY PassengerCount DESC;\n```\n\n### 14. List the Names and Ages of Passengers Who Were in the Same Cabin (Cabin) as Someone Who Did Not Survive.\nSelf-join Approach\n\n```sql\nSELECT DISTINCT t1.name, t1.age\nFROM titanic t1\nJOIN titanic t2 ON t1.cabin = t2.cabin\nWHERE t2.survived = 0\n  AND t1.survived = 1;\n```\n\nCTE (Common Table Expression) Query Approach.\n*I am more comfortable with this, though it is longer. Need to practice self-join more.*\n\nSTEP 1. Did not survive Cabin list\n\n```sql\nSELECT DISTINCT cabin\nFROM titanic\nWHERE\n\tsurvived=0\n\tAND cabin IS NOT NULL;\n```\n\nSTEP 2. Get the survived passengers of STEP 1 list\n\n```sql\nWITH CabinDidnotSurvive AS(\n\tSELECT DISTINCT cabin\n\tFROM titanic\n\tWHERE\n\t\tsurvived=0\n\t\tAND cabin IS NOT NULL\n)\n\nSELECT\n\tname,\n\tage\nFROM titanic\nWHERE\n\tcabin IN(SELECT cabin FROM cabindidnotsurvive)\n\tAND survived=1;\n```\n\n### 15. Calculate the Average Fare Paid by Passengers in Each Port (Embarked).\n\n```sql\nSELECT EmbarkedLocation, ROUND(AVG(fare),2) AS AvgFare\nFROM titanic\nWHERE EmbarkedLocation IS NOT NULL\nGROUP BY EmbarkedLocation;\n```\n\n### 16. Find the Number of Passengers Who Survived and Had Siblings or Spouses on Board (SibSp \u003e 0).\n\n```sql\nSELECT COUNT(*) AS PassengerCount\nFROM titanic\nWHERE\n\tsibsp \u003e 1\n\tAND survived = 1;\n```\n\n### 17. List the Top 5 Most Expensive Tickets and List the Details of the Passengers Who Purchased Them.\n\n```sql\nSELECT TOP 5 *\nFROM titanic\nORDER BY fare DESC;\n```\n\n### 18. Get the Distribution of Survivors by Gender (Sex).\n\n```sql\nSELECT sex, COUNT (*) AS PassengerCount\nFROM titanic\nWHERE survived = 'true'\nGROUP BY sex;\n```\n\n### 19. Find the Oldest Passenger.\n\n```sql\nSELECT TOP 1 name, age\nFROM titanic\nORDER BY age DESC;\n```\n\n### 20. List the Details of the Oldest and Youngest Passengers.\n\n```sql\nSELECT *\nFROM Titanic\nWHERE\n\tAge = (SELECT MIN(age) FROM titanic)\n\tOR Age = (SELECT MAX(age) FROM titanic);\n```\n\n\n### End\n*What a rewarding way to spend a Saturday morning! This SQL exercise not only allowed me to refine my skills but also gave me a chance to delve deep into the Titanic dataset, uncovering valuable insights along the way. It's amazing how much you can learn and achieve in just a couple of hours when you're passionate about data analysis. I hope you find this project as enjoyable and insightful as I did!*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdivinemoments%2Fproject-2-titanic-sql-analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdivinemoments%2Fproject-2-titanic-sql-analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdivinemoments%2Fproject-2-titanic-sql-analysis/lists"}