{"id":25246945,"url":"https://github.com/adzkykhairany/sql_job_analysis","last_synced_at":"2026-05-05T10:38:26.748Z","repository":{"id":249664616,"uuid":"831372091","full_name":"adzkykhairany/SQL_Job_Analysis","owner":"adzkykhairany","description":"2023 Data Job Analysis","archived":false,"fork":false,"pushed_at":"2024-07-26T17:22:06.000Z","size":162,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"adzkykhairany.github.io","last_synced_at":"2025-06-05T21:50:59.079Z","etag":null,"topics":["jupyter-notebook","postgresql","sql","sqlite"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/adzkykhairany.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-20T11:08:26.000Z","updated_at":"2024-07-26T17:22:09.000Z","dependencies_parsed_at":"2025-04-05T22:15:09.754Z","dependency_job_id":null,"html_url":"https://github.com/adzkykhairany/SQL_Job_Analysis","commit_stats":null,"previous_names":["adzkykhairany/sql_job_analysis"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/adzkykhairany/SQL_Job_Analysis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adzkykhairany%2FSQL_Job_Analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adzkykhairany%2FSQL_Job_Analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adzkykhairany%2FSQL_Job_Analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adzkykhairany%2FSQL_Job_Analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/adzkykhairany","download_url":"https://codeload.github.com/adzkykhairany/SQL_Job_Analysis/tar.gz/refs/heads/adzkykhairany.github.io","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adzkykhairany%2FSQL_Job_Analysis/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265212671,"owners_count":23728574,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["jupyter-notebook","postgresql","sql","sqlite"],"created_at":"2025-02-12T02:56:35.337Z","updated_at":"2026-05-05T10:38:21.719Z","avatar_url":"https://github.com/adzkykhairany.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 2023 Data Job Analysis\nThis project explores the 2023 data job market, **focusing on data analyst roles**. It reviews top-paying positions, essential in-demand skills, and where high demand intersects with high salaries in data analytics. Additionally, the repository showcases various techniques and queries for proficiently managing and manipulating data within PostgreSQL databases.\n\n## Tools and Technologies\n- **SQL:** The primary language used for data manipulation and analysis.\n- **PostgreSQL:** The database management system used for running SQL queries.\n- **pgAdmin:** A tool for managing PostgreSQL databases and running queries.\n- **Git:** Version control system for managing project changes.\n- **GitHub:** Platform for hosting and sharing the project repository.\n\n## The Analysis\n### 1. Top Paying Data Analyst Jobs\n```sql\nSELECT\n    job_id,\n    job_title,\n    name AS company_name,\n    job_location,\n    job_schedule_type,\n    salary_year_avg,\n    job_posted_date\nFROM\n    job_postings_fact a\nLEFT JOIN company_dim b ON a.company_id = b.company_id\nWHERE\n    job_title_short = 'Data Analyst'\n    AND job_location = 'Anywhere'\n    AND salary_year_avg IS NOT NULL\nORDER BY\n    salary_year_avg DESC\nLIMIT 10\n```\n\n![Top Paying Jobs](assets/top_paying_jobs.png)\n*Visuaization using Python*\n\n**Insights:**\n- **Advanced Skill Requirements:** Top-paying data analyst roles require a mix of advanced skills and considerable experience.\n- **High-Paying Positions:** Roles such as `Data Scientist`, `Senior Data Analyst`, and `Analytics Manager` stand out for their high earning potential, reflecting the increasing value placed on advanced analytical skills and leadership in data-driven decision-making.\n\n### 2. Skills for Top Paying Data Analyst Jobs\n```sql\nWITH top_paying_job AS (\n    SELECT\n        job_id,\n        job_title,\n        salary_year_avg,\n        name AS company_name\n    FROM\n        job_postings_fact a\n    LEFT JOIN company_dim b ON a.company_id = b.company_id\n    WHERE\n        job_title_short = 'Data Analyst'\n        AND job_location = 'Anywhere'\n        AND salary_year_avg IS NOT NULL\n    ORDER BY\n        salary_year_avg DESC\n    LIMIT 10    \n)\nSELECT \n    c.*,\n    skills\nFROM top_paying_job c\nINNER JOIN skills_job_dim d ON c.job_id = d.job_id\nINNER JOIN skills_dim e ON d.skill_id = e.skill_id \nORDER BY\n    salary_year_avg DESC\n```\n![Top Paying Job Skills](assets/top_paying_job_skills.png)\n*Visuaization using Python*\n\n**Insights:**\n- **High-Paying Skills:** The chart emphasizes skills most commonly linked to high-paying data analyst roles.\n- **Crucial Skills:** Advanced `SQL`, machine learning expertise, and proficiency in data visualization tools (e.g., `Tableau`, `Power BI`) are vital for securing top-paying positions.\n- **Specialized Knowledge:** Skills in cloud computing (`AWS`, `Azure`) and statistical analysis are highly valued and contribute significantly to higher salaries.\n\n### 3. In-Demand Skills for Data Analyst\n```sql\nSELECT \n    skills,\n    COUNT(b.job_id) AS demand_count\nFROM \n    job_postings_fact a\nINNER JOIN \n    skills_job_dim b ON a.job_id = b.job_id\nINNER JOIN \n    skills_dim c ON b.skill_id = c.skill_id \nWHERE\n    job_title_short = 'Data Analyst'\n    AND job_work_from_home = TRUE\nGROUP BY\n    skills\nORDER BY \n    demand_count DESC\nLIMIT 5\n```\n\n| Skill     | Demand Count |\n|-----------|--------------|\n| SQL       | 7,291        |\n| Excel     | 4,611        |\n| Python    | 4,330        |\n| Tableau   | 3,745        |\n| Power BI  | 2,609        |\n\n**Insights:**\n- **SQL Dominance:** `SQL` continues to be the most in-demand skill, highlighting its essential role in data manipulation and querying across diverse industries.\n- **Excel’s Enduring Importance:** `Excel` remains a key tool for data analysis, particularly in financial and administrative contexts.\n- **Growing Python Popularity:** `Python` is increasingly valued for its versatility and use in data science and machine learning.\n- **Top Visualization Tools:** `Tableau` and `Power BI` are highly sought after for their powerful data visualization capabilities, crucial for effective reporting and decision-making.\n\n### 4. Highest Earning Skills for Data Analyst\n```sql\nSELECT \n    skills,\n    ROUND(AVG(salary_year_avg), 0) AS salary_avg\nFROM \n    job_postings_fact a\nINNER JOIN \n    skills_job_dim b ON a.job_id = b.job_id\nINNER JOIN \n    skills_dim c ON b.skill_id = c.skill_id \nWHERE\n    job_title_short = 'Data Analyst'\n    AND salary_year_avg IS NOT NULL\n    AND job_work_from_home = TRUE\nGROUP BY\n    skills\nORDER BY \n    salary_avg DESC\nLIMIT 25\n```\n\n|Skill          | Average Salary ($) |\n|---------------|--------------------|\n|PySpark        | 208,172            |\n|Bitbucket\t    | 189,155            |\n|Couchbase\t    | 160,515            |\n|Watson\t        | 160,515            |\n|DataRobot\t    | 155,486            |\n|GitLab\t        | 154,500            |\n|Swift\t        | 153,750            |\n|Jupyter\t    | 152,777            |\n|Pandas\t        | 151,821            |\n|Elasticsearch  | 145,000            |\n\n**Insights:**\n- **Big Data and Advanced Analytics:** Skills in `PySpark`, `DataRobot`, and `Databricks` are crucial for handling big data, leveraging advanced machine learning platforms, and performing unified analytics.\n- **DevOps and Automation:** Proficiency in `Bitbucket`, `Jenkins`, and `Kubernetes` underscores the significance of collaboration, version control, automated deployment, and managing scalable containerized applications.\n- **Core Programming and Data Science Libraries:** Expertise in `Pandas`, `Numpy`, and `Scikit-learn` is essential for data manipulation, numerical computing, and machine learning, forming the foundation of data science.\n\n### 5. Most Effective Skills for Career Growth\n```sql\nSELECT\n    c.skills,\n    COUNT(a.job_id) AS demand_count,\n    ROUND(AVG(salary_year_avg), 0) AS salary_avg\nFROM job_postings_fact a\nINNER JOIN skills_job_dim b ON a.job_id = b.job_id\nINNER JOIN skills_dim c ON b.skill_id = c.skill_id \nWHERE\n    job_title_short = 'Data Analyst'\n    AND salary_year_avg IS NOT NULL\n    AND job_work_from_home = TRUE\nGROUP BY c.skill_id\nHAVING COUNT(a.job_id) \u003e 10\nORDER BY\n    salary_avg DESC,\n    demand_count DESC \nLIMIT 25\n```\n\n|Skill\t    | Demand Count | Average Salary ($)|\n|-----------|--------------|-------------------|\n|Go\t        | 27\t       | 115,320           |    \n|Confluence | 11\t       | 114,210           |    \n|Hadoop\t    | 22\t       | 113,193           |    \n|Snowflake\t| 37\t       | 112,948           |    \n|Azure\t    | 34\t       | 111,225           |    \n|BigQuery\t| 13\t       | 109,654           |    \n|AWS\t    | 32\t       | 108,317           |    \n|Java\t    | 17\t       | 106,906           |    \n|SSIS\t    | 12\t       | 106,683           |    \n|Jira\t    | 20\t       | 104,918           |\n\n**Insights:**\n- **Top Paying Skill:** `Go` offers the highest average salary at $115,320, indicating its strong value in the data analyst job market.\n- **High Value Cloud Technologies:** Skills in `Azure`, `Snowflake`, and `AWS` are among the highest paying, reflecting the premium on cloud expertise.\n- **Competitive Data Tools:** Proficiency in tools like `Hadoop` and `BigQuery` also leads to substantial salaries, highlighting their importance in data management.\n\n## Conclusion\nMerged conclusion from the analysis insights:\n- **Top Paying Data Analyst Jobs:** High-paying data analyst roles like Data Scientist and Analytics Manager require advanced skills and experience.\n- **Skills for Top Paying Data Analyst Jobs:** Advanced SQL, machine learning, and data visualization skills are key for high-paying data analyst positions.\n- **In-Demand Skills for Data Analyst:** SQL is the most sought-after skill, with Excel and Python also important, and Tableau and Power BI are valued for data visualization.\n- **Highest Earning Skills for Data Analyst:** Skills in big data tools (e.g., PySpark) and advanced analytics (e.g., DataRobot) lead to the highest salaries.\n- **Most Effective Skills for Career Growth:** Skills in Go, Azure, and Snowflake offer the highest salaries and are crucial for career advancement in data analysis.\n\n## Acknowledgements\nSpecial thanks to Luke Barousse for the data insights provided in [his YouTube video](https://youtu.be/7mz73uXD9DA?si=TJWDsG3Eb68o0hoJ). ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadzkykhairany%2Fsql_job_analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fadzkykhairany%2Fsql_job_analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadzkykhairany%2Fsql_job_analysis/lists"}