{"id":13723840,"url":"https://github.com/benfred/github-analysis","last_synced_at":"2026-03-05T01:31:18.367Z","repository":{"id":28638381,"uuid":"118937586","full_name":"benfred/github-analysis","owner":"benfred","description":"Trending Programming Languages ranked by GitHub Users","archived":false,"fork":false,"pushed_at":"2021-12-31T20:16:50.000Z","size":182,"stargazers_count":309,"open_issues_count":3,"forks_count":40,"subscribers_count":18,"default_branch":"master","last_synced_at":"2025-10-21T07:43:18.712Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/benfred.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-01-25T16:24:05.000Z","updated_at":"2025-08-23T14:10:49.000Z","dependencies_parsed_at":"2022-08-07T14:00:27.517Z","dependency_job_id":null,"html_url":"https://github.com/benfred/github-analysis","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/benfred/github-analysis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benfred%2Fgithub-analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benfred%2Fgithub-analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benfred%2Fgithub-analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benfred%2Fgithub-analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/benfred","download_url":"https://codeload.github.com/benfred/github-analysis/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benfred%2Fgithub-analysis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30104281,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-05T01:06:53.091Z","status":"ssl_error","status_checked_at":"2026-03-05T01:02:35.679Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T01:01:46.245Z","updated_at":"2026-03-05T01:31:18.318Z","avatar_url":"https://github.com/benfred.png","language":"Go","funding_links":[],"categories":["Go"],"sub_categories":[],"readme":"\u003c!--This File was Autogenerated from README_template.md on April 10 2018. Do not modify this file --\u003e\n\nREADME\n======\n\nThis project finds trends in popularity amongst programming languages, by analyzing over 1.25 billion events from the public GitHub timeline and figuring\nout how many users each language has. See [this blog post](http://www.benfrederickson.com/ranking-programming-languages-by-github-users/) for an overview and discussion of the trends.\n\nOn April 10 2018 the rankings of each language by the number of active users on GitHub are:\n\n\n\u003ctable\u003e\n\u003cthead\u003e\u003ctr\u003e\u003cth\u003eRank\u003c/th\u003e\u003cth\u003eLanguage\u003c/th\u003e\u003cth align=\"center\"\u003eMAU\u003c/th\u003e\u003cth align=\"center\"\u003eTrend\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\n\u003ctr\u003e\u003ctd\u003e1\u003c/td\u003e\u003ctd\u003eJavaScript\u003c/td\u003e\u003ctd align=\"right\"\u003e21.41%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/JavaScript_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e2\u003c/td\u003e\u003ctd\u003ePython\u003c/td\u003e\u003ctd align=\"right\"\u003e14.69%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Python_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e3\u003c/td\u003e\u003ctd\u003eJava\u003c/td\u003e\u003ctd align=\"right\"\u003e12.65%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Java_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e4\u003c/td\u003e\u003ctd\u003eC++\u003c/td\u003e\u003ctd align=\"right\"\u003e8.30%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/C%2B%2B_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e5\u003c/td\u003e\u003ctd\u003eC\u003c/td\u003e\u003ctd align=\"right\"\u003e5.84%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/C_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e6\u003c/td\u003e\u003ctd\u003ePHP\u003c/td\u003e\u003ctd align=\"right\"\u003e5.31%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/PHP_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e7\u003c/td\u003e\u003ctd\u003eC#\u003c/td\u003e\u003ctd align=\"right\"\u003e4.79%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/C%23_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e8\u003c/td\u003e\u003ctd\u003eShell\u003c/td\u003e\u003ctd align=\"right\"\u003e4.78%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Shell_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e9\u003c/td\u003e\u003ctd\u003eGo\u003c/td\u003e\u003ctd align=\"right\"\u003e4.36%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Go_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e10\u003c/td\u003e\u003ctd\u003eTypeScript\u003c/td\u003e\u003ctd align=\"right\"\u003e3.82%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/TypeScript_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e11\u003c/td\u003e\u003ctd\u003eRuby\u003c/td\u003e\u003ctd align=\"right\"\u003e2.94%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Ruby_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e12\u003c/td\u003e\u003ctd\u003eJupyter Notebook\u003c/td\u003e\u003ctd align=\"right\"\u003e2.66%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Jupyter%20Notebook_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e13\u003c/td\u003e\u003ctd\u003eObjective-C\u003c/td\u003e\u003ctd align=\"right\"\u003e1.77%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Objective-C_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e14\u003c/td\u003e\u003ctd\u003eSwift\u003c/td\u003e\u003ctd align=\"right\"\u003e1.67%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Swift_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e15\u003c/td\u003e\u003ctd\u003eKotlin\u003c/td\u003e\u003ctd align=\"right\"\u003e1.11%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Kotlin_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e16\u003c/td\u003e\u003ctd\u003eRust\u003c/td\u003e\u003ctd align=\"right\"\u003e0.86%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Rust_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e17\u003c/td\u003e\u003ctd\u003eR\u003c/td\u003e\u003ctd align=\"right\"\u003e0.78%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/R_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e18\u003c/td\u003e\u003ctd\u003eScala\u003c/td\u003e\u003ctd align=\"right\"\u003e0.74%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Scala_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e19\u003c/td\u003e\u003ctd\u003eLua\u003c/td\u003e\u003ctd align=\"right\"\u003e0.68%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Lua_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e20\u003c/td\u003e\u003ctd\u003ePowerShell\u003c/td\u003e\u003ctd align=\"right\"\u003e0.53%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/PowerShell_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e21\u003c/td\u003e\u003ctd\u003eMatlab\u003c/td\u003e\u003ctd align=\"right\"\u003e0.47%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Matlab_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e22\u003c/td\u003e\u003ctd\u003eCoffeeScript\u003c/td\u003e\u003ctd align=\"right\"\u003e0.44%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/CoffeeScript_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e23\u003c/td\u003e\u003ctd\u003ePerl\u003c/td\u003e\u003ctd align=\"right\"\u003e0.43%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Perl_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e24\u003c/td\u003e\u003ctd\u003eGroovy\u003c/td\u003e\u003ctd align=\"right\"\u003e0.37%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Groovy_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e25\u003c/td\u003e\u003ctd\u003eHaskell\u003c/td\u003e\u003ctd align=\"right\"\u003e0.37%\u003c/td\u003e\u003ctd align=\"center\"\u003e\u003cimg src=\"./images/Haskell_sparkline.svg\"\u003e\u003c/td\u003e\u003c/tr\u003e\n\u003c/tbody\u003e\u003c/table\u003e\n\n\n### Data Sources\n\nThe main data sources for this project are:\n\n * The [GitHub Archive](https://www.githubarchive.org/) project which has been recording every public event on GitHub since early 2011. Overall there are more than 1.25 Billion events stored there, which are more than\n 400GB gzipped.\n * The [GHTorrent](http://ghtorrent.org/) project. Which also monitors the GitHub public event timeline, and retrieves extra information from the GitHub api for each event seen.\n * A custom scraper I wrote here, which backfilled missing repo information from the GitHub API.\n\n### Inferring Languages\n\nAnalyzing programming language trends requires figuring out the language for each repository.\n\nThere are two ways to get language information out of the GitHub API. The first is to query for the Repository using the [GET /repos/:owner/:rep](https://developer.github.com/v3/repos/#get) API.\nThis returns the dominant language of the repo. You can also get the byte breakdown of all the languages using the [GET /repos/:owner/:repo/languages](https://developer.github.com/v3/repos/#list-languages) endpoint. This will include the number of bytes used by each language in the Repo.\n\nWe're using the single dominant language for the repo for this analysis. While this loses some information, there are multiple benefits that make this much more practical:\n\n * All GitHub Archive events from 2012/03/10/ to 2014/12/31 included the repo language in the repository.language json field.\n * All PullRequestEvent events have this information in the payload.pull_request.base.repo json field\n * The GHTorrent project has 40 million repos with languages in the projects table, but only has the detailed language breakdown for 27.6 million repos in the project_languages table\n\nSo the plan to get the language for each repo is to aggregate all these sources of information: The language scraped from the GitHub REST API,  the language from the projects table\nof the GHTorrent project, the language included with certain Github Archive events and finally the language inferred from fork events (forks are assumed to have the same\nlanguage as the repo they are forking)\n\n| Source         |  Repo Count   |\n| -------------  |:-------------:|\n| GHTorrent      | 42.8M         |\n| scraper        | 22.5M         |\n| GitHub-Archive | 13.5M         |\n| forks          | 37M           |\n\nThere is significant overlap between all these sources of information, but once aggregated and\ndeduplicated there we ended up with language information for 62.8 Million repos. This includes every repo that has had more than 1 user interact with it and all repos with only 1 user that have more than 5 total events ever.\nI'm still crawling the remaining repos.\n\n### Inferring the Date\n\nThe dates here are the dates corresponding to the events in GitHub Archive. This means that we are analyzing when something was pushed to GitHub rather than the commit date (which could be considerably earlier).\nThe reason behind this is that the commit date can potentially be inaccurate: The dates are given from the developers and it's not uncommon to see dates that implausibly far back in the past (Jan 1, 1970), or even more\n implausibly occurring in the future.\n\n### Running the Code\n\nThis code requires both Go and Python to run properly.  Additionally, this requires around 1TB of free disk space to run, and I would recommend at least 16GB of RAM.\n\nTo configure your system to run this code\n  * Install all the python dependencies by running ```pip install -r requirements.txt``` and go dependences by running ```go get ./...``` from this directory\n  * Install Postgres onto your system, and create the database tables from the schema.sql file: ```psql github \u003c schema.sql```\n  * Copy the config_template.toml file to config.toml and fill out the required fields.\n\nThere are multiple different components to this code.\n\nThe main programs written in Go are:\n\n * ```gha-download-files```: downloads new files from the githubarchive so that they can be analyzed locally.\n * ```gha-parse-events```: Parses the JSON events from the Github Archive and converting to normalized TSV files.  The JSON event schema changes several times over the last 7 years, and normalizing to a consistent TSV schema makes it much easier to analyze.\n * ```gha-scraper```: Crawls repo information from the GitHub API and inserts into Postgres.\n\n There are also several small bash scripts that do the actual analysis:\n\n * ```scripts/calculate_language_mau.sh```: Joins the repo languages against the parsed events, and figures out the MAU for each language at every month.\n * ```scripts/calculate_repo_languages.sh```: Merges information from postgres/ghtorrents/extracted GitHub archive events/ and from fork events to get a single repo:language mapping.\n * ```scripts/calculate_top_repos.sh```: Ranks each repository by the number of users. The output of this is passed to gha-scraper to crawl repositories.\n\n Finally plotting is done with Python by running ```python scripts/plot.py```. This will also update the graphs in this README.\n\nA future goal of this project is to simplify the steps needed to run this code, it's unnecessarily convoluted right now.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenfred%2Fgithub-analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbenfred%2Fgithub-analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenfred%2Fgithub-analysis/lists"}