{"id":26323644,"url":"https://github.com/mzfr/gsoc-data","last_synced_at":"2025-03-15T17:18:47.271Z","repository":{"id":119015037,"uuid":"145124996","full_name":"mzfr/GSoC-Data","owner":"mzfr","description":"GSoC Data from 2005 to 2018 in JSON format. ","archived":false,"fork":false,"pushed_at":"2020-03-18T15:54:00.000Z","size":2868,"stargazers_count":35,"open_issues_count":5,"forks_count":8,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-05-01T17:45:27.716Z","etag":null,"topics":["gsoc","json","open-data"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mzfr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-08-17T13:39:09.000Z","updated_at":"2021-05-24T15:41:59.000Z","dependencies_parsed_at":null,"dependency_job_id":"3c2ec286-8bd2-487c-a897-4dc440bfe73e","html_url":"https://github.com/mzfr/GSoC-Data","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mzfr%2FGSoC-Data","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mzfr%2FGSoC-Data/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mzfr%2FGSoC-Data/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mzfr%2FGSoC-Data/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mzfr","download_url":"https://codeload.github.com/mzfr/GSoC-Data/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243762227,"owners_count":20343979,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gsoc","json","open-data"],"created_at":"2025-03-15T17:18:46.241Z","updated_at":"2025-03-15T17:18:47.259Z","avatar_url":"https://github.com/mzfr.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GSoC Data\n\nAll the data from [GSoC-archive](https://developers.google.com/open-source/gsoc/past-summers) in JSON format.\n\n\n__NOTE__\nFor running the scrapers you must install the following dependencies\n* asyncio\n* aiohttp\n\nYou can do that by running: `pip install ayncio aiohttp`\n\n# Directories\n\n* `Data/`\n    + `orgs/` - all orgs that have been a part of GSoC from 2005 to 2017\n\n    + `projects/` - all projects that are completed under GSoC program from year 2005-2017\n\n* `Scrapers/`\n    - Contains all the scrapers used for scraping the data\n\n# Data\n\n### `orgs/`\n\n* `2005.json` - `2008.json`\n    - `link`: URL of the org\n    - `name`: Name of the org\n\n* `2009-2013.json`\n    -  `about`: Work that org do\n    -  `link`: URL of the org\n    -  `mail`: Mailing list of the org\n    -  `name`: Name of the org\n    -  `page`: Idea page of the org\n\n* `2014-2015.json`\n    - `link`: URL of the org\n    - `mail`: Mailing list of the org\n    - `page`: Idea page of the org\n    - `name`: Name of the org selected\n\n* `2016-2017.json`\n    - `about`: Info about the organization\n    - `link`: URL of the org\n    - `name`: Name of the org\n\n### `projects/`\n\n* `2005.json` - `2008.json`\n    - `Mentor`: Name of the mentor of the project\n    - `project`: Name of the project\n    - `student`: Name of the student\n\n* `2009-2013.json` \u0026 `2014-2015.json`\n    - `Organization`: Name of the organization\n    - `detail`: Detail about the project\n    - `link`: Link to the project\n    - `student`: Name of the student selected\n    - `title`: Name of the project\n\n* `2016-2017.json`\n    - `Organization`: Name of the organization\n    - `link`: Link to the project\n    - `mentors`: Name of the mentors\n    - `student`: Name of the student\n    - `title`: Name of the project\n\n\n# What can be done with the data?\n\nThis data will be used for improving the functionality of [Soccer](http://github.com/dufferzafar/Soccer/).\n\nIt can also be used to generate various stats, plots or answer data-related questions like:\n\n- Who did the most number of GSoCs? under which org?\n- Which org has the highest sutdent-to-mentor conversion rate? (students who first did GSoC under the org, and then became mentors)\n- Run some magic on the descriptions of projects over the years to find out if there is a trend of ML related projects.\n\netc. etc.\n\n---\n\nFeel free to open issues to discuss any more ideas!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmzfr%2Fgsoc-data","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmzfr%2Fgsoc-data","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmzfr%2Fgsoc-data/lists"}