Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/littleark/githut
Visualization of data from github archive.
https://github.com/littleark/githut
Last synced: 3 months ago
JSON representation
Visualization of data from github archive.
- Host: GitHub
- URL: https://github.com/littleark/githut
- Owner: littleark
- License: mit
- Created: 2014-08-19T12:19:02.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2022-02-02T05:15:53.000Z (almost 3 years ago)
- Last Synced: 2024-07-31T08:18:45.746Z (6 months ago)
- Language: HTML
- Size: 8.55 MB
- Stars: 1,280
- Watchers: 41
- Forks: 102
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome - littleark/githut - Visualization of data from github archive. (HTML)
README
## GitHut
GitHut (http://githut.info) is an attempt to visualize and explore the complexity of the universe of programming languages used across the repositories hosted on GitHub.
Programming languages are not simply the tool developers use to create programs or express algorithms but also instruments to code and decode creativity. By observing the history of languages we can enjoy the quest of humankind for a better way to solve problems, to facilitate collaboration between people and to reuse the effort of others.
Github is the largest code host in the world, with 3.5 million users. It's the place where the open-source development community offers access to most of its projects. By analyzing how languages are used in GitHub it is possible to understand the popularity of programming languages among developers and also to discover the unique characteristics of each language.
The visualization is based on two type of visualization: a Parallel Coordinates chart and a Small Multiples visualization.
Data is from Github Archive (http://www.githubarchive.org/).
### Web Site
GitHut is published at **http://githut.info**
### Queries
GitHub Archive data is also available on Google BigQuery. Below are the two queries used to collect the data for the Parallel Coordinates and Small Multiples visualizations:
#### Parallel Coordinates
Multiple information grouped by language for a defined quarter
```sql
SELECT
repository_language,
type,
COUNT(distinct(repository_url)) AS active_repos_by_url,
COUNT(repository_language) AS events,
YEAR(created_at) AS year,
QUARTER(created_at) AS quarter
FROM [githubarchive:github.timeline]
WHERE
(
type = 'PushEvent'
OR type = 'ForkEvent'
OR (type = 'IssuesEvent' AND (payload_action="opened" OR payload_action=="reopened"))
OR (type = 'CreateEvent' AND payload_ref_type="repository")
OR type = 'WatchEvent'
)
AND repository_language !=''
AND repository_url != ''
AND YEAR(created_at)= 2014
AND QUARTER(created_at)=1
GROUP BY
repository_language,
type,
year,
quarter
```#### Small Multiples
Count of active repositories by quarter
```sql
SELECT
repository_language,
COUNT(distinct(repository_url)) AS active_repos_by_url,
YEAR(created_at) AS year,
QUARTER(created_at) AS quarter,
FROM [githubarchive:github.timeline]
WHERE
type="PushEvent"
GROUP BY
repository_language,
year,
quarter
ORDER BY
repository_language,
year DESC,
quarter DESC
```### License
The content of this project itself is licensed under the [Creative Commons Attribution 4.0 license](http://creativecommons.org/licenses/by-nc-nd/4.0/), and the underlying source code used to format and display that content is licensed under the [MIT license](http://opensource.org/licenses/mit-license.php).