{"id":28629780,"url":"https://github.com/codebox/wordvis","last_synced_at":"2025-06-12T12:13:34.984Z","repository":{"id":140515340,"uuid":"59813073","full_name":"codebox/wordvis","owner":"codebox","description":"This is a Python script to generate Sunburst Charts that visualise the structure of English words.","archived":false,"fork":false,"pushed_at":"2019-03-06T08:08:50.000Z","size":14,"stargazers_count":15,"open_issues_count":1,"forks_count":2,"subscribers_count":3,"default_branch":"master","last_synced_at":"2023-10-20T18:59:17.504Z","etag":null,"topics":["english-word","visualisation"],"latest_commit_sha":null,"homepage":"https://codebox.net/pages/common-english-words-visualisation","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/codebox.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2016-05-27T07:16:07.000Z","updated_at":"2023-10-20T18:59:18.034Z","dependencies_parsed_at":null,"dependency_job_id":"b3f19099-9700-425e-8cbb-cd68c982207f","html_url":"https://github.com/codebox/wordvis","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"purl":"pkg:github/codebox/wordvis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codebox%2Fwordvis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codebox%2Fwordvis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codebox%2Fwordvis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codebox%2Fwordvis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/codebox","download_url":"https://codeload.github.com/codebox/wordvis/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codebox%2Fwordvis/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259462578,"owners_count":22861514,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["english-word","visualisation"],"created_at":"2025-06-12T12:13:34.288Z","updated_at":"2025-06-12T12:13:34.966Z","avatar_url":"https://github.com/codebox.png","language":"Python","readme":"This is a Python Script to generate \u003ca href=\"https://en.wikipedia.org/wiki/Pie_chart#Ring_chart_.2F_Sunburst_chart_.2F_Multilevel_pie_chart\"\u003eSunburst Charts\u003c/a\u003e that visualise the structure of English words.\r\n\r\nAn example chart, generated using the 100,000 most common words in the Google Books English Corpus is shown below:\r\n\r\n\u003cimg src=\"https://codebox.net/assets/images/common-english-words-visualisation/wordvis_100000_small.png\" height=\"320px\" width=\"320px\" alt=\"Sunburst Chart of Common English Words, small\" /\u003e\r\n\r\nThe charts consist of a series on concentric rings, with each ring divided into segments.\r\n\r\nThe rings represent letter positions within words - the innermost ring corresponds to first letters, the next ring to\r\nsecond letters, and so on.\r\n\r\nEach segment within a ring represents a particular letter, occurring at that position within a word, and following\r\nthe letter adjacent to it on the previous ring. The size of each segment represents how often that letter appears\r\nin that position within the corpus. For example, by looking at the innermost ring we can see that the most common\r\nletter to find at the start of a word is \u003cstrong\u003e'T'\u003c/strong\u003e:\r\n\r\n\u003cimg src=\"https://codebox.net/assets/images/common-english-words-visualisation/wordvis_100000_zoom.png\" height=\"600px\" width=\"600px\" class=\"fancyimage\" style=\"border: 1px solid grey\" alt=\"Sunburst Chart of Common English Words, small\" /\u003e\r\n\r\nMany of the common words found in the corpus can be seen on the chart by starting at the inner ring and reading\r\nradially outwards. For example, the word \u003cstrong\u003e'THE'\u003c/strong\u003e can be seen in the diagram above, at the 10 o'clock position.\r\n\r\n\u003ch3\u003eUsage\u003c/h3\u003e\r\nBefore running the script you must prepare a correctly formatted text file containing a list of word frequency counts\r\n\u003ca href=\"http://norvig.com/google-books-common-words.txt\"\u003esuch as this one\u003c/a\u003e.\u003c/p\u003e\r\n\r\nRun the script with 2 command-line arguments indicating the location of the word file, and the desired output file name. For example:\r\n\r\n\u003cpre\u003e\r\npython wordvis.py google-books-common-words.txt words.svg\r\n\u003c/pre\u003e\r\n\r\nThe charts are generated in SVG format, and the resulting files are large.\r\nThe \u003ca href=\"https://codebox.net/assets/images/common-english-words-visualisation/wordvis_100000.png\"\u003eSVG file generated using the Google Books data\u003c/a\u003e\r\nwas around 42MB in size.\r\n\r\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodebox%2Fwordvis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodebox%2Fwordvis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodebox%2Fwordvis/lists"}