{"id":13990509,"url":"https://github.com/calebwin/frequent","last_synced_at":"2025-04-09T16:34:10.163Z","repository":{"id":100568101,"uuid":"147986036","full_name":"calebwin/frequent","owner":"calebwin","description":"A utility for crawling websites and building frequency lists of words","archived":false,"fork":false,"pushed_at":"2024-02-07T06:56:29.000Z","size":10,"stargazers_count":26,"open_issues_count":2,"forks_count":12,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-23T18:50:53.094Z","etag":null,"topics":["frequency-lists","python","web-crawler","web-crawler-python","word-frequency"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/calebwin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-09-09T02:20:13.000Z","updated_at":"2024-12-12T11:35:19.000Z","dependencies_parsed_at":"2024-08-09T13:21:21.045Z","dependency_job_id":null,"html_url":"https://github.com/calebwin/frequent","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/calebwin%2Ffrequent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/calebwin%2Ffrequent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/calebwin%2Ffrequent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/calebwin%2Ffrequent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/calebwin","download_url":"https://codeload.github.com/calebwin/frequent/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248067770,"owners_count":21042352,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["frequency-lists","python","web-crawler","web-crawler-python","word-frequency"],"created_at":"2024-08-09T13:02:50.803Z","updated_at":"2025-04-09T16:34:10.114Z","avatar_url":"https://github.com/calebwin.png","language":"Python","readme":"# Frequent\nfrequent is a utility for crawling websites and building word frequency list. Mainly made because I wanted to be able to find top n most common words on different websites, but I imagine there might be more useful applications. Or not. \n\n```python\nimport frequent\n\n# get most frequent words from the w3schools website\n# limit crawl depth to 25\nword_frequencies = frequent.word_frequencies(\"https://www.w3schools.com\", 25)\n\n# get the top 50 words\ntop_words = website_word_frequencies.most_common(50)\n\n# print the top 50 most frequent words\nprint(top_words)\n```\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcalebwin%2Ffrequent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcalebwin%2Ffrequent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcalebwin%2Ffrequent/lists"}