{"id":15757921,"url":"https://github.com/mre/freq","last_synced_at":"2025-03-13T17:34:30.945Z","repository":{"id":53764538,"uuid":"344489739","full_name":"mre/freq","owner":"mre","description":"🗼 A CLI term frequency analyzer. Counts the number of occurrences of each word in an input and creates formatted output or a histogram.","archived":false,"fork":false,"pushed_at":"2021-03-15T14:52:16.000Z","size":189,"stargazers_count":3,"open_issues_count":2,"forks_count":2,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-02-26T20:23:29.196Z","etag":null,"topics":["frequency","histogram","occurences","words"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mre.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE-APACHE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-03-04T13:48:28.000Z","updated_at":"2021-03-15T14:59:09.000Z","dependencies_parsed_at":"2022-09-04T02:11:35.231Z","dependency_job_id":null,"html_url":"https://github.com/mre/freq","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mre%2Ffreq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mre%2Ffreq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mre%2Ffreq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mre%2Ffreq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mre","download_url":"https://codeload.github.com/mre/freq/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243448353,"owners_count":20292594,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["frequency","histogram","occurences","words"],"created_at":"2024-10-04T09:40:55.344Z","updated_at":"2025-03-13T17:34:30.605Z","avatar_url":"https://github.com/mre.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# freq\n\nA commandline tool that counts the number of word occurrences in an input.\n\n[![James Munns on Twitter](assets/tweet.png)](https://twitter.com/bitshiftmask/status/1367451210987544580)\n\nThis is just a placeholder repository for now.\nPlease create issues for feature request and collaboration.\n\n## Usage\n\n### Commandline\n\n```sh\necho \"b a n a n a\" | freq\n\n0.16666667 - 1 - b\n0.33333334 - 2 - n\n0.5 - 3 - a\n```\n\n### Library\n\n```rust\nuse std::error::Error;\n\nfn main() -\u003e Result\u003c(), Box\u003cdyn Error\u003e\u003e {\n    let frequencies = freq::count(\"fixtures/sample.txt\")?;\n    println!(\"{:?}\", frequencies);\n    Ok(())\n}\n```\n\n## Features\n\n- [x] Ignore words ([regex pattern](https://docs.rs/regex/latest/regex/struct.RegexSet.html)) [[issue 5](https://github.com/mre/freq/issues/5)]\n- [x] Different output formats (plaintext, JSON)\n- [x] freq.toml configuration file\n- [x] Filter stopwords (similar to NLTK's stopwords)\n- [ ] Performance (SIMD support, async execution)\n- [ ] Recursion support\n- [ ] Allow skipping files\n- [ ] Allow specifying ignored words in a separate file\n- [ ] Generate \"heat bars\" for words like shell-hist does\n- [ ] Split report by file/folder (sort of like `sloc` does for code)\n- [ ] Choose language for stopwords (`--lang fr`)\n- [ ] Format output (e.g. justify counts a la `uniq -c`)\n- [ ] Interactive mode (shows stats while running) (`--interactive`)\n- [ ] Calculate TF-IDF score in a multi-file scenario\n- [ ] Limit the output to the top N words (e.g. `--top 3`)\n- [ ] Ignore hidden files (begins with `.`)\n- [ ] Minimize number of allocations\n- [ ] No-std support?\n- [ ] Ignore \"words\" only consisting of special characters, e.g. `///`\n- [ ] Multiple files as inputs\n- [ ] Glob input patterns\n- [ ] If directory is given, walk contents of folder recursively (walker)\n- [ ] Verbose output (show currently analyzed file etc)\n- [ ] Library usage\n- [ ] https://github.com/jonhoo/evmap\n- [ ] Automated abstract generation with Luhn's algorithm [Issue #1](https://github.com/mre/freq/issues/1)\n\nIdea contributors:\n\n- [@jamesmunns](https://github.com/jamesmunns)\n- [@M3t0r](https://github.com/M3t0r)\n- [@themihel](https://github.com/themihel)\n- [@AlexanderThaller](https://github.com/AlexanderThaller)\n- [@pizzamig](https://github.com/pizzamig)\n- Want to see your name here? Create an issue!\n\n## Similar tools\n\n**tot-up**\n\nSimilar tool written in Rust with nice graphical output\nhttps://github.com/payload/tot-up\n\n**uniq**\n\nA basic version would be\n\n```sh,ignore\ncurl -L 'https://github.com/mre/freq/raw/main/README.md' | tr -cs '[:alnum:]' \"\\n\" | grep -vEx 'and|or|for|a|of|to|an|in' | sort | uniq -c | sort\n```\n\nThis works, but it's not very extensible by normal users.\nIt would also lack most of the features listed above.\n\n**Lucene**\n\nHas all the bells and whistles, but there is no official CLI interface and requires a full Java installation.\n\n**wordcount**\n\n`freqword \u003ctab\u003e freq`\n\nNice and simple. Doesn't exclude stopwords and no regex support, though.\nhttps://github.com/juditacs/wordcount\n\n**word-frequency**\n\nHaskell-based approach: Includes features like min length for words, or min occurrences of words in a text.\nhttps://github.com/cbzehner/word-frequency\n\n**What else?**\n\nThere must be more tools out there. Can you help me find them?\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmre%2Ffreq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmre%2Ffreq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmre%2Ffreq/lists"}