{"id":29288814,"url":"https://github.com/ebertti/language-resource","last_synced_at":"2025-07-06T03:12:52.668Z","repository":{"id":11417862,"uuid":"13868752","full_name":"ebertti/language-resource","owner":"ebertti","description":"Collection of stopwords, frequent words and other things.","archived":false,"fork":false,"pushed_at":"2013-10-25T20:55:58.000Z","size":192,"stargazers_count":5,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-25T12:07:30.319Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"unlicense","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ebertti.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-10-25T18:28:55.000Z","updated_at":"2023-07-11T12:44:11.000Z","dependencies_parsed_at":"2022-07-12T23:40:35.920Z","dependency_job_id":null,"html_url":"https://github.com/ebertti/language-resource","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ebertti/language-resource","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebertti%2Flanguage-resource","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebertti%2Flanguage-resource/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebertti%2Flanguage-resource/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebertti%2Flanguage-resource/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ebertti","download_url":"https://codeload.github.com/ebertti/language-resource/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebertti%2Flanguage-resource/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263841797,"owners_count":23518497,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-06T03:12:42.879Z","updated_at":"2025-07-06T03:12:52.653Z","avatar_url":"https://github.com/ebertti.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"Language Resource\n=================\n\nCollection of stopwords, frequent words and other things.\n\nTo help a build application with [**NLP (Natural Language Processing)**](http://en.wikipedia.org/wiki/Natural_language_processing) like:\n* Stemming\n* Text simplification\n* Text-to-speech\n* Text-proofing\n* Natural language search\n* Query expansion\n* Automated essay scoring\n* Truecasing\n\nor **Search Engines** like:\n* [Lucene](http://lucene.apache.org/core/)\n* [Elastic Search](www.elasticsearch.org/‎)\n* [Whoosh](https://bitbucket.org/mchaput/whoosh/wiki/Home)\n* [Solr](http://lucene.apache.org/solr/)\n* [Xapian](http://xapian.org/)\n\nLanguages\n---------\n| Language [ISO 639-1](http://en.wikipedia.org/wiki/ISO_639-1) | Name       | Stopwords | Frequent Words | Obs   |\n| ------------------- | ---------- |:---------:|:--------------:| ----- |\n| bg                  | Bulgarian  | Yes       | No             | UTF-8 |\n| cz                  | Czech      | Yes       | No             | UTF-8 |\n| de                  | German     | Yes       | Yes            |       |\n| en                  | English    | Yes       | Yes            |       |\n| es                  | Spanish    | Yes +     | Yes            |       |\n| fi                  | Finnish    | Yes       | Yes            |       |\n| fr                  | French     | Yes       | Yes            |       |\n| hu                  | Hungarian  | Yes       | No             | UTF-8 |\n| it                  | Italian    | Yes       | Yes            | UTF-8 |\n| pl                  | Polish     | Yes       | No             | UTF-8 |\n| pt                  | Portuguese | Yes +     | No             |       |\n| ru                  | Russian    | Yes       | No             | UTF-8 |\n| sv                  | Swedish    | Yes       | Yes            |       |\n\nReference\n---------\nAlmost everything was extract from http://members.unine.ch/jacques.savoy/clef/\n\nContributing\n------------\n\nMake a fork, do your changes and request a pull.\n\nPlease, also do the modifications on this readme file!\n\nThanks for your help!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Febertti%2Flanguage-resource","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Febertti%2Flanguage-resource","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Febertti%2Flanguage-resource/lists"}