{"id":13836904,"url":"https://github.com/brendano/ark-tweet-nlp","last_synced_at":"2025-04-09T07:03:10.955Z","repository":{"id":45535529,"uuid":"2051457","full_name":"brendano/ark-tweet-nlp","owner":"brendano","description":"CMU ARK Twitter Part-of-Speech Tagger","archived":false,"fork":false,"pushed_at":"2023-12-17T15:42:12.000Z","size":61660,"stargazers_count":575,"open_issues_count":26,"forks_count":199,"subscribers_count":64,"default_branch":"master","last_synced_at":"2025-04-02T05:16:17.900Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://www.ark.cs.cmu.edu/TweetNLP/","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/brendano.png","metadata":{"files":{"readme":"README.txt","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2011-07-15T04:50:16.000Z","updated_at":"2024-12-09T10:46:18.000Z","dependencies_parsed_at":"2024-01-11T17:47:23.240Z","dependency_job_id":"becbc5aa-d587-4e66-ad7e-48eeca72e277","html_url":"https://github.com/brendano/ark-tweet-nlp","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brendano%2Fark-tweet-nlp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brendano%2Fark-tweet-nlp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brendano%2Fark-tweet-nlp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brendano%2Fark-tweet-nlp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/brendano","download_url":"https://codeload.github.com/brendano/ark-tweet-nlp/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247994120,"owners_count":21030050,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-04T15:00:56.930Z","updated_at":"2025-04-09T07:03:10.938Z","avatar_url":"https://github.com/brendano.png","language":"Java","funding_links":[],"categories":["Java","II. Databases, search engines, big data and machine learning","人工智能"],"sub_categories":["8. Machine Learning"],"readme":"CMU ARK Twitter Part-of-Speech Tagger v0.3.2\nhttp://www.ark.cs.cmu.edu/TweetNLP/\n\nBasic usage for released version\n================================\n\nRequires Java 6.  To run the tagger on example data, try:\n\n    java -Xmx500m -jar ark-tweet-nlp-0.3.2.jar examples/example_tweets.txt\n\nwhere the jar file is the one included in the release download.\nThe tagger outputs tokens, predicted part-of-speech tags, and confidences.\nUse the \"--help\" flag for more information.  On Unix systems, \"./runTagger.sh\"\ninvokes the tagger; e.g.\n\n    ./runTagger.sh examples/example_tweets.txt\n    ./runTagger.sh --help\n\nWe also include a script that invokes just the tokenizer:\n\n    ./twokenize.sh examples/example_tweets.txt\n\nYou may have to adjust the parameters to \"java\" depending on your system.\n\nIf instead you are using a source checkout, see docs/hacking.txt for info.\n\nInformation\n===========\n\nVersion 0.3 of the tagger is much faster and more accurate.  Please see the\ntech report on the website for details.\n\nFor the Java API, see src/cmu/arktweetnlp; especially Tagger.java.\nSee also documentation in docs/ and src/cmu/arktweetnlp/package.html.\n\nThis tagger is described in the following two papers, available at the website.\nPlease cite these if you write a research paper using this software.\n\nPart-of-Speech Tagging for Twitter: Annotation, Features, and Experiments\nKevin Gimpel, Nathan Schneider, Brendan O'Connor, Dipanjan Das, Daniel Mills,\n  Jacob Eisenstein, Michael Heilman, Dani Yogatama, Jeffrey Flanigan, and \n  Noah A. Smith\nIn Proceedings of the Annual Meeting of the Association\n  for Computational Linguistics, companion volume, Portland, OR, June 2011.\nhttp://www.ark.cs.cmu.edu/TweetNLP/gimpel+etal.acl11.pdf\n\nPart-of-Speech Tagging for Twitter: Word Clusters and Other Advances\nOlutobi Owoputi, Brendan O'Connor, Chris Dyer, Kevin Gimpel, and\n  Nathan Schneider.\nTechnical Report, Machine Learning Department. CMU-ML-12-107. September 2012.\n\nContact\n=======\n\nPlease contact Brendan O'Connor (brenocon@cs.cmu.edu) and Kevin Gimpel\n(kgimpel@cs.cmu.edu) if you encounter any problems.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrendano%2Fark-tweet-nlp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbrendano%2Fark-tweet-nlp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrendano%2Fark-tweet-nlp/lists"}