{"id":17794233,"url":"https://github.com/pyk/sayoeti-core","last_synced_at":"2025-03-16T20:31:21.452Z","repository":{"id":71958232,"uuid":"47571100","full_name":"pyk/sayoeti-core","owner":"pyk","description":"Sayoeti is the PR's assistant of Komisi Pemberantasan Korupsi (KPK) that  powered by SVM. He helps KPK by watching all mass media in  Indonesia and provide sentiment analysis of the mass media.","archived":false,"fork":false,"pushed_at":"2015-12-12T04:31:11.000Z","size":485,"stargazers_count":7,"open_issues_count":0,"forks_count":2,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-27T13:21:26.362Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pyk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-12-07T18:29:18.000Z","updated_at":"2018-11-07T12:43:39.000Z","dependencies_parsed_at":"2023-03-15T00:32:55.772Z","dependency_job_id":null,"html_url":"https://github.com/pyk/sayoeti-core","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyk%2Fsayoeti-core","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyk%2Fsayoeti-core/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyk%2Fsayoeti-core/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyk%2Fsayoeti-core/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pyk","download_url":"https://codeload.github.com/pyk/sayoeti-core/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243830912,"owners_count":20354848,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-27T11:15:36.542Z","updated_at":"2025-03-16T20:31:20.610Z","avatar_url":"https://github.com/pyk.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Sayoeti Core\nSayoeti is the PR's assistant of [Komisi Pemberantasan Korupsi][KPK](KPK) that \npowered by Artificial Intelligence. He helps KPK by watching all mass media in \nIndonesia and provide sentiment analysis of the mass media.\n\n[KPK]: http://www.kpk.go.id/splash/\n\nThis is one of component that Sayoeti built on. Basically, `sayoeti-core` is \ncreated to answer the following question:\n\n    Human: Given a document, is this document about corruption news in Indonesia? (Yes/No)\n    Sayoeti: Yes, it is.\n\nLearn more here [https://sayoeti.xyz](https://sayoeti.xyz) (Indonesian).\n\n## Requirements\nWe use supervised learning method here. Sayoeti need to learn from corpus\nof corruption news first. Example of corpus from Kompas.com and Liputan6.com\ncan be found [here][corpus].\n\nTo decrease the bias we need to remove commonly used words in Indonesian\nlike `dan` \u0026 `di` that have no meaning in our context. To do this, use\n[Indonesian stopwords][sw].\n\nSayoeti only tested on `x86_64`. Compiled using `gcc`.\n\n[corpus]: https://www.dropbox.com/s/vuziwj3wcwfrter/example-corpus.tar.gz?dl=0\n[sw]: https://sites.google.com/site/kevinbouge/stopwords-lists\n\n## Setup\n\nClone the repository\n\n    git clone https://github.com/pyk/sayoeti-ai\n    cd sayoeti-ai\n\nInstall dependencies\n\n    make libmill\n\nBuild Sayoeti\n\n    make\n\nRun Sayoeti\n\n    LD_LIBRARY_PATH=/usr/local/lib ./sayoeti -c /path/to/corpusdir -s /path/to/stopwords/file\n\nSayoeti will listening on port `9090` by default.\n\n## Example\nRunning Sayoeti\n\n    $ LD_LIBRARY_PATH=/usr/local/lib ./sayoeti -c corpus -s stopwords_id.txt \n    sayoeti: Create stop words dictionary from stopwords_id.txt\n    sayoeti: stop words dictionary from stopwords_id.txt is created.\n    sayoeti: Create index vocabulary from corpus corpus\n    sayoeti: Index vocabulary from corpus corpus created.\n    sayoeti: compute global IDF for each term in index vocabulary\n    sayoeti: create a problem\n    *\n    optimization finished, #iter = 9\n    obj = 278.605220, rho = 23.603209\n    nSV = 24, nBSV = 23\n    sayoeti: listening on port :9090\n\nThrow document for Sayoeti to read\n\n    $ telnet localhost 9090\n    Trying 127.0.0.1...\n    Connected to localhost.\n    Escape character is '^]'.\n    202 OK sayoeti ready\n\n## License\n\n    Copyright 2015 Bayu Aldi Yansyah \u003cbayualdiyansyah@gmail.com\u003e\n\n    Licensed under the Apache License, Version 2.0 (the \"License\");\n    you may not use this file except in compliance with the License.\n    You may obtain a copy of the License at\n\n        http://www.apache.org/licenses/LICENSE-2.0\n\n    Unless required by applicable law or agreed to in writing, software\n    distributed under the License is distributed on an \"AS IS\" BASIS,\n    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n    See the License for the specific language governing permissions and\n    limitations under the License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpyk%2Fsayoeti-core","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpyk%2Fsayoeti-core","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpyk%2Fsayoeti-core/lists"}