{"id":19572714,"url":"https://github.com/kbariotis/documer","last_synced_at":"2025-04-27T04:32:28.388Z","repository":{"id":57005061,"uuid":"9913516","full_name":"kbariotis/documer","owner":"kbariotis","description":"Bayes algorithm implementation with PHP","archived":false,"fork":false,"pushed_at":"2018-03-19T12:08:18.000Z","size":46,"stargazers_count":77,"open_issues_count":1,"forks_count":5,"subscribers_count":8,"default_branch":"master","last_synced_at":"2024-04-14T13:10:02.399Z","etag":null,"topics":["bayes-algorithm","php"],"latest_commit_sha":null,"homepage":"","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kbariotis.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-05-07T14:18:02.000Z","updated_at":"2024-01-29T11:15:27.000Z","dependencies_parsed_at":"2022-08-21T12:10:50.460Z","dependency_job_id":null,"html_url":"https://github.com/kbariotis/documer","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kbariotis%2Fdocumer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kbariotis%2Fdocumer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kbariotis%2Fdocumer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kbariotis%2Fdocumer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kbariotis","download_url":"https://codeload.github.com/kbariotis/documer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224059248,"owners_count":17248762,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bayes-algorithm","php"],"created_at":"2024-11-11T06:28:23.024Z","updated_at":"2024-11-11T06:28:24.334Z","avatar_url":"https://github.com/kbariotis.png","language":"PHP","readme":"Documer\n==============\nBayes algorithm implementation in PHP for auto document classification.\n\nConcept\n-----------------------------\n\n_every document has key words e.g. *Margaret Thatcher*_\n\n_every document has a label e.g. *Politics*_\n\nSuppose, that in every document there are *key words all starting with an uppercase letter*. We store these words in our DB end every time we need to guess a document against a particular *label*, we use Bayes algorithm.\n\nLet's clear that out:\n\n**Training:**\n\nFirst, we tokenize the document and keep only our key words (All words starting with an uppercase letter) in an array. We store that array in our DB.\n\n**Guessing:**\n\nThis is very simple. Again, we parse the document we want to be classified and create an array with the key words. Here is the pseudo code:\n\n\tfor every label in DB\n\t\tfor every key word in document\n\t\t\tP(label/word) = P(word/label)P(label) /\t( P(word/label)P(label) + (1 - P(word/label))(1 - P(label)) )\n\nUsage\n------------\n**Install through composer**\n\n```json\n\"require\": {\n    \"kbariotis/documer\": \"dev-master\"\n  },\n```\n\n**Instantiate**\n\nPass a Storage Adapter object to the Documer Constructor.\n\n```php\n\n$documer = new Documer\\Documer(new \\Documer\\Storage\\Memory());\n```\n\n**Train**\n\n```php\n$documer-\u003etrain('politics', 'This is text about Politics and more');\n$documer-\u003etrain('philosophy', 'Socrates is an ancent Greek philosopher');\n$documer-\u003etrain('athletic', 'Have no idea about athletics. Sorry.');\n$documer-\u003etrain('athletic', 'Not a clue.');\n$documer-\u003etrain('athletic', 'It is just not my thing.');\n```\n\n**Guess**\n\n```php\n$scores = $documer-\u003eguess('What do we know about Socrates?');\n```\n\n`$scores` will hold an array with all labels of your system and the posibbility which the document will belong to\neach label.\n\n**Storage Adapters**\nImplement [Documer\\Storage\\Adapter](src/Storage/Adapter.php) to create your own Storage Adapter.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkbariotis%2Fdocumer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkbariotis%2Fdocumer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkbariotis%2Fdocumer/lists"}