{"id":19825895,"url":"https://github.com/octachron/locc","last_synced_at":"2025-07-01T14:06:03.420Z","repository":{"id":77493254,"uuid":"43022908","full_name":"Octachron/locc","owner":"Octachron","description":"Linguistic Ocaml Comment Classifier","archived":false,"fork":false,"pushed_at":"2015-09-23T19:46:06.000Z","size":124,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-28T20:47:11.447Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"OCaml","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Octachron.png","metadata":{"files":{"readme":"Readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-09-23T19:44:05.000Z","updated_at":"2015-09-23T19:46:07.000Z","dependencies_parsed_at":null,"dependency_job_id":"90533f8d-f729-4e5c-88fe-64cb0910be61","html_url":"https://github.com/Octachron/locc","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Octachron/locc","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Octachron%2Flocc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Octachron%2Flocc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Octachron%2Flocc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Octachron%2Flocc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Octachron","download_url":"https://codeload.github.com/Octachron/locc/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Octachron%2Flocc/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262978551,"owners_count":23394008,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T11:08:59.522Z","updated_at":"2025-07-01T14:06:03.399Z","avatar_url":"https://github.com/Octachron.png","language":"OCaml","readme":"Locc is a helper tool to extract comments from ocaml source code and classify\nthem in function of their primary language. The classification is done using\naspell dictionaries coupled to a max-likehood estimator based on utterly \nsimplistic statistic model.\n\n##Model\n\nThe statistical model used assume that\n\n  * There is no correlation between words, i.e. sentences are i.i.d sequences\nof words\n\n  * For a given primary language, every word existing in a language has the same probability\n\nFor instance, the default model is \n\n\n```\n     \\    french   english  unknown      secondary\n\n  french   0.8      0.1        0.1\n\n  english  0.1      0.8        0.1\n\n  primary\n\n\n````\n\nIn this model, we assume that within a text primary in french, there is a \n10% probability that an English word or a word of unknown origin appears.\nAnd reciprocally for a text primary in English, the model considers that\nthere is a 10% probability that a french or unknown word appears.\n\n\n\n##Usage\n\n```sh\nlocc -m model -o logs target\n```\nWith this invocation, locc will analyze all the Ocaml source files \n(i.e \".ml{,i,y,l}\") presents in `target`. If target is a directory, all the \nfiles and sub-directories contained in `target` will be analyzed.\n\nLocc will then output on std a report listing the number of comments detected \nunder each subclasses of the `model`. The detailed log of the analysis will be \nwritten in the `logs` directory.\n\nIf the option `model` is not provided, the default model is\n\n```\n fr 0.8 0.1 0.1\n%en 0.1 0.8 0.1\n\n```\nThe models themselve are a '%'-separated list of \"primary language name\" + \nlist of language probabilities within a text of primary language. Note that\nthe primary language name must be an aspell dictionary name.\n\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foctachron%2Flocc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foctachron%2Flocc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foctachron%2Flocc/lists"}