{"id":23395699,"url":"https://github.com/randogoth/lyagushka","last_synced_at":"2025-04-08T17:19:22.237Z","repository":{"id":225404468,"uuid":"765356903","full_name":"randogoth/lyagushka","owner":"randogoth","description":"The algorithm identifies clusters and gaps in integer datasets, calculates their Z-scores based on mean density and distance, and outputs the results as JSON.","archived":false,"fork":false,"pushed_at":"2025-01-09T18:22:03.000Z","size":104,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-14T13:36:38.269Z","etag":null,"topics":["attractors","clustering","clusters","gaps","statistical-analysis","voids","z-score"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/randogoth.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-29T19:05:09.000Z","updated_at":"2025-01-09T17:38:51.000Z","dependencies_parsed_at":"2024-04-09T05:28:55.463Z","dependency_job_id":"c0fde095-01eb-442d-804e-c8a41a1d4e9a","html_url":"https://github.com/randogoth/lyagushka","commit_stats":null,"previous_names":["randogoth/lyagushka"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/randogoth%2Flyagushka","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/randogoth%2Flyagushka/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/randogoth%2Flyagushka/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/randogoth%2Flyagushka/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/randogoth","download_url":"https://codeload.github.com/randogoth/lyagushka/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247888568,"owners_count":21013002,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["attractors","clustering","clusters","gaps","statistical-analysis","voids","z-score"],"created_at":"2024-12-22T07:16:57.229Z","updated_at":"2025-04-08T17:19:22.220Z","avatar_url":"https://github.com/randogoth.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# lyagushka\n\n(Russian лягушка [lʲɪˈɡuʂkə]: frog)\n\nLyagushka is a Rust command-line tool inspired by Fatum Project's ['Zhaba' algorithm](https://gist.github.com/randogoth/ab5ab9e8665303be176f16241e7b26b5) (Russian 'жаба': toad) and expands upon it for more versatility.\n\nIt is an algorithm that analyzes a one-dimensional dataset of integers to identify clusters of closely grouped \"attractor\" points and significant \"void\" gaps between these clusters. It calculates z-scores for each cluster or gap to measure their statistical significance relative to the dataset's mean density and distance between points. The analysis results, including attractors, voids, and their z-scores, are output as a JSON string.\n\n## Building\n\nWith a Rust and Cargo environment set up, simply run:\n\n```sh\ncargo build --release\n```\n\nTo also compile a Python wheel, you need Maturin set up:\n\n```sh\npipenv install\npipenv shell\nmaturin build --release\npip install target/wheels/lyagushka-1.1.0*.whl\n```\n\n## Usage\n\n### Parameters\n\n*  `filename.txt` (optional): A file containing a newline-separated list of integers to analyze. If not provided, the program expects input from stdin.\n*  `factor`: A floating-point value by which the mean density/span is multiplied to make up a threshold for attractor and void detection.\n*  `min_cluster_size`: An integer specifying the minimum number of contiguous points required to be considered a cluster.\n\n### Output\n\nThe tool outputs a JSON string that includes details about the identified attractors and voids, along with their respective z-scores. Here's an example of the JSON output format:\n\n```json\n\n[\n  //...\n  {\n    \"elements\": [ 722, 722, 722, 725, 725, 726, 726, 726],\n    \"start\": 722,\n    \"end\": 726,\n    \"span_length\": 4,\n    \"num_elements\": 8,\n    \"centroid\": 724.0,\n    \"z_score\": 1.19528\n  },\n  {\n    \"elements\": [],\n    \"start\": 732,\n    \"end\": 740,\n    \"span_length\": 8,\n    \"num_elements\": 0,\n    \"centroid\": 736.0,\n    \"z_score\": -1.13359\n  },\n  //...\n]\n```\n\n### From a File\n\nTo analyze a dataset from a file, provide the filename as an argument, followed by the factor and minimum cluster size parameters\n```sh\nlyagushka random_values.txt 1.5 6\n```\n(= '*Attractor clusters need to have at least 6 numbers with 1.5 times the mean density, void gaps need to be at leat 1.5 times the mean gap size wide*')\n\n### From Stdin\n\nAlternatively, you can pipe a list of integers into the tool, followed by the factor and minimum cluster size.\n\n```sh\ncat random_values.txt | lyagushka 0.5 2\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frandogoth%2Flyagushka","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frandogoth%2Flyagushka","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frandogoth%2Flyagushka/lists"}