{"id":43617517,"url":"https://github.com/bjpop/gurita","last_synced_at":"2026-02-04T12:38:06.777Z","repository":{"id":43730346,"uuid":"229207902","full_name":"bjpop/gurita","owner":"bjpop","description":"A convenient and expressive tool for data analytics and plotting on the command line","archived":false,"fork":false,"pushed_at":"2024-01-11T07:19:59.000Z","size":20792,"stargazers_count":4,"open_issues_count":24,"forks_count":3,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-01-12T16:51:59.107Z","etag":null,"topics":["command-line","data-analysis","data-science","pandas","plotting","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bjpop.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"docs/contributing.html","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2019-12-20T06:54:10.000Z","updated_at":"2023-11-23T11:36:40.000Z","dependencies_parsed_at":"2024-01-08T14:14:15.952Z","dependency_job_id":null,"html_url":"https://github.com/bjpop/gurita","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"purl":"pkg:github/bjpop/gurita","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bjpop%2Fgurita","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bjpop%2Fgurita/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bjpop%2Fgurita/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bjpop%2Fgurita/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bjpop","download_url":"https://codeload.github.com/bjpop/gurita/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bjpop%2Fgurita/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29084418,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-04T03:31:03.593Z","status":"ssl_error","status_checked_at":"2026-02-04T03:29:50.742Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["command-line","data-analysis","data-science","pandas","plotting","python"],"created_at":"2026-02-04T12:38:05.355Z","updated_at":"2026-02-04T12:38:06.760Z","avatar_url":"https://github.com/bjpop.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/_images/gurita_using_computer.png\" width=\"250\" alt=\"fun image of octopus using a computer\"\u003e\n\u003c/p\u003e\n\n# Gurita: a command line data analytics and plotting tool \n\n\nGurita is a command line tool for analysing and visualising tabular data in CSV or TSV format.\n\nAt its core Gurita provides a suite of commands, each of which carries out a common data analytics or plotting task.\n\n**A unique and powerful feature of Gurita** is that commands to be chained together into flexible analysis pipelines. See the advanced example below.\n\nIt is designed to be fast and convenient, and is particularly suited to data exploration tasks. Input files with large numbers of rows (\u003e millions) are readily supported.\n\nGurita commands are highly customisable, however sensible defaults are applied. Therefore simple tasks are easy to express\nand complex tasks are possible.\n\nGurita is implemented in [Python](http://www.python.org/) and makes extensive use of the [Pandas](https://pandas.pydata.org/), [Seaborn](https://seaborn.pydata.org/), and [Scikit-learn](https://scikit-learn.org/) libraries for data processing and plot generation.\n\n# Documentation\n\nPlease consult the [Gurita Documentation](https://bjpop.github.io/gurita/index.html) for detailed information about installation and usage.\n\n# Examples\n\n### Simple example\n\nBox plot of `sepal_length` for each species in the classic [iris dataset](https://github.com/mwaskom/seaborn-data/blob/master/iris.csv/):\n\n```bash\ncat iris.csv | gurita box -x species -y sepal_length\n```\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/_images/box.species.sepal_length.png\" width=\"400\" alt=\"example box plot of sepal_length for each species in the classic iris dataset\"\u003e\n\u003c/p\u003e\n\n### Advanced example \n\nThe following example illustrates Gurita's ability to chain commands together. \n\nCommands in a chain are separated by the plus sign (+) and data flows from left to right in the chain.\n\n```bash\ncat iris.csv | gurita filter 'species != \"virginica\"' \\\n                      + sample 0.9 \\\n                      + pca \\\n                      + scatter -x pc1 -y pc2 --hue species\n```\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/_images/scatter.pc1.pc2.species.png\" width=\"500\" alt=\"Scatter plot comparing principal components pc1 and pc2 from a filtered iris dataset\"\u003e\n\u003c/p\u003e\n\nIn this example there are 4 commands that are executed in the following order:\n\n1. The ``filter`` command selects all rows where ``species`` is not equal to ``virginica``.\n2. The filtered rows are then passed to the ``sample`` command which randomly selects 90% of the remaining rows.\n3. The sampled rows are then passed to the ``pca`` command which performs principal component analysis (PCA) as a data reduction step, yielding two extra columns in the data called ``pc1`` and ``pc2``.\n4. Finally the pca-transformed data is passed to the `scatter` command which generates a scatter plot of ``pc1`` and ``pc2`` (the first two principal components).\n\n# Licence\n\nThis program is released as open source software under the terms of [MIT License](https://raw.githubusercontent.com/bjpop/gurita/master/LICENSE).\n\n# Authors\n\n * [Bernie Pope](http://www.berniepope.id.au/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbjpop%2Fgurita","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbjpop%2Fgurita","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbjpop%2Fgurita/lists"}