{"id":37073215,"url":"https://github.com/federicotdn/inelastic","last_synced_at":"2026-01-14T08:35:39.547Z","repository":{"id":58113348,"uuid":"144212678","full_name":"federicotdn/inelastic","owner":"federicotdn","description":"Print an Elasticsearch inverted index as a CSV table or JSON object.","archived":true,"fork":false,"pushed_at":"2024-03-20T17:40:51.000Z","size":38,"stargazers_count":11,"open_issues_count":1,"forks_count":4,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-11-27T18:27:22.973Z","etag":null,"topics":["csv","elastic","elasticsearch","index","inverted","json","search"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/federicotdn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-08-09T23:19:27.000Z","updated_at":"2025-05-18T18:50:34.000Z","dependencies_parsed_at":"2022-09-20T03:54:22.136Z","dependency_job_id":null,"html_url":"https://github.com/federicotdn/inelastic","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/federicotdn/inelastic","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/federicotdn%2Finelastic","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/federicotdn%2Finelastic/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/federicotdn%2Finelastic/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/federicotdn%2Finelastic/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/federicotdn","download_url":"https://codeload.github.com/federicotdn/inelastic/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/federicotdn%2Finelastic/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28414514,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T08:31:27.429Z","status":"ssl_error","status_checked_at":"2026-01-14T08:31:19.098Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","elastic","elasticsearch","index","inverted","json","search"],"created_at":"2026-01-14T08:35:38.880Z","updated_at":"2026-01-14T08:35:39.538Z","avatar_url":"https://github.com/federicotdn.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# inelastic\n[![Build Status](https://travis-ci.org/federicotdn/inelastic.svg)](https://travis-ci.org/federicotdn/inelastic)\n[![Version](https://img.shields.io/pypi/v/inelastic.svg?style=flat)](https://pypi.python.org/pypi/inelastic)\n![](https://img.shields.io/badge/python-3-blue.svg)\n![](https://img.shields.io/badge/code%20style-black-000000.svg)\n\nPrint an Elasticsearch inverted index as a CSV table or JSON object.\n\n`inelastic` builds an approximation of how an [inverted index](https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up) would look like for a particular index and document field, using the [Multi termvectors API](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-multi-termvectors.html) on all stored documents.\n\n## Installation\nTo install `inelastic`, run the following command:\n```bash\n$ pip3 install --upgrade inelastic\n```\n\n`inelastic` currently only supports Elasticsearch versions 6.X and 7.X.\n\n## Example\n\nHaving the following index:\n```\nPUT /tweets\n{\n    \"mappings\": {\n        \"properties\": {\n            \"content\": {\n                \"type\": \"text\"\n            }\n        }\n    }\n}\n```\n\nwith the following documents:\n```\nPOST /tweets/_bulk\n{ \"index\": { \"_id\": 1 }}\n{ \"content\": \"This is my first tweet.\" }\n{ \"index\": { \"_id\": 2 }}\n{ \"content\": \"Most Elasticsearch examples use tweets.\" }\n{ \"index\": { \"_id\": 3 }}\n{ \"content\": \"This is an example.\" }\n{ \"index\": { \"_id\": 4 }}\n{ \"content\": \"Adding some more tweets.\" }\n{ \"index\": { \"_id\": 5 }}\n{ \"content\": \"Adding more and more tweets.\" }\n```\n\n`inelastic` could be used as follows (combined with the `column` command):\n\n```bash\n$ inelastic -i tweets -f content | column -t -s ,\n```\n\nWhich would output:\n```\nterm           freq  doc_count  d0  d1  d2\nadding         2     2          4   5\nan             1     1          3\nand            1     1          5\nelasticsearch  1     1          2\nexample        1     1          3\nexamples       1     1          2\nfirst          1     1          1\nis             2     2          1   3\nmore           3     2          4   5\nmost           1     1          2\nmy             1     1          1\nsome           1     1          4\nthis           2     2          1   3\ntweet          1     1          1\ntweets         3     3          2   4   5\nuse            1     1          2\n```\n\nThe `freq` field specifies the total amount of times the term appears in all documents, and the `doc_count` field specifies how many documents contain the term at least once. The `d0`, `d1`... fields list the IDs for documents containing the term.\n\nThe chosen document field's type must be `text` or `keyword`.\n\n## Usage\nThese are the arguments `inelastic` accepts:\n- `-i` (`--index`): Index name (**required**).\n- `-f` (`--field`): Document field name from which to generate inverted index (**required**).\n- `-l` (`--id-field`): Document field to use as ID when printing results (*default: _id*).\n- `-o` (`--output`): Output format, `json` or `csv` (*default: csv*).\n- `-p` (`--port`): Elasticsearch host port (*default: 9200*).\n- `-e` (`--host`): Elasticsearch host address (*default: localhost*).\n- `-q` (`--query`): Elasticsearch DSL JSON query to use when fetch documents. (*default: None*).\n- `-d` (`--doctype`): Document type (*default: _doc*) (**Elasticsearch 6.X only**).\n- `-v` (`--verbose`): Print debug information (*default: False*).\n- `-h` (`--help`): Show help and exit.\n\n## Scripting\nThe `inelastic` module exposes the `InvertedIndex` class, which can be used in custom Python scripts:\n```python\nfrom inelastic import InvertedIndex\nfrom elasticsearch import Elasticsearch  # Only with ES 7.X\nfrom elasticsearch6 import Elasticsearch # Only with ES 6.X\n\nes = Elasticsearch()\nii = InvertedIndex(search_size=250, scroll_time='10s')\n\nn_docs, errors = ii.read_index(es, 'tweets', 'content')\n\nprint('# docs: {}, # errors: {}'.format(n_docs, errors))\n\nfor entry in ii.to_list():\n    print(entry)\n```\n\nWhen run, the previous script will output:\n```\n# docs: 5, # errors: 0\n('adding', \u003cIndexEntry IDs: ['4', '5']\u003e)\n('an', \u003cIndexEntry IDs: ['3']\u003e)\n('and', \u003cIndexEntry IDs: ['5']\u003e)\n('elasticsearch', \u003cIndexEntry IDs: ['2']\u003e)\n('example', \u003cIndexEntry IDs: ['3']\u003e)\n('examples', \u003cIndexEntry IDs: ['2']\u003e)\n('first', \u003cIndexEntry IDs: ['1']\u003e)\n('is', \u003cIndexEntry IDs: ['1', '3']\u003e)\n('more', \u003cIndexEntry IDs: ['4', '5']\u003e)\n('most', \u003cIndexEntry IDs: ['2']\u003e)\n('my', \u003cIndexEntry IDs: ['1']\u003e)\n('some', \u003cIndexEntry IDs: ['4']\u003e)\n('this', \u003cIndexEntry IDs: ['1', '3']\u003e)\n('tweet', \u003cIndexEntry IDs: ['1']\u003e)\n('tweets', \u003cIndexEntry IDs: ['2', '4', '5']\u003e)\n('use', \u003cIndexEntry IDs: ['2']\u003e)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffedericotdn%2Finelastic","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffedericotdn%2Finelastic","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffedericotdn%2Finelastic/lists"}