{"id":13654248,"url":"https://github.com/miku/esbulk","last_synced_at":"2026-02-20T13:00:54.717Z","repository":{"id":20096901,"uuid":"23366356","full_name":"miku/esbulk","owner":"miku","description":"Bulk indexing command line tool for elasticsearch.","archived":false,"fork":false,"pushed_at":"2025-12-12T17:22:01.000Z","size":9796,"stargazers_count":283,"open_issues_count":10,"forks_count":41,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-12-13T22:56:30.758Z","etag":null,"topics":["code4lib","elasticsearch","hacktoberfest","indexing"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/miku.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2014-08-26T20:50:22.000Z","updated_at":"2025-12-12T17:22:05.000Z","dependencies_parsed_at":"2024-01-15T09:05:16.133Z","dependency_job_id":"d881ffa5-ea09-479b-859f-a3b4ede5f4c5","html_url":"https://github.com/miku/esbulk","commit_stats":null,"previous_names":[],"tags_count":59,"template":false,"template_full_name":null,"purl":"pkg:github/miku/esbulk","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/miku%2Fesbulk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/miku%2Fesbulk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/miku%2Fesbulk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/miku%2Fesbulk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/miku","download_url":"https://codeload.github.com/miku/esbulk/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/miku%2Fesbulk/sbom","scorecard":{"id":646825,"data":{"date":"2025-08-11","repo":{"name":"github.com/miku/esbulk","commit":"9984133e8f6c49fb1318686c0ff6a61107baab1c"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":2.2,"checks":[{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Code-Review","score":0,"reason":"Found 0/23 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Pinned-Dependencies","score":5,"reason":"dependency not pinned by hash detected -- score normalized to 5","details":["Warn: containerImage not pinned by hash: extra/Dockerfile:4","Info:   0 out of   1 containerImage dependencies pinned","Info:   1 out of   1 goCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: GNU General Public License v3.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Signed-Releases","score":0,"reason":"Project has not signed or included provenance with any releases.","details":["Warn: release artifact v0.7.22 not signed: https://api.github.com/repos/miku/esbulk/releases/205268316","Warn: release artifact v0.7.21 not signed: https://api.github.com/repos/miku/esbulk/releases/196459304","Warn: release artifact v0.7.20 not signed: https://api.github.com/repos/miku/esbulk/releases/170268195","Warn: release artifact v0.7.19 not signed: https://api.github.com/repos/miku/esbulk/releases/154788720","Warn: release artifact v0.7.18 not signed: https://api.github.com/repos/miku/esbulk/releases/146885880","Warn: release artifact v0.7.22 does not have provenance: https://api.github.com/repos/miku/esbulk/releases/205268316","Warn: release artifact v0.7.21 does not have provenance: https://api.github.com/repos/miku/esbulk/releases/196459304","Warn: release artifact v0.7.20 does not have provenance: https://api.github.com/repos/miku/esbulk/releases/170268195","Warn: release artifact v0.7.19 does not have provenance: https://api.github.com/repos/miku/esbulk/releases/154788720","Warn: release artifact v0.7.18 does not have provenance: https://api.github.com/repos/miku/esbulk/releases/146885880"],"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 7 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Vulnerabilities","score":4,"reason":"6 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: GO-2025-3829 / GHSA-4vq8-7jfc-9cvp","Warn: Project is vulnerable to: GO-2024-3321 / GHSA-v778-237x-gjrc","Warn: Project is vulnerable to: GO-2025-3487 / GHSA-hcg3-q754-cr77","Warn: Project is vulnerable to: GO-2024-3333","Warn: Project is vulnerable to: GO-2025-3503 / GHSA-qxp5-gwg8-xv66","Warn: Project is vulnerable to: GO-2025-3595 / GHSA-vvgc-356p-c3xw"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}}]},"last_synced_at":"2025-08-21T12:21:36.586Z","repository_id":20096901,"created_at":"2025-08-21T12:21:36.586Z","updated_at":"2025-08-21T12:21:36.586Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29651964,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-20T09:27:29.698Z","status":"ssl_error","status_checked_at":"2026-02-20T09:26:12.373Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["code4lib","elasticsearch","hacktoberfest","indexing"],"created_at":"2024-08-02T02:01:25.636Z","updated_at":"2026-02-20T13:00:54.710Z","avatar_url":"https://github.com/miku.png","language":"Go","funding_links":[],"categories":["Uncategorized","Elasticsearch developer tools and utilities","Go"],"sub_categories":["Uncategorized","Import and Export"],"readme":"esbulk\n======\n\nFast parallel command line [bulk loading](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html) utility for elasticsearch. Data is read from a\n[newline delimited JSON](http://jsonlines.org/) file or stdin and indexed into elasticsearch in bulk\n*and* in parallel. The shortest command would be:\n\n```shell\n$ esbulk -index my-index-name \u003c file.ldj\n```\n\nCaveat: If indexing *pressure* on the bulk API is too high (dozens or hundreds of\nparallel workers, large batch sizes, depending on you setup), esbulk will halt\nand report an error:\n\n```shell\n$ esbulk -index my-index-name -w 100 file.ldj\n2017/01/02 16:25:25 error during bulk operation, try less workers (lower -w value) or\n                    increase thread_pool.bulk.queue_size in your nodes\n```\n\nPlease note that, in such a case, some documents are indexed and some are not.\nYour index will be in an inconsistent state, since there is no transactional\nbracket around the indexing process.\n\nHowever, using defaults (parallelism: number of cores) on a single node setup\nwill just work. For larger clusters, increase the number of workers until you\nsee full CPU utilization. After that, more workers won't buy any more speed.\n\nCurrently, esbulk is [tested against](https://git.io/Jzg2u) elasticsearch\nversions 5, 6, 7 and 8 using\n[testcontainers](https://github.com/testcontainers/testcontainers-go). Originally written for [Leipzig University\nLibrary](https://en.wikipedia.org/wiki/Leipzig_University_Library), [project\nfinc](https://finc.info).\n\n[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)\n![GitHub All Releases](https://img.shields.io/github/downloads/miku/esbulk/total.svg)\n\nInstallation\n------------\n\n    $ go install github.com/miku/esbulk/cmd/esbulk@latest\n\nFor `deb` or `rpm` packages, see: https://github.com/miku/esbulk/releases\n\nUsage\n-----\n\n    $ esbulk -h\n\tUsage of esbulk:\n      -0    set the number of replicas to 0 during indexing\n      -c string\n            create index mappings, settings, aliases, https://is.gd/3zszeu\n      -cpuprofile string\n            write cpu profile to file\n      -id string\n            name of field to use as id field, by default ids are autogenerated\n      -index string\n            index name\n      -k    skip insecure certificate verification\n      -mapping string\n            mapping string or filename to apply before indexing\n      -memprofile string\n            write heap profile to file\n      -optype string\n            optype (index - will replace existing data,\n                    create - will only create a new doc,\n                    update - create new or update existing data) (default \"index\")\n      -p string\n            pipeline to use to preprocess documents\n      -purge\n            purge any existing index before indexing\n      -purge-pause duration\n            pause after purge (default 1s)\n      -r string\n            Refresh interval after import (default \"1s\")\n      -seed int\n            seed for random server selection (default: current unix nano)\n      -server value\n            elasticsearch server, this works with https as well\n      -size int\n            bulk batch size (default 1000)\n      -skipbroken\n            skip broken json\n      -timeout duration\n            timeout for HTTP requests (default 30s)\n      -type string\n            elasticsearch doc type (deprecated since ES7)\n      -u string\n            http basic auth username:password, like curl -u\n      -v    prints current program version\n      -verbose\n            output basic progress\n      -w int\n            number of workers to use (default 8)\n      -z    unzip gz'd file on the fly\n\n\n![](https://raw.githubusercontent.com/miku/esbulk/master/docs/asciicast.gif)\n\nTo index a JSON file, that contains one document\nper line, just run:\n\n    $ esbulk -index example file.ldj\n\nWhere `file.ldj` is line delimited JSON, like:\n\n    {\"name\": \"esbulk\", \"version\": \"0.2.4\"}\n    {\"name\": \"estab\", \"version\": \"0.1.3\"}\n    ...\n\nBy default `esbulk` will use as many parallel\nworkers, as there are cores. To tweak the indexing\nprocess, adjust the `-size` and `-w` parameters.\n\nYou can index from gzipped files as well, using\nthe `-z` flag:\n\n    $ esbulk -z -index example file.ldj.gz\n\nStarting with 0.3.7 the preferred method to set a\nnon-default server hostport is via `-server`, e.g.\n\n    $ esbulk -server https://0.0.0.0:9201\n\nThis way, you can use https as well, which was not\npossible before. Options `-host` and `-port` are\ngone as of [esbulk 0.5.0](https://github.com/miku/esbulk/releases/tag/v0.5.0).\n\nReusing IDs\n-----------\n\nSince version 0.3.8: If you want to reuse IDs from your documents in elasticsearch, you\ncan specify the ID field via `-id` flag:\n\n    $ cat file.json\n    {\"x\": \"doc-1\", \"db\": \"mysql\"}\n    {\"x\": \"doc-2\", \"db\": \"mongo\"}\n\nHere, we would like to reuse the ID from field *x*.\n\n    $ esbulk -id x -index throwaway -verbose file.json\n    ...\n\n    $ curl -s http://localhost:9200/throwaway/_search | jq\n    {\n      \"took\": 2,\n      \"timed_out\": false,\n      \"_shards\": {\n        \"total\": 5,\n        \"successful\": 5,\n        \"failed\": 0\n      },\n      \"hits\": {\n        \"total\": 2,\n        \"max_score\": 1,\n        \"hits\": [\n          {\n            \"_index\": \"throwaway\",\n            \"_type\": \"default\",\n            \"_id\": \"doc-2\",\n            \"_score\": 1,\n            \"_source\": {\n              \"x\": \"doc-2\",\n              \"db\": \"mongo\"\n            }\n          },\n          {\n            \"_index\": \"throwaway\",\n            \"_type\": \"default\",\n            \"_id\": \"doc-1\",\n            \"_score\": 1,\n            \"_source\": {\n              \"x\": \"doc-1\",\n              \"db\": \"mysql\"\n            }\n          }\n        ]\n      }\n    }\n\nNested ID fields\n----------------\n\nVersion 0.4.3 adds support for nested ID fields:\n\n```\n$ cat fixtures/pr-8-1.json\n{\"a\": {\"b\": 1}}\n{\"a\": {\"b\": 2}}\n{\"a\": {\"b\": 3}}\n\n$ esbulk -index throwaway -id a.b \u003c fixtures/pr-8-1.json\n...\n```\n\nConcatenated ID\n---------------\n\nVersion 0.4.3 adds support for IDs that are the concatenation of multiple fields:\n\n```\n$ cat fixtures/pr-8-2.json\n{\"a\": {\"b\": 1}, \"c\": \"a\"}\n{\"a\": {\"b\": 2}, \"c\": \"b\"}\n{\"a\": {\"b\": 3}, \"c\": \"c\"}\n\n$ esbulk -index throwaway -id a.b,c \u003c fixtures/pr-8-1.json\n...\n\n      {\n        \"_index\": \"xxx\",\n        \"_type\": \"default\",\n        \"_id\": \"1a\",\n        \"_score\": 1,\n        \"_source\": {\n          \"a\": {\n            \"b\": 1\n          },\n          \"c\": \"a\"\n        }\n      },\n```\n\nUsing X-Pack\n------------\n\nSince 0.4.2: support for secured elasticsearch nodes:\n\n```\n$ esbulk -u elastic:changeme -index myindex file.ldj\n```\n\n----\n\nA similar project has been started for solr, called [solrbulk](https://github.com/miku/solrbulk).\n\nContributors\n------------\n\n* [klaubert](https://github.com/klaubert)\n* [sakshambathla](https://github.com/sakshambathla)\n* [mumoshu](https://github.com/mumoshu)\n* [albertpastrana](https://github.com/albertpastrana)\n* [faultlin3](https://github.com/faultlin3)\n* [gransy](https://github.com/gransy)\n* [Christoph Kepper](https://github.com/ckepper)\n* Christian Solomon\n* Mikael Byström\n\nand others.\n\nMeasurements\n------------\n\n```shell\n$ csvlook -I measurements.csv\n| es    | esbulk | docs      | avg_b | nodes | cores | total_heap_gb | t_s   | docs_per_s | repl |\n|-------|--------|-----------|-------|-------|-------|---------------|-------|------------|------|\n| 6.1.2 | 0.4.8  | 138000000 | 2000  | 1     | 32    |  64           |  6420 |  22100     | 1    |\n| 6.1.2 | 0.4.8  | 138000000 | 2000  | 1     |  8    |  30           | 27360 |   5100     | 1    |\n| 6.1.2 | 0.4.8  |   1000000 | 2000  | 1     |  4    |   1           |   300 |   3300     | 1    |\n| 6.1.2 | 0.4.8  |  10000000 |   26  | 1     |  4    |   8           |   122 |  81000     | 1    |\n| 6.1.2 | 0.4.8  |  10000000 |   26  | 1     | 32    |  64           |    32 | 307000     | 1    |\n| 6.2.3 | 0.4.10 | 142944530 | 2000  | 2     | 64    | 128           | 26253 |   5444     | 1    |\n| 6.2.3 | 0.4.10 | 142944530 | 2000  | 2     | 64    | 128           | 11113 |  12831     | 0    |\n| 6.2.3 | 0.4.13 |  15000000 | 6000  | 2     | 64    | 128           |  2460 |   6400     | 0    |\n```\n\nWhy not add a [row](https://github.com/miku/esbulk/pulls)?\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmiku%2Fesbulk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmiku%2Fesbulk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmiku%2Fesbulk/lists"}