{"id":27997652,"url":"https://github.com/itemsapi/elasticbulk","last_synced_at":"2025-10-08T06:49:16.051Z","repository":{"id":26425692,"uuid":"100977652","full_name":"itemsapi/elasticbulk","owner":"itemsapi","description":"Add data in bulk to elasticsearch. It supports data streaming from PostgreSQL or Filesystem","archived":false,"fork":false,"pushed_at":"2025-02-07T18:43:48.000Z","size":149,"stargazers_count":27,"open_issues_count":0,"forks_count":5,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-05-08T22:00:04.400Z","etag":null,"topics":["bulk","elasticsearch","import","importing","json","mysql","postgresql","stream","streaming"],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/itemsapi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-08-21T17:45:18.000Z","updated_at":"2025-02-07T18:42:15.000Z","dependencies_parsed_at":"2025-05-08T22:00:06.255Z","dependency_job_id":"cf891976-7c5a-442f-a6f4-154cb828e3c4","html_url":"https://github.com/itemsapi/elasticbulk","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/itemsapi/elasticbulk","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itemsapi%2Felasticbulk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itemsapi%2Felasticbulk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itemsapi%2Felasticbulk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itemsapi%2Felasticbulk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/itemsapi","download_url":"https://codeload.github.com/itemsapi/elasticbulk/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itemsapi%2Felasticbulk/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278903008,"owners_count":26065786,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-08T02:00:06.501Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bulk","elasticsearch","import","importing","json","mysql","postgresql","stream","streaming"],"created_at":"2025-05-08T21:59:56.733Z","updated_at":"2025-10-08T06:49:16.018Z","avatar_url":"https://github.com/itemsapi.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Elastic Bulk\n\nAdd data in bulk to ElasticSearch. It supports data streaming from PostgreSQL, MSSQL, MySQL, MariaDB, SQLite3, Filesystem and CSV\n\n## Start\n\n```bash\nnpm install elasticbulk --save\n```\n\n```js\nconst elasticbulk = require('elasticbulk');\n```\n\n## Add JSON data to Elasticsearch\n\n```js\nconst elasticbulk = require('elasticbulk');\n// some array data\nvar data = [];\n\nelasticbulk.import(data, {\n  index: 'movies',\n  type: 'movies',\n  host: 'http://localhost:9200'\n})\n.then(function(res) {\n  console.log(res);\n})\n```\n\n## Add data to ItemsAPI from JSON file\n\nThe `movies.json` is a comma delimited json file.\n\n```js\nconst elasticbulk = require('elasticbulk');\nconst stream = fs.createReadStream('./movies.json')\n.pipe(JSONStream.parse())\n\nconst config = {\n  \"sorting_fields\": [\"year\", \"rating\", \"votes\", \"reviews_count\"],\n  \"aggregations\": {\n    \"year\": {\n      \"size\": 10,\n      \"conjunction\": true\n    },\n    \"genres\": {\n      \"size\": 10,\n      \"conjunction\": false\n    },\n    \"tags\": {\n      \"size\": 10,\n      \"conjunction\": true\n    },\n    \"actors\": {\n      \"size\": 10,\n      \"conjunction\": true\n    },\n    \"country\": {\n      \"size\": 10,\n      \"conjunction\": true\n    }\n  }\n}\n\nelasticbulk.import(stream, {\n  engine: 'itemsapi',\n  // api_key: '',\n  index_name: 'movies',\n  host: 'http://localhost:9200',\n}, config)\n.then(function(res) {\n  console.log(res);\n})\n```\n\n## Add data to Meilisearch from JSON file\n\nThe `movies.json` is a comma delimited json file.\n\n```js\nconst elasticbulk = require('elasticbulk');\nconst stream = fs.createReadStream('./movies.json')\n.pipe(JSONStream.parse())\n\nconst config = {\n  rankingRules: [\n    'typo',\n  ],\n  distinctAttribute: 'id',\n  searchableAttributes: [\n    'name'\n  ],\n  attributesForFaceting: [\n    'director',\n    'genres'\n  ],\n  displayedAttributes: [\n    'name'\n  ],\n  stopWords: [\n  ],\n  synonyms: {\n  }\n}\n\nelasticbulk.import(stream, {\n  chunk_size: 1000,\n  timeout: 6000,\n  // intervalMs for check internal indexing status\n  interval: 100,\n  primary_key: 'id',\n  engine: 'meilisearch',\n  api_key: 'API_KEY',\n  index_name: 'movies',\n  host: 'http://localhost:9200',\n}, config)\n.then(function(res) {\n  console.log(res);\n})\n```\n\n## Add data to Elasticsearch from JSON file\n\nThe `movies.json` is a comma delimited json file.\n\n```js\nconst elasticbulk = require('elasticbulk');\nconst stream = fs.createReadStream('./movies.json')\n.pipe(JSONStream.parse())\n\nelasticbulk.import(stream, {\n  index: 'movies',\n  type: 'movies',\n  host: 'http://localhost:9200',\n})\n.then(function(res) {\n  console.log(res);\n})\n```\n\n## Add data to Elasticsearch from CSV\n\nYou can also use ElasticBulk for importing data from CSV. It was tested for millions of records\n\n```js\nconst fs = require('fs');\nconst csv = require('fast-csv');\nconst elasticbulk = require('elasticbulk');\n\nvar stream = fs.createReadStream('questions.csv')\n.pipe(csv({\n  headers: true\n}))\n.transform(function(data){\n  // you can transform your data here\n  return data;\n})\n\nelasticbulk.import(stream, {\n  index: 'questions',\n  type: 'questions',\n  host: 'http://localhost:9200'\n})\n.then(function(res) {\n  console.log(res);\n})\n```\n\n## Add data to Elasticsearch from PostgreSQL\n\n```js\nconst Promise = require('bluebird');\nconst through2 = require('through2');\nconst db = require('knex');\nconst elasticbulk = require('elasticbulk');\n\nvar stream = db.select('*').from('movies')\n.stream()\n.pipe(through2({ objectMode: true, allowHalfOpen: false }, function (chunk, enc, cb) {\n  cb(null, chunk)\n}))\n\nelasticbulk.import(stream, {\n  index: 'movies',\n  type: 'movies',\n  host: 'localhost:9200',\n})\n.then(function(res) {\n  console.log(res);\n})\n```\n\n## Add data to Elasticsearch from MongoDB\n\n```js\nconst elasticbulk = require('.elasticbulk');\nconst mongoose = require('mongoose');\nconst Promise = require('bluebird');\nmongoose.connect('mongodb://localhost/your_database_name', {\n  useMongoClient: true\n});\n\nmongoose.Promise = Promise;\n\nvar Page = mongoose.model('Page', new mongoose.Schema({\n  title: String,\n  categories: Array\n}), 'your_collection_name');\n\n// stream query \nvar stream = Page.find({\n}, {title: 1, _id: 0, categories: 1}).limit(1500000).skip(0).batchSize(500).stream();\n\nelasticbulk.import(stream, {\n  index: 'my_index_name',\n  type: 'my_type_name',\n  host: 'localhost:9200',\n}, {\n  title: {\n    type: 'string'\n  },\n  categories: {\n    type: 'string',\n    index: 'not_analyzed'\n  }\n})\n.then(function(res) {\n  console.log('Importing finished');\n})\n```\n\n\n## Configuration\n\n```js\nelasticbulk.import(data, {\n  index: 'movies',\n  // optional\n  type: 'movies',\n  // batch size \n  chunk_size: 500,\n  debug: true,\n  host: 'localhost:9200',\n}, {\n  // mapping\n  name: {\n    type: 'string'\n  }\n})\n.then(function(res) {\n  console.log(res);\n})\n```\n\n## Tests\n\n```bash\n# Test ES 1.7\ndocker run -it -d  -p 9200:9200 -p 9300:9300 -v $HOME/elasticsearch1.7/data:/data -v $HOME/elasticsearch1.7/logs:/logs barnybug/elasticsearch:1.7.2\nmocha --exit -t 15000 tests/elasticitemsSpec.js\n\n# Test ES 7.x\ndocker run -it -d -p 9200:9200 -p 9300:9300 -e \"discovery.type=single-node\" docker.elastic.co/elasticsearch/elasticsearch:7.10.1\nmocha --exit -t 15000 tests/elasticitems7xSpec.js\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fitemsapi%2Felasticbulk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fitemsapi%2Felasticbulk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fitemsapi%2Felasticbulk/lists"}