{"id":15554153,"url":"https://github.com/sogilis/csv_fast_importer","last_synced_at":"2025-10-12T18:30:22.307Z","repository":{"id":59152611,"uuid":"52618509","full_name":"sogilis/csv_fast_importer","owner":"sogilis","description":"Fast CSV Importer for PostgreSQL and MySQL","archived":false,"fork":false,"pushed_at":"2023-09-26T08:22:54.000Z","size":809,"stargazers_count":4,"open_issues_count":7,"forks_count":1,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-01-16T13:41:47.193Z","etag":null,"topics":["mysql","performance","postgresql","rails","ruby"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sogilis.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2016-02-26T16:48:10.000Z","updated_at":"2023-02-16T22:10:59.000Z","dependencies_parsed_at":"2022-09-13T11:01:08.662Z","dependency_job_id":"9db914fa-d2bb-4625-9f46-34a6f6223f19","html_url":"https://github.com/sogilis/csv_fast_importer","commit_stats":{"total_commits":103,"total_committers":6,"mean_commits":"17.166666666666668","dds":0.2038834951456311,"last_synced_commit":"587d243718a348d2b61e7c463cc7b84de8eb0eb8"},"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sogilis%2Fcsv_fast_importer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sogilis%2Fcsv_fast_importer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sogilis%2Fcsv_fast_importer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sogilis%2Fcsv_fast_importer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sogilis","download_url":"https://codeload.github.com/sogilis/csv_fast_importer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":236261782,"owners_count":19120773,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["mysql","performance","postgresql","rails","ruby"],"created_at":"2024-10-02T14:50:19.441Z","updated_at":"2025-10-12T18:30:21.769Z","avatar_url":"https://github.com/sogilis.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Gem Version](https://badge.fury.io/rb/csv_fast_importer.svg)](https://badge.fury.io/rb/csv_fast_importer) ![Tests status](https://github.com/sogilis/csv_fast_importer/actions/workflows/tests.yml/badge.svg) [![Codacy Badge](https://app.codacy.com/project/badge/Grade/1ecd555b2ff3414d92bc8674b29c68ea)](https://www.codacy.com/gh/sogilis/csv_fast_importer/dashboard?utm_source=github.com\u0026amp;utm_medium=referral\u0026amp;utm_content=sogilis/csv_fast_importer\u0026amp;utm_campaign=Badge_Grade)\n\n\n# CSV Fast Importer\n\nA gem to import CSV files' content into a PostgreSQL or MySQL database. It is respectively based on [PostgreSQL `COPY`](https://wiki.postgresql.org/wiki/COPY) and [MySQL `LOAD DATA INFILE`](https://dev.mysql.com/doc/refman/5.7/en/load-data.html) which are designed to be as fast as possible.\n\n## Why?\n\nCSV importation is a common task which can be done by more than 6 different gems, but none of them is able to import **1 million of lines in a few seconds** (see benchmark below), hence the creation of this gem.\n\nHere is an indicative benchmark to compare available solutions. It represents the **duration (ms)** to import a **10 000 lines** csv file into a local PostgreSQL instance on a laptop running OSX (lower is better):\n\n![Benchmark](benchmark/results.png?raw=true \"Benchmark\")\n\nLike all benchmarks, some tuning can produce different results, yet this chart gives a big picture. See [benchmark details](benchmark/README.md).\n\n## Requirements\n\n- Rails (ActiveRecord in fact)\n- PostgreSQL or MySQL\n\n## Limitations\n\n- Usual ActiveRecord process (validations, callbacks, computed fields like `created_at`...) is bypassed. This is the price for performance\n- Custom enclosing field (ex: `\"`) is not supported yet\n- Custom line separator (ex: `\\r\\n` for windows file) is not supported yet\n- MySQL: encoding is not supported yet\n- MySQL: transaction is not supported yet\n- MySQL: row_index is not supported yet\n- MySQL: database must have access to file to import\n\nNote about custom line separator: it might work by opening the file with the `universal_newline` argument (e.g. `file = File.new(path, universal_newline: true)`). Unfortunately, we weren't able to reproduce and test it so we don't support it \"officialy\". You can find more information in [this ticket](https://github.com/sogilis/csv_fast_importer/pull/45#issuecomment-326578839) (in French).\n\n## Installation\n\nAdd the dependency to your Gemfile:\n\n```ruby\ngem 'csv_fast_importer'\n```\n\nRun `bundle install`.\n\nYou can install the gem by yourself too:\n\n```sh\n$ gem install csv_fast_importer\n```\n\n**For MySQL** :warning: enable `local_infile` for both [client](https://dev.mysql.com/doc/refman/5.7/en/source-configuration-options.html#option_cmake_enabled_local_infile) and [server](https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_local_infile). In Rails application, juste add `local_infile: true` to your database config file `databse.yml` to configure the database client. See [Security Issues with LOAD DATA LOCAL](https://dev.mysql.com/doc/refman/5.7/en/load-data-local.html) for more details.\n\n\n## Usage\n\nActually, CSV Fast Importer needs `active_record` to work. Setup your database\nconfiguration as in a usual Rails project. Then, use the `CsvFastImporter`\nclass:\n\n```ruby\nrequire 'csv_fast_importer'\n\nfile = File.new '/path/to/knights.csv'\nimported_lines_count = CsvFastImporter.import(file)\n\nputs imported_lines_count\n```\n\nUnder the hood, CSV Fast Importer deletes data from the `knights` table and\nimports those from `knights.csv` by mapping columns' names to table's fields.\nNote: mapping is case insensitive so **database fields' names must be lowercase**.\nFor instance, a `FIRSTNAME` CSV column will be mapped to the `firstname` field.\n\n### Options\n\n| Option key | Purpose | Default value |\n| ------------ | ------------- | ------------- |\n| *encoding* | File encoding. *PostgreSQL only* (see [FAQ](doc/faq.md) for more details)| `'UTF-8'` |\n| *col_sep* | Column separator in file | `';'` |\n| *destination* | Destination table | given base filename (without extension) |\n| *mapping* | Column mapping | `{}` |\n| *row_index_column* | Column name where inserting file row index (not used when `nil`). *PostgreSQL only* | `nil` |\n| *transaction* | Execute DELETE and INSERT in same transaction. *PostgreSQL only* | `:enabled` |\n| *deletion* | Row deletion method (`:delete` for SQL DELETE, `:truncate` for SQL TRUNCATE or `:none` for no deletion before import) | `:delete` |\n\nIf your CSV file is not encoded with same table than your database, you can specify encoding at the file opening (see [FAQ](doc/faq.md) for more details):\n\n```ruby\nfile = File.new '/path/to/knights.csv', encoding: 'ISO-8859-1'\n```\n\nYou can specify a different separator column with the `col_sep` option (`;` by\ndefault):\n\n```ruby\nCsvFastImporter.import file, col_sep: '|'\n```\n\nBy default, CSV Fast Importer computes the database table's name by taking the\n`basename` of the imported file. For instance, considering the imported file\n`/path/to/knights.csv`, the table's name will be `knights`. To bypass\nthis default behaviour, specify the `destination` option:\n\n```ruby\nfile = File.new '/path/to/clients.csv'\nCsvFastImporter.import file, destination: 'knights'\n```\n\nFinally, you can precise a custom mapping between CSV file's columns and\ndatabase fields with the `mapping` option.\n\nConsidering the following `knights.csv` file:\n\n```csv\nNAME;KNIGHT_EMAIL\nPerceval;perceval@logre.cel\nLancelot;lancelot@logre.cel\n```\n\nTo map the `KNIGHT_EMAIL` column to the `email` database field:\n\n```ruby\nCsvFastImporter.import file, mapping: { knight_email: :email }\n```\n\n## Need help?\n\nSee [FAQ](doc/faq.md).\n\n## How to contribute?\n\nYou can fork and submit new pull request (with tests and explanations).\nFirst of all, you need to initialize your environment :\n\n```sh\n$ brew install postgresql # in macOS\n$ apt-get install libpq-dev # in Linux\n$ bundle install\n```\n\nThen, start your PostgreSQL database (ex: [Postgres.app](http://postgresapp.com) for the Mac) and setup database environment:\n\n```sh\n$ bundle exec rake test:db:create\n```\nThis will connect to `localhost` PostgreSQL database without user (see `config/database.postgres.yml`) and create a new database dedicated to tests.\n\n*Warning:* database instance have to allow database creation with `UTF-8` encoding.\n\nFinally, you can run all tests with RSpec like this:\n\n```sh\n$ bundle exec rspec\n```\n\nBy default, PostgreSQL is used. You can set another database with environment variables like this for MySQL:\n```sh\n$ DB_TYPE=mysql DB_ROOT_PASSWORD=password DB_USERNAME=username bundle exec rake test:db:create\n$ DB_TYPE=mysql DB_USERNAME=username bundle exec rspec\n```\nThis will connect to mysql with `root` user (with `password` as password) and create database for user `username`.\nUse `DB_TYPE=mysql DB_USERNAME=` (with empty username) for anonymous account.\n\n*Warning*: Mysql tests require your local database permits LOCAL works. Check your Mysql instance with following command: `SHOW GLOBAL VARIABLES LIKE 'local_infile'` (should be `ON`).\n\n## Versioning\n\n`master` is the development branch and releases are published as tags.\n\nWe follow the [Semantic Versioning 2.0.0](http://semver.org/) for our gem\nreleases.\n\nIn few words:\n\n\u003e Given a version number MAJOR.MINOR.PATCH, increment the:\n\u003e\n\u003e 1. MAJOR version when you make incompatible API changes,\n\u003e 2. MINOR version when you add functionality in a backwards-compatible manner,\n\u003e    and\n\u003e 3. PATCH version when you make backwards-compatible bug fixes.\n\n## Backlog (unordered)\n\n- [ ] Support any column and table case\n- [ ] Support custom enclosing field (ex: `\"`)\n- [ ] Support custom line serparator (ex: \\r\\n for windows file)\n- [ ] Support custom type convertion\n- [ ] MySQL: support encoding parameter. See https://dev.mysql.com/doc/refman/5.5/en/charset-charsets.html\n- [ ] MySQL: support transaction parameter\n- [ ] MySQL: support row_index_column parameter\n- [ ] MySQL: run multiple SQL queries in single statement\n- [ ] Refactor tests (with should-\u003e must / should -\u003e expect / subject...)\n- [ ] Reduce technical debt on db connection (test \u0026 benchmark)\n- [ ] SQLite support\n- [ ] Add link to [activerecord-copy](https://github.com/pganalyze/activerecord-copy)\n- [ ] Ease local tests on multiple databases with [testcontainers](https://github.com/testcontainers/testcontainers-ruby)\n- [ ] Accept csv header which contains column separator\n\n## How to release new version?\n\nSetup rubygems.org account:\n\n```bash\ncurl -u {your_gem_account_name} https://rubygems.org/api/v1/api_key.yaml \u003e ~/.gem/credentials\nchmod 0600 ~/.gem/credentials\n```\n\nMake sure you are in `master` branch and run:\n\n```bash\nbundle exec rake \"release:make[major|minor|patch|x.y.z]\"\n```\nExample: `bundle exec rake \"release:make[minor]\"`\n\nThen, follow instructions.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsogilis%2Fcsv_fast_importer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsogilis%2Fcsv_fast_importer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsogilis%2Fcsv_fast_importer/lists"}