{"id":13483806,"url":"https://github.com/seamusabshere/upsert","last_synced_at":"2025-05-15T08:11:04.228Z","repository":{"id":3582728,"uuid":"4645789","full_name":"seamusabshere/upsert","owner":"seamusabshere","description":"Upsert on MySQL, PostgreSQL, and SQLite3. Transparently creates functions (UDF) for MySQL and PostgreSQL; on SQLite3, uses INSERT OR IGNORE.","archived":false,"fork":false,"pushed_at":"2021-02-20T00:07:42.000Z","size":504,"stargazers_count":650,"open_issues_count":26,"forks_count":78,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-04-14T14:59:39.910Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/seamusabshere.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2012-06-13T03:15:00.000Z","updated_at":"2025-02-14T15:50:30.000Z","dependencies_parsed_at":"2022-09-03T21:21:15.705Z","dependency_job_id":null,"html_url":"https://github.com/seamusabshere/upsert","commit_stats":null,"previous_names":[],"tags_count":35,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seamusabshere%2Fupsert","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seamusabshere%2Fupsert/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seamusabshere%2Fupsert/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seamusabshere%2Fupsert/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/seamusabshere","download_url":"https://codeload.github.com/seamusabshere/upsert/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254301432,"owners_count":22047904,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T17:01:15.453Z","updated_at":"2025-05-15T08:11:04.183Z","avatar_url":"https://github.com/seamusabshere.png","language":"Ruby","readme":"# Upsert\n\n[![Build Status](https://travis-ci.org/seamusabshere/upsert.svg?branch=master)](https://travis-ci.org/seamusabshere/upsert)\n\nMake it easy to upsert on traditional RDBMS like MySQL, PostgreSQL, and SQLite3\u0026mdash;hey look NoSQL!. Transparently creates (and re-uses) stored procedures/functions when necessary.\n\nYou pass it a bare-metal connection to the database like `Mysql2::Client` (from `mysql2` gem on MRI) or `Java::OrgPostgresqlJdbc4::Jdbc4Connection` (from `jdbc-postgres` on Jruby).\n\nAs databases start to natively support SQL MERGE (which is basically upsert), this library will take advantage (but you won't have to change your code).\n\nDoes **not** depend on ActiveRecord.\n\nDoes **not** use `INSERT ON DUPLICATE KEY UPDATE` on MySQL as this only works if you are very careful about creating unique indexes.\n\n70\u0026ndash;90%+ faster than emulating upsert with ActiveRecord.\n\nSupports MRI and JRuby.\n\n## Usage\n\nYou pass a __selector__ that uniquely identifies a row, whether it exists or not. You also pass a __setter__, attributes that should be set on that row.\n\nSyntax inspired by [mongo-ruby-driver's update method](http://api.mongodb.org/ruby/1.6.4/Mongo/Collection.html#update-instance_method).\n\n### Basic\n\n```ruby\nconnection = Mysql2::Client.new([...])\ntable_name = :pets\nupsert = Upsert.new connection, table_name\n# N times...\nupsert.row({:name =\u003e 'Jerry'}, :breed =\u003e 'beagle', :created_at =\u003e Time.now)\n```\n\nThe `created_at` and `created_on` columns are used for inserts, but ignored on updates.\n\nSo just to reiterate you've got a __selector__ and a __setter__:\n\n```ruby\nselector = { :name =\u003e 'Jerry' }\nsetter = { :breed =\u003e 'beagle' }\nupsert.row(selector, setter)\n```\n\n### Batch mode\n\nBy organizing your upserts into a batch, we can do work behind the scenes to make them faster.\n\n```ruby\nconnection = Mysql2::Client.new([...])\nUpsert.batch(connection, :pets) do |upsert|\n  # N times...\n  upsert.row({:name =\u003e 'Jerry'}, :breed =\u003e 'beagle')\n  upsert.row({:name =\u003e 'Pierre'}, :breed =\u003e 'tabby')\nend\n```\n\nBatch mode is tested to be about 80% faster on PostgreSQL, MySQL, and SQLite3 than other ways to emulate upsert (see the tests, which fail if they are not faster).\n\n### Native Postgres upsert\n\n`INSERT ... ON CONFLICT DO UPDATE` is used when Postgres 9.5+ is detected and *unique constraint are in place.*\n\n**Note: ** You must have a **unique constraint** on the column(s) you're using as a selector.  A unique index won't work.  See https://github.com/seamusabshere/upsert/issues/98#issuecomment-295341405 for more information and some ways to check.\n\nIf you don't have unique constraints, it will fall back to the classic Upsert gem user-defined function, which does not require a constraint.\n\n### ActiveRecord helper method\n\n```ruby\nrequire 'upsert/active_record_upsert'\n# N times...\nPet.upsert({:name =\u003e 'Jerry'}, :breed =\u003e 'beagle')\n```\n\n## Wishlist\n\nPull requests for any of these would be greatly appreciated:\n\n1. Cache JDBC PreparedStatement objects.\n1. Sanity check my three benchmarks (four if you include activerecord-import on MySQL). Do they accurately represent optimized alternatives?\n1. Provide `require 'upsert/debug'` that will make sure you are selecting on columns that have unique indexes\n1. Test that `Upsert` instances accept arbitrary columns, even within a batch, which is what people probably expect.\n1. [@antage](https://github.com/antage)'s idea for \"true\" upserting: (from https://github.com/seamusabshere/upsert/issues/17)\n\n```ruby\nselector = { id: 15 }\nupdate_setter = { count: Upsert.sql('count + 1') }\ninsert_setter = { count: 1 }\nupsert.row_with_two_setter(update_setter, insert_setter, selector)\n```\n\n## Real-world usage\n\n\u003cp\u003e\u003ca href=\"http://angel.co/faraday\"\u003e\u003cimg src=\"https://s3.amazonaws.com/photos.angel.co/startups/i/175701-a63ebd1b56a401e905963c64958204d4-medium_jpg.jpg\" alt=\"Faraday logo\"/\u003e\u003c/a\u003e\u003c/p\u003e\n\nWe use `upsert` for [big data at Faraday](http://angel.co/faraday). Originally written to speed up the [`data_miner`](https://github.com/seamusabshere/data_miner) data mining library.\n\n## Supported databases/drivers\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003e*\u003c/th\u003e\n    \u003cth\u003eMySQL\u003c/th\u003e\n    \u003cth\u003ePostgreSQL\u003c/th\u003e\n    \u003cth\u003eSQLite3\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth\u003eMRI\u003c/th\u003e\n    \u003ctd\u003e\u003ca href=\"https://rubygems.org/gems/mysql2\"\u003emysql2\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://rubygems.org/gems/pg\"\u003epg\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://rubygems.org/gems/sqlite3\"\u003esqlite3\u003c/a\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth\u003eJRuby\u003c/th\u003e\n    \u003ctd\u003e\u003ca href=\"https://rubygems.org/gems/jdbc-mysql\"\u003ejdbc-mysql\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://rubygems.org/gems/jdbc-postgres\"\u003ejdbc-postgres\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://rubygems.org/gems/jdbc-sqlite3\"\u003ejdbc-sqlite3\u003c/a\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\nSee below for details about what SQL MERGE trick (emulation of upsert) is used, performance, code examples, etc.\n\n### Rails / ActiveRecord\n\n(Assuming that one of the other three supported drivers is being used under the covers).\n\n* add \"upsert\" to your Gemfile and \n* run bundle install\n\n```ruby\nUpsert.new Pet.connection, Pet.table_name\n```\n\n#### Speed\n\nDepends on the driver being used!\n\n#### SQL MERGE trick\n\nDepends on the driver being used!\n\n### MySQL\n\nOn MRI, use the [mysql2](https://rubygems.org/gems/mysql2) driver.\n\n```ruby\nrequire 'mysql2'\nconnection = Mysql2::Connection.new(:username =\u003e 'root', :password =\u003e 'password', :database =\u003e 'upsert_test')\ntable_name = :pets\nupsert = Upsert.new(connection, table_name)\n```\n\nOn JRuby, use the [jdbc-mysql](https://rubygems.org/gems/jdbc-mysql) driver.\n\n```ruby\nrequire 'jdbc/mysql'\njava.sql.DriverManager.register_driver com.mysql.jdbc.Driver.new\nconnection = java.sql.DriverManager.get_connection \"jdbc:mysql://127.0.0.1/mydatabase?user=root\u0026password=password\"\n```\n\n#### Speed\n\nFrom the tests (updated 11/7/12):\n\n    Upsert was 82% faster than find + new/set/save\n    Upsert was 85% faster than find_or_create + update_attributes\n    Upsert was 90% faster than create + rescue/find/update\n    Upsert was 46% faster than faking upserts with activerecord-import (note: in question as of 3/13/15, need some expert advice)\n\n#### SQL MERGE trick\n\nThanks to [Dennis Hennen's StackOverflow response!](http://stackoverflow.com/questions/11371479/how-to-translate-postgresql-merge-db-aka-upsert-function-into-mysql/)!\n\n```sql\nCREATE PROCEDURE upsert_pets_SEL_name_A_tag_number_SET_name_A_tag_number(`name_sel` varchar(255), `tag_number_sel` int(11), `name_set` varchar(255), `tag_number_set` int(11))\nBEGIN\n  DECLARE done BOOLEAN;\n  REPEAT\n    BEGIN\n      -- If there is a unique key constraint error then\n      -- someone made a concurrent insert. Reset the sentinel\n      -- and try again.\n      DECLARE ER_DUP_UNIQUE CONDITION FOR 23000;\n      DECLARE ER_INTEG CONDITION FOR 1062;\n      DECLARE CONTINUE HANDLER FOR ER_DUP_UNIQUE BEGIN\n        SET done = FALSE;\n      END;\n\n      DECLARE CONTINUE HANDLER FOR ER_INTEG BEGIN\n        SET done = TRUE;\n      END;\n\n      SET done = TRUE;\n      SELECT COUNT(*) INTO @count FROM `pets` WHERE `name` = `name_sel` AND `tag_number` = `tag_number_sel`;\n      -- Race condition here. If a concurrent INSERT is made after\n      -- the SELECT but before the INSERT below we'll get a duplicate\n      -- key error. But the handler above will take care of that.\n      IF @count \u003e 0 THEN\n        -- UPDATE table_name SET b = b_SET WHERE a = a_SEL;\n        UPDATE `pets` SET `name` = `name_set`, `tag_number` = `tag_number_set` WHERE `name` = `name_sel` AND `tag_number` = `tag_number_sel`;\n      ELSE\n        -- INSERT INTO table_name (a, b) VALUES (k, data);\n        INSERT INTO `pets` (`name`, `tag_number`) VALUES (`name_set`, `tag_number_set`);\n      END IF;\n    END;\n  UNTIL done END REPEAT;\nEND\n```\n\n### PostgreSQL\n\nOn MRI, use the [pg](https://rubygems.org/gems/pg) driver.\n\n```ruby\nrequire 'pg'\nconnection = PG.connect(:dbname =\u003e 'upsert_test')\ntable_name = :pets\nupsert = Upsert.new(connection, table_name)\n```\n\nOn JRuby, use the [jdbc-postgres](https://rubygems.org/gems/jdbc-postgres) driver.\n\n```ruby\nrequire 'jdbc/postgres'\njava.sql.DriverManager.register_driver org.postgresql.Driver.new\nconnection = java.sql.DriverManager.get_connection \"jdbc:postgresql://127.0.0.1/mydatabase?user=root\u0026password=password\"\n```\n\nIf you want to use HStore, make the `pg-hstore` gem available and pass a Hash in setters:\n\n```ruby\ngem 'pg-hstore'\nrequire 'pg_hstore'\nupsert.row({:name =\u003e 'Bill'}, :mydata =\u003e {:a =\u003e 1, :b =\u003e 2})\n```\n\n#### PostgreSQL notes\n\n- Upsert doesn't do any type casting, so if you attempt to do something like the following:\n    `upsert.row({ :name =\u003e 'A Name' }, :tag_number =\u003e 'bob')`\n    you'll get an error which reads something like:\n    `invalid input syntax for integer: \"bob\"`\n\n\n\n#### Speed\n\nFrom the tests (updated 9/21/12):\n\n    Upsert was 72% faster than find + new/set/save\n    Upsert was 79% faster than find_or_create + update_attributes\n    Upsert was 83% faster than create + rescue/find/update\n    # (can't compare to activerecord-import because you can't fake it on pg)\n\n#### SQL MERGE trick\n\nAdapted from the [canonical PostgreSQL upsert example](http://www.postgresql.org/docs/current/interactive/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING):\n\n```sql\nCREATE OR REPLACE FUNCTION upsert_pets_SEL_name_A_tag_number_SET_name_A_tag_number(\"name_sel\" character varying(255), \"tag_number_sel\" integer, \"name_set\" character varying(255), \"tag_number_set\" integer) RETURNS VOID AS\n$$\nDECLARE\n  first_try INTEGER := 1;\nBEGIN\n  LOOP\n    -- first try to update the key\n    UPDATE \"pets\" SET \"name\" = \"name_set\", \"tag_number\" = \"tag_number_set\"\n      WHERE \"name\" = \"name_sel\" AND \"tag_number\" = \"tag_number_sel\";\n    IF found THEN\n      RETURN;\n    END IF;\n    -- not there, so try to insert the key\n    -- if someone else inserts the same key concurrently,\n    -- we could get a unique-key failure\n    BEGIN\n      INSERT INTO \"pets\"(\"name\", \"tag_number\") VALUES (\"name_set\", \"tag_number_set\");\n      RETURN;\n    EXCEPTION WHEN unique_violation THEN\n      -- seamusabshere 9/20/12 only retry once\n      IF (first_try = 1) THEN\n        first_try := 0;\n      ELSE\n        RETURN;\n      END IF;\n      -- Do nothing, and loop to try the UPDATE again.\n    END;\n  END LOOP;\nEND;\n$$\nLANGUAGE plpgsql;\n```\n\nI slightly modified it so that it only retries once - don't want infinite loops.\n\n### Sqlite\n\nOn MRI, use the [sqlite3](https://rubygems.org/gems/sqlite3) driver.\n\n```ruby\nrequire 'sqlite3'\nconnection = SQLite3::Database.open(':memory:')\ntable_name = :pets\nupsert = Upsert.new(connection, table_name)\n```\n\nOn JRuby, use the [jdbc-sqlite3](https://rubygems.org/gems/jdbc-sqlite3) driver.\n\n```ruby\n# TODO somebody please verify\nrequire 'jdbc/sqlite3'\njava.sql.DriverManager.register_driver org.sqlite.Driver.new\nconnection = java.sql.DriverManager.get_connection \"jdbc:sqlite://127.0.0.1/mydatabase?user=root\u0026password=password\"\n```\n\n#### Speed\n\nFrom the tests (updated 9/21/12):\n\n    Upsert was 77% faster than find + new/set/save\n    Upsert was 80% faster than find_or_create + update_attributes\n    Upsert was 85% faster than create + rescue/find/update\n    # (can't compare to activerecord-import because you can't fake it on sqlite3)\n\n#### SQL MERGE trick\n\nThanks to [@dan04's answer on StackOverflow](http://stackoverflow.com/questions/2717590/sqlite-upsert-on-duplicate-key-update):\n\n**Please note!  This will only work properly on Sqlite if one of the columns being used as the \"selector\" are a primary key or unique index**\n\n```sql\nINSERT OR IGNORE INTO visits VALUES (127.0.0.1, 1);\nUPDATE visits SET visits = 1 WHERE ip LIKE 127.0.0.1;\n```\n\n## Features\n\n### Tested to be fast and portable\n\nIn addition to correctness, the library's tests check that it is\n\n1. Faster than comparable upsert techniques\n2. Compatible with supported databases\n\n### Not dependent on ActiveRecord\n\nAs below, all you need is a raw database connection like a `Mysql2::Connection`, `PG::Connection` or a `SQLite3::Database`. These are equivalent:\n\n```ruby\n# with activerecord\nUpsert.new ActiveRecord::Base.connection, :pets\n# with activerecord, prettier\nUpsert.new Pet.connection, Pet.table_name\n# without activerecord\nUpsert.new Mysql2::Connection.new([...]), :pets\n```\n\n### For a specific use case, faster and more portable than `activerecord-import`\n\nYou could also use [activerecord-import](https://github.com/zdennis/activerecord-import) to upsert:\n\n```ruby\nPet.import columns, all_values, :timestamps =\u003e false, :on_duplicate_key_update =\u003e columns\n```\n\n`activerecord-import`, however, only works on MySQL and requires ActiveRecord\u0026mdash;and if all you are doing is upserts, `upsert` is tested to be 40% faster. And you don't have to put all of the rows to be upserted into a single huge array - you can batch them using `Upsert.batch`.\n\n## Gotchas\n\n### No automatic typecasting beyond what the adapter/driver provides\n\nWe don't have any logic to convert integers into strings, strings into integers, etc. in order to satisfy PostgreSQL/etc.'s strictness on this issue.\n\nSo if you try to upsert a blank string (`''`) into an integer field in PostgreSQL, you will get an error.\n\n### Dates and times are converted to UTC\n\nDatetimes are immediately converted to UTC and sent to the database as ISO8601 strings.\n\nIf you're using MySQL, make sure server/connection timezone is UTC. If you're using Rails and/or ActiveRecord, you might want to check `ActiveRecord::Base.default_timezone`... it should probably be `:utc`.\n\nIn general, run some upserts and make sure datetimes get persisted like you expect.\n\n### Clearning all library-generated functions\n\nPlace the following in to a rake task (so you don't globally redefine the `NAME_PREFIX` constant)\n\n```ruby\nUpsert::MergeFunction::NAME_PREFIX = \"upsert\"\n\n# ActiveRecord\nUpsert.clear_database_functions(ActiveRecord::Base.connection)\n\n# Sequel\nDB.synchronize do |conn|\n  Upsert.clear_database_functions(conn)\nend\n```\n\n### Doesn't work with transactional fixtures\n\nPer https://github.com/seamusabshere/upsert/issues/23 you might have issues if you try to use transactional fixtures and this library.\n\n##\nTestmetrics - https://www.testmetrics.app/seamusabshere/upsert\n\n## Copyright\n\nCopyright 2013-2019 Seamus Abshere\nCopyright 2017-2019 Philip Schalm\nPortions Copyright (c) 2019 The JRuby Team\n","funding_links":[],"categories":["Database Tools","Ruby"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseamusabshere%2Fupsert","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fseamusabshere%2Fupsert","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseamusabshere%2Fupsert/lists"}