{"id":30317920,"url":"https://github.com/PgBulkInsert/PgBulkInsert","last_synced_at":"2025-08-17T20:09:00.661Z","repository":{"id":43218258,"uuid":"50948261","full_name":"PgBulkInsert/PgBulkInsert","owner":"PgBulkInsert","description":"Java library for efficient Bulk Inserts to PostgreSQL using the Binary COPY Protocol.","archived":false,"fork":false,"pushed_at":"2025-05-25T23:40:19.000Z","size":2185,"stargazers_count":156,"open_issues_count":1,"forks_count":35,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-08-17T10:42:12.681Z","etag":null,"topics":["bulk-inserts","postgresql"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PgBulkInsert.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2016-02-02T20:05:59.000Z","updated_at":"2025-08-13T03:42:50.000Z","dependencies_parsed_at":"2024-11-16T03:15:42.251Z","dependency_job_id":"2bba0941-e751-45bf-a3cc-221493bd7875","html_url":"https://github.com/PgBulkInsert/PgBulkInsert","commit_stats":null,"previous_names":["bytefish/pgbulkinsert"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/PgBulkInsert/PgBulkInsert","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PgBulkInsert%2FPgBulkInsert","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PgBulkInsert%2FPgBulkInsert/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PgBulkInsert%2FPgBulkInsert/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PgBulkInsert%2FPgBulkInsert/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PgBulkInsert","download_url":"https://codeload.github.com/PgBulkInsert/PgBulkInsert/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PgBulkInsert%2FPgBulkInsert/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270837450,"owners_count":24654386,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-17T02:00:09.016Z","response_time":129,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bulk-inserts","postgresql"],"created_at":"2025-08-17T20:02:40.243Z","updated_at":"2025-08-17T20:09:00.648Z","avatar_url":"https://github.com/PgBulkInsert.png","language":"Java","funding_links":[],"categories":["数据库开发"],"sub_categories":[],"readme":"# PgBulkInsert #\n\n[MIT License]: https://opensource.org/licenses/MIT\n[COPY command]: http://www.postgresql.org/docs/current/static/sql-copy.html\n[PgBulkInsert]: https://github.com/bytefish/PgBulkInsert\n[Npgsql]: https://github.com/npgsql/npgsql\n\n![](https://maven-badges.herokuapp.com/maven-central/de.bytefish/pgbulkinsert/badge.svg)\n\nPgBulkInsert is a Java library for Bulk Inserts to PostgreSQL using the Binary COPY Protocol. \n\nIt provides a wrapper around the PostgreSQL [COPY command]:\n\n\u003e The [COPY command] is a PostgreSQL specific feature, which allows efficient bulk import or export of \n\u003e data to and from a table. This is a much faster way of getting data in and out of a table than using \n\u003e INSERT and SELECT.\n\nThis project wouldn't be possible without the great [Npgsql] library, which has a beautiful implementation of the Postgres protocol.\n\n## Setup ##\n\n[PgBulkInsert] is available in the Central Maven Repository. \n\nYou can add the following dependencies to your pom.xml to include [PgBulkInsert] in your project.\n\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003ede.bytefish\u003c/groupId\u003e\n    \u003cartifactId\u003epgbulkinsert\u003c/artifactId\u003e\n    \u003cversion\u003e8.1.4\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n## Supported PostgreSQL Types ##\n\n* [Numeric Types](http://www.postgresql.org/docs/current/static/datatype-numeric.html)\n    * smallint\n    * integer\n    * bigint\n    * real\n    * double precision\n\t* numeric\n* [Date/Time Types](http://www.postgresql.org/docs/current/static/datatype-datetime.html)\n    * timestamp\n    * timestamptz\n    * date\n    * time\n    * interval\n* [Character Types](http://www.postgresql.org/docs/current/static/datatype-character.html)\n    * text\n* [JSON Types](https://www.postgresql.org/docs/current/static/datatype-json.html)\n    * jsonb\n* [Boolean Type](http://www.postgresql.org/docs/current/static/datatype-boolean.html)\n    * boolean\n* [Binary Data Types](http://www.postgresql.org/docs/current/static/datatype-binary.html)\n    * bytea\n* [Network Address Types](http://www.postgresql.org/docs/current/static/datatype-net-types.html)\n    * inet (IPv4, IPv6)\n    * macaddr\n* [UUID Type](http://www.postgresql.org/docs/current/static/datatype-uuid.html)\n    * uuid\n* [Array Type](https://www.postgresql.org/docs/current/static/arrays.html)\n    * One-Dimensional Arrays\n* [Range Type](https://www.postgresql.org/docs/current/rangetypes.html)\n    * int4range\n    * int8range\n    * numrange\n    * tsrange\n    * tstzrange\n    * daterange\n* [hstore](https://www.postgresql.org/docs/current/static/hstore.html)\n    * hstore\n* [Geometric Types](https://www.postgresql.org/docs/current/static/datatype-geometric.html)\n    * point\n    * line\n    * lseg\n    * box\n    * path\n    * polygon\n    * circle\n\n   \n## Usage ##\n\nYou can use the [PgBulkInsert] API in various ways. The first one is to use the ``SimpleRowWriter`` when you don't have \nan explicit Java POJO, that matches a Table. The second way is to use an ``AbstractMapping\u003cTEntityType\u003e`` to define a \nmapping between a Java POJO and a PostgreSQL table.\n\nPlease also read the FAQ, which may answer some of your questions.\n\n## Using the SimpleRowWriter ##\n\nUsing the ``SimpleRowWriter`` doesn't require you to define a separate mapping. It requires you to define the PostgreSQL table structure using \na ``SimpleRowWriter.Table``, that has a schema name (optional), table name and column names:\n\n```java\n// Schema of the Table:\nString schemaName = \"sample\";\n\n// Name of the Table:\nString tableName = \"row_writer_test\";\n\n// Define the Columns to be inserted:\nString[] columnNames = new String[] {\n        \"value_int\",\n        \"value_text\"\n};\n\n// Create the Table Definition:\nSimpleRowWriter.Table table = new SimpleRowWriter.Table(schemaName, tableName, columnNames);\n```\n\nOnce created you create the ``SimpleRowWriter`` by using the ``Table`` and a ``PGConnection``.\n\nNow to write a row to PostgreSQL you call the ``startRow`` method. It expects you to pass a \n``Consumer\u003cSimpleRow\u003e`` into it, which defines what data to write to the row. The call to \n``startRow`` is synchronized, so it is safe to be called from multiple threads.\n\n```java\n// Create the Writer:\ntry(SimpleRowWriter writer = new SimpleRowWriter(table, pgConnection)) {\n\n    // ... write your data rows:\n    for(int rowIdx = 0; rowIdx \u003c 10000; rowIdx++) {\n\n        // ... using startRow and work with the row, see how the order doesn't matter:\n        writer.startRow((row) -\u003e {\n            row.setText(\"value_text\", \"Hi\");\n            row.setInteger(\"value_int\", 1);\n        });\n    }\n}\n```\n\nSo the complete example looks like this:\n\n```java\npublic class SimpleRowWriterTest extends TransactionalTestBase {\n\n    // ...\n    \n    @Test\n    public void rowBasedWriterTest() throws SQLException {\n\n        // Get the underlying PGConnection:\n        PGConnection pgConnection = PostgreSqlUtils.getPGConnection(connection);\n\n        // Schema of the Table:\n        String schemaName = \"sample\";\n        \n        // Name of the Table:\n        String tableName = \"row_writer_test\";\n\n        // Define the Columns to be inserted:\n        String[] columnNames = new String[] {\n                \"value_int\",\n                \"value_text\"\n        };\n\n        // Create the Table Definition:\n        SimpleRowWriter.Table table = new SimpleRowWriter.Table(schemaName, tableName, columnNames);\n\n        // Create the Writer:\n        try(SimpleRowWriter writer = new SimpleRowWriter(table, pgConnection)) {\n\n            // ... write your data rows:\n            for(int rowIdx = 0; rowIdx \u003c 10000; rowIdx++) {\n\n                // ... using startRow and work with the row, see how the order doesn't matter:\n                writer.startRow((row) -\u003e {\n                    row.setText(\"value_text\", \"Hi\");\n                    row.setInteger(\"value_int\", 1);\n                });\n\n            }\n        }\n\n        // Now assert, that we have written 10000 entities:\n\n        Assert.assertEquals(10000, getRowCount());\n    }\n}\n```\n\nIf you need to customize the Null Character Handling, then you can use the ``setNullCharacterHandler(Function\u003cString, String\u003e nullCharacterHandler)`` function.\n\n## Using the AbstractMapping ##\n\nThe ``AbstractMapping`` is the second possible way to map a POJO for usage in PgBulkInsert. Imagine we want to bulk insert a large amount of people \ninto a PostgreSQL database. Each ``Person`` has a first name, a last name and a birthdate.\n\n### Database Table ###\n\nThe table in the PostgreSQL database might look like this:\n\n```sql\n CREATE TABLE sample.unit_test\n(\n    first_name text,\n    last_name text,\n    birth_date date\n);\n```\n\n### Domain Model ###\n\nThe domain model in the application might look like this:\n\n```java\nprivate class Person {\n\n    private String firstName;\n\n    private String lastName;\n\n    private LocalDate birthDate;\n\n    public Person() {}\n\n    public String getFirstName() {\n        return firstName;\n    }\n\n    public void setFirstName(String firstName) {\n        this.firstName = firstName;\n    }\n\n    public String getLastName() {\n        return lastName;\n    }\n\n    public void setLastName(String lastName) {\n        this.lastName = lastName;\n    }\n\n    public LocalDate getBirthDate() {\n        return birthDate;\n    }\n\n    public void setBirthDate(LocalDate birthDate) {\n        this.birthDate = birthDate;\n    }\n    \n}\n```\n\n### Bulk Inserter ###\n\nThen you have to implement the ``AbstractMapping\u003cPerson\u003e``, which defines the mapping between the table and the domain model.\n\n```java\npublic class PersonMapping extends AbstractMapping\u003cPerson\u003e\n{\n    public PersonMapping() {\n        super(\"sample\", \"unit_test\");\n\n        mapText(\"first_name\", Person::getFirstName);\n        mapText(\"last_name\", Person::getLastName);\n        mapDate(\"birth_date\", Person::getBirthDate);\n    }\n}\n```\n\nThis mapping is used to create the ``PgBulkInsert\u003cPerson\u003e``:\n\n```java\nPgBulkInsert\u003cPerson\u003e bulkInsert = new PgBulkInsert\u003cPerson\u003e(new PersonMapping());\n```\n\n### Using the Bulk Inserter ###\n\n[IntegrationTest.java]: https://github.com/bytefish/PgBulkInsert/blob/master/PgBulkInsert/pgbulkinsert-core/src/test/java/de/bytefish/pgbulkinsert/integration/IntegrationTest.java\n\nAnd finally we can write a Unit Test to insert ``100000`` people into the database. You can find the entire Unit Test on GitHub as [IntegrationTest.java].\n\n```java\n@Test\npublic void bulkInsertPersonDataTest() throws SQLException {\n    // Create a large list of People:\n    List\u003cPerson\u003e personList = getPersonList(100000);\n    // Create the BulkInserter:\n    PgBulkInsert\u003cPerson\u003e bulkInsert = new PgBulkInsert\u003cPerson\u003e(new PersonMapping(schema));\n    // Now save all entities of a given stream:\n    bulkInsert.saveAll(PostgreSqlUtils.getPGConnection(connection), personList.stream());\n    // And assert all have been written to the database:\n    Assert.assertEquals(100000, getRowCount());\n}\n\nprivate List\u003cPerson\u003e getPersonList(int num) {\n    List\u003cPerson\u003e personList = new ArrayList\u003c\u003e();\n\n    for (int pos = 0; pos \u003c num; pos++) {\n        Person p = new Person();\n\n        p.setFirstName(\"Philipp\");\n        p.setLastName(\"Wagner\");\n        p.setBirthDate(LocalDate.of(1986, 5, 12));\n\n        personList.add(p);\n    }\n\n    return personList;\n}\n```\n\n## FAQ ##\n\n### How can I write Primitive Types (``boolean``, ``float``, ``double``)? ###\n\nBy default methods like ``mapBoolean`` map the boxed type ``Boolean``, ``Integer``, ``Long``. This might be problematic \nif you need to squeeze out the last seconds when doing bulk inserts, see Issue:\n\n* [https://github.com/PgBulkInsert/PgBulkInsert/issues/93](https://github.com/PgBulkInsert/PgBulkInsert/issues/93)\n\nSo for every data type that also has a primitive type, you can add a \"Primitive\" suffix to the method name like:\n\n* ```mapBooleanPrimitive``\n\nThis will use the primitive type and prevent boxing and unboxing of values.\n\n### How can I write a ``java.sql.Timestamp``? ###\n\nYou probably have Java classes with a ``java.sql.Timestamp`` in your application. Now if you use the ``AbstractMapping`` or a ``SimpleRowWriter`` it expects a ``LocalDateTime``. Here is how to map a ``java.sql.Timestamp``.\n\nImagine you have an ``EMail`` class with a property ``emailCreateTime``, that is using a ``java.sql.Timestamp`` to \nrepresent the time. The column name in Postgres is ``email_create_time`` and you are using a ``timestamp`` data type.\n\nTo map the ``java.sql.Timestamp`` you would write the ``mapTimeStamp`` method like this:\n\n```java\nmapTimeStamp(\"email_create_time\", x -\u003e x.getEmailCreateTime() != null ? x.getEmailCreateTime().toLocalDateTime() : null);\n```\n\nAnd here is the complete example:\n\n```java\npublic class EMail {\n\n    private Timestamp emailCreateTime;\n\n    public Timestamp getEmailCreateTime() {\n        return emailCreateTime;\n    }\n}\n\npublic static class EMailMapping extends AbstractMapping\u003cEMail\u003e\n{\n    public EMailMapping(String schema) {\n        super(schema, \"unit_test\");\n\n        mapTimeStamp(\"email_create_time\", x -\u003e x.getEmailCreateTime() != null ? x.getEmailCreateTime().toLocalDateTime() : null);\n    }\n}\n```\n\n### Handling Null Characters or... 'invalid byte sequence for encoding \"UTF8\": 0x00' ###\n\nIf you see the error message ``invalid byte sequence for encoding \"UTF8\": 0x00`` your data contains Null Characters. Although ``0x00`` is totally valid UTF-8... PostgreSQL does not support writing it, because it uses C-style string termination internally.\n\nPgBulkInsert allows you to enable a Null Value handling, that removes all ``0x00`` occurences and replaces them with an empty string:\n    \n```java\n// Create the Table Definition:\nSimpleRowWriter.Table table = new SimpleRowWriter.Table(schema, tableName, columnNames);\n\n// Create the Writer:\nSimpleRowWriter writer = new SimpleRowWriter(table);\n\n// Enable the Null Character Handler:\nwriter.enableNullCharacterHandler();\n```\n\n## Running the Tests ##\n\nRunning the Tests requires a PostgreSQL database. \n\nYou have to configure the test database connection in the module ``pgbulkinsert-core`` and file ``db.properties``:\n\n```ini\ndb.url=jdbc:postgresql://127.0.0.1:5432/sampledb\ndb.user=philipp\ndb.password=test_pwd\ndb.schema=public\n```\n\nThe tests are transactional, that means any test data will be rolled back once a test finishes. But it probably makes \nsense to set up a separate ``db.schema`` for your tests, if you want to avoid polluting the ``public`` schema or have \ndifferent permissions.\n\n## License ##\n\nPgBulkInsert is released with under terms of the [MIT License]:\n\n* [https://github.com/bytefish/PgBulkInsert](https://github.com/bytefish/PgBulkInsert)\n\n\n## Resources ##\n\n* [Npgsql](https://github.com/npgsql/npgsql)\n* [Postgres on the wire - A look at the PostgreSQL wire protocol (PGCon 2014)](https://www.pgcon.org/2014/schedule/attachments/330_postgres-for-the-wire.pdf)\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPgBulkInsert%2FPgBulkInsert","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FPgBulkInsert%2FPgBulkInsert","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPgBulkInsert%2FPgBulkInsert/lists"}