{"id":27949156,"url":"https://github.com/digital-preservation/csv-validator","last_synced_at":"2025-05-15T17:03:51.809Z","repository":{"id":7716137,"uuid":"9081590","full_name":"digital-preservation/csv-validator","owner":"digital-preservation","description":"CSV Validation Tool and API (CSV Schema RI)","archived":false,"fork":false,"pushed_at":"2025-04-15T08:46:21.000Z","size":86107,"stargazers_count":213,"open_issues_count":51,"forks_count":56,"subscribers_count":26,"default_branch":"master","last_synced_at":"2025-05-07T15:22:02.373Z","etag":null,"topics":["csv","csv-files","csv-parse","csv-parser","csv-parsing","csv-schema","csv-validator"],"latest_commit_sha":null,"homepage":"http://digital-preservation.github.io/csv-validator","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/digital-preservation.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2013-03-28T16:45:35.000Z","updated_at":"2025-04-15T08:46:18.000Z","dependencies_parsed_at":"2023-10-30T12:36:41.962Z","dependency_job_id":"ea8232f9-cc47-423c-9f42-42ae6d65ee72","html_url":"https://github.com/digital-preservation/csv-validator","commit_stats":{"total_commits":934,"total_committers":24,"mean_commits":"38.916666666666664","dds":0.7366167023554604,"last_synced_commit":"6218b9a84428b34c291ec31aca6e139ebd398eb0"},"previous_names":[],"tags_count":22,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digital-preservation%2Fcsv-validator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digital-preservation%2Fcsv-validator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digital-preservation%2Fcsv-validator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digital-preservation%2Fcsv-validator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/digital-preservation","download_url":"https://codeload.github.com/digital-preservation/csv-validator/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254384982,"owners_count":22062422,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","csv-files","csv-parse","csv-parser","csv-parsing","csv-schema","csv-validator"],"created_at":"2025-05-07T15:20:04.599Z","updated_at":"2025-05-15T17:03:51.789Z","avatar_url":"https://github.com/digital-preservation.png","language":"Scala","funding_links":[],"categories":["SoftwareTools"],"sub_categories":["CSV Validator"],"readme":"CSV Validator\n=============\n\nA Validation Tool and APIs for validating CSV (Comma Separated Value) files by using [CSV Schema](https://github.com/digital-preservation/csv-schema).\n\n[![CI](https://github.com/digital-preservation/csv-validator/workflows/CI/badge.svg)](https://github.com/digital-preservation/csv-validator/actions?query=workflow%3ACI)\n\nReleased under the [Mozilla Public Licence version 2.0](http://www.mozilla.org/MPL/2.0/).\n\nA [comprehensive user guide is available in GitHub pages](http://digital-preservation.github.io/csv-validator/), along with a more [complete specification of the CSV Schema language](http://digital-preservation.github.io/csv-schema/csv-schema-1.1.html).\n\n\nTechnology\n----------\nThe Validation tool and APIs are written in Scala 2.13 and may be used as:\n\n* A stand-alone command line tool.\n\n* A desktop tool, we provide a simple Swing GUI.\n\n* A library in your Scala project.\n\n* A library in your Java project (We provide a Java 11 interface, to make things simple for Java programmers too).\n\nThe Validation Tool and APIs can be used on any Java Virtual Machine which supports Java 11 or better (**NB Java 6 support was removed in version 1.1**). The source code is\nbuilt using the [Apache Maven](https://maven.apache.org/) build tool:\n\n1. For use in other Java/Scala Applications, build by executing `mvn clean install`.\n2. For the Command Line Interface or Swing GUI, build by executing `mvn clean package`.\n\n\nMaven Artifacts\n===============\nReleased Maven Artifacts can be found in Maven Central under the groupId [`uk.gov.nationalarchives`](http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22uk.gov.nationalarchives%22).\n\n\nJava API\n--------\nIf you wish to use the CSV Validator from your own Java project, we provide a native Java API, the dependency details are:\n```xml\n\u003cdependency\u003e\n\t\u003cgroupId\u003euk.gov.nationalarchives\u003c/groupId\u003e\n    \u003cartifactId\u003ecsv-validator-java-api\u003c/artifactId\u003e\n    \u003cversion\u003e1.4.0\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\nThe Javadoc, can be found in either Maven Central or you can build it locally by executing `mvn javadoc:javadoc`.\n\nExample Java code of using the CSV Validator through the Java API:\n```java\n Charset csvEncoding = JCharset.forName(\"UTF-8\"); // default is UTF-8\n boolean validateCsvEncoding = true;\n Charset csvSchemaEncoding = JCharset.forName(\"UTF-8\"); // default is UTF-8\n boolean failFast = true; // default is false\n List\u003cSubstitution\u003e pathSubstitutions = new ArrayList\u003cSubstitution\u003e(); // default is any empty ArrayList\n boolean enforceCaseSensitivePathChecks = true; // default is false\n boolean trace = false; // default is false\n ProgressCallback progress; // default is null\n boolean skipFileChecks = true; // default is false\n int maxCharsPerCell = 8096; // default is 4096\n\n // add a substitution path\n pathSubstitutions.add(new Substitution(\"file://something\", \"/home/xxx\"));\n\n CsvValidator.ValidatorBuilder validateWithStringNames = new CsvValidator.ValidatorBuilder(\n     \"/home/dev/IdeaProjects/csv/csv-validator/csv-validator-core/data.csv\",\n     \"/home/dev/IdeaProjects/csv/csv-validator/csv-validator-core/data-schema.csvs\"\n )\n\n // alternatively, you can pass in Readers for each file\n Reader csvReader = new Reader();\n Reader csvSchemaReader = new Reader();\n CsvValidator.ValidatorBuilder validateWithReaders = new CsvValidator.ValidatorBuilder(\n     csvReader, csvSchemaReader\n )\n\n List\u003cFailMessage\u003e messages = validateWithStringNames\n   .usingCsvEncoding(csvEncoding, validateCsvEncoding) // should only be `true` if using UTF-8 encoding, otherwise it will throw an exception\n   .usingCsvSchemaEncoding(csvSchemaEncoding)\n   .usingFailFast(failFast)\n   .usingPathSubstitutions(pathSubstitutions)\n   .usingEnforceCaseSensitivePathChecks(enforceCaseSensitivePathChecks)\n   .usingTrace(trace)\n   .usingProgress(progress)\n   .usingSkipFileChecks(skipFileChecks)\n   .usingMaxCharsPerCell(maxCharsPerCell)\n   .runValidation();\n\n if(messages.isEmpty()) {\n   System.out.println(\"All worked OK\");\n } else {\n   for(FailMessage message : messages) {\n     if(message instanceof WarningMessage) {\n       System.out.println(\"Warning: \" + message.getMessage());\n     } else {\n       System.out.println(\"Error: \" + message.getMessage());\n     }\n   }\n }\n}\n```\n\n\nScala API\n=========\nLikewise, if you wish to use the CSV Validator from your own Scala project, the Scala API is part of the core, the dependency details are:\n```xml\n\u003cdependency\u003e\n\t\u003cgroupId\u003euk.gov.nationalarchives\u003c/groupId\u003e\n    \u003cartifactId\u003ecsv-validator-core\u003c/artifactId\u003e\n    \u003cversion\u003e1.3.0\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\nThe Scaladoc, can be found in either Maven Central or you can build it locally by executing `mvn scala:doc`.\n\nAn example of using the Scala API can be found in the class `uk.gov.nationalarchives.csv.validator.api.java.CsvValidatorJavaBridge` from the\n`csv-validator-java-api` module. The Scala API at present gives much more control over the individual Schema Parsing and Validation Processor\nthan the Java API.\n\nSchema Examples\n===============\nExamples of CSV Schema can be found in the test cases of the `csv-validator-core` module. See the `*.csvs` files in [acceptance/](https://github.com/digital-preservation/csv-validator/tree/master/csv-validator-core/src/test/resources/uk/gov/nationalarchives/csv/validator/acceptance). Schemas used by the Digital Preservation department at The National Archives are also available in the `example-schemas` folder of the [csv-schema](https://github.com/digital-preservation/csv-schema) repository.\n\n\nCurrent Limitations of the CSV Validator Tool\n=============================================\nThe CSV Validator implements almost all of `CSV Schema 1.2 (Draft)` language, current limitations and missing functionality are:\n\n* `DateExpr` is not yet fully implemented (may raise Schema check error).\n\n* `PartialDateExpr` is not yet implemented (raises Schema check error).\n\n* At least `MD5`, `SHA-1`, `SHA-2`, `SHA-3`, and `SHA-256` checksum algorithms are supported. Probably many more as well as we defer to Java's `java.security.MessageDigest` class.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdigital-preservation%2Fcsv-validator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdigital-preservation%2Fcsv-validator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdigital-preservation%2Fcsv-validator/lists"}