{"id":27949135,"url":"https://github.com/digital-preservation/utf8-validator","last_synced_at":"2025-05-07T15:20:02.128Z","repository":{"id":11953664,"uuid":"14524594","full_name":"digital-preservation/utf8-validator","owner":"digital-preservation","description":"UTF-8 Validator","archived":false,"fork":false,"pushed_at":"2023-03-24T12:02:22.000Z","size":123,"stargazers_count":19,"open_issues_count":8,"forks_count":7,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-05-07T15:19:56.333Z","etag":null,"topics":["charset","unicode","utf-8","utf8-validator","validator"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/digital-preservation.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2013-11-19T12:59:25.000Z","updated_at":"2025-01-10T12:44:00.000Z","dependencies_parsed_at":"2024-01-17T07:06:41.979Z","dependency_job_id":"b7ffa937-7ff9-483b-afe8-b49284d4079d","html_url":"https://github.com/digital-preservation/utf8-validator","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digital-preservation%2Futf8-validator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digital-preservation%2Futf8-validator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digital-preservation%2Futf8-validator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digital-preservation%2Futf8-validator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/digital-preservation","download_url":"https://codeload.github.com/digital-preservation/utf8-validator/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252902620,"owners_count":21822263,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["charset","unicode","utf-8","utf8-validator","validator"],"created_at":"2025-05-07T15:20:01.133Z","updated_at":"2025-05-07T15:20:02.121Z","avatar_url":"https://github.com/digital-preservation.png","language":"Java","funding_links":[],"categories":["SoftwareTools"],"sub_categories":["UTF-8 Validator"],"readme":"UTF-8 Validator\n===============\n\nA UTF-8 Validation Tool which may be used as either a command line tool or as a library embedded in your own program.\n\nReleased under the [BSD 3-Clause Licence](http://opensource.org/licenses/BSD-3-Clause).\n\n[![CI](https://github.com/digital-preservation/utf8-validator/workflows/CI/badge.svg)](https://github.com/digital-preservation/utf8-validator/actions?query=workflow%3ACI)\n[![Maven Central](https://maven-badges.herokuapp.com/maven-central/uk.gov.nationalarchives/utf8-validator/badge.svg)](https://search.maven.org/search?q=g:uk.gov.nationalarchives)\n\nUse from the Command Line\n-------------------------\nYou can either download the application from [here](https://search.maven.org/remotecontent?filepath=uk/gov/nationalarchives/utf8-validator/1.2/utf8-validator-1.2-application.zip) or [build from the source code](#building-from-source-code). You should extract this ZIP file to the place on your computer where you keep your applications. You can then run either `bin/validate.sh` (Linux/Mac/Unix) or `bin\\validate.bat` (Windows).\n\nFor example, to report all validation errors:\n\n```bash\n$ cd /opt/utf8-validator-1.2\n$ bin/validate /tmp/my-file.txt\n```\n\nFor example to report the first validation error and exit:\n\n```bash\n$ cd /opt/utf8-validator-1.2\n$ bin/validate.sh --fail-fast /tmp/my-file.txt\n```\n\nCommand Line Exit Codes\n-----------------------\n* **0** Success\n* **1** Invalid Arguments provided to the application\n* **2** File was not UTF-8 Valid\n* **4** IO Error, e.g. could not read file\n\n\nUse as a Library\n----------------\nThe UTF-8 Validator is written in Java and may be easily used from any Java (Scala, Clojure, JVM Language etc) application. We are using the Maven build system, and our artifacts have been published to [Maven Central](http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22uk.gov.nationalarchives%22).\n\nIf you are using Maven, you can simply add this to the dependencies section of your pom.xml:\n\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003euk.gov.nationalarchives\u003c/groupId\u003e\n    \u003cartifactId\u003eutf8-validator\u003c/artifactId\u003e\n    \u003cversion\u003e1.2\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\nAlternatively if you are using Sbt, you can add this to your library dependencies:\n\n```scala\n\"uk.gov.nationalarchives\" % \"utf8-validator\" % \"1.2\"\n```\n\nTo use the Library you need to implement the very simple interface `uk.gov.nationalarchives.utf8.validator.ValidationHandler` (or you could use `uk.gov.nationalarchives.utf8.validator.PrintingValidationHandler` if it suits you). The interface has a single method which is called whenever a validator finds a validation error. You can then instantiate `Utf8Validator` and validate from either a `java.io.File` or `java.io.InputStream`. For example:\n\n```java\nValidationHandler handler = new ValidationHandler() {\n\t@Override\n\tpublic void error(final String message, final long byteOffset) throws ValidationException {\n\t\tSystem.err.println(\"[Error][@\" + byteOffset + \"] \" + message);\n\t};\n};\n\nFile f = ... //your file here\n\nnew Utf8Validator(handler).validate(f);\n```\n\nBuilding from Source Code\n--------------------------\n* Git clone the repository from https://github.com/digital-preservation/utf8-validator.git\n* Build using [Maven](http://maven.apache.org), by running `mvn package` you will then find a ZIP of the compiled application in `target/utf8-validator-1.2-application.zip`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdigital-preservation%2Futf8-validator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdigital-preservation%2Futf8-validator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdigital-preservation%2Futf8-validator/lists"}