{"id":22110755,"url":"https://github.com/unified-doc/unified-doc","last_synced_at":"2025-10-11T13:02:49.227Z","repository":{"id":57386201,"uuid":"266412140","full_name":"unified-doc/unified-doc","owner":"unified-doc","description":"unified document APIs","archived":false,"fork":false,"pushed_at":"2020-09-25T05:07:03.000Z","size":605,"stargazers_count":10,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-10-15T12:00:42.429Z","etag":null,"topics":["annotate","compile","content","document","export","file","hast","highlight","html","io","markdown","nlp","parse","search","text","unifiedjs","unist","vfile"],"latest_commit_sha":null,"homepage":"https://unified-doc.netlify.app/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/unified-doc.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"license","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-05-23T20:14:39.000Z","updated_at":"2024-09-05T14:59:43.000Z","dependencies_parsed_at":"2022-09-02T01:10:56.519Z","dependency_job_id":null,"html_url":"https://github.com/unified-doc/unified-doc","commit_stats":null,"previous_names":[],"tags_count":136,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unified-doc%2Funified-doc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unified-doc%2Funified-doc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unified-doc%2Funified-doc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unified-doc%2Funified-doc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/unified-doc","download_url":"https://codeload.github.com/unified-doc/unified-doc/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227537560,"owners_count":17784564,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["annotate","compile","content","document","export","file","hast","highlight","html","io","markdown","nlp","parse","search","text","unifiedjs","unist","vfile"],"created_at":"2024-12-01T10:23:36.948Z","updated_at":"2025-10-11T13:02:44.192Z","avatar_url":"https://github.com/unified-doc.png","language":"JavaScript","readme":"# unified-doc\nunified document APIs.\n\n---\n\n## Contents\n- [Intro](#intro)\n- [Document Formats](#document-formats)\n- [Spec](#spec)\n- [Packages](#packages)\n- [Development](#development)\n\n## Intro\nVast amounts of human knowledge is stored digitally in different document formats.  It is cheap to create, store, render, and manage content for the same document format, but much harder to perform the same operations for content across different formats.  Some form of [unified][unified] bridge is required to significantly lower the friction when working across different formats, resulting in improved sharing of human knowledge.\n\nInstead of implementing custom programs per format to parse/render/search/annotate/export content, `unified-doc` implements a set of unified document APIs for supported content types.  This allows extension of existing APIs to newly introduced content types, and for supported content types to benefit from future API methods.\n\nWith `unified-doc`, we can easily\n- compile and render any content to HTML.\n- format and style the document.\n- mark or annotate the document.\n- search on the document's text content.\n- export the document in a variety of file formats.\n- preserve the semantic structure of the source content.\n- retrieve useful representations of the document (e.g. source, html, text, syntax tree).\n- enrich the document through an ecosystem of plugins.\n- evolve with interoperable web technologies.\n\n### Document formats\n\n`unified-doc` supports the following document formats by implementing parsers associated with the [mime type][mime-type] of the document format:\n\n- [x] most source code supported by syntax highlighting libraries (e.g. `.txt`, `.json`, `.js`, `.css`, `.sh`, `.py`, `.r`, `.cpp`)\n- [x] `.html`\n- [x] `.md`\n- [x] `.csv`\n- [ ] `.docx`\n- [ ] `.epub`\n- [ ] `.pdf`\n- [ ] `.tex`\n- [ ] `.mathml`\n- [ ] `.rtf`\n\n\n## Spec\nPlease refer to the [Spec](./spec.md) documentation for more details on goals, definitions, and implementations in `unified-doc`.\n\n## Packages\nThe following packages are managed under the `unified-doc` project.\n\n### APIs\nUnified document APIs for Node, CLI, DOM.\n- [`unified-doc`][unified-doc]\n- [`unified-doc-cli`][unified-doc-cli]\n- [`unified-doc-dom`][unified-doc-dom]\n\n### Parsers\nParsers parse source content into [hast][] trees.\n- [`unified-doc-parse-code-block`][unified-doc-parse-code-block]\n- [`unified-doc-parse-csv`][unified-doc-parse-csv]\n\n### Search Algorithms\nSearch algorithms use a unified search interface to return search results based on the provided `query` when searching across a document's `textContent`.\n- [`unified-doc-search-micromatch`][unified-doc-search-micromatch]\n\n### Hast Utils\n`hast` utilities operate and transform `hast` trees.\n- [`unified-doc-util-mark`][unified-doc-util-mark]\n- [`unified-doc-util-text-offsets`][unified-doc-util-text-offsets]\n\n### Wrappers\nWrappers implement `unified-doc` APIs in other interfaces.\n- [`unified-doc-react`][unified-doc-react]\n\n### Types\nShared Typescript typings used across `unified-doc` packages.\n- [`unified-doc-types`][unified-doc-types]\n\n## Development\nThis project is:\n- implemented with the [unified][] interface.\n- linted with `xo` + `prettier` + `tsc`.\n- developed and built with `microbundle`.\n- tested with `jest`.\n- softly-typed with `typescript` with `checkJs` (only public APIs are typed).\n- managed with `lerna`\n\nMonorepo scripts:\n```sh\n# install dependencies and bootstrap with lerna\nnpm run bootstrap\n\n# build all packages with microbundle\nnpm run build\n\n# clean all packages (rm dist + node_modules)\nnpm run clean\n\n# watch/rebuild all packages with microbundle\nnpm run dev\n\n# lint all packages with xo + prettier + tsc\nnpm run lint\n\n# test all packages with jest in --watch mode (make sure to run the 'dev' script)\nnpm run test\n\n# test all packages in a single run\nnpm run test:run\n\n# publish all packages with lerna\nnpm run publish\n```\n\n\u003c!-- Definitions --\u003e\n[hast]: https://github.com/syntax-tree/hast\n[mime-type]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types\n[unified]: https://github.com/unifiedjs\n[unified-doc]: https://github.com/unified-doc/unified-doc/tree/main/packages/unified-doc\n[unified-doc-cli]: https://github.com/unified-doc/unified-doc-cli\n[unified-doc-dom]: https://github.com/unified-doc/unified-doc-dom\n[unified-doc-parse-code-block]: https://github.com/unified-doc/unified-doc/tree/main/packages/unified-doc-parse-code-block\n[unified-doc-parse-csv]: https://github.com/unified-doc/unified-doc/tree/main/packages/unified-doc-parse-csv\n[unified-doc-react]: https://github.com/unified-doc/unified-doc-react\n[unified-doc-search-micromatch]: https://github.com/unified-doc/unified-doc/tree/main/packages/unified-doc-search-micromatch\n[unified-doc-types]: https://github.com/unified-doc/unified-doc/tree/main/packages/unified-doc-types\n[unified-doc-util-mark]: https://github.com/unified-doc/unified-doc/tree/main/packages/unified-doc-util-mark\n[unified-doc-util-text-offsets]: https://github.com/unified-doc/unified-doc/tree/main/packages/unified-doc-util-text-offsets\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Funified-doc%2Funified-doc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Funified-doc%2Funified-doc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Funified-doc%2Funified-doc/lists"}