https://github.com/0xthierry/layout-document-ai
Document AI preserving the original layout.
https://github.com/0xthierry/layout-document-ai
documentai
Last synced: about 1 year ago
JSON representation
Document AI preserving the original layout.
- Host: GitHub
- URL: https://github.com/0xthierry/layout-document-ai
- Owner: 0xthierry
- Created: 2024-07-25T12:46:59.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-08-29T22:57:42.000Z (almost 2 years ago)
- Last Synced: 2024-10-18T09:13:57.187Z (over 1 year ago)
- Topics: documentai
- Language: TypeScript
- Homepage:
- Size: 52.7 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Layout Document AI
## Description
`Layout Document AI` is a tool designed to process documents using Google Cloud's Document AI, preserving the original layout. It reads JSON files generated by Document AI, processes the text while maintaining the layout, and outputs the text into `.txt` files.
## Features
- Process Document AI JSON outputs
- Preserves the original layout of the document.
- Generate formatted text files
## Installation
1. Install dependencies:
```sh
pnpm install
```
## Usage
1. Place your Document AI JSON files in the `document-ai-json` directory.
2. Run the main script to process the documents:
```sh
pnpm run dev
```
3. The processed text files will be saved in the `document-ai-text` directory.
## Contributing
1. Fork the repository.
2. Create a new branch (`git checkout -b feature-branch`).
3. Make your changes.
4. Commit your changes (`git commit -m 'Add some feature'`).
5. Push to the branch (`git push origin feature-branch`).
6. Open a pull request.
## License
This project is licensed under the ISC License. See the [LICENSE](LICENSE) file for details.
## Author
Thierry Santos - [thierrysantoos123@gmail.com](mailto:thierrysantoos123@gmail.com)
## Acknowledgements
- [Google Cloud Document AI](https://cloud.google.com/document-ai) for providing the document processing capabilities.