An open API service indexing awesome lists of open source software.

https://github.com/0xthierry/layout-document-ai

Document AI preserving the original layout.
https://github.com/0xthierry/layout-document-ai

documentai

Last synced: about 1 year ago
JSON representation

Document AI preserving the original layout.

Awesome Lists containing this project

README

          

# Layout Document AI

## Description

`Layout Document AI` is a tool designed to process documents using Google Cloud's Document AI, preserving the original layout. It reads JSON files generated by Document AI, processes the text while maintaining the layout, and outputs the text into `.txt` files.

## Features

- Process Document AI JSON outputs
- Preserves the original layout of the document.
- Generate formatted text files

## Installation

1. Install dependencies:
```sh
pnpm install
```

## Usage

1. Place your Document AI JSON files in the `document-ai-json` directory.

2. Run the main script to process the documents:
```sh
pnpm run dev
```

3. The processed text files will be saved in the `document-ai-text` directory.

## Contributing

1. Fork the repository.
2. Create a new branch (`git checkout -b feature-branch`).
3. Make your changes.
4. Commit your changes (`git commit -m 'Add some feature'`).
5. Push to the branch (`git push origin feature-branch`).
6. Open a pull request.

## License

This project is licensed under the ISC License. See the [LICENSE](LICENSE) file for details.

## Author

Thierry Santos - [thierrysantoos123@gmail.com](mailto:thierrysantoos123@gmail.com)

## Acknowledgements

- [Google Cloud Document AI](https://cloud.google.com/document-ai) for providing the document processing capabilities.