An open API service indexing awesome lists of open source software.

https://github.com/docling-project/docling4j

Docling4j brings the functionalities of Docling in document understanding to Java® projects
https://github.com/docling-project/docling4j

ai docling document-parser document-parsing document-understanding documents java pdf pdf-converter pdf-to-json

Last synced: 4 months ago
JSON representation

Docling4j brings the functionalities of Docling in document understanding to Java® projects

Awesome Lists containing this project

README

          

[![License MIT](https://img.shields.io/github/license/docling-project/docling-parse)](https://opensource.org/licenses/MIT)
[![semantic-release: angular](https://img.shields.io/badge/semantic--release-angular-e10079?logo=semantic-release)](https://github.com/semantic-release/semantic-release)
[![graalpy](https://img.shields.io/badge/pyenv-graalpy-blue)](#start-replacing-cpython-with-graalpy)

# Docling4j version 0.1.1

**Docling4j** brings the functionalities of [Docling](https://github.com/docling-project/docling) in document understanding to Java® projects.

## Installation (WIP)
The current version of this library is: 0.1.1

To use it in your project, define a dependency that contains the artifact coordinates (group id, artifact id and version)
for the service, like this:

##### Maven

```xml

com.ibm.docling
docling4j
0.1.1

```

**docling4j** uses [GraalPy](https://www.graalvm.org/python), a high-performance embeddable Python 3 runtime for Java. Although not required, [Oracle GraalVM JDK](https://www.oracle.com/java/graalvm/) is recommended for running **docling4j**, since it supports runtime compilation to native code and efficient execution of embedded applications. Find more details on the level of optimizations of different Java runtimes [here](https://www.graalvm.org/latest/reference-manual/embed-languages/#runtime-optimization-support).

## Get help and support

Please feel free to connect with us using the [discussion section](https://github.com/docling-project/docling/discussions).

## Technical report

For more details on Docling's inner workings, check out the [Docling Technical Report](https://arxiv.org/abs/2408.09869).

## Code of conduct

See [Code of Conduct](https://github.com/docling-project/docling4j/blob/main/CODE_OF_CONDUCT.md) for details.

## References

If you use Docling in your projects, please consider citing the following:

```bib
@techreport{Docling,
author = {Deep Search Team},
month = {8},
title = {Docling Technical Report},
url = {https://arxiv.org/abs/2408.09869},
eprint = {2408.09869},
doi = {10.48550/arXiv.2408.09869},
version = {1.0.0},
year = {2024}
}
```

## License

The Docling codebase is under MIT license.
For individual model usage, please refer to the model licenses found in the original packages.

## LF AI & Data

Docling is hosted as a project in the [LF AI & Data Foundation](https://lfaidata.foundation/projects/).

### IBM ❤️ Open Source AI

The project was started by the AI for knowledge team at IBM Research Zurich.