https://github.com/projectnessie/nessie
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
https://github.com/projectnessie/nessie
aws-lambda data git iceberg java spark
Last synced: 11 days ago
JSON representation
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
- Host: GitHub
- URL: https://github.com/projectnessie/nessie
- Owner: projectnessie
- License: apache-2.0
- Created: 2020-04-09T18:39:03.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2025-04-18T01:51:51.000Z (17 days ago)
- Last Synced: 2025-04-18T08:54:31.970Z (17 days ago)
- Topics: aws-lambda, data, git, iceberg, java, spark
- Language: Java
- Homepage: https://projectnessie.org
- Size: 186 MB
- Stars: 1,182
- Watchers: 28
- Forks: 148
- Open Issues: 125
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project
- jimsghstars - projectnessie/nessie - Nessie: Transactional Catalog for Data Lakes with Git-like semantics (Java)
- awesome-data-engineering - Project Nessie - Project Nessie is a Transactional Catalog for Data Lakes with Git-like semantics. Works with Apache Iceberg tables. (Data Lake Management)
- awesome-datalake - Nessie - Project Nessie is a Transactional Catalog for Data Lakes with Git-like semantics. (Data Lake Storages)
- awesome-datalake - Nessie - Project Nessie is a Transactional Catalog for Data Lakes with Git-like semantics. (Data Lake Storages)
README
# Project Nessie
Project Nessie is a Transactional Catalog for Data Lakes with Git-like semantics.
[](https://project-nessie.zulipchat.com/)
[](https://groups.google.com/g/projectnessie)
[](https://twitter.com/projectnessie)
[](https://projectnessie.org/)[](https://search.maven.org/artifact/org.projectnessie.nessie/nessie)
[](https://pypi.python.org/pypi/pynessie)
[](https://quay.io/repository/projectnessie/nessie?tab=tags)
[](https://artifacthub.io/packages/search?repo=nessie)
[](https://app.swaggerhub.com/apis/projectnessie/nessie)[](https://github.com/projectnessie/nessie/actions/workflows/ci.yml?query=branch%3Amain)
[](https://github.com/projectnessie/query-engine-integration-tests/actions/workflows/main.yml?query=branch%3Amain)
[](https://github.com/projectnessie/nessie/actions/workflows/ci-mac.yml)More information can be found at [projectnessie.org](https://projectnessie.org/).
Nessie supports Iceberg Tables/Views. Additionally, Nessie is focused on working with the widest range of tools possible, which can be seen in the [feature matrix](https://projectnessie.org/tools/#feature-matrix).
## Using Nessie
You can quickly get started with Nessie by using our small, fast docker image.
**IMPORTANT NOTE** Nessie has moved away from `docker.io` to GitHub's container registry `ghcr.io`,
and also `quay.io`. Recent releases are already only available on both ghcr.io and quay.io. Please
update references to `projectnessie/nessie` in your code to either `ghcr.io/projectnessie/nessie`
or `quay.io/projectnessie/nessie`.```
docker pull ghcr.io/projectnessie/nessie
docker run -p 19120:19120 ghcr.io/projectnessie/nessie
```
_For trying Nessie image with different configuration options, refer to the templates under the [docker module](./docker#readme)._A local [Web UI](https://projectnessie.org/tools/ui/) will be available at this point.
Then install the Nessie CLI tool (to learn more about CLI tool and how to use it, check [Nessie CLI Documentation](https://projectnessie.org/tools/cli/)).
```
pip install pynessie
```From there, you can use one of our technology integrations such those for
* [Spark via Iceberg](https://projectnessie.org/tools/iceberg/spark/)
* [Hive via Iceberg](https://projectnessie.org/tools/iceberg/hive/)To learn more about all supported integrations and tools, check [here](https://projectnessie.org/tools/)
Have fun! We have a Google Group and a Slack channel we use for both developers and
users. Check them out [here](https://projectnessie.org/community/).### Authentication
By default, Nessie servers run with authentication disabled and all requests are processed under the "anonymous"
user identity.Nessie supports bearer tokens and uses [OpenID Connect](https://openid.net/connect/) for validating them.
Authentication can be enabled by setting the following Quarkus properties:
* `nessie.server.authentication.enabled=true`
* `quarkus.oidc.auth-server-url=`
* `quarkus.oidc.client-id=`#### Experimenting with Nessie Authentication in Docker
One can start the `projectnessie/nessie` docker image in authenticated mode by setting
the properties mentioned above via docker environment variables. For example:```shell
docker run -p 19120:19120 \
-e QUARKUS_OIDC_CLIENT_ID= \
-e QUARKUS_OIDC_AUTH_SERVER_URL= \
-e NESSIE_SERVER_AUTHENTICATION_ENABLED=true \
--network host \
ghcr.io/projectnessie/nessie
```## Building and Developing Nessie
### Requirements
- JDK 21 or higher: JDK 21 or higher is needed to build Nessie (some artifacts are built
for Java 8 or 11)### Installation
Clone this repository:
```bash
git clone https://github.com/projectnessie/nessie
cd nessie
```Then open the project in IntelliJ or Eclipse, or just use the IDEs to clone this github repository.
Refer to [CONTRIBUTING](./CONTRIBUTING.md) for build instructions.
### Compatibility
Nessie Iceberg's integration is compatible with Iceberg as in the following table:
| Nessie version | Iceberg version | Spark version (Scala 2.12+2.13) | Hive version | Flink version | Presto version | Trino version |
|----------------|-----------------|---------------------------------|--------------|------------------------|-------------------------------------|---------------|
| 0.103.3 | 1.5.0 | 3.3.x, 3.4.x, 3.5.x | n/a | 1.16.x, 1.17.x, 1.18.x | 0.277, 0.278.x, 0.279, 0.280, 0.281 | 419 |### Distribution
To run:
1. configuration in `servers/quarkus-server/src/main/resources/application.properties`
2. execute `./gradlew :nessie-quarkus:assemble && java -jar servers/quarkus-server/build/quarkus-app/quarkus-run.jar`
3. go to `http://localhost:19120`### UI
Nessie UI sources have moved to their own repository: https://github.com/projectnessie/nessie-ui.
### Docker image
Official Nessie images are built with support for [multiplatform builds](./tools/dockerbuild#readme). But to quickly
build a docker image for testing purposes, simply run the following command:```shell
./gradlew :nessie-quarkus:clean :nessie-quarkus:quarkusBuild
docker build -f ./tools/dockerbuild/docker/Dockerfile-server -t nessie-unstable:latest ./servers/quarkus-server
```Check that your image is available locally:
```shell
docker images
```You should see something like this:
```
REPOSITORY TAG IMAGE ID CREATED SIZE
nessie-unstable latest 24bb4c7bd696 15 seconds ago 555MB
```Once this is done you can run your image with `docker run -p 19120:19120 quay.io/nessie-unstable:latest`, passing the relevant
environment variables, if any. Environment variables names must follow MicroProfile Config's [mapping
rules](https://github.com/eclipse/microprofile-config/blob/master/spec/src/main/asciidoc/configsources.asciidoc#environment-variables-mapping-rules).## Nessie related repositories
* [CEL Java](https://github.com/projectnessie/cel-java): Java port of the Common Expression Language
* [Nessie apprunner](https://github.com/projectnessie/nessie-apprunner): Maven and Gradle plugins to use Nessie in integration tests.## Contributing
### Code Style
The Nessie project uses the Google Java Code Style, scalafmt and pep8.
See [CONTRIBUTING.md](./CONTRIBUTING.md) for more information.## Acknowledgements
See [ACKNOWLEDGEMENTS.md](ACKNOWLEDGEMENTS.md)