{"id":19840722,"url":"https://github.com/projectnessie/nessie-demos","last_synced_at":"2025-05-12T14:54:32.479Z","repository":{"id":37046405,"uuid":"363992758","full_name":"projectnessie/nessie-demos","owner":"projectnessie","description":"Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.","archived":false,"fork":false,"pushed_at":"2025-05-02T11:21:40.000Z","size":832,"stargazers_count":29,"open_issues_count":9,"forks_count":22,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-05-02T12:32:05.013Z","etag":null,"topics":["binder","iceberg","jupyter-notebooks","nessie","spark"],"latest_commit_sha":null,"homepage":"https://projectnessie.org/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/projectnessie.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-05-03T16:29:35.000Z","updated_at":"2025-04-20T01:04:43.000Z","dependencies_parsed_at":"2024-01-12T17:27:48.498Z","dependency_job_id":"d8c2f203-39d1-41bc-a2c3-f410419341e8","html_url":"https://github.com/projectnessie/nessie-demos","commit_stats":{"total_commits":406,"total_committers":11,"mean_commits":36.90909090909091,"dds":"0.46059113300492616","last_synced_commit":"64598e0f2c7d09063981a842ed61fb497cde2df9"},"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/projectnessie%2Fnessie-demos","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/projectnessie%2Fnessie-demos/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/projectnessie%2Fnessie-demos/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/projectnessie%2Fnessie-demos/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/projectnessie","download_url":"https://codeload.github.com/projectnessie/nessie-demos/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253759469,"owners_count":21959772,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["binder","iceberg","jupyter-notebooks","nessie","spark"],"created_at":"2024-11-12T12:27:57.429Z","updated_at":"2025-05-12T14:54:32.426Z","avatar_url":"https://github.com/projectnessie.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Nessie Binder Demos\n\nThese demos run under binder and can be found at:\n\n* [Spark and Iceberg](https://mybinder.org/v2/gh/projectnessie/nessie-demos/main?labpath=notebooks%2Fnessie-iceberg-demo-nba.ipynb)\n* [Flink and Iceberg](https://mybinder.org/v2/gh/projectnessie/nessie-demos/main?labpath=notebooks%2Fnessie-iceberg-flink-demo-nba.ipynb)\n* [Hive and Iceberg](https://mybinder.org/v2/gh/projectnessie/nessie-demos/main?labpath=notebooks%2Fnessie-iceberg-hive-demo-nba.ipynb)\n\nThey are automatically rebuilt every time we push to main. They are unit tested using `testbook` library to ensure we get\nthe correct results as the underlying libraries continue to grow/mature.\n\n\n## Upgrade instructions\n\nBecause of the split between Binder and unit tests it wasn't totally trivial to create a single place to update all versions.\nSome versions have to be updated in multiple places:\n\n### Nessie\n\nNessie version is set in Binder at `docker/binder/requirements_base.txt`. Currently, the demos are using 0.74.x of Nessie.\n\n### Iceberg\n\nCurrently we are using Iceberg `1.4.2` and it is specified in both iceberg notebooks as well as `docker/utils/__init__.py`\n\n### Spark\n\nOnly has to be updated in `docker/binder/requirements.txt`. Currently, Iceberg supports 3.2, 3.3, 3.4 and 3.5, we use Spark 3.2 in the demos.\n\n### Flink\n\nFlink version is set in Binder at `docker/binder/requirements_flink.txt`. Currently, we are using `1.17.1`.\n\n### Hadoop\n\nHadoop libs are used by flink and currently specified in `docker/utils/__init__.py` only. We use `2.10.1` with Flink and Hive.\n\n### Hive\n\nCurrent Hive version that is being used `2.3.9` which supports Hadoop version of `2.10.1`. To update the version, it needs to be only updated\nin `docker/utils/__init__.py`.\n\n## Binder\n\n[Binder](https://mybinder.org) is a more customizable platform for Jupyter notebooks and\nmore (see their website). Binder generates a Dockerfile + image based on the settings in the\nsource GitHub repository (other sources are possible). It is possible to pre-install both\ne.g. Ubuntu and/or Python packages into the Docker image generated by Binder.\n\nOf course, Binder just lets a user \"simply start\" a notebook via a simple \"click on a link\".\n\n\n## Development\nFor development, you will need to make sure to have the following installed:\n- Python 3.10+\n- pre-commit\n\nRegarding pre-commit, you will need to make sure is installed through `pre-commit install` in order to install the hooks locally since this repo\nexecutes some several scripts in pre-commit stage.\n\nTo run the notebooks unit tests, in `notebook` folder, run the following commands:\n1. `python -m pip install -r requirements_dev.txt`\n2. `tox`\n\nRunning the unit tests takes time since it will need to download all the binaries files like Hive, Flink ..etc and then it will\nrun the tests.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprojectnessie%2Fnessie-demos","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprojectnessie%2Fnessie-demos","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprojectnessie%2Fnessie-demos/lists"}