{"id":13527142,"url":"https://github.com/NationalSecurityAgency/datawave","last_synced_at":"2025-04-01T09:31:09.292Z","repository":{"id":37547506,"uuid":"116999027","full_name":"NationalSecurityAgency/datawave","owner":"NationalSecurityAgency","description":"DataWave is an ingest/query framework that leverages Apache Accumulo to provide fast, secure data access.","archived":false,"fork":false,"pushed_at":"2024-10-29T12:13:07.000Z","size":101637,"stargazers_count":563,"open_issues_count":402,"forks_count":244,"subscribers_count":59,"default_branch":"integration","last_synced_at":"2024-10-29T14:39:29.483Z","etag":null,"topics":["accumulo","bigdata","java"],"latest_commit_sha":null,"homepage":"https://code.nsa.gov/datawave","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NationalSecurityAgency.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-01-10T19:05:58.000Z","updated_at":"2024-10-28T23:57:16.000Z","dependencies_parsed_at":"2023-12-15T19:23:24.425Z","dependency_job_id":"8b3d1b29-7ff3-4c74-870e-9f8a909d18d8","html_url":"https://github.com/NationalSecurityAgency/datawave","commit_stats":{"total_commits":2552,"total_committers":62,"mean_commits":41.16129032258065,"dds":0.7033699059561129,"last_synced_commit":"4be665cf49ea253ed5972abb047cd951ef3a0a64"},"previous_names":[],"tags_count":684,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NationalSecurityAgency%2Fdatawave","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NationalSecurityAgency%2Fdatawave/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NationalSecurityAgency%2Fdatawave/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NationalSecurityAgency%2Fdatawave/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NationalSecurityAgency","download_url":"https://codeload.github.com/NationalSecurityAgency/datawave/tar.gz/refs/heads/integration","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246615976,"owners_count":20806038,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accumulo","bigdata","java"],"created_at":"2024-08-01T06:01:42.026Z","updated_at":"2025-04-01T09:31:09.286Z","avatar_url":"https://github.com/NationalSecurityAgency.png","language":"Java","funding_links":[],"categories":["Java","Security Tools","大数据"],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n   \u003cimg src=\"datawave-readme.png\" /\u003e\n\u003c/p\u003e\n\n[![Apache License][li]][ll] ![Build Status](https://github.com/NationalSecurityAgency/datawave/actions/workflows/tests.yml/badge.svg)\n\nDataWave is a Java-based ingest and query framework that leverages [Apache Accumulo](http://accumulo.apache.org/) to provide fast, secure access to your data. DataWave supports a wide variety of use cases, including but not limited to...\n\n* Data fusion across structured and unstructured datasets\n* Construction and analysis of distributed graphs\n* Multi-tenant data architectures, with tenants having distinct security requirements and data access patterns\n* Fine-grained control over data access, integrated easily with existing user-authorization services and PKI\n\nThe easiest way to get started is the [DataWave Quickstart](https://code.nsa.gov/datawave/docs/quickstart)\n\nDocumentation is located [here](https://code.nsa.gov/datawave/docs/)\n\nBasic build instructions are [here](BUILDME.md)\n\n## How to Use this Repository\n\nThe microservices and associated utility projects are intended to be developed, versioned,\nand released independently.  The following subdirectories contain those independently\nversioned modules:\n\n```\ncore/utils/type-utils\ncontrib/datawave-utils\ncore/base-rest-responses\ncore/in-memory-accumulo\ncore/metrics-reporter\ncore/utils/accumulo-utils\ncore/utils/common-utils\ncore/utils/metadata-utils\nmicroservices/microservice-parent\nmicroservices/microservice-service-parent\nmicroservices/starters/audit\nmicroservices/starters/cache\nmicroservices/starters/cached-results\nmicroservices/starters/datawave\nmicroservices/starters/metadata\nmicroservices/starters/query\nmicroservices/starters/query-metric\nmicroservices/services/accumulo\nmicroservices/services/audit\nmicroservices/services/authorization\nmicroservices/services/config\nmicroservices/services/dictionary\nmicroservices/services/file-provider\nmicroservices/services/hazelcast\nmicroservices/services/map\nmicroservices/services/mapreduce-query\nmicroservices/services/modification\nmicroservices/services/query\nmicroservices/services/query-executor\nmicroservices/services/query-metric\n```\n\nEach of those subdirectories contain a .gitrepo file that keeps track of where the code came from.\n\n### Updating one of the datawave sub-repositories\nAt one point we used submodules to link in a all of the sub-repositories.  We have now switched\nto including the submodules' code directly into the main datawave repository.  The git subrepo\nmechanism (https://github.com/ingydotnet/git-subrepo) was used to facilitate the transition.\nThat same mechanism can be used to pull in changes from the other repositories as needed until\nthey can be removed altogether.  The original cloning of the sub repositories was done using\nthe subrepo command as follows:\n```\ngit subrepo clone \u003crepo\u003e \u003cdir\u003e\n```\nIf changes need to be pulled in, then the following process can be used:\n```\ngit subrepo pull \u003cdir\u003e\n```\n### Building\n\nIt is recommended to build the project using multiple threads.  This will not build the starters, utilities, and services.\n```\nmvn -Pdocker,dist clean install -T 1C\n```\n\nIf you want to build the starters, util modules, and services as well then try this\n```\nmvn -Pdocker,dist -Dstarters -Dservices -Dutils clean install -T 1C\n```\nIf you want to build the service apis but not the services themselveds then add -DonlyServiceApis\n\nNOTE: The util modules, starters, and services are actually tagged and deployed separately.\n  Hence the snapshot versions within those sub repos are not connected together.\n\n### DataWave Microservices\n\nFor more information about deploying the datawave quickstart and microservices, check out the [Docker Readme](docker/README.md#usage)\n\n[li]: http://img.shields.io/badge/license-ASL-blue.svg\n[ll]: https://www.apache.org/licenses/LICENSE-2.0\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNationalSecurityAgency%2Fdatawave","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FNationalSecurityAgency%2Fdatawave","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNationalSecurityAgency%2Fdatawave/lists"}