{"id":13415319,"url":"https://github.com/eBay/akutan","last_synced_at":"2025-03-14T22:33:20.608Z","repository":{"id":40643597,"uuid":"179586596","full_name":"eBay/akutan","owner":"eBay","description":"A distributed knowledge graph store","archived":true,"fork":false,"pushed_at":"2019-07-18T23:39:49.000Z","size":3364,"stargazers_count":1655,"open_issues_count":19,"forks_count":108,"subscribers_count":68,"default_branch":"master","last_synced_at":"2024-07-31T21:53:39.160Z","etag":null,"topics":["go","graph","rdf","sparql"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/eBay.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-04-04T22:32:19.000Z","updated_at":"2024-07-25T15:12:20.000Z","dependencies_parsed_at":"2022-08-09T23:50:49.959Z","dependency_job_id":null,"html_url":"https://github.com/eBay/akutan","commit_stats":null,"previous_names":["ebay/beam"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eBay%2Fakutan","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eBay%2Fakutan/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eBay%2Fakutan/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eBay%2Fakutan/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/eBay","download_url":"https://codeload.github.com/eBay/akutan/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243658057,"owners_count":20326459,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["go","graph","rdf","sparql"],"created_at":"2024-07-30T21:00:47.064Z","updated_at":"2025-03-14T22:33:20.602Z","avatar_url":"https://github.com/eBay.png","language":"Go","funding_links":[],"categories":["Go","Repositories","Tools","Databases","Libraries, Softwares and Tools","Graph databases"],"sub_categories":["LPG","Data Cube extensions","Knowledge Graph Database","Triple stores"],"readme":"# Akutan\n\n[![Build Status](https://travis-ci.com/eBay/akutan.svg?branch=master)](https://travis-ci.com/eBay/akutan)\n[![GoDoc](https://godoc.org/github.com/ebay/akutan/src/github.com/ebay/akutan?status.svg)](https://godoc.org/github.com/ebay/akutan/src/github.com/ebay/akutan)\n\nThere's a blog post that's a [good introduction to Akutan](https://www.ebayinc.com/stories/blogs/tech/beam-a-distributed-knowledge-graph-store/).\n\nAkutan is a distributed knowledge graph store, sometimes called an RDF store or a\ntriple store. Knowledge graphs are suitable for modeling data that is highly\ninterconnected by many types of relationships, like encyclopedic information\nabout the world. A knowledge graph store enables rich queries on its data, which\ncan be used to power real-time interfaces, to complement machine learning\napplications, and to make sense of new, unstructured information in the context\nof the existing knowledge.\n\nHow to model your data as a knowledge graph and how to query it will feel a bit\ndifferent for people coming from SQL, NoSQL, and property graph stores. In a\nknowledge graph, data is represented as a single table of *facts*, where each\nfact has a *subject*, *predicate*, and *object*. This representation enables the\nstore to sift through the data for complex queries and to apply inference rules\nthat raise the level of abstraction. Here's an example of a tiny graph:\n\nsubject         | predicate | object\n----------------|-----------|-----------------\n`\u003cJohn_Scalzi\u003e` | `\u003cborn\u003e`  | `\u003cFairfield\u003e`\n`\u003cJohn_Scalzi\u003e` | `\u003clives\u003e` | `\u003cBradford\u003e`\n`\u003cJohn_Scalzi\u003e` | `\u003cwrote\u003e` | `\u003cOld_Mans_War\u003e`\n\nTo learn about how to represent and query data in Akutan, see\n[docs/query.md](docs/query.md).\n\nAkutan is designed to store large graphs that cannot fit on a single server. It's\nscalable in how much data it can store and the rate of queries it can execute.\nHowever, Akutan serializes all changes to the graph through a central log, which\nfundamentally limits the total rate of change. The rate of change won't improve\nwith a larger number of servers, but a typical deployment should be able to\nhandle tens of thousands of changes per second. In exchange for this limitation,\nAkutan's architecture is a relatively simple one that enables many features. For\nexample, Akutan supports transactional updates and historical global snapshots. We\nbelieve this trade-off is suitable for most knowledge graph use cases, which\naccumulate large amounts of data but do so at a modest pace. To learn more about\nAkutan's architecture and this trade-off, see\n[docs/central_log_arch.md](docs/central_log_arch.md).\n\nAkutan isn't ready for production-critical deployments, but it's useful today for\nsome use cases. We've run a 20-server deployment of Akutan for development\npurposes and off-line use cases for about a year, which we've most commonly\nloaded with a dataset of about 2.5 billion facts. We believe Akutan's current\ncapabilities exceed this capacity and scale; we haven't yet pushed Akutan to its\nlimits. The project has a good architectural foundation on which additional\nfeatures can be built and higher performance could be achieved.\n\nAkutan needs more love before it can be used for production-critical deployments.\nMuch of Akutan's code consists of high-quality, documented, unit-tested modules,\nbut some areas of the code base are inherited from Akutan's earlier prototype days\nand still need attention. In other places, some functionality is lacking before\nAkutan could be used as a critical production data store, including deletion of\nfacts, backup/restore, and automated cluster management. We have filed\nGitHub issues for these and a few other things. There are also areas where Akutan\ncould be improved that wouldn't necessarily block production usage. For example,\nAkutan's query language is not quite compatible with Sparql, and its inference\nengine is limited.\n\nSo, Akutan has a nice foundation and may be useful to some people, but it also\nneeds additional love. If that's not for you, here are a few alternative\nopen-source knowledge and property graph stores that you may want to consider\n(we have no affiliation with these projects):\n\n- [Blazegraph](https://github.com/blazegraph/database): an RDF store. Supports\n  several query languages, including SPARQL and Gremlin. Disk-based,\n  single-master, scales out for reads only. Seems unmaintained. Powers\n  \u003chttps://query.wikidata.org/\u003e.\n- [Dgraph](https://github.com/dgraph-io/dgraph): a triple-oriented property\n  graph store. GraphQL-like query language, no support for SPARQL. Disk-based,\n  scales out.\n- [Neo4j](https://github.com/neo4j/neo4j): a property graph store. Cypher query\n  language, no support for SPARQL. Single-master, scales out for reads only.\n- See also Wikipedia's\n  [Comparison of Triplestores](https://en.wikipedia.org/wiki/Comparison_of_triplestores)\n  page.\n\nThe remainder of this README describes how to get Akutan up and running. Several\ndocuments under the `docs/` directory describe aspects of Akutan in more\ndetail; see [docs/README.md](docs/README.md) for an overview.\n\n## Installing dependencies and building Akutan\n\nAkutan has the following system dependencies:\n - It's written in [Go](https://golang.org/). You'll need v1.11.5 or newer.\n - Akutan uses [Protocol Buffers](https://developers.google.com/protocol-buffers/)\n   extensively to encode messages for [gRPC](https://grpc.io/), the log of data\n   changes, and storage on disk. You'll need protobuf version 3. We reccomend\n   3.5.2 or later. Note that 3.0.x is the default in many Linux distributions, but\n   doesn't work with the Akutan build.\n - Akutan's Disk Views store their facts in [RocksDB](https://rocksdb.org/).\n\nOn Mac OS X, these can all be installed via [Homebrew](https://brew.sh/):\n\n\t$ brew install golang protobuf rocksdb zstd\n\nOn Ubuntu, refer to the files within the [docker/](docker/) directory for\npackage names to use with `apt-get`.\n\nAfter cloning the Akutan repository, pull down several Go libraries and additional\nGo tools:\n\n\t$ make get\n\nFinally, build the project:\n\n\t$ make build\n\n## Running Akutan locally\n\nThe fastest way to run Akutan locally is to launch the in-memory log store:\n\n\t$ bin/plank\n\nThen open another terminal and run:\n\n\t$ make run\n\nThis will bring up several Akutan servers locally. It starts an API server that\nlistens on localhost for gRPC requests on port 9987 and for HTTP requests on\nport 9988, such as \u003chttp://localhost:9988/stats.txt\u003e.\n\nThe easiest way to interact with the API server is using `bin/akutan-client`. See\n[docs/query.md](docs/query.md) for examples. The API server exposes the\n`FactStore` gRPC service defined in\n[proto/api/akutan_api.proto](proto/api/akutan_api.proto).\n\n## Deployment concerns\n\n### The log\n\nEarlier, we used `bin/plank` as a log store, but this is unsuitable for real\nusage! Plank is in-memory only, isn't replicated, and by default, it only\nkeeps 1000 entries at a time. It's only meant for development.\n\nAkutan also supports using [Apache Kafka](https://kafka.apache.org/) as its log\nstore. This is recommended over Plank for any deployment. To use Kafka, follow the\n[Kafka quick start](https://kafka.apache.org/quickstart) guide to install\nKafka, start ZooKeeper, and start Kafka. Then create a topic called \"akutan\"\n(not \"test\" as in the Kafka guide) with `partitions` set to 1. You'll want to\nconfigure Kafka to synchronously write entries to disk.\n\nTo use Kafka with Akutan, set the `akutanLog`'s `type` to `kafka` in your Akutan\nconfiguration (default: `local/config.json`), and update the `locator`'s\n`addresses` accordingly (Kafka uses port 9092 by default). You'll need to clear\nout Akutan's Disk Views' data before restarting the cluster. The Disk Views\nby default store their data in $TMPDIR/rocksdb-akutan-diskview-{space}-{partition}\nso you can delete them all with `rm -rf $TMPDIR/rocksdb-akutan-diskview*`\n\n### Docker and Kubernetes\n\nThis repository includes support for running Akutan inside\n[Docker](https://www.docker.com/) and\n[Minikube](https://kubernetes.io/docs/setup/minikube/). These environments can\nbe tedious for development purposes, but they're useful as a step towards a\nmodern and robust production deployment.\n\nSee `cluster/k8s/Minikube.md` file for the steps to build and deploy Akutan\nservices in `Minikube`. It also includes the steps to build the Docker images.\n\n### Distributed tracing\n\nAkutan generates distributed [OpenTracing](https://opentracing.io/) traces for use\nwith [Jaeger](https://www.jaegertracing.io/). To try it, follow the\n[Jaeger Getting Started Guide](https://www.jaegertracing.io/docs/getting-started/#all-in-one-docker-image)\nfor running the all-in-one Docker image. The default `make run` is configured to\nsend traces there, which you can query at \u003chttp://localhost:16686\u003e. The Minikube\ncluster also includes a Jaeger all-in-one instance.\n\n## Development\n\n### VS Code\n\nYou can use whichever editor you'd like, but this repository contains some\nconfiguration for [VS Code](https://code.visualstudio.com/Download). We\nsuggest the following extensions:\n - [Go](https://marketplace.visualstudio.com/items?itemName=ms-vscode.Go)\n - [Code Spell Checker](https://marketplace.visualstudio.com/items?itemName=streetsidesoftware.code-spell-checker)\n - [Rewrap](https://marketplace.visualstudio.com/items?itemName=stkb.rewrap)\n - [vscode-proto3](https://marketplace.visualstudio.com/items?itemName=zxh404.vscode-proto3)\n - [Docker](https://marketplace.visualstudio.com/items?itemName=PeterJausovec.vscode-docker)\n\nOverride the default settings in `.vscode/settings.json` with\n[./vscode-settings.json5](./vscode-settings.json5).\n\n### Test targets\n\nThe `Makefile` contains various targets related to running tests:\n\nTarget       | Description\n------------ | -----------\n`make test`  | run all the akutan unit tests\n`make cover` | run all the akutan unit tests and open the web-based coverage viewer\n`make lint`  | run basic code linting\n`make vet`   | run all static analysis tests including linting and formatting\n\n## License Information\n\nCopyright 2019 eBay Inc.\n\nPrimary authors: Simon Fell, Diego Ongaro, Raymond Kroeker, Sathish Kandasamy\n\nLicensed under the Apache License, Version 2.0 (the \"License\"); you may not use\nthis file except in compliance with the License. You may obtain a copy of the\nLicense at \u003chttps://www.apache.org/licenses/LICENSE-2.0\u003e.\n\nUnless required by applicable law or agreed to in writing, software distributed\nunder the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR\nCONDITIONS OF ANY KIND, either express or implied. See the License for the\nspecific language governing permissions and limitations under the License.\n\n\n----\n**Note** the project was renamed to Akutan in July 2019.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FeBay%2Fakutan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FeBay%2Fakutan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FeBay%2Fakutan/lists"}