https://github.com/indeedeng/mph-table
Immutable key/value store with efficient space utilization and fast reads. They are ideal for the use-case of tables built by batch processes and shipped to multiple servers.
https://github.com/indeedeng/mph-table
Last synced: 1 day ago
JSON representation
Immutable key/value store with efficient space utilization and fast reads. They are ideal for the use-case of tables built by batch processes and shipped to multiple servers.
- Host: GitHub
- URL: https://github.com/indeedeng/mph-table
- Owner: indeedeng
- License: apache-2.0
- Archived: true
- Created: 2017-12-20T15:26:10.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2023-07-11T18:58:19.000Z (over 2 years ago)
- Last Synced: 2025-11-16T17:02:21.786Z (2 months ago)
- Language: Java
- Homepage: http://engineering.indeedblog.com/blog/2018/02/indeed-mph/
- Size: 132 KB
- Stars: 100
- Watchers: 18
- Forks: 20
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- awesome-java - MPH Table
- awesome-jvm - mph-table - Minimal Perfect Hash Tables are an immutable key/value store with efficient space utilization and fast reads. (Memory and concurrency)
README
Indeed has decided to archive this project, as it is no longer supported nor used internally. Thank you for your understanding. We won't be accepting pull requests or responding to issues for this project anymore. If you are a current user and would like to take over support for this project, please create a fork.
# Minimal Perfect Hash Tables

## About
Minimal Perfect Hash Tables are an immutable key/value store with
efficient space utilization and fast reads. They are ideal for the
use-case of tables built by batch processes and shipped to multiple
servers.
## Usage
Indeed MPH is available on [Maven Central](https://mvnrepository.com/artifact/com.indeed/mph-table),
just add the following dependency:
```
com.indeed
mph-table
1.0.4
```
The primary interfaces are
[TableReader](src/main/java/com/indeed/mph/TableReader.java), to
construct a reader to an existing table,
[TableWriter](src/main/java/com/indeed/mph/TableWriter.java), to build
a table, and
[TableConfig](src/main/java/com/indeed/mph/TableConfig.java), to
specify the configuration for the writer.
How to write a table:
```java
final TableConfig config = new TableConfig()
.withKeySerializer(new SmartLongSerializer())
.withValueSerializer(new SmartVLongSerializer());
final Set> entries = new HashSet<>();
for (long i = 0; i < 20; ++i) {
entries.add(new Pair(i, i * i));
}
TableWriter.write(new File("squares"), config, entries);
```
How to read a table:
```java
try (final TableReader reader = TableReader.open("squares")) {
final Long value = reader.get(3L); // get one
for (final Pair p : reader) { // iterate over all
...
}
}
```
## Command Line
In addition to the Java API, TableReader and TableWriter provide
convenience command-line interfaces to read and write tables, allowing
you to quickly get started without writing any code:
# print all key-values in a table as TSV
$ java com.indeed.mph.TableReader --dump
# print the value for a single key
$ java com.indeed.mph.TableReader --get
# create a table from a TSV file of words with counts
$ java com.indeed.mph.TableWriter --valueSerializer .SmartVLongSerializer
# create a table from a TSV file mapping movie ids to lists of actor names (compressed by reference)
$ java com.indeed.mph.TableWriter --keySerializer .SmartVLongSerializer --valueSerializer '.SmartListSerializer(.SmartDictionarySerializer)'
# same as above, not actually storing the movie ids but still allowing retrieval by them
$ java com.indeed.mph.TableWriter --keyStorage IMPLICIT --keySerializer .SmartVLongSerializer --valueSerializer '.SmartListSerializer(.SmartDictionarySerializer)'
## Code of Conduct
This project is governed by the [Contributor Covenant v 1.4.1](CODE_OF_CONDUCT.md)
## License
This project is licensed under the Apache-2.0 License - see the [LICENSE](LICENSE) file for details.