https://github.com/baidu/tera

An Internet-Scale Database.
https://github.com/baidu/tera

baidu bigtable c-plus-plus data database hbase nosql storage

Last synced: 8 months ago
JSON representation

An Internet-Scale Database.

Host: GitHub
URL: https://github.com/baidu/tera
Owner: baidu
License: bsd-3-clause
Created: 2014-03-26T07:22:05.000Z (almost 12 years ago)
Default Branch: master
Last Pushed: 2024-06-05T12:31:05.000Z (over 1 year ago)
Last Synced: 2025-04-11T15:57:06.568Z (9 months ago)
Topics: baidu, bigtable, c-plus-plus, data, database, hbase, nosql, storage
Language: C++
Homepage:
Size: 15.7 MB
Stars: 1,900
Watchers: 183
Forks: 435
Open Issues: 172
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-bigdata - Baidu Tera - an Internet-scale database, inspired by BigTable. (Key Map Data Model)
awesome-bigdata - Baidu Tera - an Internet-scale database, inspired by BigTable. (Key Map Data Model)
awesome-bigtable - Tera - High performance distributed NoSQL database. (Cool Stuff / Inspired by Bigtable)
awesome-bigdata - Baidu Tera - an Internet-scale database, inspired by BigTable. (Key Map Data Model)
fucking-awesome-bigdata - Baidu Tera - an Internet-scale database, inspired by BigTable. (Key Map Data Model)
awesome-bigdata - Baidu Tera - an Internet-scale database, inspired by BigTable. (Key Map Data Model)
A-curated-list-of-awesome-big-data-frameworks-ressources-and-other-awesomeness.- - Baidu Tera - an Internet-scale database, inspired by BigTable. (Key Map Data Model)
data-engineering-collection - Baidu Tera - an Internet-scale database, inspired by BigTable. (`Key Map Data Model `)

README

          # Tera - An Internet-Scale Database

[![Build Status](https://travis-ci.org/baidu/tera.svg?branch=master)](https://travis-ci.org/baidu/tera)

[![Coverity Scan Build Status](https://scan.coverity.com/projects/10959/badge.svg)](https://scan.coverity.com/projects/tera)

[![Documentation Status](https://img.shields.io/badge/中文文档-最新-brightgreen.svg)](readme-cn.md)

Copyright 2015, Baidu, Inc.

Tera is a high performance distributed NoSQL database, which is inspired by google's [BigTable](http://static.googleusercontent.com/media/research.google.com/zh-CN//archive/bigtable-osdi06.pdf) and designed for real-time applications. Tera can easily scale to __petabytes__ of data across __thousands__ of commodity servers. Besides, Tera is widely used in many Baidu products with varied demands，which range from throughput-oriented applications to latency-sensitive service, including web indexing, WebPage DB, LinkBase DB, etc. ([中文](readme-cn.md))

## Features

* Linear and modular scalability

* Automatic and configurable sharding

* Ranged and hashed sharding strategies

* MVCC

* Column-oriented storage and locality group support

* Strictly consistent

* Automatic failover support

* Online schema change

* Snapshot support

* Support RAMDISK/SSD/DFS tiered cache

* Block cache and Bloom Filters for real-time queries

* Multi-type table support (RAMDISK/SSD/DISK table)

* Easy to use [C++](doc/en/sdk_guide.md)/[Java](doc/en/sdk_guide_java.md)/[Python](doc/en/sdk_guide_python.md)/[REST-ful](doc/en/sdk_guide_http.md) API

## Data model

Tera is the collection of many sparse, distributed, multidimensional tables. The table is indexed by a row key, column key, and a timestamp; each value in the table is an uninterpreted array of bytes.

* (row:string, (column family+qualifier):string, time:int64) → string

To learn more about the schema, you can refer to [BigTable](http://static.googleusercontent.com/media/research.google.com/zh-CN//archive/bigtable-osdi06.pdf).

## Architecture

![架构图](resources/images/arch.png)

Tera has three major components: sdk, master and tablet servers.

- __SDK__: a library that is linked into every application client to access Tera cluster.

- __Master__: master is responsible for managing tablet servers and tablets, automatic load balance and garbage collection of files in filesystem.

- __Tablet Server__: tablet server is the core module in tera, and it uses an __enhance__ [Leveldb](https://github.com/google/leveldb) as a basic storage engine. Tablet server manages a set of tablets, handles read/write/scan requests and schedule tablet split and merge online.

## Building blocks

Tera is built on several pieces of open source infrastructure.

- __Filesystem__ (required)

    Tera uses the distributed file system to store transaction log and data files. So Tera uses an abstract file system interface, called Env, to adapt to different implementations of file systems (e.g., [BFS](https://github.com/baidu/bfs), HDFS, HDFS2, POXIS filesystem).

- __Distributed lock service__ (required)

    Tera relies on a highly-available and persistent distributed lock service, which is used for a variety of tasks: to ensure that there is at most one active master at any time; to store meta table's location, to discover new tablet server and finalize tablet server deaths. Tera has an adapter class to adapt to different implementations of lock service (e.g., ZooKeeper, [Nexus](https://github.com/baidu/ins))

- __High performance RPC framework__ (required)

    Tera is designed to handle a variety of demanding workloads, which range from throughput-oriented applications to latency-sensitive service. So Tera needs a high performance network programming framework. Now Tera heavily relies on [Sofa-pbrpc](https://github.com/baidu/sofa-pbrpc/) to meet the performance demand.

- __Cluster management system__ (not necessary)

    A Tera cluster in Baidu typically operates in a shared pool of machines that runs a wide variety of other distributed applications. So Tera can be deployed in a cluster management system [Galaxy](https://github.com/baidu/galaxy), which uses for scheduling jobs, managing resources on shared machines, dealing with machine failures, and monitoring machine status. Besides, Tera can also be deployed on RAW machine or in Docker container.

## Documents

* [Developer Doc](doc/en/README.md)

## Quick start

* __How to build__

    Use sh [./build.sh](BUILD) to build Tera.

* __How to deploy__

    [Pseudo Distributed Mode](doc/en/onebox.md)

    [Build on Docker](example/docker)

* __How to access__

    [teracli](doc/en/teracli.md)

    [API](doc/en/sdk_guide.md)

## Contributing to Tera

Contributions are welcomed and greatly appreciated.

Read [Roadmap](doc/en/roadmap.md) to get a general knowledge about our development plan.

See [Contributions](doc/en/contributor.md) for more details.

## Follow us

To join us, please send resume to tera-user at baidu.com.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/baidu/tera

Awesome Lists containing this project

README