https://github.com/gchq/stroom
Stroom is a highly scalable data storage, processing and analysis platform.
https://github.com/gchq/stroom
big-data dashboards data-analytics enrichment lucene pipeline-processor visualisation xml xslt
Last synced: 2 days ago
JSON representation
Stroom is a highly scalable data storage, processing and analysis platform.
- Host: GitHub
- URL: https://github.com/gchq/stroom
- Owner: gchq
- License: apache-2.0
- Created: 2016-11-03T11:55:47.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2026-02-06T18:47:12.000Z (about 1 month ago)
- Last Synced: 2026-02-07T00:36:58.711Z (about 1 month ago)
- Topics: big-data, dashboards, data-analytics, enrichment, lucene, pipeline-processor, visualisation, xml, xslt
- Language: Java
- Homepage: https://gchq.github.io/stroom-docs/
- Size: 206 MB
- Stars: 465
- Watchers: 27
- Forks: 62
- Open Issues: 657
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Security: SECURITY.md
- Roadmap: ROADMAP.md
- Notice: NOTICE.md
Awesome Lists containing this project
- awesome-java - Stroom
README
# 
Stroom is a data processing, storage and analysis platform.
It is scalable - just add more CPUs / servers for greater throughput.
It is suitable for processing high volume data such as system logs, to provide valuable insights into IT performance and usage.
Stroom provides a number of powerful capabilities:
* **Data ingest.** Receive and store large volumes of data such as native format logs.
Ingested data is always available in its raw form.
* **Data transformation pipelines.** Create sequences of XSL and text operations, in order to normalise or export data in any format.
It is possible to enrich data using lookups and reference data.
* **Integrated transformation development.** Easily add new data formats and debug the transformations if they don't work as expected.
* **Scalable Search.** Create multiple indexes with different retention periods.
These can be sharded across your cluster.
* **Dashboards.** Run queries against your indexes or statistics and view the results within custom visualisations.
* **Statistics.** Record counts or values of items over time, providing answers to questions such as "how many times has a specific machine provided data in the last hour/day/month?"

## Get Stroom
To run Stroom and Stroom-Proxy in Docker do the following:
``` bash
# Download and extract Stroom v7.10 stack
bash <(curl -s https://gchq.github.io/stroom-resources/v7.10/get_stroom.sh)
# Navigate into the new stack directory
cd stroom_core_test/stroom_core_test*
# Start the stack
./start.sh
```
For more details on the commands above and any prerequisites see [Single Node Docker Installation](https://gchq.github.io/stroom-docs/docs/install-guide/single-node-docker/).
For the releases of the core Stroom and Stroom-Proxy products, see [Stroom releases](https://github.com/gchq/stroom/releases).
For the releases of the docker application stacks, see [Stroom-Resources releases](https://github.com/gchq/stroom-resources/releases).
## Documentation
The Stroom application spans several repositories but we've bundled all the documentation into one place.
See [Stroom Documentation](https://gchq.github.io/stroom-docs/).
## Contributing
Contributing to Stroom can come in the following forms:
* Raising issues for bugs or feature requests in Stroom.
* Making code contributions to fix bugs or adding new features.
* Developing new Stroom content (see [stroom-content](https://github.com/gchq/stroom-content)).
* Adding to or updating the documentation.
If you'd like to make a contribution to Stroom then the details for doing all of that are
in [CONTRIBUTING.md](https://github.com/gchq/stroom/blob/master/CONTRIBUTING.md).
If you would like to contribute to Stroom's documentation then see [here](https://gchq.github.io/stroom-docs/community/documentation/).
The documentation repository can be found at [github.com/gchq/stroom-docs](https://github.com/gchq/stroom-docs).
## GitHub Repositories
Stroom and its associated libraries, services and content span several GitHub repositories:
* [`stroom`](https://github.com/gchq/stroom) - The core Stroom application and proxy.
* [`stroom-clients`](https://github.com/gchq/stroom-clients) - Various client libraries for sending logs to Stroom.
* [`stroom-content`](https://github.com/gchq/stroom-content) - Packaged content packs for import into Stroom.
* [`stroom-docs`](https://github.com/gchq/stroom-docs) - Documentation for the Stroom family of products.
* [`stroom-headless`](https://github.com/gchq/stroom-headless) - An example of how to run Stroom in headless mode from
the command line.
* [`stroom-resources`](https://github.com/gchq/stroom-resources) - Configuration for orchestrating stroom in docker
containers and released docker stacks.
* [`stroom-visualisations-dev`](https://github.com/gchq/stroom-visualisations-dev) - A set of visualisations for use in
Stroom.
* [`event-logging-schema`](https://github.com/gchq/event-logging-schema) - An XML Schema for describing auditable
events.
* [`event-logging`](https://github.com/gchq/event-logging) - A JAXB API for the `event-logging` XML Schema.