https://github.com/borys25ol/elasticsearch-reindex
Tool for migration Elasticsearch indexes between different nodes.
https://github.com/borys25ol/elasticsearch-reindex
cli elasticsearch migration-tool python reindex
Last synced: 5 months ago
JSON representation
Tool for migration Elasticsearch indexes between different nodes.
- Host: GitHub
- URL: https://github.com/borys25ol/elasticsearch-reindex
- Owner: borys25ol
- License: mit
- Created: 2021-08-03T13:00:54.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2024-11-19T11:31:50.000Z (over 1 year ago)
- Last Synced: 2025-09-01T01:15:26.698Z (10 months ago)
- Topics: cli, elasticsearch, migration-tool, python, reindex
- Language: Python
- Homepage:
- Size: 105 KB
- Stars: 7
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Elasticsearch Reindex
====================
[](https://www.python.org/)
[](https://github.com/psf/black)
[](http://mypy-lang.org/)
[](https://pycqa.github.io/isort/)
[](https://github.com/pre-commit/pre-commit)
## Description
`elasticsearch-reindex` is a CLI tool for transferring Elasticsearch indexes between different servers.
## Installing
Install the package using pip:
```bash
pip install elasticsearch-reindex
```
Usage
-----
## Configuration
Ensure the source Elasticsearch host is whitelisted in the destination host.
Edit the elasticsearch.yml configuration file on the destination Elasticsearch server.
You should edit Elasticsearch YML config:
#### Path to config file:
```shell
/etc/elasticsearch/elasticsearch.yml
```
Add the following line to the file:
```shell
reindex.remote.whitelist: :
```
## Running the Tool
Use the CLI to migrate data between Elasticsearch instances:
```shell
elasticsearch_reindex \
--source_host http(s)://es-source-host:es-source-port \
--source_http_auth username:password \
--dest_host http(s)://es-dest-host:es-dest-port \
--dest_http_auth username:password \
--check_interval 5 \
--concurrent_tasks 3 \
-i test_index_1 -i test_index_2
```
Also, there is a command alias `elasticsearch-reindex`:
```shell
elasticsearch-reindex ...
```
### CLI Parameters
Required fields:
* `source_host` - Elasticsearch endpoint where data will be extracted.
* `dest_host` - Elasticsearch endpoint where data will be transfered.
Optional fields:
* `source_http_auth` - HTTP Basic authentication, username and password.
* `dest_http_auth` - HTTP Basic authentication, username and password.
* `check_interval` - Time period (in second) to check task success status.
`Default value` - `10` (seconds)
* `concurrent_tasks` - How many parallel task Elasticsearch will process.
`Default value` - `1` (sync mode)
* `indexes` - List of user ES indexes to migrate instead of all source indexes.
### Run library from Python script:
```python
from elasticsearch_reindex import ReindexManager
def main() -> None:
"""
Example reindex function.
"""
dict_config = {
"source_host": "http://localhost:9201",
"dest_host": "http://localhost:9202",
"check_interval": 20,
"concurrent_tasks": 5,
}
reindex_manager = ReindexManager.from_dict(data=dict_config)
reindex_manager.start_reindex()
if __name__ == "__main__":
main()
```
With custom user indexes:
```python
from elasticsearch_reindex import ReindexManager
def main() -> None:
"""
Example reindex function with HTTP Basic authentication.
"""
dict_config = {
"source_host": "http://localhost:9201",
# If the source host requires authentication
# "source_http_auth": "tmp-source-user:tmp-source-PASSWD.220718",
"dest_host": "http://localhost:9202",
# If the destination host requires authentication
# "dest_http_auth": "tmp-reindex-user:tmp--PASSWD.220718",
"check_interval": 20,
"concurrent_tasks": 5,
"indexes": ["es-index-1", "es-index-2", "es-index-n"],
}
reindex_manager = ReindexManager.from_dict(data=dict_config)
reindex_manager.start_reindex()
if __name__ == "__main__":
main()
```
Local install
-------------
Set up and activate a Python 3 virtual environment:
```shell
make ve
```
To install Git hooks:
```shell
make install_hooks
```
Create .env file and fill the data:
```shell
cp .env.example .env
```
Export env variables:
```shell
export $(xargs < .env)
```
### Key Environment Variables::
Variable for enable testing:
* `ENV` - variable for enable testing mode.
For activate test mode set to value - `test`.
Elasticsearch docker settings:
* `ES_SOURCE_PORT` - Source Elasticsearch port
* `ES_DEST_PORT` - Destination Elasticsearch port
* `ES_VERSION` - Elasticsearch version
* `LOCAL_IP` - Address of you local host machine in LAN like `192.168.4.106`.
### How to find your Local IP?
* MacOS (find it in response):
```shell
ifconfig
```
* Linux (find it in response):
```shell
ip r
```
Testing
-------
Start Elasticsearch nodes using Docker Compose:
```shell
docker-compose up -d
```
Verify Elasticsearch nodes are running:
* Source Elasticsearch:
```shell
curl -X GET $LOCAL_IP:$ES_SOURCE_PORT
```
* Destination Elasticsearch:
```shell
curl -X GET $LOCAL_IP:$ES_DEST_PORT
```
Export to `PYTHONPATH` env variable:
```shell
export PYTHONPATH="."
```
For run tests with `pytest` use:
```shell
make test
```
For run tests with `pytest` and `coverage` report use:
```shell
make test-cov
```