https://github.com/liamca/azure-search-backup-restore
Tool to help extract content from and Azure Search index and restore it to a new index
https://github.com/liamca/azure-search-backup-restore
Last synced: 3 months ago
JSON representation
Tool to help extract content from and Azure Search index and restore it to a new index
- Host: GitHub
- URL: https://github.com/liamca/azure-search-backup-restore
- Owner: liamca
- License: mit
- Created: 2016-03-17T15:47:53.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2023-07-07T21:31:30.000Z (almost 3 years ago)
- Last Synced: 2025-04-02T10:39:16.669Z (about 1 year ago)
- Language: Jupyter Notebook
- Size: 35.2 KB
- Stars: 11
- Watchers: 3
- Forks: 11
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Azure Search Index Backup and Restore
The purpose of this tool is to help with extraction of content from an Azure Search index and restoration to a new index during the development phase.
## Important - Please Read
Search indexes are different from other datastores in that it is really hard to extract all content from the store. Due to the way that search indexes are constantly ranking and scoring results, paging through search results or even using continuation tokes as this tool does has the possibility of missing data during data extraction. As an example, lets say you search for all documents, and there is a document with ID 101 that is part of page 5 of the search results. As you start extracting data from page to page as you move from page 4 to page 5, it is possible that now ID 101 is actually now part of page 4, meaning that when you look at page 5, it is no longer there and you just missed that document.
For that reason, this tool keeps a count of the ID's of the keys extracted and will do a comparison to the count of documents in the Azure Search index to make sure they match. Although this does not provide a perfect solution, it does help reduce the chance of missing data.
Also, as an extra precaution, it is best if there are no changes being made and the search index is in a steady state during this extraction phase.