https://github.com/dataoneorg/sysmwatch
Simple monitoring tool for DataONE systemmetadata
https://github.com/dataoneorg/sysmwatch
Last synced: about 1 year ago
JSON representation
Simple monitoring tool for DataONE systemmetadata
- Host: GitHub
- URL: https://github.com/dataoneorg/sysmwatch
- Owner: DataONEorg
- Created: 2021-01-21T16:32:08.000Z (over 5 years ago)
- Default Branch: main
- Last Pushed: 2025-04-21T21:16:13.000Z (about 1 year ago)
- Last Synced: 2025-05-20T03:13:24.747Z (about 1 year ago)
- Language: Python
- Size: 43.9 KB
- Stars: 0
- Watchers: 9
- Forks: 0
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# sysmwatch
This python tool watches the postgres systemmetadata tables, pulling out entries
with dateModified more recent than a specified value.
Identifiers are examined in the Solr index, and flagged if the indexed dateModified
does not match that of the systemMetadata.
The process is fairly efficient and may provide a basis for implementing
a replacement for the index-task-generator which currently relies on hazelcast events.
Output is to a JSON file that can be rendered with a simple handsontable implemnetation.
## Approaches
There are several approaches to postgres event notification:
1. Polling. Periodic query to retrieve a (hopefully) limited set of changes. Implemented in `main.py`.
2. Listening for `pg_notify` events. Implemented in `listen.py`
3. Using a postgres extension to call a amqp message queue. Listener in `listenq.py`
4. Write events to a queue table (e.g. using a trigger) and retrieve events by querying the table using the `SKIP LOCKED` semantics. Not implemented here, some discussion on [stack overflow](https://stackoverflow.com/questions/297280/the-best-way-to-use-a-db-table-as-a-job-queue-a-k-a-batch-queue-or-message-queu). Simple example on [HN](https://news.ycombinator.com/item?id=20020501).