https://github.com/cisagov/cyhy-feeds
Tools to create and retrieve Cyber Hygiene (CyHy) data extracts
https://github.com/cisagov/cyhy-feeds
cyhy data export
Last synced: 8 months ago
JSON representation
Tools to create and retrieve Cyber Hygiene (CyHy) data extracts
- Host: GitHub
- URL: https://github.com/cisagov/cyhy-feeds
- Owner: cisagov
- License: cc0-1.0
- Created: 2018-08-13T20:54:54.000Z (almost 8 years ago)
- Default Branch: develop
- Last Pushed: 2025-09-09T19:26:19.000Z (10 months ago)
- Last Synced: 2025-10-14T14:47:31.455Z (8 months ago)
- Topics: cyhy, data, export
- Language: Python
- Homepage:
- Size: 401 KB
- Stars: 1
- Watchers: 4
- Forks: 4
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# Cyber Hygiene Feeds 📥 📤 #
[](https://github.com/cisagov/cyhy-feeds/actions)
[](https://coveralls.io/github/cisagov/cyhy-feeds?branch=develop)
[](https://lgtm.com/projects/g/cisagov/cyhy-feeds/alerts/)
[](https://lgtm.com/projects/g/cisagov/cyhy-feeds/context:python)
[](https://snyk.io/test/github/cisagov/cyhy-feeds)
cyhy-feeds consists of two parts: the extractor and the retriever.
`cyhy-data-extract` retrieve and compress the specified data, sign the compressed
file, encrypt the file, and optionally push the encrypted, compressed file to an
S3 bucket using provided AWS credentials.
`cyhy-data-retriever` take a provided file (optionally stored in an S3 bucket),
decrypt it, and then decompress it to local storage.
## Getting Started ##
Both `cyhy-data-extract` and `cyhy-data-retriever` require Python 3.
To run the tool locally first install the requirements:
```console
pip install -r requirements.txt
```
### cyhy-data-extract Usage and Examples ###
#### cyhy-data-extract Usage ####
```console
Create compressed, encrypted, signed extract file with Federal CyHy data for
integration with the Weathermap project.
Usage:
COMMAND_NAME --config CONFIG_FILE [--cyhy-config CYHY_CONFIG] [--scan-config SCAN_CONFIG] [--assessment-config ASSESSMENT_CONFIG] [-v | --verbose] [-a | --aws ] [--cleanup-aws] [--date DATE] [--debug]
COMMAND_NAME (-h | --help)
COMMAND_NAME --version
Options:
-h --help Show this screen
--version Show version
-x CYHY_CONFIG --cyhy-config=CYHY_CONFIG CyHy MongoDB configuration to use
-y SCAN_CONFIG --scan-config=SCAN_CONFIG Scan MongoDB configuration to use
-z ASSESSMENT_CONFIG --assessment-config=ASSESSMENT_CONFIG Assessment MongoDB configuration to use
-v --verbose Show verbose output
-a --aws Output results to S3 bucket
--cleanup-aws Delete old files from the S3 bucket
-c CONFIG_FILE --config=CONFIG_FILE Configuration file for this script
-d DATE --date=DATE Specific date to export data from in form: %Y-%m-%d (eg. 2018-12-31) NOTE that this date is in UTC
--debug Enable debug logging
```
#### cyhy-data-extract Examples ####
Extract CyHy data for the current day using the MongoDB configuration in `cyhy.yml`
and the runtime configuration in `cyhy-data-extract.cfg`.
```console
python3 cyhy-data-extract.py --cyhy-config cyhy.yml --config cyhy-data-extract.cfg
```
Extract scan data for the current day using the MongoDB configuration in 'scan.yml'
and the runtime configuration in `cyhy-data-extract.cfg`.
```console
python3 cyhy-data-extract.py --scan-config scan.yml --config cyhy-data-extract.cfg
```
Extract assessment data for the current day using the MongoDB configuration in
`assessment.yml` and the runtime configuration in `cyhy-data-extract.cfg`.
```console
python3 cyhy-data-extract.py --assessment-config assessment.yml --config cyhy-data-extract.cfg
```
Extract CyHy and scan data for the current day using the MongoDB configurations
in `cyhy.yml` and `scan.yml`, respectively, and use the runtime configuration in
`cyhy-data-extract.cfg`.
```console
python3 cyhy-data-extract.py --cyhy-config cyhy.yml --scan-config scan.yml
--config cyhy-data-extract.cfg
```
Extract CyHy and scan data for the current day using the MongoDB configurations
in `cyhy.yml` and `scan.yml`, upload the results to AWS, and use the runtime
configuration in `cyhy-data-extract.cfg`.
```console
python3 cyhy-data-extract.py --cyhy-config cyhy.yml --scan-config scan.yml
--aws --config cyhy-data-extract.cfg
```
Extract CyHy and scan data for January 25th, 2019 using the MongoDB configurations
in `cyhy.yml` and `scan.yml`, upload the results to AWS, and use the runtime
configuration in `cyhy-data-extract.cfg`.
```console
python3 cyhy-data-extract.py --cyhy-config cyhy.yml --scan-config scan.yml
--aws --config cyhy-data-extract.cfg --date 2019-01-25
```
Extract CyHy, scan, and assessment data for January 25th, 2019 using the MongoDB
configurations in `cyhy.yml`, `scan.yml`, and `assessment.yml`, upload the results
to AWS, and use the runtime configuration in `cyhy-data-extract.cfg`.
```console
python3 cyhy-data-extract.py --cyhy-config cyhy.yml --scan-config scan.yml
--assessment-config assessment.yml --aws --config cyhy-data-extract.cfg
--date 2019-01-25
```
### cyhy-data-retriever Usage and Examples ###
#### cyhy-data-retriever Usage ####
```console
Retrieve a compressed, encrypted, signed extract file and
verify/decrypt/uncompress it.
NOTES:
* the python modules below must be installed for the script to work
* This script expects to operate on a GPG-encrypted bzip2 tar file
e.g. filename.tbz.gpg
Usage:
COMMAND_NAME [-v | --verbose] [--filename EXTRACT_FILENAME] [-a | --aws]
--config CONFIG_FILE
COMMAND_NAME (-h | --help)
COMMAND_NAME --version
Options:
-h --help Show this screen
--version Show version
-f EXTRACT_FILENAME --filename=EXTRACT_FILENAME Name of extract file to retrieve
-v --verbose Show verbose output
-c CONFIG_FILE --config=CONFIG_FILE Configuration file for this script
-a --aws Output results to S3 bucket
```
#### cyhy-data-retriever Examples ####
Retrieve the data stored in file `cyhy_extract_2019-01-25T000000+0000.tbz.gpg`
residing on AWS using the runtime configuration in `cyhy-data-retriever.cfg`.
```console
cyhy-data-retriever --filename cyhy_extract_2019-01-25T000000+0000.tbz.gpg --aws
--config cyhy-data-retriever.cfg
```
### Extract Config File Parameters ###
- `FED_ORGS_EXCLUDED` - Orgs to exclude from extract
- `GNUPG_HOME` - Location of GNUPG database (eg. /Users/bob/.gnupg)
- `RECIPIENTS` - Names on the gpg public key(s)
- `SIGNER` - Gpg signer to ensure integrity
- `SIGNER_PASSPHRASE` - Passphrase for signer gpg key
- `OUTPUT_DIR` - Directory to output extract to
- `FILE_RETENTION_NUM_DAYS` - Number of days to hold extract
- `ES_AWS_CONFIG_SECTION_NAME` - Name of the AWS config file section
containing the configuration to be used when accessing the
Elasticsearch data
- `ES_REGION` - Region for DMARC bucket
- `ES_URL` - Elasticsearch URL
- `ES_RETRIEVE_SIZE` - Elasticsearch size
### Retriever Config File Parameters ###
- `CLIENT_PRIVATE_KEY_FILE` - Path to gpg private key
- `GNUPG_HOME` - Location of GPG database (eg. /Users/bob/.gnupg)
- `GPG_DECRYPTION_PASSPHRASE` - Passphrase for private gpg key
- `AWS_ACCESS_KEY_ID` - User ID used for AWS S3 bucket read access
- `AWS_SECRET_ACCESS_KEY` - Key for AWS S3 bucket read access
- `PROXY_CONFIG` - Only needed when proxy is present
## Contributing ##
We welcome contributions! Please see [here](CONTRIBUTING.md) for
details.
## License ##
This project is in the worldwide [public domain](LICENSE).
This project is in the public domain within the United States, and
copyright and related rights in the work worldwide are waived through
the [CC0 1.0 Universal public domain
dedication](https://creativecommons.org/publicdomain/zero/1.0/).
All contributions to this project will be released under the CC0
dedication. By submitting a pull request, you are agreeing to comply
with this waiver of copyright interest.