{"id":19883610,"url":"https://github.com/muchdogesec/obstracts","last_synced_at":"2025-05-02T14:33:43.608Z","repository":{"id":262285079,"uuid":"883262599","full_name":"muchdogesec/obstracts","owner":"muchdogesec","description":"Turn any blog into structured threat intelligence.","archived":false,"fork":false,"pushed_at":"2024-11-11T15:40:26.000Z","size":148,"stargazers_count":6,"open_issues_count":3,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-11-11T16:33:03.932Z","etag":null,"topics":["atom","blog","rss","threat-hunting","threat-intel","threat-intelligence"],"latest_commit_sha":null,"homepage":"https://www.obstracts.com/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/muchdogesec.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-04T16:52:12.000Z","updated_at":"2024-11-11T14:48:49.000Z","dependencies_parsed_at":"2024-11-11T16:47:07.144Z","dependency_job_id":null,"html_url":"https://github.com/muchdogesec/obstracts","commit_stats":null,"previous_names":["muchdogesec/obstracts"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/muchdogesec%2Fobstracts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/muchdogesec%2Fobstracts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/muchdogesec%2Fobstracts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/muchdogesec%2Fobstracts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/muchdogesec","download_url":"https://codeload.github.com/muchdogesec/obstracts/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224316384,"owners_count":17291243,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["atom","blog","rss","threat-hunting","threat-intel","threat-intelligence"],"created_at":"2024-11-12T17:21:32.285Z","updated_at":"2024-11-12T17:21:32.826Z","avatar_url":"https://github.com/muchdogesec.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Obstracts\n\n## Before you begin...\n\nWe offer a fully hosted web version of Obstracts which includes many additional features over those in this codebase. [You can find out more about the web version here](https://www.obstracts.com/).\n\n## Overview\n\n![](docs/obstracts.png)\n\nObstracts takes a blog ATOM or RSS feed and converts into structured threat intelligence.\n\nOrganisations subscribe to lots of blogs for security information. These blogs contain interesting indicators of malicious activity (e.g. malicious URL).\n\nTo help automate the extraction of this information, Obstracts automatically downloads blog articles and extracts indicators for viewing to a user.\n\nIt works at a high level like so:\n\n1. A feed is added to Obstracts by user (selecting profile to be used)\n2. Obstracts uses history4feed as a microservice to handle the download and storage of posts.\n3. The HTML from history4feed for each blog post is converted to markdown using file2txt in `html` mode\n4. The markdown is run through txt2stix where txt2stix pattern extractions/whitelists/aliases are run based on staff defined profile\n5. STIX bundles are generated for each post of the blog, and stored in an ArangoDB database called `obstracts_database` and Collections names matching the blog\n6. A user can access the bundle data or specific objects in the bundle via the API\n7. As new posts are added to remote blogs, user makes request to update blog and these are requested by history4feed\n\n## tl;dr\n\n[![Obstracts](https://img.youtube.com/vi/plp4hw95WdA/0.jpg)](https://www.youtube.com/watch?v=plp4hw95WdA)\n\n[Watch the demo](https://www.youtube.com/watch?v=plp4hw95WdA).\n\n## Install\n\n### Download and configure\n\n```shell\n# clone the latest code\ngit clone https://github.com/muchdogesec/obstracts\n```\n\n### Configuration options\n\nObstracts has various settings that are defined in an `.env` file.\n\nTo create a template for the file:\n\n```shell\ncp .env.example .env\n```\n\nTo see more information about how to set the variables, and what they do, read the `.env.markdown` file.\n\n### Build the Docker Image\n\n```shell\nsudo docker compose build\n```\n\n### Start the server\n\n```shell\nsudo docker compose up\n```\n\n### Access the server\n\nThe webserver (Django) should now be running on: http://127.0.0.1:8001/\n\nYou can access the Swagger UI for the API in a browser at: http://127.0.0.1:8001/api/schema/swagger-ui/\n\n## Contributing notes\n\nObstracts is made up of different core external components that support most of its functionality.\n\nAt a high-level the Obstracts pipeline looks like this: https://miro.com/app/board/uXjVKD2mg_0=/\n\nGenerally if you want to improve how Obstracts performs functionality, you should address the changes in;\n\n* [history4feed](https://github.com/muchdogesec/history4feed): responsible for downloading the blog posts, including the historical archive, and keep posts updated\n* [file2txt](https://github.com/muchdogesec/file2txt/): converts the HTML post content into a markdown file (which is used to extract data from)\n* [txt2stix](https://github.com/muchdogesec/txt2stix): turns the markdown file into STIX objects\n* [stix2arango](https://github.com/muchdogesec/stix2arango): manages the logic to insert the STIX objects into the database\n* [dogesec_commons](https://github.com/muchdogesec/dogesec_commons): where the API Objects, Profiles, Extractors, Whitelist and Alias endpoints are imported from \n\nFor anything else, then the Obstracts codebase is where you need to be :)\n\n## Useful supporting tools\n\n* [Turn any blog post into structured threat intelligence](https://www.dogesec.com/blog/launching_obstracts_open_source/)\n* [An up-to-date list of threat intel blogs that post cyber threat intelligence research](https://github.com/muchdogesec/awesome_threat_intel_blogs)\n\n## Support\n\n[Minimal support provided via the DOGESEC community](https://community.dogesec.com/).\n\n## License\n\n[Apache 2.0](/LICENSE).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmuchdogesec%2Fobstracts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmuchdogesec%2Fobstracts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmuchdogesec%2Fobstracts/lists"}