{"id":18076396,"url":"https://github.com/needmorecowbell/funnel","last_synced_at":"2025-08-07T02:14:17.323Z","repository":{"id":86178797,"uuid":"179622121","full_name":"needmorecowbell/Funnel","owner":"needmorecowbell","description":"Funnel is a lightweight yara-based feed scraper","archived":false,"fork":false,"pushed_at":"2023-05-22T21:37:14.000Z","size":90,"stargazers_count":39,"open_issues_count":3,"forks_count":5,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-04-12T08:15:45.530Z","etag":null,"topics":["osint","rss","yara"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/needmorecowbell.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-04-05T05:04:18.000Z","updated_at":"2023-07-25T04:14:32.000Z","dependencies_parsed_at":"2023-11-27T06:31:41.124Z","dependency_job_id":null,"html_url":"https://github.com/needmorecowbell/Funnel","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/needmorecowbell/Funnel","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/needmorecowbell%2FFunnel","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/needmorecowbell%2FFunnel/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/needmorecowbell%2FFunnel/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/needmorecowbell%2FFunnel/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/needmorecowbell","download_url":"https://codeload.github.com/needmorecowbell/Funnel/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/needmorecowbell%2FFunnel/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":269185726,"owners_count":24374634,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-07T02:00:09.698Z","response_time":73,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["osint","rss","yara"],"created_at":"2024-10-31T11:09:48.753Z","updated_at":"2025-08-07T02:14:17.315Z","avatar_url":"https://github.com/needmorecowbell.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Funnel\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://user-images.githubusercontent.com/7833164/55665412-1ca67180-580d-11e9-8e63-c09f83d919da.png\" height=\"150\"  width=\"150\"\u003e\u003c/img\u003e\n\u003c/p\u003e\n\n\nFunnel is a lightweight yara-based feed scraper. Give a list of inputs and it will check them. Put it in a crontab and it will regularly update the database. If the article gets matched to the yara rule, it will be put into the database. All matched results get put into an sqlite database, with the rule it flagged.\n\n## Installation\n\nInstall your required dependencies and you're good to go.\n\n` pip3 install -r requirements.txt `\n\n## Usage:\n\n```bash\nFunnel.py [-h] [-v] [-u] rule_path target_path\n\npositional arguments:\n  rule_path      path to directory of rules used on list of feeds\n  target_path    path to sources list or url\n\noptional arguments:\n  -h, --help     show this help message and exit\n  -v, --verbose  increase output verbosity\n  -u, --url      scan one url instead of using sources list\n```\n\n**Example:** \n\nYou want to get every new post on the internet that has your name or personal info in it. You would use as many sources as possible,and fill out the personal_info.yar rule. \n\nSchedule this command to run regularly using crontab:\n\n`python3 Funnel.py rules/personal/ sources/sources-large.json\n`\n\nWant to scan just one url to see if it matches against any of your rule set? \n\n`python3 Funnel.py -u rules/ https://www.bbc.com/news/world-asia-47844000`\n\nA bar that wants all the newest margharita recipes? You could do that. Every single post about a politician, for a data visualization project on how much each person is talked about? Works too! Just add rules and sources.\n\n### Sources\n\nThe sources should be in a json file, with a url and a title for each source in the list. Here is a barebones example:\n\n```json\n{\n    \"sources-rss\":[\n        {\n                \"url\": \"https://www.reddit.com/r/netsec/.rss\",\n                \"title\": \"netsec subreddit\"\n        },\n        {\n                \"url\": \"https://www.reddit.com/r/malware/.rss\",\n                \"title\": \"malware subreddit\"\n        }\n\n    ]\n\n}\n\n```\n**Tip:** Extract sources from feedly by using the opml_to_json.py file in the util folder to turn your exported feedly opml file into a valid sources file\n\n\n### Rules\n\nSome sample rules have been provided in the rules folder. Any standard yara rule will work, it is always being compared on text content at this point, no file analysis yet. You can pass in either a directory of rules, a nested directory of rules, or just one rule.\n\n### Database\n\nThe database is in sqlite, and works with two tables. The first, is a table of links of matched articles, which have a unique id. The second table is a table of the matched rules with the matched article's id together. This keeps duplicates out of the links table, and makes for easy reference.\n\n## Contribute\n\nFeel free to add your suggestions for what to add to this project, even better if you give me a pull request!\n\n\nInspired by ThreatIngestor from InQuest\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneedmorecowbell%2Ffunnel","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fneedmorecowbell%2Ffunnel","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneedmorecowbell%2Ffunnel/lists"}