{"id":16167173,"url":"https://github.com/spamscope/spamscope","last_synced_at":"2025-04-04T22:04:53.462Z","repository":{"id":57469544,"uuid":"66502644","full_name":"SpamScope/spamscope","owner":"SpamScope","description":"Fast Advanced Spam Analysis Tool","archived":false,"fork":false,"pushed_at":"2024-03-18T20:27:33.000Z","size":6541,"stargazers_count":290,"open_issues_count":2,"forks_count":59,"subscribers_count":20,"default_branch":"develop","last_synced_at":"2024-10-17T08:16:58.322Z","etag":null,"topics":["ansible","ansible-playbook","apache-storm","application-security","dialect","docker","docker-image","mail-analyzer","outlook","python","security","smtp","spam-analyzer","spamscope","streamparse"],"latest_commit_sha":null,"homepage":"https://pypi.python.org/pypi/SpamScope","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SpamScope.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["fedelemantuano"]}},"created_at":"2016-08-24T21:55:46.000Z","updated_at":"2024-10-15T16:59:44.000Z","dependencies_parsed_at":"2024-07-29T09:44:37.443Z","dependency_job_id":null,"html_url":"https://github.com/SpamScope/spamscope","commit_stats":{"total_commits":438,"total_committers":3,"mean_commits":146.0,"dds":0.004566210045662156,"last_synced_commit":"ffbfc53b9a3503ef3041cee94c6726c8b899118d"},"previous_names":[],"tags_count":50,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SpamScope%2Fspamscope","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SpamScope%2Fspamscope/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SpamScope%2Fspamscope/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SpamScope%2Fspamscope/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SpamScope","download_url":"https://codeload.github.com/SpamScope/spamscope/tar.gz/refs/heads/develop","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247256110,"owners_count":20909240,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ansible","ansible-playbook","apache-storm","application-security","dialect","docker","docker-image","mail-analyzer","outlook","python","security","smtp","spam-analyzer","spamscope","streamparse"],"created_at":"2024-10-10T03:06:31.300Z","updated_at":"2025-04-04T22:04:53.439Z","avatar_url":"https://github.com/SpamScope.png","language":"Python","readme":"[![PyPI version](https://badge.fury.io/py/SpamScope.svg)](https://badge.fury.io/py/SpamScope)\n[![Build Status](https://travis-ci.org/SpamScope/spamscope.svg?branch=master)](https://travis-ci.org/SpamScope/spamscope)\n[![Coverage Status](https://coveralls.io/repos/github/SpamScope/spamscope/badge.svg?branch=develop)](https://coveralls.io/github/SpamScope/spamscope?branch=develop)\n[![BCH compliance](https://bettercodehub.com/edge/badge/SpamScope/spamscope?branch=develop)](https://bettercodehub.com/)\n\n![SpamScope](https://raw.githubusercontent.com/SpamScope/spamscope/develop/docs/logo/spamscope.png)\n\n# Overview\nSpamScope is an advanced spam analysis tool that use [Apache Storm](http://storm.apache.org/) with [streamparse](https://github.com/Parsely/streamparse) to process a stream of mails.\nTo understand how SpamScope works, I suggest to read these overviews:\n - [Apache Storm Concepts](http://storm.apache.org/releases/1.2.3/Concepts.html)\n - [Streamparse Quickstart](http://streamparse.readthedocs.io/en/stable/quickstart.html)\n\nIn general the first step is run Apache Storm, then you can run the topologies on it.\nSpamScope has some topologies in [topologies folder](./topologies/), but you can make others topologies.\n\n![Schema topology](docs/images/schema_topology.png?raw=true \"Schema topology\")\n\n# Apache 2 Open Source License\nSpamScope can be downloaded, used, and modified free of charge. It is available under the Apache 2 license.\n\n## Support the project\n\n**Dogecoin**: `DAUbDUttkf8WN1kwP9YYQQKyEJYY2WWtEG`\n\n[![Donate with Bitcoin](https://en.cryptobadges.io/badge/big/1BCJ8wok4DNW8KbdL8H3VwZviXAWibhEPe)](https://en.cryptobadges.io/donate/1BCJ8wok4DNW8KbdL8H3VwZviXAWibhEPe)\n\n[![Donate](https://www.paypal.com/en_US/i/btn/btn_donateCC_LG.gif \"Donate\")](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick\u0026hosted_button_id=VEPXYP745KJF2)\n\n# What Does SpamScope do?\nSpamScope gets the raw emails (both RFC822 and Outlook formats) in input and returns an JSON object. Then it extracts urls and attachments (if they are zipped extracts the content files). All informations are saved in JSON objects. This is the first analysis. After that SpamScope runs a _phishing_ module, that gives a _phishing score_ to the emails.\n\nThen you can enable/disable post processing modules, that connect SpamScope with third party tools. There are three main categories:\n - raw emails analysis\n - attachments analysis\n - sender emails analysis\n\n It's possible to add new modules in these three categories, if you want connect SpamScope with others tools.\n\n## Raw emails analysis\nThese modules (see [here](./src/modules/mails)) analyze the raw emails:\n - SMTP dialect\n - SpamAssassin\n\n## Attachments analysis\nThese modules (see [here](./src/modules/attachments)) analyze the attachments of emails:\n - Apache Tika\n - Store sample on disk (as default SpamScope saves samples in JSON objects)\n - Thug\n - VirusTotal\n - Zemana\n\n## Sender emails analysis\nSpamScope can detects the exact sender IP and then it can analyze it (see [here](./src/modules/networks)):\n - Shodan\n - VirusTotal\n\n# Why should I use SpamScope\n- It's very fast: the job is splitted in functionalities that work in parallel.\n- It's flexible: you can choose what SpamScope has to do.\n- It's distributed: SpamScope uses Apache Storm, free and open source distributed realtime computation system.\n- It makes JSON output that you can save where you want.\n- It's easy to setup: there are docker images and docker-compose ready for use.\n- It's integrated with Apache Tika, VirusTotal, Thug, Shodan and SpamAssassin (for now).\n- It's free and open source (for special functions you can contact me).\n- It can analyze Outlook msg.\n\n## Distributed\nSpamScope uses Apache Storm that allows you to start small and scale horizontally as you grow. Simply add more workers.\n\n## Flexibility\nYou can choose your mails input sources (with **spouts**) and your functionalities (with **bolts**).\n\nSpamScope comes with the following bolts:\n - **tokenizer** splits mail in token like headers, body, attachments and it can filter emails, attachments and ip addresses already seen\n - **phishing** looks for your keywords in email and connects email to targets (bank, your customers, etc.)\n - **raw_mail** is for all third party tools that analyze raw mails like SpamAssassin\n - **attachments** analyzes all mail attachments and uses third party tools like VirusTotal\n - **network** analyzes all sender ip addresses with third party tools like Shodan\n - **urls** extracts all urls in email and attachments\n - **json_maker** and **outputs** make the json report and save it\n\n## Store where you want\nYou can build your custom output bolts and store your data in Elasticsearch, MongoDB, filesystem, etc.\n\n## Build your topology\nWith streamparse tecnology you can build your topology in Python, add and/or remove spouts and bolts.\n\n## API\nFor now SpamScope doesn't have its own API, because it isn't tied to any tecnology.\nIf you use `Redis` as spout (input), you'll use Redis API to put mails in topology.\nIf you use `Elasticsearch` as output, you'll use Elasticsearch API to get results.\n\nIt's possible to develop a middleware API that it talks with input, output and changes the configuration, but now there isn't.\n\n# SpamScope on Web\n - [Shodan Applications \u0026 Integrations](https://developer.shodan.io/apps)\n - [The Honeynet Project](http://honeynet.org/node/1329)\n - [securityonline.info](http://securityonline.info/pcileech-direct-memory-access-dma-attack-software/)\n - [jekil/awesome-hacking](https://github.com/jekil/awesome-hacking)\n - [Linux Security Expert](https://linuxsecurity.expert/tools/spamscope/)\n\n# Authors\n\n## Main Author\n Fedele Mantuano (**LinkedIn**: [Fedele Mantuano](https://www.linkedin.com/in/fmantuano/))\n\n# Requirements\nFor operating system requirements you can read [Ansible playbooks](./ansible), that go into details.\n\nFor Python requirements you can read:\n * [mandatory requirements](./requirements.txt)\n * [optional requirements](./requirements_optional.txt)\n\n_Thug_ is another optional requirement, that it's not in requirements. See [Thug section](#thug-optional) for more details.\n\n## Apache Storm\n[Apache Storm](http://storm.apache.org/) is a free and open source distributed realtime computation system.\n\n## streamparse\n[streamparse](https://github.com/Parsely/streamparse) lets you run Python code against real-time streams of data via Apache Storm.\n\n## mail-parser\n[mail-parser](https://github.com/SpamScope/mail-parser) is the parsing for raw email of SpamScope.\n\n## Faup\n[Faup](https://github.com/stricaud/faup) stands for Finally An Url Parser and is a library and command line tool to parse URLs and normalize fields.\n\n## rarlinux (optional)\n[rarlinux](https://www.rarlab.com/) unarchives rar file.\n\n## SpamAssassin (optional)\nSpamScope can use [SpamAssassin](http://spamassassin.apache.org/) an open source anti-spam to analyze every mails.\n\n## Apache Tika (optional)\nSpamScope can use [Apache Tika](https://tika.apache.org/) to parse every attachments.\nThe Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).\nTo use Apache Tika in SpamScope you must install [tika-app-python](https://github.com/fedelemantuano/tika-app-python) with `pip` and [Apache Tika](https://tika.apache.org/download.html).\n\n## Thug (optional)\nFrom release v1.3 SpamScope can analyze Javascript and HTML attachments with [Thug](https://github.com/buffer/thug).\nIf you want to analyze the attachments with Thug, follow [these instructions](http://buffer.github.io/thug/doc/build.html) to install it. Enable it in `attachments` section of [main configuration file](./conf/spamscope.example.yml).\n\nWhat is Thug? From README project:\n\u003e Thug is a Python low-interaction honeyclient aimed at mimicing the behavior of a web browser in order to detect and emulate malicious contents.\n\nYou can see a complete SpamScope report with Thug analysis [here](https://goo.gl/Y4kWCv).\n\nThug analysis can be very slow and you can have `heartbeat timeout` errors in Apache Storm.\nTo avoid any issue set `supervisor.worker.timeout.secs`:\n\n```\nnr. user agents * timeout_thug \u003c supervisor.worker.timeout.secs\n```\n\nThe best value for `threshold` is 1.\n\n## VirusTotal (optional)\nIt's possible add to results (for mail attachments and sender ip address) the VirusTotal report. You need a private API key.\n\n## Shodan (optional)\nIt's possible add to results the Shodan report for sender ip address. You need a private API key.\n\n## Elasticsearch (optional)\nIt's possible to store the results in Elasticsearch. In this case you should install `elasticsearch` package.\n\n## Redis (optional)\nIt's possible to store the results in Redis. In this case you should install `redis` package.\n\n# Configuration\nRead the [example of main configuration file](./conf/spamscope.example.yml).\nThe default value where SpamScope will search the configuration file is `/etc/spamscope/spamscope.yml`, but it's possible to set the environment variable `SPAMSCOPE_CONF_FILE`:\n\n```\n$ export SPAMSCOPE_CONF_FILE=/etc/spamscope/spamscope.yml\n```\n\nWhen you change the configuration file, SpamScope automatically reloads the new changes.\n\n# Installation\nYou can use:\n  * [Docker images](./docker/README.md) to run SpamScope with docker engine\n  * [Ansible](./ansible/README.md): to install and run SpamScope on server\n\n# Topologies\nSpamScope comes with six topologies:\n   - [spamscope_debug](./topologies/spamscope_debug.py): the output are JSON files on file system.\n   - [spamscope_elasticsearch](./topologies/spamscope_elasticsearch.py): the output are stored in Elasticsearch indexes.\n   - [spamscope_redis](./topologies/spamscope_redis.py): the output are stored in Redis.\n   - [spamscope_debug_iter](./topologies/spamscope_debug_iter.py): It uses generator to send mails in topology. The output are JSON files on file system.\n   - [spamscope_elasticsearch_iter](./topologies/spamscope_elasticsearch_iter.py): It uses generator to send mails in topology. The output are stored in Elasticsearch indexes.\n   - [spamscope_redis_iter](./topologies/spamscope_redis_iter.py): It uses generator to send mails in topology. The output are stored in Redis.\n\nIf you want submit SpamScope topology use `spamscope-topology submit` tool. For more details [see SpamScope cli tools](src/cli/README.md):\n\n```\n$ spamscope-topology submit --topology {spamscope_debug,spamscope_elasticsearch,spamscope_redis}\n```\n\nIt's possible to change the default settings for all Apache Storm options. I suggest to change these options:\n\n - **topology.tick.tuple.freq.secs**: reload configuration of all bolts\n - **topology.max.spout.pending**: Apache Storm framework will then throttle your spout as needed to meet the `topology.max.spout.pending` requirement\n - **topology.sleep.spout.wait.strategy.time.ms**: max sleep for emit new tuple (mail)\n\nYou can use `spamscope-topology submit` to do these changes.\n\n# Important\nIf you are using Elasticsearch output, I suggest you to use [Elasticsearch templates](./conf/templates) that comes with SpamScope.\n\n# Unittest\nSpamScope comes with unittests for each modules. In bolts and spouts there are no special features, all intelligence is in external modules.\nAll unittests are in [tests folder](tests/).\n\nTo have complete tests you should set the followings enviroment variables:\n\n```\n$ export THUG_ENABLED=True\n$ export VIRUSTOTAL_ENABLED=True\n$ export VIRUSTOTAL_APIKEY=\"your key\"\n$ export ZEMANA_ENABLED=True\n$ export ZEMANA_APIKEY=\"your key\"\n$ export ZEMANA_PARTNERID=\"your partner id\"\n$ export ZEMANA_USERID=\"your userid\"\n$ export SHODAN_ENABLED=True\n$ export SHODAN_APIKEY=\"your key\"\n$ export SPAMASSASSIN_ENABLED=True\n```\n\n# Output example\nThis is a [raw email](https://goo.gl/wMBfbF) that I analyzed with SpamScope:\n  - [SpamScope output](https://goo.gl/fr4i7C).\n\nThis is another example with [Thug analysis](https://goo.gl/Y4kWCv).\n\n# Screenshots\n![Apache Storm](docs/images/Docker00.png?raw=true \"Apache Storm\")\n\n![SpamScope](docs/images/Docker01.png?raw=true \"SpamScope\")\n\n![SpamScope Topology](docs/images/Docker02.png?raw=true \"SpamScope Topology\")\n\n![SpamScope Map](docs/images/map.png?raw=true \"SpamScope Map\")\n","funding_links":["https://github.com/sponsors/fedelemantuano","https://www.paypal.com/en_US/i/btn/btn_donateCC_LG.gif","https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick\u0026hosted_button_id=VEPXYP745KJF2"],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspamscope%2Fspamscope","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fspamscope%2Fspamscope","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspamscope%2Fspamscope/lists"}