https://github.com/etsy/MIDAS

Mac Intrusion Detection Analysis System
https://github.com/etsy/MIDAS
non-sox
Last synced: 3 months ago
JSON representation
Mac Intrusion Detection Analysis System
Host: GitHub
URL: https://github.com/etsy/MIDAS
Owner: etsy
Archived: true
Created: 2013-12-05T15:51:26.000Z (over 11 years ago)
Default Branch: master
Last Pushed: 2015-09-23T15:45:08.000Z (almost 10 years ago)
Last Synced: 2024-08-01T22:05:28.651Z (12 months ago)
Topics: non-sox
Size: 357 KB
Stars: 831
Watchers: 124
Forks: 160
Open Issues: 1
Metadata Files:
- Readme: README.archived.md
Awesome Lists containing this project

osx-and-ios-security-awesome - MIDAS - macOS Intrusion Detection Analysis System. (macOS Security)
README

        MIDAS

=====

MIDAS is a framework for developing a Mac Intrusion Detection Analysis System,

based on work and collaborative discussions between the Etsy and

Facebook security teams. This repository provides a modular framework and a

number of helper utilities, as well as an example module for detecting

modifications to common OS X persistence mechanisms.

The MIDAS project is based off concepts presented in [Homebrew Defensive

Security] (http://www.slideshare.net/mimeframe/ruxcon-2012-15195589) and

[Attack-Driven Defense]

(http://www.slideshare.net/zanelackey/attackdriven-defense), as well as

lessons learned during the development of the Tripyarn and BigMac products.

Our mutual goal in releasing this framework is to foster more discussion in

this area and provide organizations with a starting point in instrumenting

OS X endpoints to detect common patterns of compromise and persistence.

Overview

---------

The `midas` subdirectory is where the core MIDAS code lives. The entry point is

`launcher.py`. From there, each module in `midas/modules` is executed and the

stdout of the module is written to a log file. When deploying MIDAS, this is

the code that's put on user's systems.

The `develop` subdirectory is where development resources (like a `.pylintrc`)

live.

The `templates` resource is where template and base resources live. These can be

used as a starting point when adding modules.

Architecture

------------

MIDAS allows you to define a set of "modules" that implement

host-based checks, verifications, analysis, etc.

### Launcher

The `launcher.py` file exists at the top level of the `midas` directory. It

gathers some simple information about the host it's executing on (such as time,

hostname, etc) and defines the ways that it should handle modules of certain

languages. To add a supported language, create a new instance of

`TyLanguage` in `launcher.py` and add it to the `SUPPORTED_LANGUAGES` list. If

you'd like to change the way a certain language is supported (perhaps you'd

like all python modules to be executed with a custom version of python), you

can change the attributes (such as `execution_string`) of the language.

Once key definitions are made, the launcher will iterate through all files

(note that directories are explicitly skipped) in the `modules` subdirectory.

For each file in the directory, if a language entry is found that indicates

how to deal with that filetype, the file is executed and the stdout of the

module are appended to a log file in the `log` subdirectory. Note that a

module `modules/example.py` will generate a log file `logs/example.log`.

### Module language

Modules can be written in any language, so long as a named tuple for that

language is defined in `midas/launcher.py`. These named tuples (which

already exist for python, ruby and shell) exist so that MIDAS knows how

to handle certain filetypes when it sees them.

As long as your code can be executed and prints something to stdout, it can be

a module.

Components

----------

### Example module

The file `midas/modules/example.py` is an example MIDAS module created to

illustrate what a MIDAS module might look like. This module performs

analysis of LaunchAgents and LaunchDaemons on the host and logs any

modifications that it identifies. The rest of the checks and

verifications analyze the host firewall configurations and log any additions or

differences that are identified. **This is not meant to be a complete intrusion

detection mechanism alone**, instead it is meant as a reference example of what

a MIDAS module may look like.

### Helpers

There are several helper files in `midas/lib/helpers` that are generally

grouped by category. Functions in these helpers can be imported by modules to

assist in general tasks. Some functionality exposed by helpers include:

- list all weak SSH keys on a host

- find all files in a given directory with given permissions

- list all startup items

- list all LaunchAgents, LaunchDaemons, etc.

- list and hash all kernel extensions

- return the SSID of the currently connected WiFi network

- return the IP and MAC address of the current network gateway

- return DNS configuration information

- and much, much more

### Config system

The config file, which can be found at `midas/lib/config.py` is a way to group

together information that can be abstracted away from modules. Usually there

are things like strings that should be checked in a certain module validation,

directories to search during a given check, etc. By abstracting the data away

from the individual module/code, it makes it easier for people who might not

maintain the code to contribute to it.

Since the config dictionary is operated on via a static method, it does not

need to instantiate the Config object in order to use it. To add a new value

to the config dictionary, simply add an entry in the class.

### ORM and table definitions

MIDAS relies on a local datastore to do some simple host-based analytics with

the data gathered by modules. For this reason, MIDAS comes with a simple,

custom ORM.

#### Table definitions

Table schemas are described via a simple dictionary, all of which can be found

in `midas/lib/tables/`. The dictionary is then parsed and valid SQL is created

from the dictionary. Each item in the dictionary represents a column. The

column definition should be a key-value pair where the key is a strings that

represents the name of the column and the value is a dictionary that describes

the column. The column definition dictionary can have the following attributes:

- default

  - If this is set, it will be the default value of the column. This is most frequently set to `"NULL"`

- nullable

  - If this is set to `False`, then the `NOT NULL` attribute will be used when creating the table

- attrs

  - Use this to add additional SQL syntax that you want added to the table creation statement that isn't supported by tyORM

- primary_key

  - If this is set to `True`, then the column listed will be set to the primary key

See the following same table definition for reference:

```python

test_table = {

    "data_field_name" : {

        "type" : "text",

        "nullable" : False,

    },

    "other_data_field" : {

        "type" : "text",

        "default" : "NULL",

    }

}

```

#### Instantiating an ORM object

When instantiating an ORM object, the class takes one parameter: the database

filename. If the file doesn't exist, the ORM will create it.

See the following example code for reference:

```python

from lib.ty_orm import TyORM

ORM = TyORM("midas_hids.db")

```

#### Transparent primary key system

Although it is possible to specify a primary key when creating a table, TyORM

transparently creates an auto-incrementing `id` column and sets it as a primary

key. Although SQLite allows you to specify several primary keys, this is not

necessary. The `id` column is always used as the `WHERE` clause identifier when

doing `UPDATE` and `DELETE` operations.

#### Creating

Use the `insert` method to insert data into a table. The `insert` method takes

two arguments: the table name and the data that you'd like to insert. The table

name should be a string that describes the name of the table. The data should

be a dictionary such that the keys describe the columns that the value should

be inserted into.

See the following example `insert` call for reference:

```python

ORM = TyORM("midas_hids.db")

data = {

    "data_field_name" : "foo",

    "other_data_field" : "bar",

}

ORM.insert("test_table", data)

```

#### Reading

Use the `select` method to read data from the database. The `select` method takes one mandatory argument and three optional arguments.

The mandatory argument that all `select` method calls needs to have is the table name that you'd like to select from. The optional arguments are as follows:

- columns

  - An array of columns that you would like returned

- where

  - A string that describes the "WHERE" clause that you would like added to the SQL query

  - Note that this can be a string but if you're supplying user input, this should be an array such that the first item in the array is the where clause with '?' place holders and the second item in the array is an array with the representative values.

- limit

  - An integer describing the LIMIT value that you would like added to the SQL query

- order_by

  - A string dictating which column to order results by

See the following example `select` calls for reference:

```python

ORM = TyORM("midas_hids.db")

# this will return all table data

ORM.select("test_table")

# this will return only the "data_field_name" column of the

# first five columns where the "data_field_name" column is "foobar", ordered by "data_field_name"

ORM.select("test_table", ["data_field_name"], "data_field_name = 'foobar'", 5, "data_field_name")

```

#### Updating

Use the `update` method to update data from the database. Simply `select` some

data and change the returned dictionary to reflect the data you want the field

to be updated to and, via some "hidden" values, the ORM will take care of the

rest.

See the following example `update` call for reference:

```python

ORM = TyORM("midas_hids.db")

data = ORM.select("test_table", "*", "data_field_name = 'foobar123'", 1)

data['data_field_name'] = 'newname123'

ORM.update(data)

```

#### Deleting

Use the `delete` method to delete data from the database. Simply `select` some

data and call the delete method and the ORM will take care of the rest.

See the following example `delete` call for reference:

```python

ORM = TyORM("midas_hids.db")

data = ORM.select("test_table", "*", "data_field_name = 'foobar123'", 1)

ORM.delete(data)

```

#### Initializing tables and dynamic ALTERs

One of the strengths of tyORM is it's ability to dynamically ALTER a table if

the table's schema doesn't match the table  definition dictionary.

Simply call the `initialize_table` table method before operating on the table.

The `initialize_table` method takes two arguments: the table name and the table

definition. The `initialize_table` method will create the table if it doesn't

exist and it will alter the table if any new columns have been added.

See the following example code for reference:

```python

test_table_data = {

    "data_field_name" : "foo",

    "other_data_field" : "bar",

}

ORM = TyORM("midas_hids.db")

ORM.initialize_table("test_table", test_table_data)

# operate on the ORM here

```

Due to limitations of SQLite, this only support new columns that are added, not

columns that are removed.

### Host based analytics

The file `lib/data_science.py` is used for simple host based data aggregation.

The `DataScience` class is used to query the database and log any changes,

given a new dataset. Using data_science is very simple. The class constructor

takes three arguments:

- a TyORM object that is already instantiated with the database which is to be

  operated on

- a dataset

- a table name that the dataset should be compared against

The dataset should be an array of dictionaries. Each item in the array should

be a dictionary where each key of the dictionary represents a column in the

database and each corresponding value represents a corresponding value. It's

OK if some columns of the table are not in the dictionary, but the `name`

column should always be present. Although TyORM has it's own transparent

primary key system using the `id` column, for the sake of `data_science`, the

`name` column should be present and it should be unique. The `data_science`

code will then select all of the data from the given table and compare it

against the supplied dataset, printing out log lines illustrating all data that

has been added, removed and changed.

### Decorators

The file `midas/lib/decorators.py` contains a bunch of decorators that can be

used for a variety of things, but currently predominantly code execution

frequency.

### Property List parsing

This is the utility that MIDAS uses to operate on property list files such

as LaunchAgents and LaunchDaemons. This is mostly the

[biplist](https://github.com/wooster/biplist) python module, however, you

should never actually call any of the biplist functions directly.

The `read_plist` function is what you should call when trying to read a plist.

When you call `read_plist`, it first tries to use biplist's readPlist. That

code determines if the plist is a binary plist or an XML plist. If it is an XML

plist, it just uses python's plistlib to read the plist. If it is a binary

plist, it uses a native python implementation to parse it and return it's

contents. If that, for whatever reason, fails, the `read_plist` function will

then try to shell out to `plutil` to parse the plist.

The `get_plist_key` function takes a plist and a key as input. It returns the

key (if the key exists) or False if it does not. This is so that, when

operating on property lists, you don't have to roll your own exception handling

on every access.

`read_plist` and `get_plist_key` are the only two functions that should be

called from this file.

Customization

-------------

A MIDAS deployment in an organization typically follows these steps:

- Create a private fork of MIDAS

- Add modules and helpers that implement instrumentation specific to the environment

- Deploy the code (with updated modules) to OS X endpoints in the organization

- Set a crontab/LaunchAgent that executes MIDAS at a set interval

- Use syslog/a log aggregation mechanism to forward the logs to a central logging

  host

- Analyze the collected data and create alerts for anomalies

Contributors

------------

+ __Mike Arpaia__ ([@mikearpaia](https://twitter.com/mikearpaia))

+ __Chris Biettchert__ ([@chrisbiettchert](https://twitter.com/chrisbiettchert))

+ __Ben Hughes__ ([@benjammingh](https://twitter.com/benjammingh))

+ __Zane Lackey__ ([@zanelackey](https://twitter.com/zanelackey))

+ __mimeframe__ ([@mimeframe](https://twitter.com/mimeframe))
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/etsy/MIDAS

Awesome Lists containing this project

README