Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ghackebeil/pyoram

Python-based Oblivious RAM
https://github.com/ghackebeil/pyoram

Last synced: 3 months ago
JSON representation

Python-based Oblivious RAM

Awesome Lists containing this project

README

        

PyORAM
======

.. image:: https://travis-ci.org/ghackebeil/PyORAM.svg?branch=master
:target: https://travis-ci.org/ghackebeil/PyORAM

.. image:: https://ci.appveyor.com/api/projects/status/1tpnf7fr0qthrwxx/branch/master?svg=true
:target: https://ci.appveyor.com/project/ghackebeil/PyORAM?branch=master

.. image:: https://codecov.io/github/ghackebeil/PyORAM/coverage.svg?branch=master
:target: https://codecov.io/github/ghackebeil/PyORAM?branch=master

.. image:: https://img.shields.io/pypi/v/PyORAM.svg
:target: https://pypi.python.org/pypi/PyORAM/

Python-based Oblivious RAM (PyORAM) is a collection of
Oblivious RAM algorithms implemented in Python. This package
serves to enable rapid prototyping and testing of new ORAM
algorithms and ORAM-based applications tailored for the
cloud-storage setting. PyORAM is written to support as many
Python versions as possible, including Python 2.7+, Python
3.4+, and PyPy 2.6+.

This software is copyright (c) by Gabriel A. Hackebeil ([email protected]).

This software is released under the MIT software license.
This license, including disclaimer, is available in the 'LICENSE' file.

This work was funded by the Privacy Enhancing Technologies
project under the guidance of Professor `Attila Yavuz
`_ at Oregon State
University.

Why Python?
-----------

This project is meant for research. It is provided mainly as
a tool for other researchers studying the applicability of
ORAM to the cloud-storage setting. In such a setting, we
observe that network latency far outweighs any overhead
introduced from switching to an interpreted language such as
Python (as opposed to C++ or Java). Thus, our hope is that
by providing a Python-based library of ORAM tools, we will
enable researchers to spend more time prototyping new and
interesting ORAM applications and less time fighting with a
compiler or chasing down segmentation faults.

Installation
------------

To install the latest release of PyORAM, simply execute::

$ pip install PyORAM

To install the trunk version of PyORAM, first clone the repository::

$ git clone https://github.com/ghackebeil/PyORAM.git

Next, enter the directory where PyORAM has been cloned and run setup::

$ python setup.py install

If you are a developer, you should instead install using::

$ pip install -e .
$ pip install nose2 unittest

Installation Tips
-----------------

* OS X users are recommended to work with the `homebrew
`_ version of Python2 or Python3. If you
must use the default system Python, then the best thing to
do is create a virtual environment and install PyORAM into
that. The process of creating a virtual environment that is
stored in the PyORAM directory would look something like::

$ sudo pip install virtualenv
$ cd
$ virtualenv local_python2.7

If you had already attempted to install PyORAM into the
system Python and encountered errors, it may be necessary
to delete the directories :code:`build` and :code:`dist`
from the current directory using the command::

$ sudo rm -rf build dist

Once this virtual environment has been successfully
created, you can *activate* it using the command::

$ . local_python2.7/bin/activate

Then, proceed with the normal installation steps to
install PyORAM into this environment. Note that you must
*activate* this environment each time you open a new
terminal if PyORAM is installed in this way. Also, note
that use of the :code:`sudo` command is no longer
necessary (and should be avoided) once a virtual
environment is activated in the current shell.

* If you have trouble installing the cryptography package
on OS X with PyPy: `stackoverflow `_.

* If you encounter the dreaded "unable to find
vcvarsall.bat" error when installing packages with C
extensions through pip on Windows: `blog post `_.

Tools Available (So Far)
------------------------

Encrypted block storage
~~~~~~~~~~~~~~~~~~~~~~~

* The basic building block for any ORAM implementation.

* Available storage interfaces include:

- local storage using a file, a memory-mapped file, or RAM

+ Dropbox

- cloud storage using SFTP (requires SSH access to a server)

+ Amazon EC2

+ Microsoft Azure

+ Google Cloud Platform

- cloud storage using Amazon Simple Storage Service (S3)

* See Examples:

- examples/encrypted_storage_ram.py

- examples/encrypted_storage_mmap.py

- examples/encrypted_storage_file.py

- examples/encrypted_storage_sftp.py

- examples/encrypted_storage_s3.py

Path ORAM
~~~~~~~~~

* Reference: `Stefanov et al. `_

* Generalized to work over k-kary storage heaps. Default
settings use a binary storage heap and bucket size
parameter set to 4. Using a k-ary storage heap can reduce
the access cost; however, stash size behavior has not been
formally analyzed in this setting.

* Tree-Top caching can be used to reduce data transmission
per access as well as reduce access latency by exploiting
parallelism across independent sub-heaps below the last
cached heap level.

* See Examples:

- examples/path_oram_ram.py

- examples/path_oram_mmap.py

- examples/path_oram_file.py

- examples/path_oram_sftp.py

- examples/path_oram_s3.py

Performance Tips
----------------

Setup Storage Locally
~~~~~~~~~~~~~~~~~~~~~

Storage schemes such as BlockStorageFile ("file"), BlockStorageMMap
("mmap"), BlockStorageRAM ("ram"), and BlockStorageSFTP ("sftp") all
employ the same underlying storage format. Thus, an oblivious storage
scheme can be initialized locally and then transferred to an external
storage location and accessed via BlockStorageSFTP using SSH login
credentials. See the following pair of files for an example of this:

* examples/path_oram_sftp_setup.py

* examples/path_oram_sftp_test.py

BlockStorageS3 ("s3") employs a different format whereby the
underlying blocks are stored in separate "file" objects.
This design is due to the fact that the Amazon S3 API does
not allow modifications to a specific byte range within a
file, but instead requires that the entire modified file
object be re-uploaded. Thus, any efficient block storage
scheme must use separate "file" objects for each block.

Tree-Top Caching
~~~~~~~~~~~~~~~~

For schemes that employ a storage heap (such as Path ORAM),
tree-top caching provides the ability to parallelize I/O
operations across the independent sub-heaps below the last
cached heap level. The default behavior of this
implementation of Path ORAM, for instance, caches the top
three levels of the storage heap in RAM, which creates eight
independent sub-heaps across which write operations can be
asynchronous.

If the underlying storage is being accessed through SFTP, the
tree-top cached storage heap will attempt to open an
independent SFTP session for each sub-heap using the same
SSH connection. Typically, the maximum number of allowable
sessions associated with a single SSH connection is limited
by the SSH server. For instance, the default maximum number
of sessions allowed by a server using OpenSSH is 10. Thus,
increasing the number of cached levels beyond 3 when using
a binary storage heap will attempt to generate 16 or more SFTP
sessions and result in an error such as::

paramiko.ssh_exception.ChannelException: (1, 'Administratively prohibited')

There are two options for avoiding this error:

1. If you have administrative privileges on the server, you
can increase the maximum number of allowed sessions for a
single SSH connection. For example, to set the maximum
allowed sessions to 128 on a server using OpenSSH, one
would set::

MaxSessions 128

in :code:`/etc/ssh/sshd_config`, and then run the
command :code:`sudo service ssh restart`.

2. You can limit the number of concurrent devices that will
be created by setting the concurrency level to something
below the last cached level using the
:code:`concurrency_level` keyword. For example, the
settings :code:`cached_levels=5` and
:code:`concurrency_level=0` would cache the top 5 levels
of the storage heap locally, but all external I/O
operations would take place through a single storage
device (e.g., using 1 SFTP session).