Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jni/streaming-talk
Resources for a talk about streaming data analysis in Python
https://github.com/jni/streaming-talk
Last synced: about 1 month ago
JSON representation
Resources for a talk about streaming data analysis in Python
- Host: GitHub
- URL: https://github.com/jni/streaming-talk
- Owner: jni
- Created: 2015-05-02T08:17:29.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2015-08-29T12:36:37.000Z (about 9 years ago)
- Last Synced: 2024-10-05T00:31:48.939Z (about 2 months ago)
- Language: Python
- Size: 502 KB
- Stars: 15
- Watchers: 4
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.markdown
Awesome Lists containing this project
README
# Notes from a talk about streaming programming patterns
### IPython notebook from EuroSciPy 2015
`'Big data in little laptop.ipynb'` contains the IPython notebook used.
Download the Drosophila genome file from
[here](http://hgdownload.soe.ucsc.edu/goldenPath/dm6/bigZips/dm6.fa.gz),
unzip it, and place it in the `data` directory.You also need to download and `python setup.py install` IPython Memory
Usage from [here](https://github.com/ianozsvald/ipython_memory_usage).Then, if you use Anaconda and Python 3.4, it should just work!
### (Given at Melbourne Python Users Group meeting 2015-05-04)
`notes.markdown` is the (enhanced post-talk) script I was following. You should
be able to run it using [notedown](https://github.com/aaren/notedown).Use [conda](http://conda.pydata.org/docs/index.html) to recreate the
environment I used for the demo. You can use the
[`conda env`](http://conda.pydata.org/docs/commands/env/conda-env-create.html)
command with the environment descriptor `environment.yml` file.`session.py` is the log of the IPython session I did during the talk, warts and
all.`nxeuler.py` contains some helper functions for the genome assembly example,
which I didn't get to in the talk but is in the notes.`iris.csv` and `sample.fasta` are the sample datasets used in various parts of
the notebook.