Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/eiriniar/CellCnn
Representation Learning for detection of phenotype-associated cell subsets
https://github.com/eiriniar/CellCnn
Last synced: 2 months ago
JSON representation
Representation Learning for detection of phenotype-associated cell subsets
- Host: GitHub
- URL: https://github.com/eiriniar/CellCnn
- Owner: eiriniar
- License: gpl-3.0
- Created: 2016-02-15T14:49:10.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2020-10-24T15:29:23.000Z (about 4 years ago)
- Last Synced: 2024-08-02T16:44:09.755Z (6 months ago)
- Language: Jupyter Notebook
- Homepage: http://www.imsb.ethz.ch/research/claassen/Software/cellcnn.html
- Size: 78.5 MB
- Stars: 64
- Watchers: 10
- Forks: 28
- Open Issues: 4
-
Metadata Files:
- Readme: README.rst
- License: LICENSE.txt
Awesome Lists containing this project
- awesome-single-cell - CellCNN - [Python] - Representation Learning for detection of phenotype-associated cell subsets (Software packages / RNA-seq)
README
=======
CellCnn
=======Installation
============CellCnn was originally written in Python2.7. For a Python3 version, please check out this branch:
https://github.com/eiriniar/CellCnn/tree/python3There are several ways to run Python, but we recommend using a virtual environment.
To set up a virtual environment, you can perform the following steps:1. Download the Python2.7 installation script corresponding to your operating system from https://conda.io/miniconda.html . For example, for Mac OS it should be called "Miniconda2-latest-MacOSX-x86_64.sh".
2. Run the installation script (please use the script name corresponding to your operating system):
``bash Miniconda2-latest-MacOSX-x86_64.sh``3. Open a new terminal and create a virtual environment for CellCnn, e.g. "cellcnn_env":
``conda create --name cellcnn_env python=2.7``4. Activate the virtual environment:
``source activate cellcnn_env``----
After having Python2.7 running on your system, please do the following to install CellCnn:
1. Clone the CellCnn repository:
``git clone https://github.com/eiriniar/CellCnn.git``2. Install the CellCnn dependencies:
``pip install -r https://raw.githubusercontent.com/eiriniar/CellCnn/master/requirements.txt``3. To install CellCnn, run the following command after replacing ``path_to_CellCnn`` with the actual path in your system:
``pip install -e path_to_CellCnn/CellCnn``----
Changed in **version v0.2**: we now use the lightweight ``flowIO`` package for reading mass/flow cytometry data.
Thanks to the package author Scott White for pointing out this possibility!Usage
=====Examples are provided in the subfolder ``CellCnn/cellCnn/examples``.
----
Alternatively, for the analysis of mass/flow cytometry samples, CellCnn can be run from the command line.
To get a list of command line options please run:``python run_analysis.py --help``
For a CellCnn analysis with default settings only two arguments have to be provided:
``python run_analysis.py -f fcs_samples_with_labels.csv -m markers.csv``
| The first input argument is a two-column CSV file, where the first column specifies input sample filenames and the second column the corresponding class labels. An example file is provided in ``CellCnn/cellCnn/examples/NK_fcs_samples_with_labels.csv``.
| The second input argument is a CSV file containing the names of markers/channels that should be used for the analysis. An example file is provided in ``CellCnn/cellCnn/examples/NK_markers.csv``.For example, to perform the analysis outlined in ``CellCnn/cellCnn/examples/NK_cell.ipynb``
from the command line, you can run the following (assuming your current directory is ``CellCnn/cellCnn/examples``):``python ../run_analysis.py -f NK_fcs_samples_with_labels.csv -m NK_markers.csv -i NK_cell_dataset/gated_NK/ -o outdir_NK --export_csv --group_a CMV- --group_b CMV+ --verbose 0``
The above command performs a binary classification CellCnn analysis, exports the learned filter weights as CSV files in the directory ``outdir_NK/exported_filter_weights`` and generates result plots in ``outdir_NK/plots``. The following plots are generated:
filter_plots
""""""""""""- clustered_filter_weights.pdf :
Filter weight vectors from all trained networks that pass a validation accuracy
threshold, grouped in clusters via hierarchical clustering. Each row corresponds to
a filter. The last column(s) indicate the weight(s) connecting each filter to the output
class(es). Indices on the y-axis indicate the filter cluster memberships, as a
result of the hierarchical clustering procedure.
- consensus_filter_weights.pdf :
One representative filter per cluster is chosen (the filter with minimum distance to all
other memebers of the cluster). We call these selected filters "consensus filters".
- best_net_weights.pdf :
Filter weight vectors of the network that achieved the highest validation accuracy.
- filter_response_differences.pdf :
Difference in cell filter response between classes for each consensus filter.
To compute this difference for a filter, we first choose a filter-specific class, that's
the class with highest output weight connection to the filter. Then we compute the
average cell filter response (value after the pooling layer) for validation samples
belonging to the filter-specific class (``v1``) and the average cell filter response
for validation samples not belonging to the filter-specific class (``v0``).
The difference is computed as ``v1 - v0``. For regression problems, we cannot compute
a difference between classes. Instead we compute Kendall's rank correlation coefficient
between the predictions of each individual filter (value after the pooling layer) and
the true response values. This plot helps decide on a cutoff (``filter_diff_thres`` parameter)
for selecting discriminative filters.training_plots
""""""""""""""These plots are generated on the basis of samples used for model training.
- tsne_all_cells.png :
Marker distribution overlaid on t-SNE map.In addition, the following plots are produced for each selected filter (e.g. filter ``i``):
- cdf_filter_i.pdf :
Cumulative distribution function of cell filter response for filter ``i``. This plot
helps decide on a cutoff (``filter_response_thres`` parameter) for selecting the
responding cell population.
- selected_population_distribution_filter_i.pdf :
Histograms of univariate marker expression profiles for the cell population selected by
filter ``i`` vs all cells.
- selected_population_frequencies_filter_i.pdf :
Boxplot of selected cell population frequencies in samples of the different classes, if running a classification problem.
For regression settings, a scatter plot of selected cell population frequencies vs response variable is generated.
- tsne_cell_response_filter_i.png :
Cell filter response overlaid on t-SNE map.
- tsne_selected_cells_filter_i.png :
Marker distribution of selected cell population overlaid on t-SNE map.validation_plots
""""""""""""""""Same as the training_plots, but generated on the basis of samples used for model validation.
----
After performing model training once, you can refine the plots with different cutoff values for the selected filters and
cell populations. Training does not have to be repeated for refining the plots. The pre-computed results can be used with the option ``--load_results``.Another relevant argument is ``--export_selected_cells``, which produces a CSV result file for each input FCS file and stores it in ``outdir/selected_cells``. Rows in the CSV result file correspond to cells in the order found in the FCS input file.
The CSV result file contains two columns per selected filter, the first indicating the cell filter response as a continuous value and the second containing a binary value resulting from thresholding the continuous cell filter response. This later column is an indicator of whether a cell belongs to the cell population selected by a particular filter.``python ../run_analysis.py -f NK_fcs_samples_with_labels.csv -m NK_markers.csv -i NK_cell_dataset/gated_NK/ -o outdir_NK --group_a CMV- --group_b CMV+ --filter_response_thres 0.3 --load_results --export_selected_cells``
Documentation
=============For additional information, CellCnn's documentation is hosted on http://eiriniar.github.io/CellCnn/