{"id":19091981,"url":"https://github.com/allencellmodeling/fov_processing_pipeline","last_synced_at":"2025-02-22T07:27:50.248Z","repository":{"id":91697815,"uuid":"220048346","full_name":"AllenCellModeling/fov_processing_pipeline","owner":"AllenCellModeling","description":null,"archived":false,"fork":false,"pushed_at":"2020-02-20T19:34:54.000Z","size":48179,"stargazers_count":0,"open_issues_count":8,"forks_count":0,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-01-02T23:14:21.768Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AllenCellModeling.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-11-06T17:04:01.000Z","updated_at":"2020-02-04T23:36:44.000Z","dependencies_parsed_at":"2023-03-05T04:30:52.356Z","dependency_job_id":null,"html_url":"https://github.com/AllenCellModeling/fov_processing_pipeline","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AllenCellModeling%2Ffov_processing_pipeline","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AllenCellModeling%2Ffov_processing_pipeline/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AllenCellModeling%2Ffov_processing_pipeline/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AllenCellModeling%2Ffov_processing_pipeline/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AllenCellModeling","download_url":"https://codeload.github.com/AllenCellModeling/fov_processing_pipeline/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240139517,"owners_count":19754128,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-09T03:17:43.843Z","updated_at":"2025-02-22T07:27:50.238Z","avatar_url":"https://github.com/AllenCellModeling.png","language":"Python","readme":"# FOV Processing Pipeline\n\n[![Build Status](https://github.com/AllenCellModeling/fov_processing_pipeline/workflows/Build%20Master/badge.svg)](https://github.com/AllenCellModeling/fov_processing_pipeline/actions)\n[![Code Coverage](https://codecov.io/gh/AllenCellModeling/fov_processing_pipeline/branch/master/graph/badge.svg)](https://codecov.io/gh/AllenCellModeling/fov_processing_pipeline)\n\nPipeline tools for high-throughput analysis of AICS Pipeline FOVs\n\n---\n\n## Features\nIt's a data pipeline for Pipeline 4 Data!\nThe toolkit demonstrates a proof of concept for...\n* Accessing files via labkey\n* Access FOV-level and Cell-level images and metadata\n* Perform simple quality control tests\n* Make some simple plots for data exploration\nand...\n* Distributed parallelization via Prefect/Dask. Wow!\n\nFor more information see [this presentation](https://docs.google.com/presentation/d/13nFQ0KDxBti7Vgont6fcrv0gaE3NaGr-Deb-aNl2xLY/edit?usp=sharing)\n\n## To do\n* ~**Source data from Quilt rather than LabKey**~ \u003cbr\u003e\n* **Store results such as summary statistics with AnnData rather than Pandas dataframes** \u003cbr\u003e\nThe data type and process for producing statistics calculated from this pipeline require flexibility and annotations. The current formatting in Pandas is rigid in the types of data that can be added and does not carry any annotations with it. This can results in column names which are simultaneously lengthy and incomplete/inadequate to describe the stored statistics. This pipeline would be improved by implementing a tool like this to better document all stored statisticas/results/summaries in tables.\n\n\n## If installing *somewhere other than AICS compute-cluster infrastructure* (e.g. your local machine)\n... you will need:\n\n**AICS certificates** to be able to install the required package `lkaccess`. Instructions to setup certs on an macOS machine are as follows:\n\n- Visit http://confluence.corp.alleninstitute.org/display/SF/Certs+on+OS+X\n- Download the three .crt files, open each and keychain to System and hit 'Add' to trust\n- Download `pip_conf_setup.sh` to project directory\n- Install wget: `brew install wget`\n- Run the downloaded setup file: `sudo bash pip_conf_setup.sh`\n\n## Installation\n**(Optional)** Make a conda environment\n```\nconda create --name fov_processing_pipeline python=3.7  \nconda activate fov_processing_pipeline\n```\n\n**Clone the Repo**\n```\ngit clone https://github.com/AllenCellModeling/fov_processing_pipeline.git\ncd fov_processing_pipeline\n```\n\n**Install**  \n```\npip install .[dev]\n```\n\n## If running locally (e.g. macOS)\n(do this after the installation)  \nNote: Image loading with the remotely mounting the data repository will be much slower than running from AICS compute infrastructure.\n\n**mount the remote data repository**, which can be done on macOS with \n\n```\nmount_smbfs //\u003cYOUR_USERNAME\u003e@allen/programs/allencell/data ./data/\n```\n\nTo unmount when you're all done:\n\n```\numount ./data/\n```\n\n## Quick start\nTo run the entire pipeline with default settings, start by following the above instructions for installation, environment setup and mounting the data. Then navigate to the repository in a terminal and simply run\n\n```\nfpp_process\n```\n\nIf you want to see all the options do\n```\nfpp_process -h\n```\n\nIf you want to use this in a distributed context, then read the directions [here](./docs/distributed_instructions.md).\n\n\nRunning the pipeline will include creation of FOV summary table, quality control, diagnostic image production and creation of some basic plots of z-intensity profiles of FOV channels for all structures. This runs the pipeline in the default configuration, which trims the data to only 10 FOVs per cell line, and includes only the following cell lines:\n- Nuclear lamin (Lamin B1)\n- Nucleolus DFC (Fibrillarin)\n- Nucleolus GC (Nucleophosmin)\n- Gogli (Sialytransferase 1)\n- ER (Sec61 beta)\n- Alpha actinin (Alpha actinin 1)\n- Actomyosin bundles (Non-muscle myosin heavy chain IIB)\n- Alpha tubulin\n\nFlags can be used to overwrite existing results if you have previously run the pipeline, or to generate new plots without regenerating the data, using (respectively):\n```\n--overwrite\n--use_current_results\n```\n\n## Description of Software\n\nThe main function to run the code is `fov_processing_pipeline/bin/process.py:main`. The code is run via a Dask/Prefect flow, that run locally by default. \n\n`process.py:main` calls functions from `fov_processing_pipeline/wrappers.py`, that each perform a specific task. An incomplete list of those tasks are:\n\n### Save and Load Data - wrappers.save_load_data()\n\nThis function returns a dataframe containing all of the FOV information needed for processing\n\n### Get manifest of all files that are saved out - wrappers.get_save_paths()\n\n### Per-FOV processing operations - wrappers.process_fov_row()\n\n### Gather all FOV results - wrappers.load_stats()\n\n### Quality Control - wrappers.qc_stats()\nThis pipeline include a couple simple protocols for quality control of FOV data.\n* Number of z-slices in z-stacks\nNot all FOV's have the same number of z-slices (or xy images) in their z-stacks. This can impact the user's ability to perform some statistical analysis on the data, such as PCA. To correct for this, the pipeline finds the most common number of z-slices, and interpolates z-stacks with more or less z-slices into a new z-stack with the same number of z-slices.\n* Out-of-order z-slices\nThrough our data exploration using this pipeline, we discovered that some z-stacks have a z-slice from the middle of the z-stack misplced to the bottom of the z-stack. To address this, the QC processing of this pipeline removes those FOV's from the FOV dataset (i.e. it is not included in the FOV summary table used to all image analysis).\n\n### Diagnostics and Analysis - wrappers.stats2plots(), wrappers.im2diagnostics()\nThis pipeline includes some basic image diagnostic and analysis tools to get the user started with exploring the data.\n* Diagnostic images: For a quick view of FOV z-stacks, an image of the maximum project along the xy- xz- and yz- axes are rendered in a single image for each z-stacks, including all channels in different colors\n* Channel intensity by z-depth: To display how structure varies across height within a z-stack, and average intensity profile as a function of z is generated for each channel, for each structure; that is, for all FOV's with the same labeled structure, the brightfield, DNA, cell membrane, and structure intensity is averaged across all FOVs at each z-height, and plotted. These intensity profiles may be plotted against the actual z-height, or can be centered relative to the maximum position of the DNA\n \n## Documentation\nFor full package documentation please visit [AllenCellModeling.github.io/fov_processing_pipeline](https://AllenCellModeling.github.io/fov_processing_pipeline).\n\n## Development\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for information related to developing the code.\n\n***Free software: Allen Institute Software License***\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fallencellmodeling%2Ffov_processing_pipeline","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fallencellmodeling%2Ffov_processing_pipeline","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fallencellmodeling%2Ffov_processing_pipeline/lists"}