{"id":29224557,"url":"https://github.com/sing-group/my-brain-seq","last_synced_at":"2026-02-23T09:49:07.338Z","repository":{"id":88302806,"uuid":"479318166","full_name":"sing-group/my-brain-seq","owner":"sing-group","description":"myBrain-Seq: a Compi pipeline for miRNA-Seq data analysis in neuropsychiatry","archived":false,"fork":false,"pushed_at":"2025-04-16T06:27:12.000Z","size":8767,"stargazers_count":9,"open_issues_count":7,"forks_count":1,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-04-16T08:08:30.824Z","etag":null,"topics":["compi","differential-expression","docker","mirna","mirna-seq","pipeline"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sing-group.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-04-08T09:00:42.000Z","updated_at":"2025-04-16T06:27:16.000Z","dependencies_parsed_at":null,"dependency_job_id":"27cc38de-7e6f-4c6a-8ab9-95d9d5baaa67","html_url":"https://github.com/sing-group/my-brain-seq","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"purl":"pkg:github/sing-group/my-brain-seq","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing-group%2Fmy-brain-seq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing-group%2Fmy-brain-seq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing-group%2Fmy-brain-seq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing-group%2Fmy-brain-seq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sing-group","download_url":"https://codeload.github.com/sing-group/my-brain-seq/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing-group%2Fmy-brain-seq/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263271501,"owners_count":23440396,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compi","differential-expression","docker","mirna","mirna-seq","pipeline"],"created_at":"2025-07-03T06:07:17.423Z","updated_at":"2026-02-23T09:49:07.292Z","avatar_url":"https://github.com/sing-group.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ctable border=\"0\"\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003cimg src=\"https://raw.githubusercontent.com/sing-group/my-brain-seq/master/resources/docs/mybrain-seq_logo.png\"\n         alt=\"myBrain-Seq logo\"\n         width=\"240px\"\n\t\t style=\"vertical-align: middle; margin-left: 0;\"/\u003e\u003c/td\u003e\n    \u003ctd style=\"vertical-align:middle;\"\u003e\n      \u003cblockquote\u003e\n        \u003cp\u003e\u003cstrong\u003emyBrain-Seq\u003c/strong\u003e is a \n\t\t\t\u003ca href=\"https://www.sing-group.org/compi/\"\u003eCompi\u003c/a\u003e pipeline for miRNA-Seq analysis of neuropsychiatric data. A Docker image is available for this pipeline in \u003ca href=\"https://hub.docker.com/r/singgroup/my-brain-seq\"\u003e this Docker Hub repository \u003c/a\u003e.\n\t\t\u003c/p\u003e\n      \u003c/blockquote\u003e\n\t\t\u003cp\u003e\n\t\t\t\u003ca href=\"https://hub.docker.com/r/singgroup/my-brain-seq\"\u003e\n\t\t\t\t\u003cimg src=\"https://img.shields.io/badge/docker-v1.3.0-green\" alt=\"dockerhub\"\u003e\n\t\t\t\u003c/a\u003e \n\t\t\t\u003ca href=\"https://github.com/sing-group/my-brain-seq\"\u003e\n\t\t\t\t\u003cimg src=\"https://img.shields.io/badge/license-MIT-brightgreen\" alt=\"license\"\u003e\n\t\t\t\u003c/a\u003e \n\t\t\t\u003ca href=\"https://hub.docker.com/r/singgroup/my-brain-seq\"\u003e\n\t\t\t\t\u003cimg src=\"https://img.shields.io/badge/hub-docker-blue\" alt=\"dockerhub\"\u003e\n\t\t\t\u003c/a\u003e \n\t\t\t\u003ca href=\"https://www.sing-group.org/compihub/explore/625e719acc1507001943ab7f\"\u003e\n\t\t\t\t\u003cimg src=\"https://img.shields.io/badge/hub-compi-blue\" alt=\"compihub\"\u003e\n\t\t\t\u003c/a\u003e\n\t\t\u003c/p\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n## myBrain-Seq repositories\n\n- [GitHub](https://github.com/sing-group/my-brain-seq)\n- [Docker Hub](https://hub.docker.com/r/singgroup/my-brain-seq)\n- [CompiHub](https://www.sing-group.org/compihub/explore/625e719acc1507001943ab7f)\n\n## Table of contents\n\n * [What does myBrain-Seq do?](#what-does-mybrain-seq-do)\n * [Using the myBrain-Seq image in Linux](#using-the-mybrain-seq-image-in-linux)\n * [Test data](#test-data)\n * [Troubleshooting](#troubleshooting)\n * [Publications](#publications)\n * [For Developers](#for-developers)\n * [Team](#team)\n\n# What does myBrain-Seq do?\n\n**MyBrain-Seq** is a [Compi](https://www.sing-group.org/compi/) pipeline for performing full analyses of miRNA-Seq data, with particular interest on neuropsychiatric data. \n\nIt can automatically identify differentially expressed microRNAs (DE miRNAs) between two conditions using two differential expression analysis software, namely DESeq2 and EdgeR, and is able to offer an integrated result suitable for experimental validation. Additionally, a functional analysis module puts biological meaning behind the list of DE miRNAs and eases the process of biomarker identification. \n\n\u003cp align=\"center\"\u003e\n\t\u003cimg src=\"https://raw.githubusercontent.com/sing-group/my-brain-seq/master/resources/docs/pipeline_workflow.png\" alt=\"myBrain-Seq workflow\" title=\"myBrain-Seq workflow\" width=\"80%\"/\u003e\n\t\u003c/br\u003e\n\u003c/p\u003e\n\nIts features and analysis are designed and tuned to work with miRNA data. We designed myBrain-Seq with the particularities of neuropsychiatric data in mind. In this way, myBrain-Seq addresses its most common limitations while offering results that help the investigator to identify potential biomarkers and molecular mechanisms for the studied conditions. When more than two conditions are involved, myBrain-Seq facilitates performing all the pairwise comparisons of interest.\n\nA typical analysis with myBrain-Seq comprises the following steps, which are further detailed below:\n- Preprocessing\n- Differential expression analysis\n- Functional analysis\n- Results summarization\n\n### Preprocessing\n\nPrepare the input FastQ files for the differential expression analysis. This process comprises:\n\n1. Quality control of the sequences using FastQC.\n2. Trimming of the adapter sequences using Cutadapt (optional).\n3. Alignment to the reference genome with Bowtie. \n4. Format conversion of the Bowtie output files to BAM using sam-tools.\n5. Quality control of the alignments with sam-tools. \n6. Quantification and annotation with featureCounts.\n\n### Differential expression analysis\n\nAfter the preprocessing was completed, myBrain-Seq performs the differential expression analysis. This process comprises:\n\n1. Differential expression analysis with DESeq2 (with/without factor correction). \n2. Differential expression analysis with EdgeR (with/without factor correction).\n3. Intersection of the DESeq2 and EdgeR results and averagement of their q-values and fold change optional).\n4. Creation of a venn diagram with the integrated results using VennDiagram.\n5. Creation of a volcano plot with the results using EnhancedVolcano.\n\nIn addition, the user can instruct myBrain-Seq to generate a genome index for the Bowtie alignment; this index will be built in parallel with the preprocessing tasks.\n\n### Functional analysis\n\nAfter the differential expression analysis, myBrain-Seq performs a functional analysis. This process comprises:\n\n1. Hierarchical clustering of the samples using the expression of the DE miRNAs.\n2. Functional enrichment analysis of the DE miRNAs using Diana Tarbase and Reactome databases as reference.\n3. Creation of a miRNA-target network, expanded using Reactome protein-protein interactions.\n\n### Results summarization\n\nFinally, a single MultiQC report is generated to summarize the results of the quality, alignment, assignment and quantification of all the samples. \n\n# Using the myBrain-Seq image in Linux\n\nTo perform a myBrain-Seq analysis users must first:\n\n1. Initialize a working directory with the files required myBrain-Seq.\n2. Add the data analysis (fastQ reads, genomes, contrast files, and so on).\n3. Configure the pipeline parameters.\n\nThis section provides a comprehensive guide on how to perform these steps and the tools and scripts included in the myBrain-Seq image to do it easily. \n\n## Running the myBrain-Seq's terminal user interface\n\nSome steps on the preparation of myBrain-Seq analysis require either to adapt and run code on a console or to use myBrain-Seq's terminal user interface (*v.console*). As the *v.console* can perform several operations, please refer to this section whenever you need to use it. To launch the *v.console* just run the following command on a terminal:\n\n```bash\ndocker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/tmp singgroup/my-brain-seq visual_console.sh\n```\n\nAn interactive menu should be displayed in your terminal. \n\n\u003cp align=\"center\"\u003e\n\t\u003cimg src=\"https://raw.githubusercontent.com/sing-group/my-brain-seq/master/resources/docs/vconsole.png\" alt=\"myBrain-Seq visual console\" title=\"myBrain-Seq visual console\" width=\"80%\"/\u003e\n\t\u003c/br\u003e\n\u003c/p\u003e\n\n## Building the directory tree\n\nTo start a new analysis, the first thing to do is build the directory tree in your local file system. This directory tree will be referred as the **working directory** and its structure is recognized and used by the pipeline during the analysis. \n\nMyBrain-seq offers two options to generate the working directory: interactively using myBrain-Seq's terminal user interface (*v.console*) or adapting and running a command in the console.  \n\n#### Creating the working directory interactively with the v.console\n\nRun the *v.console* (see section \"*Running the v.console*\") and select the option \"Initialize the working-directory\"; then, paste the full path where the \"working-directory\" should be placed and confirm.\n\n#### Creating the working directory with a command\n\nTo build the working directory adapt the first line of the following code and run it:\n\n```bash\nWORKING_DIRECTORY=/path/to/the/working-directory\ndocker run --rm -v ${WORKING_DIRECTORY}:${WORKING_DIRECTORY} -u \"$(id -u)\":\"$(id -g)\" singgroup/my-brain-seq init_working_dir.sh ${WORKING_DIRECTORY}\n```\n\n#### Structure of the working-directory\n\nAfter completing any of the above options, the selected working-directory (`mbs_project` in the example below) should have the following structure: \n\n```\n/home/user/mbs-project \n\t|-- README.txt\n\t|-- input\n\t|   |-- compi.parameters\n\t|   |-- conditions_file.txt\n\t|   `-- contrast_file.txt\n\t|-- output\n\t`-- run.sh\n```\n\nWhere:\n\n- **README.txt** contains the next steps you need to do to run the analysis. \n- **compi.parameters** contains the paths and parameters needed for the analysis.\n- **conditions_file.txt** contains the names and conditions of each fastQ file. \n- **contrast_file.txt** contains the names and labels of the conditions to compare in the differential expression analysis.\n- **run.sh** is the script to run the analysis.\n\nThe creations of these files is detailed in the following sections as well as briefly indicated in the `README.txt` file. You may find it convenient to create additional directories and files within the working directory to group all the data related to a particular study.\n\n## Writing the `compi.parameters` file\n\nThe `compi.parameters` file is used by myBrain-Seq to locate the files needed for the analysis as well as to define which optional tasks will be run. Here is an example of a `compi.parameters` file using the working directory created in the previous example:\n\n```\nworkingDir=/path/to/mbs-project\nfastqDir=/path/to/study_1/data/\ngffFile=/path/to/study_1/refs/mirbase_hsa.gff3\nconditions=/path/to/mbs-project/input/conditions_file_study_1.txt\ncontrast=/path/to/mbs-project/input/contrast_file_study_1.txt\nbwtIndex=/path/to/study_1/refs/bowtie-index_GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set\nadapter=TGGAATTCTCGGGTGCCAAGG\norganism=Homo sapiens\n```\n\nThe following are mandatory parameters:\n\n- **workingDir**: the path to the myBrain-Seq working directory of the analysis (first example of this section).\n- **fastqDir**: the path to the directory with the fastQ files to analyse.\n- **gffFile**: the path to the the GFF3 file with the miRNA annotations. This file could be obtained from [miRBase](https://www.mirbase.org/ftp.shtml) or [NCBI Genomes](https://www.ncbi.nlm.nih.gov/genome/).\n- **conditions**: the path to conditions_file.txt.\n- **contrast**: the path to contrast_file.txt.\n- **genome** *(optional if bwtIndex is provided)*: the path to the reference genome in FASTA from which the Bowtie index will be built.\n- **bwtIndex** *(optional if genome is provided)*: the path to a directory containing a Bowtie index, including the basename of the bowtie index files. If this parameter is omitted myBrain-Seq will build a new index using a genome in FASTA provided in the genome parameter.\n- **organism**: the organism used in the study. This parameter is used for the functional enrichment analysis and for the network construction. Available organisms are: *Mus musculus, Homo sapiens, Caenorhabditis elegans, Danio rerio, Rattus norvegicus, Gallus gallus, Drosophila melanogaster*.\n\nWhereas these are some useful optional parameters:\n\n- **adapter**: the sequence of the adapter to remove. If this parameter is omitted myBrain-Seq will skip the adapter removal step.\n- **gffFeature**: the name of the feature of the GFF3 file from which the attributes will be obtained; the default value is \"miRNA\".\n- **gffAttribute**: the name of the attribute in the GFF3 file to use in the annotations; the default value is \"Name\".\n- **qvalue**: FDR-corrected pvalue used to filter miRNAs after the differential expression analysis; the default value is \"0.05\".\n- **log2FC**: Absolute value of the log2FC, used to filter miRNAs after the differential expression analysis; the default value is \"0.5\".\n- **distance_method**: Method used to compute distances on the hierarchical clustering step; the default value is \"euclidean\". Availiable methods are: \"euclidean\", \"maximum\", \"manhattan\", \"canberra\", \"binary\", \"pearson\", \"abspearson\", \"correlation\", \"abscorrelation\", \"spearman\" and \"kendall\".\n\nA full list of the optional parameters is on the section [pipeline parameters](#pipeline-parameters).\n\n## Writing the `conditions_file.txt` file\n\nThe `conditions_file.txt` is a TSV file used by myBrain-Seq to link each fastQ file with its condition. This information will be used to choose the group of samples to compare in the differential expression analysis. Here is an example of a conditions file:\n\n\tname\tcondition\tlabel\t\t\t\t\tsex\t\talcohol\n\tC019 \tcontrol\t\tC_before_treatment\t\tM\t\t0\n\tC020 \tcontrol\t\tC_before_treatment\t\tM\t\t1\n\tC021 \tcontrol\t\tC_after_treatment\t\tF\t\t0\n\tC022\tcontrol\t\tC_after_treatment\t\tF\t\t1\n\tP012D\tFE \t\t\tFE_before_treatment\t\tM\t\t1\n\tP013A\tFE\t\t\tFE_before_treatment\t\tF\t\t0\n\tP014A\tFE\t\t\tFE_after_treatment\t\tM\t\t0\n\tP015D\tFE\t\t\tFE_after_treatment\t\tF\t\t1\n\tP014A\tSEP\t\t\tSEP_before_treatment\tM\t\t1\n\tP015D\tSEP\t\t\tSEP_before_treatment\tF\t\t0\n\nIn order to obtain a file with a valid format, the following considerations must be taken into account:\n\n- Columns must be separated by single tabulations.\n- The first row must be the header: “name”, “condition” and “label”.\n- The first column must be the file rootnames of the fastQ files (i.e.: C019.fastq --\u003e C019).\n- The second column must be the conditions.\n- The third column is the label, which is only used so that the user can identify each sample in case there is more than one condition. It has no impact on the analysis result but it must be present.\n- Additional columns with factors can be included. All these factors will be added to the statistical model of differential expression analysis. Only one factor per column, they can be omitted.\n\n## Writing the `contrast_file.txt` file\n\nThe `contrast_file.txt` is used by myBrain-Seq to perform the comparisons between samples of two different conditions in the differential expression analysis. Each line on this file corresponds with a contrast that myBrain-Seq has to perform. Here is an example of a contrast file:\n\n```\nname\n\"Control-First_episode\" = \"C-FE\"\n\"Control-Second_episode\" = \"C-SEP\"\n\"First_episode-Second_episode\" = \"FE-SEP\"\n```\n\nThe first line of `contrast_file.txt` is the header, the following lines begin with the contrast label (left side of the equal sign) and the factors to compare (right side of the equal sign). In order to obtain a file with a valid format, the following considerations must be taken into account:\n\n- The first row should be \"name\", in lowercase.\n- The following rows must follow this structure: double quotes, ***label of the factor to compare***, hyphen, ***label of the reference factor***, double quotes, space, equal sign, space, double quotes, ***factor to compare***, hyphen, ***reference factor***, double quotes. No additional spaces should be added, use underscore symbol instead (eg.: *First episode* should be *First_episode*). Here is a visual representation of this structure where \"B\" is the reference factor: `\"Label_A-Label_B\" = \"Factor_A-Factor_B\"`\n- The name of the factors to be compared (right side of the equal sign) must be the same as those specified in the \"condition\" column of the `conditions_file.txt`.\n\n## Running myBrain-Seq analysis\n\nOnce all the required files were built, to start myBrain-Seq analysis run the script \"run.sh\" placed on the root of the working directory. This also can be done interactively by using the *v.console* (see section \"*Running the v.console*\"). To run the script manually adapt the following code:\n\n```bash\n/path/to/working-dir/run.sh /path/to/compi.parameters\n```\n\n### Adapting myBrain-Seq execution\n\nMyBrain-Seq admits some parameters to customize the execution. Using these parameters you can perform partial executions or control the number of parallel processes of the analysis. These parameters should be quoted and added at the end of the running command:\n\n- **Start myBrain-Seq execution at a specific task**: `/path/to/working-dir/run.sh /path/to/compi.parameters \"--from task_name\"`\n\n- **Run myBrain-Seq until a specific task**: `/path/to/working-dir/run.sh /path/to/compi.parameters \"--until task_name\"`\n- **Start myBrain-Seq execution after a specific task**: `/path/to/working-dir/run.sh /path/to/compi.parameters \"--after task_name\"`\n- **Change the number of parallel processes** *(default 5)*: `/path/to/working-dir/run.sh /path/to/compi.parameters \"--num-tasks 2\"`\n\nYou can combine several parameters to gain greater control of the analysis process. Since these parameters are controlled by the Compi framework, please refer to the [Compi manual](https://www.sing-group.org/compi/docs/introduction.html#) for more information.\n\n## Find out tasks with errors\n\nSome tasks may produce errors that do not cause the pipeline to fail, but they can be important. Such errors are reported in the log files produced in the `logs` directory of the pipeline working directory. Inside this directory myBrain-Seq will create additional directories with the logs of each execution, they will be named with the date and hour of the analysis. Files containing the errors are saved with extension `*.err.log`, whereas normal output is saved with extension `*.out.log`.\n\n# myBrain-Seq parameters\n\nMyBrain-Seq needs the values of some parameters to work, as already indicated in the [*writing the `compi.parameters` file*](#writing-the-compiparameters-file) section. However, optional parameters with default values can also be edited by adding them to the compi.parameters file. Below is a list of all myBrain-Seq parameters:\n\n\u003ctable class=\"tg\" style=\"undefined;table-layout: fixed; width: 759px\"\u003e\n\u003ccolgroup\u003e\n\u003ccol style=\"width: 200px\"\u003e\n\u003ccol style=\"width: 559px\"\u003e\n\u003c/colgroup\u003e\n\u003cthead\u003e\n  \u003ctr\u003e\n    \u003cth class=\"tg-3mtf\"\u003eParameter\u003c/th\u003e\n    \u003cth class=\"tg-3mtf\"\u003eDescription\u003c/th\u003e\n  \u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003eworkingDir\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe working directory of the project.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003efastqDir\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe directory containing the fastq files (default is relative to workingDir).\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003eoutDir\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe directory containing the pipeline outputs (relative to workingDir).\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003eorganism\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe organism from which the data was obtained, needed for the functional enrichment analysis.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003eadapter\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe sequence of the adapter to remove.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003egenome\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe directory path to the genome to align.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003ebwtIndex\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe absolute path to the rootname of the Bowtie index.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003egffFile\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe path to the .gff file of the reference genome.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003egffFeature\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eFeature of the .gff file to use for the annotations (eg.`: miRNA, gene, transcript...), default miRNA.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003egffAttribute\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eAttribute of the .gff to use in the annotations (eg. Name, gene_id, transcript_id...), default \"Name\".\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003econditions\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe path to the .tsv file with the rootnames of the samples, conditions and labels.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003econtrast\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe path to the .tsv file with the contrast DESeq2 has to perform.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003eqvalue\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eFDR-corrected pvalue used to filter miRNAs after the differential expression analysis.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003elog2FC\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eAbsolute value of the log2FC, used to filter miRNAs after the differential expression analysis.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003edistance_method\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eMethod used to compute distances on the hierarchical clustering step, default euclidean.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003evennFormat\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe file format of the Venn diagram (png/svg/tiff), default png.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003efqcOut\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to the directory containing the FastQC results.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003ectdOut\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the directory containing the Cutadapt results.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003ebwtOut\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to the directory containing the Bowtie results.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003ebamstOut\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the directory containing the Samtools stats and Plot-bamstats results.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003eftqOut\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to the directory containing the FeatureCounts results.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003edsqOut\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the directory containing the DESeq2 results.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003eedgOut\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to the directory containing the EdgeR results.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003edeaIntOut\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the directory containing the results of the DESeq2 and EdgeR integration.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003emqcOut\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to the directory containing the MultiQC report.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003escriptsDir\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the directory containing the R scripts.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003etestAdapterBashScript\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to the directory containing the R script to get the path of the aligned/unaligned data.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003edeSeq2Rscript\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the directory containing the R script to run the DESeq2 analysis.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003efilterCtsRscript\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to the directory containing the R script used to filter all-counts.txt and conditions_file.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003eedgerRscript\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the directory containing the R script to run the EdgeR analysis.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003eenhancedVolcanoRscript\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to the directory containing the R script to build the Volcano plot.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003edeaIntRscript\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the directory containing the R script to run the DESeq-EdgeR results integration.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003evennRscript\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to the directory containing the R script to run the DESeq-EdgeR results integration.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003edeseq2NormalizationRscript\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the directory containing the R script for the creation of the hclust table.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003ehclustMakeTableRscript\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to the directory containing the R script for the creation of the hclust table.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003ehclustRscript\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the directory containing the R script for the hierarchical clustering analysis.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003efunctionalEnrichmentRscript\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to the directory containing the R script for the functional enrichment analysis.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003enetworkRscript\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the directory containing the R script for the network creation.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003edatabasesDir\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to the directory containing the TarBase and Reactome databases.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003ebwt_additional_args\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eAdditional arguments to pass to Bowtie.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003eft_additional_args\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eAdditional arguments to pass to FeatureCounts.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003etarbaseDB\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the TarBase file.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003ereactomeDB\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eThe relative path to a Reactome file with Ensembl IDs and Reactome IDs.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003ereactomeInteractionsDB\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eThe relative path to the Reactome file downloaded from \u003ca href=\"https://reactome.org/download/current/interactors/reactome.all_species.interactions.tab-delimited.txt\" target=\"_blank\" rel=\"noopener noreferrer\"\u003ehere\u003c/a\u003e and renamed as ReactomeInteractions.txt.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003erDeseq2Version\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eVersion of the pegi3s/r_deseq2 Docker image to use.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003erEdgerVersion\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eVersion of the pegi3s/r_edger Docker image to use.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003erEnhancedVolcanoVersion\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eVersion of the pegi3s/r_enhanced-volcano Docker image to use.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003ecutadaptVersion\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eVersion of the pegi3s/cutadapt Docker image to use.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003efastqcVersion\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eVersion of the pegi3s/fastqc Docker image to use.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003ebowtieVersion\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eVersion of the pegi3s/bowtie1 Docker image to use.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003efeatureCountsVersion\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eVersion of the pegi3s/feature-counts Docker image to use.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003esamtoolsVersion\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eVersion of the pegi3s/samtools_bcftools Docker image to use.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003esamtoolsBamstatsVersion\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eVersion of the pegi3s/samtools_bcftools Docker image to use for bam analysis.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003erdatanalysisVersion\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eVersion of the pegi3s/r_data-analysis Docker image to use.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003erVennVersion\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eVersion of the pegi3s/r_venn-diagram Docker image to use.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003erNetworkVersion\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eVersion of the pegi3s/r_network Docker image to use.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-v5y2\"\u003emultiqcVersion\u003c/td\u003e\n    \u003ctd class=\"tg-9ika\"\u003eVersion of the pegi3s/multiqc Docker image to use.\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd class=\"tg-1ot7\"\u003eselectDEAsoftware\u003c/td\u003e\n    \u003ctd class=\"tg-azvr\"\u003eUse this param to select the differential expression analysis software (deseq, edger or both).\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\n\n# Test data\n\nThe sample data is available [here](http://static.sing-group.org/software/myBrainSeq/downloads/test-data-1.0.zip). Download and decompress it to get a directory named `working-dir` that contains an example of a functional working directory, were the data and biological references were grouped within it. Here you can find:\n\n- A directory called `input`, with the `compi.parameters`, `condition_file.txt` and `contrast_file.txt` of this particular study.\n- A directory called `data`, with the fastQ files of the study, the Bowtie index and the miRNA annotations. \n\nTo run the pipeline with this test data, edit the `compi.parameters` (at `/working-dir/input`) and modify the paths to adapt them to the absolute location of the working directory in your computer (e.g.: `workingDir=/working-dir` could be `workingDir=/home/user/working-dir`). After doing this, just run the `run.sh` script included.\n\n## Running time\n\n- ≈ 6 minutes - 5 parallel tasks - Ubuntu 20.04.4 LTS, 8 CPUs (Intel® Core™ i7-9700 @ 3.00GHz), 16GB of RAM and SSD disk.\n- ≈ 12 minutes - 5 parallel tasks - Ubuntu 18.04.6 LTS, 8 CPUs (Intel® Core™ i7-8565U @ 1.80GHz), 16GB of RAM and SSD disk.\n\n# Troubleshooting\n\n## Using hard disk drives in Windows formats\n\nThe `build-genome-index` task can fail when a hard disk drive in a Windows format like FAT/exFAT is used. When using such formats for storing the pipeline data, the following error may appear in the logs:\n\n```\n\u003e [MBS | build-genome-index]: Running bowtie for the genome index creation\n\u003e docker: Error response from daemon: error while creating mount source path '/path/to/analysis/input/genome/Homo_sapiens/bowtie-index_Homo_sapiens': chown /path/to/analysis/input/genome/Homo_sapiens/bowtie-index_Homo_sapiens: operation not permitted.\n```\n\nTo overcome this issue, the folder for the genome index must be created before running the pipeline (`mkdir -p /path/to/analysis/input/genome/Homo_sapiens/bowtie-index_Homo_sapiens`).\n\n# Publications\n\n- Pérez-Rodríguez, D., Agís-Balboa, R. C., \u0026 López-Fernández, H. (2023). [MyBrain-Seq: A Pipeline for MiRNA-Seq Data Analysis in Neuropsychiatric Disorders](https://doi.org/10.3390/biomedicines11041230). Biomedicines, 11(4), Article 4. \n- D. Pérez-Rodríguez; M. Pérez-Rodríguez; R.C. Agís-Balboa; H. López-Fernández (2022) [Towards a flexible and portable workflow for analyzing miRNA-seq neuropsychiatric data: an initial replicability assessment](https://doi.org/10.1007/978-3-031-17024-9_4). 16th International Conference on Practical Applications of Computational Biology \u0026 Bioinformatics: PACBB 2022. L'Aquila, Italy. 13 - July\n\n## Papers using myBrainSeq\n\n- D. Pérez-Rodríguez; M. Arancha Penedo; T. Rivera-Baltanás; T. Peña-Centeno; S. Burkhardt; A. Fischer; J. M. Prieto-González; J. M. Olivares Díez; H. López-Fernández; R. C. Agís-Balboa (2023) [MiRNA differences related to treatment resistant schizophrenia](https://doi.org/10.3390/ijms24031891). International Journal of Molecular Sciences. Volume 24(3), 1891. ISSN: 1422-0067\n\n## Related work\n\n- Pérez-Rodríguez, D., López-Fernández, H., \u0026 Agís-Balboa, R. C. (2021). Application of miRNA-seq in neuropsychiatry: A methodological perspective. Computers in Biology and Medicine, 135, 31-42. https://doi.org/10.1016/j.compbiomed.2021.104603\n- Pérez-Rodríguez, D., López-Fernández, H., \u0026 Agís-Balboa, R. C. (2022). On the Reproducibility of MiRNA-Seq Differential Expression Analyses in Neuropsychiatric Diseases. En M. Rocha, F. Fdez-Riverola, M. S. Mohamad, \u0026 R. Casado-Vara (Eds.), Practical Applications of Computational Biology \u0026 Bioinformatics, 15th International Conference (PACBB 2021) (pp. 41-51). Springer International Publishing. https://doi.org/10.1007/978-3-030-86258-9_5\n\n# For Developers\n\n## Building the Docker image\n\nTo build the Docker image, [`compi-dk`](https://www.sing-group.org/compi/#downloads) is required. Once you have it installed, simply run `compi-dk build -drd -tv` from the project directory to build the Docker image. The image will be created with the name specified in the `compi.project` file. This file also specifies the version of compi that goes into the Docker image.\n\n## Versioning\n\nThe pipeline version is set in the `\u003cversion\u003e` section of the `pipeline.xml`. Nevertheless, as the version number is referenced from other sites, it is recommended to update it using the following command:\n\n```\n./resources/development/set-new-version.sh $(pwd) \u003cnew_version\u003e\n```\n\n# Team \n\nMyBrain-Seq is a pipeline developed by the SING and NeuroEpigenetics Lab groups.\n\n- Daniel Pérez-Rodríguez [![ORCID](https://info.orcid.org/wp-content/uploads/2020/12/orcid_16x16.gif)](https://orcid.org/0000-0002-8110-3567), daniel.prz.rodriguez@gmail.com \n- Hugo López-Fernández [![ORCID](https://info.orcid.org/wp-content/uploads/2020/12/orcid_16x16.gif)](https://orcid.org/0000-0002-6476-7206), hlfernandez@uvigo.es\n- Roberto C. Agís-Balboa [![ORCID](https://info.orcid.org/wp-content/uploads/2020/12/orcid_16x16.gif)](https://orcid.org/0000-0001-9899-9569), roberto.carlos.agis.balboa@sergas.es\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsing-group%2Fmy-brain-seq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsing-group%2Fmy-brain-seq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsing-group%2Fmy-brain-seq/lists"}