{"id":19162920,"url":"https://github.com/hernanmd/structurepipelines","last_synced_at":"2025-08-09T21:11:22.613Z","repository":{"id":71141384,"uuid":"169195455","full_name":"hernanmd/STRUCTUREPipelines","owner":"hernanmd","description":"Pipelines to run STRUCTURE using multiple front-ends (CLUMPAK+StrAuto, Structure_threader)","archived":false,"fork":false,"pushed_at":"2020-10-05T21:48:32.000Z","size":30,"stargazers_count":10,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-05-07T11:39:39.740Z","etag":null,"topics":["bioinformatics","clumpp","population-genetics","shell-script","structure"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hernanmd.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-02-05T05:35:55.000Z","updated_at":"2024-11-02T13:17:14.000Z","dependencies_parsed_at":null,"dependency_job_id":"e1a96f38-19dd-4ee7-a729-5c607cbb1a83","html_url":"https://github.com/hernanmd/STRUCTUREPipelines","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/hernanmd/STRUCTUREPipelines","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hernanmd%2FSTRUCTUREPipelines","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hernanmd%2FSTRUCTUREPipelines/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hernanmd%2FSTRUCTUREPipelines/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hernanmd%2FSTRUCTUREPipelines/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hernanmd","download_url":"https://codeload.github.com/hernanmd/STRUCTUREPipelines/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hernanmd%2FSTRUCTUREPipelines/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259426881,"owners_count":22855548,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","clumpp","population-genetics","shell-script","structure"],"created_at":"2024-11-09T09:13:32.888Z","updated_at":"2025-06-12T08:03:56.909Z","avatar_url":"https://github.com/hernanmd.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Table of Contents\n\n- [Description](#description)\n- [General Installation](#general-installation)\n- [Pipeline CLUMPAK(StrAuto) + distruct](#pipeline-clumpakstrauto--distruct)\n  - [Requirements](#requirements)\n  - [Usage](#usage)\n  - [Download CLUMPAK](#download-clumpak)\n  - [Run Main Pipeline (CLUMPAK)](#run-main-pipeline-clumpak)\n  - [Run Distruct for many Ks](#run-distruct-for-many-ks)\n- [Pipeline Structure_threader](#pipeline-structure_threader)\n  - [Installation](#installation)\n  - [Usage](#usage-1)\n    - [Create a \"popfile\"](#create-a-popfile)\n    - [Edit the \"mainparams\" and optionally the \"extraparams\" file](#edit-the-mainparams-and-optionally-the-extraparams-file)\n    - [Run analysis](#run-analysis)\n    - [Plot results](#plot-results)\n- [Pipeline fastSTRUCTURE with Docker](#pipeline-faststructure-with-docker)\n  - [Installation](#installation-1)\n  - [Run image under Windows MSYS2 or WSL](#run-image-under-windows-msys2-or-wsl)\n  - [Run image under Linux/OSX](#run-image-under-linuxosx)\n\n# Description\n\nThis repository contains three pipelines with scripts to run locally several types of STRUCTURE analysis :\n\n  - Pipeline [CLUMPAK](http://clumpak.tau.ac.il/index.html) analysis of STRUCTURE results produced by [StrAuto](https://www.crypticlineage.net/software/strauto/) and generate [distruct](https://www.crypticlineage.net/software/distruct/) figures.\n  - Pipeline Structure_threader with fastStructure\n  - Pipeline fastStructure with Docker\n\n# General Installation\n\n```bash\ngit clone https://github.com/hernanmd/STRUCTUREPipelines.git\ncd runstructure\n```\n\n# Pipeline CLUMPAK(StrAuto) + distruct\n\n## Requirements\n\n  - STRUCTURE input file (.str). \n    - The default name used in the configuration files is project_data.str\n  - StrAuto results should be already available in a .zip file\n    - The results should be zipped into a single .zip file.\n\t- Default name is stresults.zip, with the following structure:\n\n```bash\n       k1.zip\n               k1/\n\t\t\tproject_data_k1_run10_f\n\t\t\tproject_data_k2_run1_f\n\t\t\t...\n       k2.zip\n               k2/\n                       project_data_k1_run10_f\n                       project_data_k2_run1_f\n\t\t\t...\n```\n\n## Usage\n\n  - Put your StrAuto results into a subdirectory\n  - Edit environment variables in the file rsEnvVars.sh\n\n## Download CLUMPAK\n\n```bash\n./rsGetClumpak.sh\n```\n## Run Main Pipeline (CLUMPAK)\n\n  - The runStrClumpak script performs the following actions:\n    - Read environment variables in rsEnvVars.sh as parameters.\n    - Create the output directory.\n    - Build the populations file.\n    - Run the CLUMPAK Perl script\n\n```bash\n./runStrClumpak\n```\n\n## Run Distruct for many Ks\n\n  - The following script perform the following actions:\n    - Read environment variables as parameters.\n    - Create the output directory.\n    - Build the populations file.\n    - Run DistructForManyKs perl script\n\n```bash\n./runStrDistructForManyKs\n```\n\n# Pipeline Structure_threader\n\n## Installation\n\nWiP\n\n## Usage\n\nTo use this pipeline you should have your input files both in PED/MAP format (to generate the populations file) and in BED/BIM/FAM format (required by fastSTRUCTURE).\nALSO the structure input file, which could be generated from PLINK using the \"--recode structure\" option.\nIt is highly recommended to put the input files in a separate subdirectory.\nThe output directory will be created if not already present.\n\n### Create a \"popfile\" \n\n```bash\n# Create the required popfile from the PED file. \n# The first parameter should be a PED file name which should be specified WITHOUT the .ped extension\n# The second parameter should be the species name (as understood by PLINK): cow, horse, etc.\n# The output is a new file named \"popfile\" suitable for Structure_threader plots\n./mkPopFile ../STRUCTURE_PIPrun/project_input/file species\n```\n\n### Edit the \"mainparams\" and optionally the \"extraparams\" file\n\n  - If you have not mainparams and extraparams files in your input directory, then run the ./runFsStrThreader.sh script to generate a template version of both files.\n  - Edit with your favorite editor: `nano project_input/mainparams`\n\n### Run analysis\n\nTo run Structure_threader you must specify the following parameters\n\n  - 1st parameter is the DIRECTORY where input files are located\n  - 2nd parameter is the BED file (using PED is not valid for now)\n  - 3rd parameter is the DIRECTORY where output will be written\n  - 4th parameter is the name of the popfile generated with mkPopFile script.\n  - 5th parameter is the number of maximum K:\n\nExample:\n\n```bash\n ./runFsStrThreader.sh project_input/ file.bed project_output/ popfile 24\n```\n\nThe Structure_threader already generates a plots subdirectory with HTML/SVG paired files into the output directory, however this script will also generate a \"Comparative Plots\" in a comparativePlotAllKs directory\n\n# Pipeline fastSTRUCTURE with Docker \n\n## Installation\n\nInstall Docker\nUnder Windows: Launch MSYS2\nUnder Linux/OSX: Launch Terminal\nFetch fastStructure docker image from [https://hub.docker.com/r/dockerbiotools/faststructure](https://hub.docker.com/r/dockerbiotools/faststructure)\n\n```bash\ndocker pull dockerbiotools/faststructure\n```\n\n## Run image under Windows MSYS2 or WSL\n\n```bash\n# Get the image id from the following command\ndocker images\n# Make a directory for your dataset\nmkdir data # Or whatever your population name is\n# Install the winpty package if necessary\n# pacman -Ss winpty (or apt get winpty)\n# Run the image\nwinpty docker run -it -v /${PWD}/data/:/fastStructure/data 6ca\n```\n\n## Run image under Linux/OSX\n\n```bash\n# Get the image id from the following command\ndocker images\n# Run the image\ndocker run -it -v /${PWD}/data/:/fastStructure/data 6ca\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhernanmd%2Fstructurepipelines","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhernanmd%2Fstructurepipelines","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhernanmd%2Fstructurepipelines/lists"}