{"id":33164614,"url":"https://github.com/SUwonglab/PECA","last_synced_at":"2025-11-20T17:01:57.936Z","repository":{"id":37924299,"uuid":"144921055","full_name":"SUwonglab/PECA","owner":"SUwonglab","description":"PECA is a software for inferring context specific gene regulatory network from paired gene expression and chromatin accessibility data","archived":false,"fork":false,"pushed_at":"2025-10-07T16:57:48.000Z","size":20266,"stargazers_count":45,"open_issues_count":7,"forks_count":7,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-10-07T18:35:04.597Z","etag":null,"topics":["atac-seq","chromatin-accessibiity","dnase-seq","gene-expression","gene-regulatory-network","rna-seq"],"latest_commit_sha":null,"homepage":null,"language":"MATLAB","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SUwonglab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2018-08-16T01:31:10.000Z","updated_at":"2025-10-07T16:57:51.000Z","dependencies_parsed_at":"2023-02-09T09:01:20.192Z","dependency_job_id":"969e0406-d379-4447-9129-f722f7218866","html_url":"https://github.com/SUwonglab/PECA","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/SUwonglab/PECA","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SUwonglab%2FPECA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SUwonglab%2FPECA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SUwonglab%2FPECA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SUwonglab%2FPECA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SUwonglab","download_url":"https://codeload.github.com/SUwonglab/PECA/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SUwonglab%2FPECA/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":285475217,"owners_count":27178110,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-20T02:00:05.334Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["atac-seq","chromatin-accessibiity","dnase-seq","gene-expression","gene-regulatory-network","rna-seq"],"created_at":"2025-11-16T00:00:26.724Z","updated_at":"2025-11-20T17:01:57.930Z","avatar_url":"https://github.com/SUwonglab.png","language":"MATLAB","funding_links":[],"categories":["Uncategorized"],"sub_categories":[],"readme":"# PECA\n\n## Introduction:\n\nPECA is a software for inferring context specific gene regulatory network from paired gene expression and chromatin accessibility data.\nplease cite PECA and PECA2 papers:\n\nDuren, Zhana, et al. \"Modeling gene regulation from paired expression and chromatin accessibility data.\" Proceedings of the National Academy of Sciences 114.25 (2017): E4914-E4923.\n\nDuren, Zhana, et al. \"Time course regulatory analysis based on paired expression and chromatin accessibility data.\" Genome research 30.4 (2020): 622-634.\n\n## Quickly start:\n```\nwget https://github.com/SUwonglab/PECA/archive/master.zip\nunzip master.zip\ncd PECA-master/\nbash install.sh\n```\n```\nbash PECA.sh sampleName genome\n```\n## Install:\n\n```\nbash install.sh\n```\n\n## Run PECA:\n\nRun PECA by following two steps:\n\n### Step 1: Input \nPut the input files in folder named `./Input`. Three files: `${SampleName}.txt`, `${SampleName}.bam`, `${SampleName}.bam.bai`.\n\n`${SampleName}.txt` is gene expression file containing two columns (tab delimited), gene Symbol and FPKM (or TPM). \n\n`${SampleName}.bam` is chromatin accessibility data, DNase-seq or ATAC-seq. \n\n`${SampleName}.bam.bai` is the index file of bam file. \n\nNote that all the three files should have same before-dot-file-name ${SampleName},only difference is after dot \".txt\", \".bam\" or \".bam.bai\". Please see the example of RAd4 in the `./Input` directory.\n\n### Step 2: Run \n```\nsh PECA.sh ${SampleName} ${genome}\n```\n\nExample: `sh PECA.sh RAd4 mm9`\n\nTo make sure the code run smoothly, please provide at least 64GB memory.\n\nThe results will be `./Results/${SampleName}/` .\n${SampleName}_network.txt is the tissue specific network.\n\nTFTG_score.txt is regulation strength for the all TF to TG. Each row represent one TF and each column represents one target gene. Higher value represents higher possibility of regulation.\n\nCRB_pval.txt is the Chromatin regulators' (CR) binding site matrix, each column represent one CR, each row represent one region, the values are p-values.\n\n## Run PECA without ENCODE data information\nPECA model uses prior information from ENCODE data. One can learn this prior information using their own data without using the ENCODE data if the number of paired samples are greater than 5.\n\n```sh PECA_withoutENCODE.sh FullPath_to_sampleNameFile ${genome}```\n\nExample: `sh PECA_withoutENCODE.sh /home/user/sampleName.txt hg19`\nHere /home/user/sampleName.txt is a txt file that contain sample names (contain one sample name per line). For example\n```\nES_day0\nES_day2\nES_day4\nES_day6\nES_day10\nES_day20\n```\nUnder Input folder you should have ES_day0.txt, ES_day0.bam, and ES_day0.bam.bai, and the same for other samples. The reults of ES_day0 will be stored in ./Results__withoutENCODE/ES_day0/.\n\n## Run PECA_compReg:\nIf you have two conditions (multiple samples in each conditions) and want to compare the two conditions at network level, please see tutorial in comparative_regulatory_analysis.md https://github.com/SUwonglab/PECA/blob/master/comparative_regulatory_analysis.md.\n\n## Run PECA_net_dif:\nIf you have two samples and want to compare the two samples at network level, please do it by following steps:\n\n1, Prepare two networks: Run PECA on two samples one by one by \"sh PECA.sh ${sampleName} ${genome}\"\n\n2, Run:  `sh PECA_compare_dif.sh ${Sample1} ${Sample2} ${Organism}`\n\nExample: `sh PECA_compare_dif.sh K562 GM12878 human` ; `sh PECA_compare_dif.sh mESC RAd4 mouse`\n\nThe results will be `./Results/Compare_${Sample1}_${Sample2}`. Containing six files:  \n\nspecific network of two samples: `${Sample1}_specific_network.txt` and `${Sample2}_specific_network.txt`\n\ncommon network of two samples: `${Sample1}_${Sample2}_common_network.txt`\n\nspecific module of two networks:  `${Sample1}_specific_module.txt` and `${Sample2}_specific_module.txt`\n\ncommon module of two samples: `${Sample1}_${Sample2}_common_module.txt` \n\nFiles PooledNetwork.txt or PooledModule.txt can be used to visualize the network by cytoscype, and the node lable is given in file Node_lable.txt. \"1\" and \"-1\" in PooledNetwork.txt or PooledModuole.txt represent \"Activation\" and \"Repression\" respectively. \"1\" and \"2\" in Node_lable.txt represent the gene is Sample1 specific or Sample2 specific.\n\n## Run PECA_net_dif_multiple:\nIf you have two conditions (multiple samples in each conditions) and want to compare the two conditions at network level, please do it by following steps:\n\n1, Prepare networks: Run PECA on all the samples from two conditions one by one by \"sh PECA.sh ${sampleName} ${genome}\"\n\n2, Construct lables: Write the sample names of Group1 and Group2 into text files named $Group1 and $Group2, respectively. (eg. create one text file named \"Control\" and put the sample names of one condition to this file, create other text file named \"Case\" and put the names of the other condition to this file. Note that the sample name files contain one sample name per line )\n\n3, Run: `sh PECA_compare_dif_multiple.sh $Group1 $Group2 ${Organism}`\nExample： `sh PECA_compare_dif_multiple.sh Control Case human`\n \nThe results will be `./Results/CompareGroup_${Group1}_${Group2}`. Containing six files:  \n\nspecific network of two conditions: `${Group1}_specific_network.txt` and `${Group2}_specific_network.txt`\n\ncommon network of two conditions: `${Group1}_${Group2}_common_network.txt` \n\nspecific module of two conditions:  `${Group1}_specific_module.txt` and `${Group2}_specific_module.txt`\n\ncommon module of two conditions: `${Group1}_${Group2}_common_module.txt`\n\nFiles PooledNetwork.txt or PooledModuole.txt can be used to visualize the network by cytoscype, and the node lable is given in file Node_lable.txt. \"1\" and \"-1\" in PooledNetwork.txt or PooledModuole.txt represent \"Activation\" and \"Repression\" respectively. \"1\" and \"2\" in Node_lable.txt represent the gene is Group1 specific or Group2 specific.\n\n## Requirements:\n\n* Matlab (Optimization Toolbox)\n\n* macs2\n\n* homer\n\n* samtools\n\n* bedtools\n\n## Contact:\n\nIf you have any issues, please contact Zhana Duren by zduren@clemson.edu\n\n***\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSUwonglab%2FPECA","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FSUwonglab%2FPECA","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSUwonglab%2FPECA/lists"}