{"id":21015627,"url":"https://github.com/gersteinlab/texp","last_synced_at":"2025-12-26T09:46:46.843Z","repository":{"id":143365016,"uuid":"68731011","full_name":"gersteinlab/texp","owner":"gersteinlab","description":"TeXP is a pipeline to gauge the autonomous transcription level of L1 subfamilies using short read RNA-seq data","archived":false,"fork":false,"pushed_at":"2019-10-10T14:24:57.000Z","size":41207,"stargazers_count":6,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-20T12:07:45.580Z","etag":null,"topics":["bioinformatics","bioinformatics-pipeline"],"latest_commit_sha":null,"homepage":"","language":"Makefile","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gersteinlab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-09-20T16:23:16.000Z","updated_at":"2024-11-20T07:44:27.000Z","dependencies_parsed_at":"2023-07-18T14:30:07.345Z","dependency_job_id":null,"html_url":"https://github.com/gersteinlab/texp","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gersteinlab%2Ftexp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gersteinlab%2Ftexp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gersteinlab%2Ftexp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gersteinlab%2Ftexp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gersteinlab","download_url":"https://codeload.github.com/gersteinlab/texp/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243446902,"owners_count":20292446,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","bioinformatics-pipeline"],"created_at":"2024-11-19T10:10:37.300Z","updated_at":"2025-12-26T09:46:46.805Z","avatar_url":"https://github.com/gersteinlab.png","language":"Makefile","funding_links":[],"categories":[],"sub_categories":[],"readme":"# TeXP\nTeXP is a pipeline to evaluate the transcription level of transposable elements in short read RNA-seq data\n\n#About\nTeXP is a pipeline for quantifying abundances of Transposable Elements transcripts from RNA-Seq data. TeXP is based on the assumption that RNA-seq reads overlapping Transposable Elements is a composition of pervasive transcription signal and autonomous transcription of Transposable Elements.\n\nhttps://www.biorxiv.org/content/10.1101/648667v1.full\n\n# How to quickly run TeXP\n\n```\ndocker run -it fnavarro/texp:latest /bin/bash\n```\n\nDownload a fastq file from a RNA-seq experiment, for example, MCF-7 from the ENCODE project\n\n```  \n  wget -c -t0 \"https://www.encodeproject.org/files/ENCFF000HFF/@@download/ENCFF000HFF.fastq.gz\" -O file.fastq.gz\n```\n  \nRun TeXP\n```\n  ./TeXP.sh -f file.fastq.gz -t 1 -o process/example/ -n quick_texp_run\n```\nThe output files will be generated at:\n```\n  ls process/example/quick_texp_run\n\t*.L1HS_hg38.count (Naive counts) \n\t*.L1HS_hg38.count.corrected (Corrected counts)\n\t*.L1HS_hg38.count.rpkm (Naive RPKM)\n\t*.L1HS_hg38.count.rpkm.corrected (Corrected RPKM)\n\t*.L1HS_hg38.count.signal_proportions \n```\n\nTIPS:\nIf fastq files are stored locally you can use\n```\ndocker run -it -v ~/Desktop/:/texp fnavarro/texp:latest /bin/bash\n```\nTo mount \"~/Desktop\" at your docker container\n\n# Requirements\n - Bowtie2 (2.3+)\n - Bedtools (2.26+)\n - Fastx-toolkit (0.0.14+)\n - perl (5.24+)\n - python (2.7)\n - R (3.3+)\n  - Penalized package (0.49+)\n - samtools (1.3+)\n - wgsim (a12da33 on Oct 17, 2011)\n---\n - Bowtie2 hg38 reference index (http://files2.gersteinlab.org/public-docs/2019/08.14/bowtie2.hg38.tar.bz2)\n - Hg38 repetitive element annotation (http://files2.gersteinlab.org/public-docs/2019/08.14/rep_annotation.hg38.tar.bz2)\n \n# Download TeXP\n $\u003e git clone https://github.com/gersteinlab/texp.git\n\n Edit TeXP.sh and Update INSTALL_DIR variable to the path where TeXP was cloned \n\n # Installing TeXP dependencies\napt-get update\n\n- Install binaries dependencies\n\napt-get install -y \\\n\tbedtools=2.26.0+dfsg-3 \\\n\tbowtie2=2.3.0-2 \\\n\tfastx-toolkit=0.0.14-3 \\\n\tgawk=1:4.1.4+dfsg-1 \\\n\tgit \\\n\tperl=5.24.1-3+deb9u1 \\\n\tpython=2.7.13-2 \\\n\tr-base=3.3.3-1 \\\n\tr-base-dev=3.3.3-1 \\\n\tsamtools=1.3.1-3 \\\n\twget \n\n\n- Install Wgsim\n\nmkdir -p /src; \\ \n\tcd /src ; \\\n\tgit clone https://github.com/lh3/wgsim.git; \\\n\tcd wgsim; \\\n\tgcc -g -O2 -Wall -o wgsim wgsim.c -lz -lm; \\\n\tmv wgsim /usr/bin/; \\\n\tcd /;\n\n\n- Download Libraries\n\nFix path (/data/library) to the a proper location at your computation enviroment\n\nmkdir -p /data/library/rep_annotation; \\\n\tcd /data/library/rep_annotation; \\\n\twget -c -t0 \"http://files2.gersteinlab.org/public-docs/2019/08.14/rep_annotation.hg38.tar.bz2\" -O rep_annotation.hg38.tar.bz2; \\\n\ttar xjvf rep_annotation.hg38.tar.bz2; \\\n\trm -Rf rep_annotation.hg38.tar.bz2\n\t\nmkdir -p /data/library/bowtie2; \\\n\tcd /data/library/bowtie2; \\\n\twget -c -t0 \"http://files2.gersteinlab.org/public-docs/2019/08.14/bowtie2.hg38.tar.bz2\" -O bowtie2.hg38.tar.bz2; \\\n\ttar xjvf bowtie2.hg38.tar.bz2; \\\n\trm -Rf bowtie2.hg38.tar.bz2\n\n\n\n- Install R packages dependencies\n\necho 'install.packages(c(\"penalized\"), repos=\"http://cloud.r-project.org\", dependencies=TRUE)' \u003e /tmp/packages.R \\\n    \u0026\u0026 Rscript /tmp/packages.R\n\n\n# TeXP config\n A few paramaters must be setup so TeXP can properly work outside a docker enviroment; Parameters are set on opts.mk and the user MUST properly set it up.\n\n - LIBRARY_PATH: Absolute path pointing to TeXP library, general this is the path you downloaded TeXP\n - EXT_LIBRARY_PATH: Absolute path containing the bowtie2 reference index and Transposable element annotation bed file, downloaded as instructed above\n - EXE_DIR: If binaries are found in a single path, EXE_DIR can be used to generalize binary location. For example, if bowtie2, bedtools, etc are located at /usr/bin/, you should set EXE_DIR := /usr\n - Dependencies installed in different paths should be defined manually, for example, if wgsim is installed at the home folder, the user must set:\n    - WGSIM_BIN := ~/wgsim/bin/wgsim\n  - Finally the user must set to CONFIGURED := TRUE\n\n---\n\n\n# Docker image\nAlternatively, docker images containing all dependencies and libraries can be used. The TeXP docker image also is pre-configured to work outside the box.\nCheck https://hub.docker.com/r/fnavarro/texp/ for futher instructions:\ndocker pull fnavarro/texp\n\n\n---\n# Running TeXP\n $\u003e ./TeXP.sh -f [FILE_NAME] -t [INT] -o [OUTPUT_PATH] n [SAMPLE_ID]\n \n\n -f: Input file (fastq,fastq.gz,sra)\n\n -t: Number of threads\n\n -o: Output path (i.e. ./ or ./processed)\n\n -n: Sample name (i.e. SAMPLE01)\n \n ---\n\n # FAQ - Frequently Asked Questions\n \u003e 1) Does TeXP work for paired end data?\n\nTeXP has been implemented to run one fastq file at a time. Overall, we empirically find that if the RNA-seq library is good, P1 and P2 should yield very similar estimates. Therefore, if using paired-end RNA-seq data, we recommend calculating the mean between both pairs.\n \n \u003e 2) Does TeXP work for unstranded data?\n \n Yes!\n\n\u003e 3) Can I use other aligners\n\nOn (figure 15) [https://journals.plos.org/ploscompbiol/article/file?type=supplementary\u0026id=info:doi/10.1371/journal.pcbi.1007293.s015] we show that aligners do not drastically change TeXP estimations, therefore, while you could use other aligners, we suggest using bowtie2 since all TeXP parameterization has been done on bowtie2\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgersteinlab%2Ftexp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgersteinlab%2Ftexp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgersteinlab%2Ftexp/lists"}