{"id":20837778,"url":"https://github.com/astrazeneca/detectis","last_synced_at":"2025-06-19T17:38:46.571Z","repository":{"id":43730052,"uuid":"300214077","full_name":"AstraZeneca/detectIS","owner":"AstraZeneca","description":"A pipeline to rapidly detect exogenous DNA integration sites using DNA or RNA paired-end sequencing data","archived":false,"fork":false,"pushed_at":"2023-04-25T08:59:37.000Z","size":1662,"stargazers_count":9,"open_issues_count":0,"forks_count":6,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-06-14T00:20:44.235Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Perl","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AstraZeneca.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2020-10-01T09:01:35.000Z","updated_at":"2024-01-13T03:10:50.000Z","dependencies_parsed_at":"2023-10-20T18:03:52.605Z","dependency_job_id":null,"html_url":"https://github.com/AstraZeneca/detectIS","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraZeneca%2FdetectIS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraZeneca%2FdetectIS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraZeneca%2FdetectIS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraZeneca%2FdetectIS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AstraZeneca","download_url":"https://codeload.github.com/AstraZeneca/detectIS/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225110573,"owners_count":17422411,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-18T01:08:33.068Z","updated_at":"2024-11-18T01:08:33.865Z","avatar_url":"https://github.com/AstraZeneca.png","language":"Perl","funding_links":[],"categories":[],"sub_categories":[],"readme":"![Maturity level-Prototype](https://img.shields.io/badge/Maturity%20Level-Prototype-red)\n\n# detectIS\n\nDetectIS is a pipeline specifically designed to detect exogenous DNA integration sites using DNA or RNA paired-end sequencing data.\nThe workflow manager [nextflow](https://www.nextflow.io/) is used with a configuration file and a Singularity image \n\n\n## Getting Started\n\nIn order to run the workflow, the user has to create a configuration file, specifying:\n\n\t \ta)fasta file with the reference host genome;\n\t\tb)fasta file with the reference exogenous sequence;\n\t\tc)the directory containing the raw data, in FASTQ format\n\t\td)the output directory. \nThe analysis can be executed locally or in an HPC environment, in the latter scenario the user has also to specify the cluster executor. \n\n\n### Prerequisites\n\nThe detectIS software requirements are:\n\t- [Singularity](https://www.sylabs.io/docs/) V2.6 or higher.\n\t- [Nextflow](https://www.nextflow.io/), the workflow has been developed and tested with version 0.32.0.4897 \n\n\n### Creating a Singularity container\n\nA Singularity container with all the necessary software is required to run the pipeline.\nThe image can be created by using the recipe (file: \"detectIS.rec\" contained in the  \"utils\" directory). Superuser privileges are necessary to generate a Singularity container with the command:\n\n```\nsudo singularity build detectIS.simg detectIS.rec\n```\n\nN.B. superuser privileges are necessary only to create the container but no to use it. This means you can create the container in your local pc/workstation and copy it to the system where you run analyses (e.g. your hpc or cluster). \n\nAlternatively, If you have problems in generating a Singularity container from the recipe you can download the image from [Singularity Hub](https://singularity-hub.org/)  \n\n\n### Runnig the workflow\n\nIf you have installed Singularity, Nextflow, and [configured the Singularity](https://www.sylabs.io/guides/2.6/user-guide/faq.html?highlight=disk%20access#how-are-external-file-systems-and-paths-handled-in-a-singularity-container) granting the image access to the disk partitions to read and write you can run any workflow.\n\n```\nnextflow run Workflows/detectIS.nf -c detectIS_TestDataset.conf -with-report detectIS_TestDataset_nextflow_report.html\n```\n\nIn the example Workflows/detectIS.nf is the workflow for the detectIS analysis and detectIS_TestDataset.conf is the configuration file with all the information needed for that given project. In the configuration file are specified input and output file directories, references (fasta) directories, and cluster specific parameters. \n\n\n### Test data sets\n\nIn the directory \"TestDataset\" are contained paired-end reads and reference files to run a detectIS analysis.\nThe dataset simulates the integration of a plasmid in the genome of Chinese hamster ovary cell line (CHOK1) .\n\nThe analysis can be executed using the bash script \"Run_detectIS.sh\", also contained in the directory \"TestDataset\" or using nexflow:\n\n```\nnextflow run Workflows/detectIS.nf -c detectIS_TestDataset.conf -with-report detectIS_TestDataset_nextflow_report.html\n```\n\nThe configuration file and the bash script can be either used as a template for other analyses. \n\n\n## Deployment\n\nPlease notice that Singularity containers can be [kernel-dependent](https://www.sylabs.io/guides/2.6/user-guide/faq.html?highlight=disk%20access#are-singularity-containers-kernel-dependent), this implies that the image recipies contained in this project will not necessarily produce an image able to run on your HPC system. If none of the available images is compatible with your system you might need to modify the recipe using an OS with compatible kernel, please raise an issue if this is the case and you need support for it.\n\n## Citation\n\nIf you use detectIS in your research, please cite our latest [publication](https://doi.org/10.1093/bioinformatics/btab366).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastrazeneca%2Fdetectis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fastrazeneca%2Fdetectis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastrazeneca%2Fdetectis/lists"}