{"id":13773937,"url":"https://github.com/GfellerLab/EPIC","last_synced_at":"2025-05-11T06:31:49.184Z","repository":{"id":117640547,"uuid":"79361381","full_name":"GfellerLab/EPIC","owner":"GfellerLab","description":"Repository for the R package EPIC, to Estimate the Proportion of Immune and Cancer cells from bulk gene expression data.","archived":false,"fork":false,"pushed_at":"2023-07-12T09:29:09.000Z","size":16641,"stargazers_count":69,"open_issues_count":1,"forks_count":23,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-02-14T06:36:05.879Z","etag":null,"topics":["bulk-data","cancer-cells","cell-type","gene-expression","rna-seq"],"latest_commit_sha":null,"homepage":"https://gfellerlab.shinyapps.io/EPIC_1-1/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GfellerLab.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-01-18T16:46:41.000Z","updated_at":"2024-02-14T06:36:05.880Z","dependencies_parsed_at":"2023-07-26T22:15:16.564Z","dependency_job_id":null,"html_url":"https://github.com/GfellerLab/EPIC","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GfellerLab%2FEPIC","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GfellerLab%2FEPIC/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GfellerLab%2FEPIC/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GfellerLab%2FEPIC/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GfellerLab","download_url":"https://codeload.github.com/GfellerLab/EPIC/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225021915,"owners_count":17408517,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bulk-data","cancer-cells","cell-type","gene-expression","rna-seq"],"created_at":"2024-08-03T17:01:22.079Z","updated_at":"2024-11-17T09:30:24.761Z","avatar_url":"https://github.com/GfellerLab.png","language":"R","readme":"---\r\ntitle: \"EPIC package\"\r\noutput: github_document\r\n---\r\n\r\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\r\n\r\n```{r, echo = FALSE}\r\nknitr::opts_chunk$set(\r\n  collapse = TRUE,\r\n  comment = \"#\u003e\",\r\n  fig.path = \"README-\"\r\n)\r\n```\r\n\r\n## Description\r\nPackage implementing EPIC method to estimate the proportion of immune, stromal,\r\nendothelial and cancer or other cells from bulk gene expression data.\r\nIt is based on reference gene expression profiles for the main non-malignant\r\ncell types and it predicts the proportion of these cells and of the remaining\r\n\"*other cells*\" (that are mostly cancer cells) for which no reference profile is\r\ngiven.\r\n\r\nThis method is described in the publication from *Racle et al., 2017* available\r\nat \u003chttps://elifesciences.org/articles/26476\u003e.\r\n\r\nEPIC is also available as a web application: \u003chttp://epic.gfellerlab.org\u003e.\r\n\r\n## Usage\r\nThe main function in this package is `EPIC`. It needs as input a matrix of the\r\nTPM (or RPKM) gene expression from the samples for which to estimate cell\r\nproportions. One can also define the reference cells to use\r\n```{r, eval = FALSE}\r\n# library(EPIC) ## If the package isn't loaded (or use EPIC::EPIC and so on).\r\nout \u003c- EPIC(bulk = bulkSamplesMatrix)\r\nout \u003c- EPIC(bulk = bulkSamplesMatrix, reference = referenceCellsList)\r\n```\r\n\r\n`out` is a list containing the various mRNA and cell fractions in each sample as well as some *data.frame* of the goodness of fit.\r\n\r\nValues of mRNA per cell and signature genes to use can also be changed:\r\n```{r, eval = FALSE}\r\nout \u003c- EPIC(bulk = bulkSamplesMatrix, reference = referenceCellsList, mRNA_cell = mRNA_cell_vector, sigGenes = sigGenes_vector)\r\nout \u003c- EPIC(bulk = bulkSamplesMatrix, reference = referenceCellsList, mRNA_cell_sub = mRNA_cell_sub_vector)\r\n```\r\n\r\nVarious other options are available and are well documented in the help pages\r\nfrom EPIC:\r\n```{r, eval = FALSE}\r\n?EPIC::EPIC\r\n?EPIC::EPIC.package\r\n```\r\n\r\n\r\n## Installation\r\n```{r, eval = FALSE}\r\ninstall.packages(\"devtools\")\r\ndevtools::install_github(\"GfellerLab/EPIC\", build_vignettes=TRUE)\r\n```\r\n\r\n\r\n## Web application\r\nEPIC is also available as a web application: \u003chttp://epic.gfellerlab.org\u003e.\r\n\r\n## Python wrapper\r\nA pyhton wrapper has been written by Stephen C. Van Nostrand from MIT and is\r\navailable at \u003chttps://github.com/scvannost/epicpy\u003e.\r\n\r\n## License\r\nEPIC can be used freely by academic groups for non-commercial purposes. The\r\nproduct is provided free of charge, and, therefore, on an \"*as is*\" basis,\r\nwithout warranty of any kind. Please read the file \"*LICENSE*\" for details.\r\n\r\nIf you plan to use EPIC (version 1.1) in any for-profit application, you are\r\nrequired to obtain a separate license.\r\nTo do so, please contact Nadette Bulgin \r\n([nbulgin@lcr.org](mailto:nbulgin@lcr.org)) at the Ludwig Institute for\r\nCancer Research Ltd.\r\n\r\n\r\n## Contact information\r\nJulien Racle ([julien.racle@unil.ch](mailto:julien.racle@unil.ch)),\r\nand David Gfeller ([david.gfeller@unil.ch](mailto:david.gfeller@unil.ch)).\r\n\r\n\r\n## FAQ\r\n##### Which proportions returned by EPIC should I use?\r\n* EPIC is returning two proportion values: *mRNAProportions* and *cellFractions*, \r\nwhere the 2nd represents the true proportion of cells coming from the different\r\ncell types when considering differences in mRNA expression between cell types.\r\nSo in principle, it is best to consider these *cellFractions*.\r\n\r\n  However, please note, that when the goal is to benchmark EPIC predictions, if\r\nthe 'bulk samples' correspond in fact to in silico samples reconstructed for\r\nexample from single-cell RNA-seq data, then it is usually better to compare the\r\n'true' proportions against the *mRNAProportions* from EPIC. Indeed, when\r\nbuilding such in silico samples, the fact that different cell types express\r\ndifferent amount of mRNA is usually not taken into account. On the other side,\r\nif working with true bulk samples, then you should compare the true cell\r\nproportions (measured e.g., by FACS) against the *cellFractions*.\r\n\r\n##### What do the \"*other cells*\" represent?\r\n* EPIC predicts the proportions of the various cell types for which we have\r\ngene expression reference profiles (and corresponding gene signatures). But,\r\ndepending on the bulk sample, it is possible that some other cell types are\r\npresent for which we don't have any reference profile. EPIC returns the\r\nproportion of these remaining cells under the name \"*other cells*\". In the\r\ncase of tumor samples, most of these other cells would certainly correspond\r\nto the cancer cells, but it could be that there are also some stromal cells or\r\nepithelial cells for example.\r\n\r\n##### I receive an error message \"*attempt to set 'colnames' on an object with less than two dimensions*\". What can I do?\r\n* This is certainly that some of your data is a vector instead of a matrix.\r\nPlease make sure that your bulk data is in the form of a matrix (and also\r\nyour reference gene expression profiles if using custom ones).\r\n\r\n##### Is there some caution to consider about the *cellFractions* and *mRNA_cell* values?\r\n* As described in our manuscript, EPIC first estimates the proportion of mRNA\r\nper cell type in the bulk and then it uses the fact that some cell types have\r\nmore mRNA copies per cell than other to normalize this and obtain an estimate of\r\nthe proportion of cells instead of mRNA (EPIC function returns both information\r\nif you need the one or the other). For this normalization we had either measured\r\nthe amount of mRNA per cell or found it in the literature (fig. 1 – fig.\r\nsupplement 2 of our paper). However we don’t currently have such values for the\r\nendothelial cells and CAFs. Therefore for these two cell types, we use an average\r\nvalue, which might not reflect their true value and this could bias a bit the\r\npredictions, especially for these cell types. If you have some values for these\r\nmRNA/cell abundances, you can also add them into EPIC, with help of the parameter\r\n\"*mRNA_cell*\" or “*mRNA_cell_sub*” (and that would be great to share these values).\r\n\r\n    If the mRNA proportions of these cell types are low, then even if you don't\r\ncorrect the results with their true mRNA/cell abundances, it would not really\r\nhave a big impact on the results. On the other side, if there are many of these\r\ncells in your bulk sample, the results might be a little bit biased, but the\r\neffect should be similar for all samples and thus not have a too big importance\r\n(maybe you wouldn’t be fully able to tell if there are more CAFs than Tcells for\r\nexample, but you should still have a good estimate of which sample has more CAFs\r\n(or Tcells) than which other sample for example).\r\n\r\n##### I receive a warning message that \"*the optimization didn't fully converge for some samples*\". What does it mean?\r\n* When estimating the cell proportions EPIC performs a least square regression between the observed expression of the signature genes and the expression of these genes predicted based on the estimated proportions and gene expression reference profiles of the various cell types.\r\n\r\n    When such a warning message appears, it means that the optimization didn’t manage to fully converge for this regression, for some of the samples. You can then check the \"*fit.gof\\$convergeCode*\" (and possibly also \"*fit.gof\\$convergeMessage*\") that is outputted by EPIC alongside the cell proportions. This will tell you which samples had issue with the convergence (a value of 0 means it converged ok, while other values are errors/warnings, their meaning can be found in the help of \"*optim*\" (or \"*constrOptim*\") function from R (from \"*stats*\" package) which is used during the optimization and we simply forward the message it returns).\r\n\r\n    The error code that usually comes is a \"1\" which means that the maximum number of iterations has been reached in the optimization. This could mean there is an issue with the bulk gene expression data that maybe don’t completely follow the assumption of equation (1) from our manuscript. From our experience, it seems in practice that even when there was such a warning message the proportions were predicted well, it is maybe that the optimization just wants to be *too precise*, or maybe few of the signature genes didn’t match well but the rest of signature genes could be used to have a good estimate of the proportions.\r\n\r\n    If you have some samples that seem to have strange results, it could however be useful to check that the issue is not that these samples didn’t converge well. To\r\nbe more conservative you could also remove all the samples that didn't converge\r\nwell as these are maybe outliers, if it is only a small fraction from your original samples. Another possibility would be to change the parameters of the optim/constrOptim function to allow for more iterations or maybe a weaker tolerance for the convergence, but for this you would need to tweak it directly in the code of EPIC, I didn't implement such option for EPIC.\r\n\r\n\r\n##### Who should I contact in case of a technical or other issue?\r\n* Julien Racle  ([julien.racle@unil.ch](mailto:julien.racle@unil.ch)). Please\r\nprovide as much details as possible and ideally send also an example input file (and/or reference profiles) that is causing the issue.\r\n","funding_links":[],"categories":["RNA-seq"],"sub_categories":["Cell-Type Deconvolution"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FGfellerLab%2FEPIC","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FGfellerLab%2FEPIC","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FGfellerLab%2FEPIC/lists"}