{"id":18046579,"url":"https://github.com/ltla/bioconductor.js","last_synced_at":"2025-04-05T04:24:39.248Z","repository":{"id":62278352,"uuid":"558624525","full_name":"LTLA/bioconductor.js","owner":"LTLA","description":"Javascript implementations of common Bioconductor classes","archived":false,"fork":false,"pushed_at":"2023-08-09T17:43:46.000Z","size":1445,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-12T06:17:29.311Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://ltla.github.io/bioconductor.js/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LTLA.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-10-27T23:50:18.000Z","updated_at":"2023-08-02T23:51:28.000Z","dependencies_parsed_at":"2024-12-18T09:40:30.772Z","dependency_job_id":"3389f04b-ce6e-44a4-940b-d23842990694","html_url":"https://github.com/LTLA/bioconductor.js","commit_stats":{"total_commits":54,"total_committers":1,"mean_commits":54.0,"dds":0.0,"last_synced_commit":"e15779dbf606d5f77ba6813d5b2cdcd776a0987b"},"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Fbioconductor.js","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Fbioconductor.js/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Fbioconductor.js/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Fbioconductor.js/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LTLA","download_url":"https://codeload.github.com/LTLA/bioconductor.js/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247287892,"owners_count":20914263,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-30T19:07:57.355Z","updated_at":"2025-04-05T04:24:39.228Z","avatar_url":"https://github.com/LTLA.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Bioconductor objects in Javascript\n\nThis package aims to provide Javascript implementations of [Bioconductor](https://github.com/Bioconductor) data structures for use in web applications.\nMuch like the original R code, we focus on the use of common generics to provide composability, allowing users to construct complex objects that \"just work\".\nWe also attempt to circumvent Javascript's pass-by-reference behavior to avoid unintended modifications to unrelated objects when calling setter methods from their nested child objects.\n\n## Quick start\n\nHere, we perform some generic operations on a `DataFrame` object, equivalent to Bioconductor's `S4Vectors::DFrame` class.\n\n```js\n// Import using ES6 notation\nimport * as bioc from \"bioconductor\";\n\n// Construct a DataFrame\nlet results = new bioc.DataFrame(\n    { \n        logFC: new Float64Array([-1, -2, 1.3, 2.1]),\n        pvalue: new Float64Array([0.01, 0.02, 0.001, 1e-8])\n    },\n    {\n        rowNames: [ \"p53\", \"SNAP25\", \"MALAT1\", \"INS\" ]\n    }\n);\n\n// Run generics\nbioc.LENGTH(results);\nbioc.SLICE(results, [ 2, 3, 1 ]); \nbioc.CLONE(results);\n\nlet more_results = new bioc.DataFrame(\n    { \n        logFC: new Float64Array([0, 0.1, -0.1]),\n        pvalue: new Float64Array([1e-5, 1e-4, 0.5])\n    },\n    {\n        rowNames: [ \"GFP\", \"mCherry\", \"tdTomato\" ]\n    }\n);\n\nbioc.COMBINE([results, more_results]);\n```\n\nSee the [reference documentation](https://ltla.github.io/bioconductor.js) for more details.\n\n# Using generics\n\nOur generics allow users to operate on different objects in a consistent manner.\nFor example, a `DataFrame` allows us to store any object as a column as long as it defines methods for the `LENGTH`, `SLICE`, `CLONE` and `COMBINE` generics.\nThis enables the construction of complex objects like a `DataFrame` nested inside another `DataFrame`.\n\n```js\nlet genomic_results = new bioc.DataFrame(\n    { \n        logFC: new Float64Array([-1, -2, 1.3, 2.1]),\n        pvalue: new Float64Array([0.01, 0.02, 0.001, 1e-8]),\n        location: new bioc.DataFrame({\n            \"chromosome\": [ \"chrA\", \"chrB\", \"chrC\", \"chrD\" ],\n            \"start\": [ 1, 2, 3, 4 ],\n            \"width\": [ 10, 20, 30, 40 ],\n            \"strand\": new Uint8Array([-1, 1, 1, -1 ])\n        })\n    },\n    {\n        rowNames: [ \"p53\", \"SNAP25\", \"MALAT1\", \"INS\" ]\n    }\n);\n\nlet subset = bioc.SLICE(genomic_results, { start: 2, end: 4 });\nbioc.LENGTH(subset); \nsubset.column(\"location\");\n```\n\nAlternatively, we could store an `IRanges` (see below) as a column of our `DataFrame`.\nAll generics on the parent `DataFrame` will be automatically applied to the `IRanges` column.\n\n```js\nlet old_location = genomic_results.column(\"location\");\nlet new_location = new bioc.GRanges(old_location.column(\"chromosome\"),\n    new bioc.IRanges(old_location.column(\"start\"), old_location.column(\"width\")),\n    { strand: old_location.column(\"strand\") });\ngenomic_results.$setColumn(\"location\", new_location);\n\nsubset = bioc.SLICE(genomic_results, { start: 2, end: 4 });\nsubset.column(\"location\");\n```\n\nWe mimic R's S4 generics using methods in Javascript classes.\nFor example, each vector-like class should define a `_bioconductor_LENGTH` method to quantify its concept of \"length\".\nThe `LENGTH` function will then call this method to obtain a length value for any instance of any supported class.\nWe prefix this method with `_bioconductor_` to avoid collisions with other properties;\nthis allows safe monkey patching of third-party classes if they are sufficiently vector-like.\n\n(Admittedly, the `LENGTH` function is not really necessary, as users could just call `_bioconductor_LENGTH` directly.\nHowever, the latter is long and unpleasant to type, so we might as well wrap it in something that's easier to remember.\nIt would also require monkey patching of built-in classes like Arrays and TypedArrays, which is somewhat concerning as it risks interfering with the behavior of other packages.\nBy defining our own `LENGTH` function, we can safely handle the built-in classes as special cases without modifying their prototypes.)\n\n# Mimicking copy-on-write\n\nWe mimic R's copy-on-write behavior by returning a new object from any setter, rather than mutating the existing object.\nThis avoids silent pass-by-reference changes in separate objects, which would be particularly problematic in complex classes that contain many child objects.\nIn the example below, `another_reference` still retains the original set of row names while only `modified` has its row names removed.\n\n```js\n// Construct a DataFrame\nlet results = new bioc.DataFrame(\n    { \n        logFC: new Float64Array([-1, -2, 1.3, 2.1]),\n        pvalue: new Float64Array([0.01, 0.02, 0.001, 1e-8])\n    },\n    {\n        rowNames: [ \"p53\", \"SNAP25\", \"MALAT1\", \"INS\" ]\n    }\n);\n\nlet another_reference = results;\nlet modified = results.setRowNames(null);\n```\n\nFor users who are very sure that they are only operating on a single instance of the object,\nor for those who wish to exploit pass-by-reference behavior to multiple multiple objects at once, \nwe can use mutating setters for slightly more efficiency.\nThese are prefixed with `$` signs to indicate their potentially unexpected behavior.\n\n```js\nresults.$setRowNames(null);\nanother_reference.rowNames(); // this will now be null.\n```\n\nNote that this copy-on-write paradigm only applies to the setters defined in the **bioconductor.js** classes.\nAssignments to base objects (e.g., arrays, TypedArrays) will still exhibit pass-by-reference behavior.\nIf there is a risk of inadvertently modifying a shared object, users should consider `CLONE`ing their object before modifying it.\n\n```js\n// Returns a base object, i.e., Float64Array of log-fold changes.\nlet lfc = results.column(\"logFC\");\n\n// We clone it so that changes don't propagate to 'results' by reference.\n// We can then apply our arbitrary modifications to the copy.\nlet lfc_copy = bioc.CLONE(lfc);\nlfc_copy[0] = 100;\n\n// Only 'more_modified' will contain the new log-FC's;\n// 'results' itself is not affected.\nlet more_modified = results.setColumn(\"logFC\", lfc_copy);\n```\n\n# Representing (genomic) ranges\n\nWe can construct equivalents of Bioconductor's `IRanges` and `GRanges` objects, representing integer and genomic ranges respectively.\nSimilarly, Bioconductor's `GRangesList` is implemented as a `GroupedGRanges` in this package.\n\n```js\nlet ir = new bioc.IRanges(/* start = */ [1,2,3], /* width = */ [ 10, 20, 30 ]);\nlet gr = new bioc.GRanges([ \"chrA\", \"chrB\", \"chrC\" ], ir, { strand: [ 1, 0, -1 ] });\n\n// Generics still work on these range objects:\nbioc.LENGTH(gr);\nbioc.SLICE(gr, [ 2, 1, 0 ]);\nbioc.CLONE(gr);\n```\n\nWe can find overlaps between two sets of ranges, akin to Bioconductor's `findOverlaps()` function:\n\n```js\nlet index = gr.buildOverlapIndex();\nlet gr2 = new bioc.GRanges([ \"chrA\", \"chrC\", \"chrA\" ], new bioc.IRanges([5, 3, 2], [9, 9, 9]));\nlet overlaps = index.overlap(gr2);\n```\n\nWe can store per-range metadata in the `elementMetadata` field of each object, just like Bioconductor's `mcols()`.\n\n```js\nlet meta = gr.elementMetadata();\nmeta.$setColumn(\"symbol\", [ \"Nanog\", \"Snap25\", \"Malat1\" ]);\ngr.$setElementMetadata(meta);\ngr.elementMetadata().columnNames();\n```\n\n# Handling experimental assays\n\nThe `SummarizedExperiment` object is a data structure for storing experimental data in a matrix-like object, \nalong with further annotations on the rows (usually features) and samples (usually columns).\nTo illustrate, let's mock up a small count matrix, ostensibly from an RNA-seq experiment:\n\n```js\n// Making a column-major dense matrix of random data.\nlet ngenes = 100;\nlet nsamples = 20;\nlet expression = new Int32Array(ngenes * nsamples);\nexpression.forEach((x, i) =\u003e expression[i] = Math.random() * 10);\nlet mat = new bioc.DenseMatrix(ngenes, nsamples, expression);\n\n// Mocking up row names, column annotations.\nlet rownames = [];\nfor (var g = 0; g \u003c ngenes; g++) {\n    rownames.push(\"Gene_\" + String(g));\n}\n\nlet treatment = new Array(nsamples);\ntreatment.fill(\"control\", 0, 10);\ntreatment.fill(\"treated\", 10, nsamples);\nlet sample_meta = new bioc.DataFrame({ group: treatment });\n```\n\nWe can now store all of this information in a `SummarizedExperiment`:\n\n```js\nlet se = new bioc.SummarizedExperiment({ counts: mat }, \n    { rowNames: rownames, columnData: sample_meta });\n```\n\nThis can be manipulated by generics for two-dimensional objects:\n\n```js\nbioc.NUMBER_OF_ROWS(se);\nbioc.SLICE_2D(se, { start: 0, end: 50 }, [0, 2, 4, 8, 10, 12, 14, 16, 18]);\nbioc.COMBINE_COLUMNS([se, se]);\n```\n\nSimilar implementations are provided for the `RangedSummarizedExperiment` and [`SingleCellExperiment`](https://bioconductor.org/packages/SingleCellExperiment) classes.\n\n# Supported classes and generics\n\nFor classes:\n\n|**Javascript**|**R/Bioconductor equivalent**|\n|---|---|\n| [`DataFrame`](https://ltla.github.io/bioconductor.js/DataFrame.html) | `S4Vectors::DFrame` |\n| [`IRanges`](https://ltla.github.io/bioconductor.js/IRanges.html) | `IRanges::IRanges` |\n| [`GRanges`](https://ltla.github.io/bioconductor.js/GRanges.html) | `GenomicRanges::GRanges` |\n| [`GroupedGRanges`](https://ltla.github.io/bioconductor.js/GroupedGRanges.html) | `GenomicRanges::GRangesList` |\n| [`SummarizedExperiment`](https://ltla.github.io/bioconductor.js/SummarizedExperiment.html) | `SummarizedExperiment::SummarizedExperiment` |\n| [`RangedSummarizedExperiment`](https://ltla.github.io/bioconductor.js/RangedSummarizedExperiment.html) | `SummarizedExperiment::RangedSummarizedExperiment` |\n| [`SingleCellExperiment`](https://ltla.github.io/bioconductor.js/SingleCellExperiment.html) | `SingleCellExperiment::SingleCellExperiment` |\n\nFor generics:\n\n|**Javascript**|**R/Bioconductor equivalent**|\n|---|---|\n| [`LENGTH`](https://ltla.github.io/bioconductor.js/LENGTH.html) | `base::NROW` |\n| [`SLICE`](https://ltla.github.io/bioconductor.js/SLICE.html) | `S4Vectors::extractROWS` |\n| [`COMBINE`](https://ltla.github.io/bioconductor.js/COMBINE.html) | `S4Vectors::bindROWS` |\n| [`CLONE`](https://ltla.github.io/bioconductor.js/CLONE.html) | - |\n| [`NUMBER_OF_ROWS`](https://ltla.github.io/bioconductor.js/NUMBER_OF_ROWS.html) | `base::NROW` |\n| [`NUMBER_OF_COLUMNS`](https://ltla.github.io/bioconductor.js/NUMBER_OF_COLUMNS.html) | `base::NCOL` |\n| [`SLICE_2D`](https://ltla.github.io/bioconductor.js/SLICE_2D.html) | `base::\"[\"` |\n| [`COMBINE_ROWS`](https://ltla.github.io/bioconductor.js/COMBINE_ROWS.html) | `S4Vectors::bindROWS` |\n| [`COMBINE_COLUMNS`](https://ltla.github.io/bioconductor.js/COMBINE_COLUMNS.html) | `S4Vectors::bindCOLS` |\n\n# Further reading\n\nA high-level description of Bioconductor data structures is given in the [\"Orchestrating high-throughput genomic analysis with Bioconductor\"](https://doi.org/10.1038/nmeth.3252) paper.\n\nThe formulation of the generics was mostly based on the code in the [**S4Vectors**](https://github.com/Bioconductor/S4Vectors) package.\n\nThe implementation of each class is based on the code in the corresponding R package, e.g., `GRanges` in [**GenomicRanges**](https://bioconductor.org/packages/GenomicRanges).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fltla%2Fbioconductor.js","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fltla%2Fbioconductor.js","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fltla%2Fbioconductor.js/lists"}