An open API service indexing awesome lists of open source software.

https://github.com/p2m2/stream-reader-mzxml

A Scala library specializing in stream processing of mzXML files, based on FS2
https://github.com/p2m2/stream-reader-mzxml

fs2 mass-spectrometry scala stream

Last synced: 5 months ago
JSON representation

A Scala library specializing in stream processing of mzXML files, based on FS2

Awesome Lists containing this project

README

          

# stream-reader-mzxml
[![Build Status](https://github.com/p2m2/stream-reader-mzxml/workflows/release/badge.svg)](https://github.com/p2m2/stream-reader-mzxml/actions?query=workflow%3Arelease)

A Scala library specializing in stream processing of mzXML files, based on FS2.

### mzXML specifications

https://sashimi.sourceforge.net/schema_revision/mzXML_2.1/Doc/mzXML_2.0_tutorial.pdf

## Installation using Docker

### 1) First option : build docker image

```bash
docker build . -t p2m2/stream-reader-mzxml
```

### 2) Second option : pull image from dockerhub

```bash
docker pull inraep2m2/stream-reader-mzxml
```

### run command in the current directory

```bash
docker run -v $(pwd):/data inraep2m2/stream-reader-mzxml /data/myfile.mzXML
```

### run command in specific path

```bash
docker run -v :/data inraep2m2/stream-reader-mzxml /data/myfile.mzXML
```

## Installation using sbt

```bash
sbt assembly
```
### run command

```bash
java -cp ./assembly/pack.jar
```
| MainClass | Arguments | Description |
|:-------------------------------------|:---------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| MainDistributionIntensityIons | | Number of ions by intensity thresholds |
| MainDistributionMzIons | *Minimum intensity*| Occurrences of the most frequent Mz (Ions) |
| MainDistributionDiffMzIons | *Minimum intensity of peaks of interest* | Gives the occurrences of the difference between the Mz (mass on charge) of interest and the other ions in the same mass spectrum to detect the formation of adducts. |
| MainPrecursorMzMatchingGlucosinolate | | Selects the precursorMz corresponding to an MS2 signature (diagnostic neutral and ion losses) |

### Examples

```bash
java -cp ./assembly/pack.jar MainDistributionIntensityIons
```

```bash
java -cp ./assembly/pack.jar MainDistributionMzIons
```

```bash
java -cp ./assembly/pack.jar MainDistributionDiffMzIons -i 50000
```

```bash
java -cp ./assembly/pack.jar MainGlucosinolates src/test/resources/Orbitrap_Exploris_240_precision64.mzXML --output test.out --startRT 0 --endRT 2 --minIntensity 7500 --deltaMp0Mp2 1.9958 --carbonMin 3 --carbonMax 35 --sulfurMin 1.5 --sulfurMax 5 --precisionMz 0.0001
```

## ammonite example

### Precursor Mz search

```scala
import $cp.`assembly/pack.jar`
import cats.effect.{IO, IOApp}
import fs2.{Stream, text, Pipe}
import fs2.io.file.{Files, Path}
import java.nio.file.Paths

import cats.effect.unsafe.implicits._
import fr.inrae.p2m2.mzxml._

val mzXMLFile = "./src/test/resources/LTQ_Orbitrap_precision32.mzXML"
val outputFile = "precursor_288p93.txt"

val formatPrecursorMz : Pipe[IO, Option[Seq[PrecursorMz]], String] = {
inStream =>
inStream.map {
case Some(p) => s"Precursor ${p.head.value} with precursorIntensity ${p.head.precursorIntensity} " +
s"and precursorScanNum ${p.head.precursorScanNum}\n"
case _ => ""
}
}

SpectrumRequest(mzXMLFile).precursorMz(288.93,5000).map {
case Some(r) => Some(r.precursorMz)
case None => None
}
.filter(_.isDefined)
.through(formatPrecursorMz)
.through(text.utf8.encode)
.through(Files[IO].writeAll(Path(outputFile)))
.compile
.drain
.unsafeRunSync()

println(outputFile)
```

###

```scala
import $cp.`assembly/pack.jar`
import cats.effect.{IO, IOApp}
import fs2.{Stream, text, Pipe}
import fs2.io.file.{Files, Path}
import java.nio.file.Paths

import cats.effect.unsafe.implicits._
import fr.inrae.p2m2.mzxml._
import fr.inrae.p2m2.mzxml.utils.ChemicalConst

val mzXMLFile = "./src/test/resources/LTQ_Orbitrap_precision32.mzXML"
val deltaMp0Mp2 = 1.996 // Glucosinolate
val numberCarbonMin = 3
val numberCarbonMax = 35
val numberSulfurMin = 1.5
val numberSulfurMax = 5
val startTime = 0
val endTime = 5

{
SpectrumRequest(mzXMLFile)
.msLevel(1)
.filter(_.isDefined)
.map( _.get)
.filter( _.retentionTimeInSeconds.getOrElse(0)>=startTime)
.filter( _.retentionTimeInSeconds.getOrElse(Int.MaxValue)<=endTime)
.map {
(spectrum: Spectrum) => {
spectrum.peaks.map {
case (mz0, int0) =>
val mz_ms_p2 = mz0 + deltaMp0Mp2
val (mz1, int1) = spectrum.findClosestValueMz(mz0 + 1.0)
val (mz2, int2) = spectrum.findClosestValueMz(mz_ms_p2)
((mz0, int0), (mz1, int1), (mz2, int2)) // isotopes

}.filter {
case (v0, v1, _) =>
v1._2 >= v0._2 *
(ChemicalConst.abundanceIsotope("C")(1) * numberCarbonMin +
ChemicalConst.abundanceIsotope("S")(1) * numberSulfurMin)

}.filter {
case (v0, v1, _) =>
v1._2 < v0._2 *
(ChemicalConst.abundanceIsotope("C")(1) * numberCarbonMax +
ChemicalConst.abundanceIsotope("S")(1) * numberSulfurMax)

} /* criteria M2 of Isotope S are present 4.4 % */
.filter {
case (v0, _, v2) =>
v2._2 >= v0._2 * ChemicalConst.abundanceIsotope("S")(2) * numberSulfurMin
}
.filter {
case (v0, _, v2) =>
v2._2 < v0._2 * ChemicalConst.abundanceIsotope("S")(2) * numberSulfurMax
}.map(
(spectrum.retentionTimeInSeconds.getOrElse(-1), _)
)
}
}
.map(x => x.map(y => y.toString()+"\n").mkString("\n"))
.through(text.utf8.encode)
.through(Files[IO].writeAll(Path(outputFile)))
.compile
.drain
.unsafeRunSync()
}

```

`amm example.sc`
```bash
amm glucosinolateIons.sc ../mzxml-glucosinolate-analyser/src/test/resources/20181018-037.mzXML 100000
```