https://github.com/mouzkolit/mirfetch

miRNA WebScraper that allows for TargetSpectrum Search
https://github.com/mouzkolit/mirfetch

multithreading python selenium

Last synced: about 2 months ago
JSON representation

miRNA WebScraper that allows for TargetSpectrum Search

Host: GitHub
URL: https://github.com/mouzkolit/mirfetch
Owner: mouzkolit
Created: 2022-12-07T09:33:19.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2023-03-16T10:18:00.000Z (over 3 years ago)
Last Synced: 2025-01-17T23:19:48.378Z (over 1 year ago)
Topics: multithreading, python, selenium
Language: Jupyter Notebook
Homepage:
Size: 3.26 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          
 miRFetch Package to provide easy access to the DIANA microT and microCDS Webserver 

 This package allows to submit RNA sequences provided from tRNA fragments ect and determine the Target Spaces using Selenium WebScraping and

Automatisation using an easy accessible API

 Data will be fetched from https://mrmicrot.imsi.athenarc.gr/?r=mrmicrot/index and from

https://dianalab.e-ce.uth.gr/html/dianauniverse/index.php?r=microT_CDS


 miRT Fetching Segment: 


 To start analysis first a dictionary of sequences must generated, consisting of a key (the name of the RNA sequence) and a list harboring the individual RNA

sequences like shown below. In Future we will provide also support of pd.DataFrame as well as output from MintMap a Pipeline to annotate 

tRNA fragments


```

from rnaFetch.mirTFetch import mirTFetch 

from rnaFetch.mirCDSFetch import microTCDS

RNA = {"GlyCCC": ["GCATTGGTGGTTCAGTGGTAGAATTCTC", 

                  "GCATTGGTGGTTCAGTGGTAGAATTCTCGCC", 

                  "GCATTGGTGGTTCAGTGGTAGAATTCT"],

       "LysTTT": ["GGGAGCGCCCGGATAGCTCAGTCGGTAGAGCATCAGACTTTT",

                  "TCGGGCGGGAGTGGTGGCTTTT",

                  "TCGGGCGGGAGTGGTGGCTTT"],

       "ThrAGT": ["TCGAATCCCAGCGGTGCCTCCA",

                  "ATCCCAGCGGTGCCTCCA",

                  "ATCCCAGCGGTGCCTCCG"]

      }

```

 Then you can initialize the using "Chrome", "Firefox" or "Edge" 


```

# Change to Firefox or Edge if you prefer

# Selenium Driver is initialized in headless mode but you can ask for Browser Window setting headless = None

fetcht = mirTFetch("Chrome")

```

 We can then set the threshold to consider a target; Can also be added manually via pandas when table is generated

And then we can run the Pipeline to let the miRWebserver determine the Target Spaces, which also includes the BioMart

Mapping using multithreading to convert Ensembl Transcript ID to Ensembl Gene ID and the external gene name, 

which can be better used for downstream analysis like GProfiler Analysis or Diana microT CDS analysis


```

fetcht.threshold = 0.95

# this will return a table or save a table in self.prediction_data

# In addition UTR sequence Table will be provided in self.utr_table

# data table will be also returned

final_table = fetcht.run_miRNA_analysis(dictionary)

```

 Get RNA miRNA overlap 


 We also provided overlapping target spaces between miRNAs and queried sequences using the mirCDSFetch module 

 Input the final_table generate after biomart annotation into the following code snippet.

The ouptut is a table of miRNA:sequence prediction partner shared in the grouped table. We further allow for specific visualizations of target

space overlaps between the queried RNA and miRNAs using Sankey Plots



```

# This will connect to the microTCDS webpage via a Selenium Driver

# 500 Genes per run will be supplied in chunks

# Threshold can be set to a float between 0-1 and will be automatically set 

fetchcds = microTCDS(final_table)

new_table = fetchcds.run_miRNA_analysis(threshold = 0.95)

overlap, grouped = fetchcds.get_mt_cds_overlap(final_table, new_table)

```

 A second Tutorial how to got directly from list having miRNA detections is shown in the Tutorial Folder

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mouzkolit/mirfetch

Awesome Lists containing this project

README

miRFetch Package to provide easy access to the DIANA microT and microCDS Webserver

miRT Fetching Segment:

Get RNA miRNA overlap

A second Tutorial how to got directly from list having miRNA detections is shown in the Tutorial Folder