Projects in Awesome Lists by gauravcodepro
A curated list of projects in awesome lists by gauravcodepro .
https://github.com/gauravcodepro/genome-shell-utility
a genome shell utility to help you with the project management and organization. It will automatically create the folders and transfers the files to the sever and will move around.
clusters genome genomeutility hpc-clusters server
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pangraphs-genome-assembly
genome assembly to pangraphs from illumina to long reads.
bioinformatics container-image containerization docker longread pacbio-data pacbio-iso-seq pacbio-sequencing
Last synced: 18 Mar 2025
https://github.com/gauravcodepro/protein-annotator
python package to analyze the protein coding regions for the genome annotation. It uses the miniprot for the alignment and gives you all the protein predicted mRNA, coding regions and other exon positions.
bioinformatics genome-alignment genome-analysis genome-annotation protein-sequences python-library
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/numpy-builder
A numpy shell builder to extract and how to use the numpy across the arrays.I am putting the entire manual for those who like to search immediately rather than looking here and there.
bash-prompt bash-script bash-scripting data-analysis data-mining data-science numpy numpy-arrays shell-prompt shell-script
Last synced: 16 Jun 2025
https://github.com/gauravcodepro/nextflow-pacbiohifi
a nextflow pacbiohifi for the genome assembly from the pacbiohifi
bioinformatics nextflow nextflow-language nextflow-pipeline nextflow-process pacbio-data pacbiohifi
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pubmed-abstract-fetcher
This function will prepare the abstract and the id information for all the pubmed articles that you want to read and have as a citation. I coded this using a web scraping approach and it is blazing fast and parses better than ncbi eutils
bioinformatics corpus genomics language-model language-modeling natural-language-processing pubmed pubmed-parser
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/genome-size-estimation
a genome reference estimation based on the peak size calibration and the given the values of the peak size calibration will estimate the genome and the sub genome fraction.
genome genome-analysis genome-assembly genome-sequencing size-calculation size-optimization
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/genomic-alignment-extraction
A scalable large scale genomic fraction aligner and extractor for the large scale alignment of the genomes and the transcriptomes and process them over the cores for the extraction of the aligned regions. The aligned regions can also be mapped to the length plotter and can be machine trained for specific applications
alignment-algorithms alignmentextract alignments genome genome-annotation genome-assembly genomealignment transcriptome-analysis transcriptomealignment transcriptomics
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/genome-analysis-slurm-pbs
slurm and pbs scripts for genome, metagenome, transcriptome analysis.
bioinformatics bioinformatics-analysis bioinformatics-pipeline bioinformatics-scripts genomic-data-analysis genomics genomics-visualization
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/codingplotter
a coding plotter for the protein annotations coming from the annotation of the genome using the protein hints and to extract and plot the specific length estimates.
bioinformatics bioinformatics-visualization genome-analysis genome-annotation protein-sequences
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/langchain-candida-literature-miner
candida literature bert machine learning. works on any literature term from pubmed.
fungal-genomes fungalgenomics machinelearning natural-language-generation natural-language-processing pubmed pubmed-abstracts pubmed-records
Last synced: 26 Feb 2025
https://github.com/gauravcodepro/graphanalyzer
graphanalyzer python package for dealing with paf and gfa alignments.
alignments bioinformatics genomes graphalignments graphs graphs-algorithms machinelearning
Last synced: 12 Jun 2025
https://github.com/gauravcodepro/tairaccession
tairaccession python package for interaction with tair and analyzing arabidopsis genome.
annotation-processor annotation-tool arabidopsis bioinformatics bioinformatics-tool comparative-analysis comparative-genomics genome-analysis genome-annotation phytozome plantgenomes plantgenomics tair3
Last synced: 02 Aug 2025
https://github.com/gauravcodepro/ko-phylogenomics
analyzing bacterial genome based ontologies and phylogenomics informativeness
bacterial-genome-analysis bacterial-genomes geneontology genome-annotation ontologies ontology plant plantgenomes
Last synced: 02 Aug 2025
https://github.com/gauravcodepro/python-datastructures
python datastructure and algorithms
codewars codewars-kata codewars-kata-solution codewars-solutions datastructures datastructures-algorithms faang-interview interview interview-preparation interview-questions leetcode-python leetcode-questions leetcode-solutions
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pacbio-nanopore-repeat-coverage
a long read repeat coverage calculator,given an long read file before assembly either direct from the sequencing runs or after the cleaning, it calculates the total amount of the repeat stretches present in the sequencing reads and you can plot them before assembly
bioinformatics bioinformatics-algorithms bioinformatics-tool longread oxford-nanopore pacbio pacbio-sequencing sequencing-coverage sequencing-data sequencing-error
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pangenome-single-copy-gene
a single copy gene analysis pangenome which allows for the orthology already computed and allows for the alignment and phylogeny building using three alignment approaches and also allows for the GTRCAT and GTRGAMA model with 1000 bootstrap. To check the file conversion it iterates the first files at the start and see if the number matches
bioinformatics bioinformatics-analysis pangenome pangenome-inference pangenome-pipeline
Last synced: 22 Jul 2025
https://github.com/gauravcodepro/tair-pubmed-connector
There is no function to fetch automatically the information on the reported pubmed articles links in the tair to be used for the language models, so i coded this function which will take the tair information, a gene or locus tag and will fetch the corresponding pubmed and then from the pubmed the corresponding abstracts
bioinformatics bioinformatics-analysis bioinformatics-tool genome-annotation language-model pubmed pubmed-abstracts pubmed-articles-grabber pubmed-records tair3
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/plant-microarray-analysis
analysis of already normalized microarray expression profiles and perform batch analysis and plots the volcano plots and differential expression
microarray microarray-analysis microarray-data-analysis microarray-experiments
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pangenome-evolutionary
a complete workflow for analyzing the pangenomes from the core genesets. simply have to provide the fasta files and it will do everything and will make all the accessory information plots from the evolutionary analysis. It will also check for the breakage in the phylogeny and also will perform the repoint analysis.
bioinformatics evolutionary-computation evolutionary-computing pangenome pangenome-clustering pangenome-inference
Last synced: 04 Jul 2025
https://github.com/sablokgaurav/graphanalyzer
graphanalyzer python package for dealing with paf and gfa alignments.
alignments bioinformatics genomes graphalignments graphs graphs-algorithms machinelearning
Last synced: 31 Jul 2025
https://github.com/gauravcodepro/pangenomes-metagenomes
a workflow for complete analysis of bacterial metagenomes and pagenome graphs and direct viewing in panchae. It will also analyze metagenomes from both illumina and long reads
bacterial-genomes bioinformatics metagenomes metagenomics-data pangenomics pangraphs
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/domain-analyzer
This repository contains a datascience based faster implementation of the domain predictions from the interpro scan and it will give you a complete domains information, coordinates and other associative information. I used a mapping dataframe approach to make it faster rather than looping it over and over.
bacterial-genome-analysis bacterial-genomes domains functional-programming functionaldomains fungal-genomes fungaleffectors interproscan plantgenomes protein-sequence protein-structure proteindomains
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/evolutionary-rate-analyer
A R function for the analysis of the evolutionary rates from the fasta files, and uses the ka/ks and the dn/ds and plots the evolutionary rates.
bacterial-genome-analysis bacterial-genomes bioinformatics evolution evolutionary-computation fungal-genomes fungal-metagenomics plantgenomics rprogramming
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/shard-fetcher
a python way to web scrap the shard, some of the categories dont have description otherwise i would have made a dataframe otherwise the functions can be used irrespective for the name tags to find the corresponding names in github
crystal-lang shard web3 webscraping
Last synced: 10 Jun 2025
https://github.com/gauravcodepro/streamlit-pangraphs-pangenome
a pangraph visualization and pangenome visualization component for the pangenome.
bioinformatics pangenome pangenome-graph streamlit streamlit-application streamlit-webapp
Last synced: 12 Jun 2025
https://github.com/gauravcodepro/tair-gff-ids
A set of functions which will provide easy access and cleaned gff from tair and uses a dataframe and datascience approach to get the systematic tair ids and their coordinates from the tair 10 gff version. It can be applied to any version of the tair for getting the systematic retrival of the tair ids.
genome genome-analysis genome-annotation genome-assembly genome-sequencing tair tair3 transcriptome-annotation transcriptome-assembly
Last synced: 05 Sep 2025
https://github.com/gauravcodepro/plant-resistance_gene_isolator
I coded this function to make a comprehensive gene isolation for the plant resistance genes from the long reads sequencing. Given PacBio or Oxford Nanopore Reads, it will assemble, predict the plant disease resistance genes and will allow you to analyze the mutations in the plant disease resistance genes
bioinformatics bioinformatics-pipeline bioinformatics-tool fungal-effectors fungal-genomes plantdisease plantdiseasedetection plantgenomics planthealth
Last synced: 06 Jul 2025
https://github.com/gauravcodepro/genomehifi-contiguity
a conda yaml for the genomehifi-contiguity that will allow you to create the environment for all the analysis.
bioinformatics genome-analysis genome-assembly hifi pacbio pacbio-sequencing
Last synced: 06 Jul 2025
https://github.com/gauravcodepro/ontology-network-candida
analyzing candida ontology terms for network analysis.
candida expressionnetworks fungal-genomes geneontology network-analysis
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/genomeassembly-pacbiohifi
PacbioHifi genome assembly benchmarks
bioinformatics genome-assembly pacbio-hifi-sequencing-reads pacbio-sequencing pacbioreadsconnection protoype
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/streamlit-metabolic-json-modelling
streamlit application for metabolic json from BIGG modelling database.
metabolic-models metabolic-pathways metabolic-reconstruction streamlit-application streamlit-webapp
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/ruby-rails-webrowser
a pacbiohifi ruby-rails web-browser deployable on herkou for the rapid pacbiohifi genomics analysis
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/cascading-style-sheets-genome
cascading style sheets for the genome analysis, javascript and pacbiohifi
bioinformatics css css-framework css-grid css-grid-layout css3 genome-browser
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/warp-workflow
a warp workflow builder which uses the shell in built array addition and adds your commands to the in-built array declaration
array-methods arrays cluster-computing rust shellcode warp workflow
Last synced: 27 Mar 2025
https://github.com/gauravcodepro/awk-pacbiohifi1
awk library and functions for pacbiohifi read analysis.
awk-programming-language biological-expression-language compiler genome-analysis
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/trie-suffixarray.jl
a implementation of the trie-suffix arrays for pacbiohifi
pacbiohifi suffix-array suffix-tree suffix-trie
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/flask-api-genome
flask api implementation of Fastify and FASTAPI to dessimate genome information using APIs
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/python-pbs-slurm-writer
This repository contains two custom functions that will prepare the PBS files for your cluster computing. Simply call the function and it will ask for the parameters and then it will output the complete PBS file so that you can submit to the cluster
bash-script cluster computing-cluster computing-framework high-performance-computing pbs pbs-torque python python-script
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/visual-pacbiohifi-verkko
multiple visualizations for the pacbiohifi assembly from verkko including a graph visualization, a link visualization, plotting the coverage and other parameters
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/evoseq-genome-informatics
a R package for the genomes annotations to phylogeny. a R package for the analysis of the specific genes from the sequenced genomes.
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/sequencing-tag-generator
A ruby function to generate dna sequence barcodes for sequencing labelling. It takes a barcode length, and the iteration you want to produce.
barcode-scanner barcodegenerator genome-sequencing ruby rubygem
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/plant-resistance-gene-logistic-regressor
an application of the logistic regressor for the plant disease resistance genes. Given a fasta file and the corresponding expression file and a motif types which you think are associated with the plant disease resistance, if prepares the classification datasets and then fits a logistic regressor for the model building.
logistic-regression machine-learning machine-learning-algorithms plantdisease plantdiseaseclassification plantdiseasedetection plantgenomes plantgenomics sequence-labeling
Last synced: 30 Jun 2025
https://github.com/gauravcodepro/semanticweb-ontologies-prepare
generating the ontology graphs and the system relationship.
bioinformatics graph-algorithms graph-theory ontologies ontologies-api
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/r-package-backup
r function for the R package installation
installer package-management rpackage rpackages rprogramming
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/panachegraph
a julia package with implementation of genome graph and nearest neighbour approach in julia-lang to the pangenome graphs.
Last synced: 21 Jun 2025
https://github.com/gauravcodepro/linear-regression-model-sequence
I coded this linear regression based training model based on the sequence features across the sequences. It has two arguments, just train the model or train and predict the model
bioinformatics bioinformatics-algorithms linear-regression machine-learning machine-learning-algorithms
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/genotyping-platform
a custom function which can be used to prepare the files for the genotyping or the sequencing. You can specify the path and the fasta files and mark them according to the desired condition for the genotyping or sequencing
genome-sequencing genomes genotyping sequencing sequencing-data sequencing-reads
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/intergenic-extractor
extracting all the intergenic regions from the genome annotation using the protein alignments.
annotation-tool bioinformatics genome-alignment genome-analysis genome-annotation protein-sequences
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/json-github
json parser for the github repository parse. Given the source code of the github it will make the direct clone address of the github reporitories.
configuration-files docker github github-pages kubernetes shell
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/hmm-genome-annotations
a hmm genome annotation parser for filtering off the hmmscans and parse annotations coming from the genome for generating the final protein annotations for genome submission.
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/docker-visual
applied nushell rust programming approach to docker containerization and created arrays from the same. A fresh way to view the docker containerization
cluster-computing docker docker-container docker-images dockerfiles
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/gfa-fasta
a gfa fasta write which reads the GFA alignment from the graphs and writes the fasta. modify the headers as you needs. A single excecution time with the faster iter rates.
bioinformatics bioinformatics-algorithms graphalignments graphs machinelearning pangenomes visualization
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/ensembl-plants
a single line code to resolve all the ensembl plants version and to make it usable for the genome assembly and further downstream analysis. Dont have to invoke the api access dicrectly and dont have to write a short::route connector
Last synced: 28 Aug 2025
https://github.com/gauravcodepro/acetylation-factors
A set of functions to estimate the acetylation factors and you can use for the multitude of genes. adding the support for the jax and also for the shell invoke to give you the complete histone analysis till acetylation factors. It will self execute the python class
acetylation bioinformatics chipseq chipseqanalysis chipseqdata datascience
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/diff-alternative-data-structure-r
I read this post today and they mentioned the diff which i have used a lot in R but i want to put this git just to show that you can also do this from a data structure point of view
data-science data-structures r rdataframe rprog rprogramming
Last synced: 17 Aug 2025
https://github.com/gauravcodepro/expression-neural-network
a deep neural expression based classifier demonstrated to fit a unbalanced dataset, expression datasets across the samples
deep-neural-networks deeplearning expression-evaluator machine-learning machine-learning-algorithms neural-network neural-networks
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pacbio-hifibrowse-electron-app
a electron based Pacbio hifi browser for visualization of pacbiohifi aligned long reads.
Last synced: 07 Jul 2025
https://github.com/gauravcodepro/illumina-genome-size
estimation of the genome size for the illumina reads, only for the pre-screening purposes and includes a R function also.
genome-analysis genome-assembly genomes jellyfish rprogramming size-calculation
Last synced: 07 Jul 2025
https://github.com/gauravcodepro/evolutionary-fitness-calculation
A data structure approach to generate a random sequence from the polyATGC stretches for evolutionary fitness
bioinformatics evolutionary-algorithms evolutionary-biology evolutionary-computation modelling-framework ruby rubyprogramming
Last synced: 12 Jul 2025
https://github.com/gauravcodepro/zenity-docker-nu
a zenity app for the docker subnet masking check with nu rust programming. I coded this earlier also for slurm and pbs and now put it for the docker also. you can define your docker processid and it will get the netmask for the same so that you can connect to those instances easily. I coded this for the slurm and pbs cluster
cluster-computing docker nushell nushell-core nushell-script
Last synced: 15 Jun 2025
https://github.com/gauravcodepro/kmer-terminal-extract-nodes
a go program for getting the terminal edges with the specific types.
graph graph-algorithms kmer kmer-edges
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/genomeprofiler-fyne
a fyne go application for genome profile and profiles all genomes
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/modelling-geospatial
arcpy, qgis for vector, raster images and time series machine learning.
Last synced: 30 Aug 2025
https://github.com/gauravcodepro/pacbio-polyatgc-trimmer-recursion
long_read_polyATGC_trimmer using regular expression.
bioinformatics fasta-sequences oxford-nanopore pacbio-data pacbio-sequencing pattern-matching trimming-bases
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pacbiohifi-analyzer
a pacbiohifi analyzer for pacbiohifi reads from sequence analysis to the graph alignments.
bioinformatics genome-assembly pacbio-data pacbio-iso-seq pacbiohifi
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pbs-configure-r
R code for the PBS users for generating and working on the PBS clusters
computational-biology high-performance high-performance-computing r rprofile-management rprog rprogramming rproject
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/metagenomics-otu-plotter
plotting metagenomics otu abundances
bash-scripting bioinformatics metagenomics metagenomics-binning metagenomics-counts metagenomics-data shell-script
Last synced: 23 Aug 2025
https://github.com/gauravcodepro/bigg-slint-interface
a slint interface to the BIGG modeler database establishing all the links to the metanextx
genome-analysis genome-annotation metabolic-pathways slint slint-ui
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pacbiohifi-seq-app
a single page shiny express pacbiohifi-seq-app for sequencing facility or startups
pacbio-sequencing pacbiohifi sequencing-data sequencing-data-analysis shinycore shinydashboard shinyexpress startups
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/bigg-modeler
a desktop application using the react native for the BIGG modeler.
autoupdater desktop-app desktop-environment electron javascript metabolic-models react-native
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/visualize-proteins
a protein visualization for the annotation of the genome transcripts and their integration into the genome maps. You can have the alignments in any gfa, mfa or paf format and it will visualize the transcripts to the gene alignments.
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pacbiohifi-shiny-report-application
pacbiohifi shiny application
bioinformatics data-visualization data-visualization-dashboard expression pyshiny-core pyshiny-express
Last synced: 13 Jul 2025
https://github.com/gauravcodepro/zsh-posh-scrap
ZSH_POSH_web_scrapping: This repository contains the bash based web scrapping if you want to install the nerd fonts for programming
bash bash-scripting font-awesome fonts nerd-fonts oh-my-posh oh-my-posh-theme oh-my-zsh-theme programming programming-fonts webscraping
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/transdecoder-visualization
a regular expression based trinity assembly transdecoder predictions encoder which will parse and will prepare the transcript annotations for visualization with any genome visualization kit such as pygenomeviz, mauve and others, it prepares the coordinates as tuples
geneprediction genome-annotation transcriptome-analysis transcriptome-annotation transcriptome-assembly transcriptome-wide trinity
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/shell-repo-generator
a shell sudo repor generator for finding and updating the packages on the cloud instances and also the slurm and the pbs instances. just provide the instance name and the package to be seached and it will bring down all the install available.
cloud cloud-computing linux-shell slurm-cluster
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/shiny-metabolic_analysis
a shiny application for the analysis of the metabolic genome and to provide the visualization and the mapping analysis. It will also allow you to make the multiple mapping terms
python3 rshiny rshinyapp shiny-applications shiny-python shiny-r
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/genome-datautilities
genome-datautilies for genome analysis and also sequence and string manipulations.
bash bash-script bashpro chpc cluster computing-cluster computing-framework programming shell shell-script
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/zenity-processid-application
a zenity based processid and it will kill automatically the process id after plotting
cluster-computing network network-programming processes zenity-gui
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/ruby-genome-annotation
A genome annotation length calculator written in ruby. It invokes the shell subprocess with in ruby to parse the iterators at the faster rate. if you have dozens of genome sequenced, simply mention the column number and the iterator will hash the length. added support for the features as
bioinformatics genome-analysis genome-annotation ruby-gem
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pangenome-proportion
a R function to estimate the proportion of the pangenome across the species after the orthology runs. It makes the proportion of the table across the species. You can provide the species file and then estimated
bioinformatics pangenome rprogramming
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/plant-resistance-gene-fetcher
a custom function to fetch the dna and the protein sequence from the plant resistance gene database and get the corresponding dna_sequence and the protein_sequence.
bioinformatics disease-detection diseaseresistance plantgenomics plantresistance sequence-analysis sequence-labeling
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/odd-ratio-estimator
This function will take a data frame of the outbreak and will predict the odd ratios and the specific likelihood of occurrence of the disease in that specific geographical location
genomic-data-analysis genomics-data infectious-disease-models infectious-diseases odds-ratio python risk-analysis risk-assessment risk-management
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/neural-network-metagenomics
a function to generate the hidden layers from the given fasta and the expression files. it takes the replicate columns and then calculates the expression and length as a hidden layer. Applying to the transcriptomics, meta transcriptomics and other expression datasets
bioinformatics machinelearning machinelearningalgorithms metagenomics metatranscriptomics neural-network neural-networks
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/phytozome-pacid-fetcher
this function takes ids file with the gene of interest and the phytozome gff files and will fetch the pacid for the genes of interest.
annotation-processor bioinformatics bioinformatics-analysis genome-analysis genome-annotation phytozome
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/protein-multialign-gem
a protein miltalign gem for the sequence extractions, upstream and downstream promoters, enhancers and also the motifs localized near to the mRNA
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/awk-plotter
a awk based sort index way to plot the files or the directories across the dockers and intergrate this in your ~/.bashrc or the ~./zshrc or a cron job for managing the disk space across dockers
awk awk-programming-language cluster-computing containers docker filesystem watcher
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/paf-graphs-plotter
a paf formatter and plotter for the graph analysis and see the alignment of the miniasm and it uses faster approach for the analysis of the paf alignments and also outputs and extracts the corresponding paf alignments for the graphs.
alignment-algorithm alingmentplotter graph miniasm paf visualization-tools
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/maf-diagonal-information
a maf diagonal information written in AWK for getting the specific corrdinates of the alignment present on the plus or the minus strand and getting their diagonal information so that it can be used as node.
awk-programming-language bioinformatics diagonal mafalignments pacbiohifi
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/mirna-neural-network
A function to prepare the neural network sequence for the miRNA predictions. It uses the target prediction and the transcript and prepares the targets for the neural networks
bioinformatics machine-learning machine-learning-algorithms microrna-sequence microrna-target-prediction microrna-targets micrornas neural-machine-translation neural-networks
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/miniprot-ruby-gem
a ruby gem for the parsing and extraction of the miniprot alignments to extract locus specific, ids specific, and score and iters
bioinformatics genome-analysis genome-annotation ruby-gem rubyprogramming
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pbs-altair-pro-bash
This repository contains the code for the PBS Altair Pro at CHPC and you can save this code ending with .sh and run the script as .sh and you dont have to remember the PBS Pro manual.
altair bash-script bash-scripting computing-cluster high-performance high-performance-computing pbs pbs-torque
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/plant-resistance-gene-miner
plant resistance gene miner which uses a regular expression plus a web scrap approach and given a resistance gene id, it will return the genbank id
bioinformatics bioinformatics-analysis genemining plantdisease plantdiseaseclassification plantgenomics resistancegenes
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/pacbiohifi-universitat-potsdam
a slurm configuration analytical pipline for the analysis of the pacbiohifi sequencing genomes using the verkko, hifiasm and the genomeasm4pg. only provide the link to the sequencing or the files with the folder and rest it will do the work
genome-annotation genome-assembly genome-sequencing pacbio-data pacbio-sequencing slurm-cluster slurm-job slurm-workload-manager
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/literature-bert
python class for literature training from biomedical literature. It reads the text from the pdf and then implements the tokens and then uses the BERT model to train the model
bert-embeddings bert-models bioinformatics genomics literature-mining machine-learning-algorithms natural-language-processing naturallanguageprocessing
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/lastz-alignment-awk
genome sorting and plotting the length alignment from the lastz alignment for the node calculations right before you import them for the calculations.
genome-analysis genomealignment hifi pacbio-data pacbio-sequencing
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/genome-annotation-multivisual
dplyr version of visualization of all the coding regions for a specific ids from protein alignment.
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/linear-regression-bounded
fitting a linear regression on the height and the bolting time of the lettuce phenotypes to see if there can be a linear regression to be established
bioinformatics linear-models linear-regression machinelearning memorylinearregression python rprogramming
Last synced: 22 Feb 2025
https://github.com/gauravcodepro/graph-sequence-similarity
a site specific function to estimate the site similarity across the graphs to make it faster, i implemented a linear approach so that it will compare the iterable as a list and then stores it in the another iterable. You can nest these functions as callables. Since graphs are nested as the linear in the grammar of the graphics
Last synced: 02 Aug 2025