https://github.com/bryancastillo10/dna-seq-explorer
Web App for Basic Molecular Biology Analysis (with trained AI feature for DNA Type/Possible Taxa Prediction)
https://github.com/bryancastillo10/dna-seq-explorer
bioinformatics-tool fastapi machine-learning material-ui react-query tanstack-router zustand
Last synced: 5 months ago
JSON representation
Web App for Basic Molecular Biology Analysis (with trained AI feature for DNA Type/Possible Taxa Prediction)
- Host: GitHub
- URL: https://github.com/bryancastillo10/dna-seq-explorer
- Owner: bryancastillo10
- License: mit
- Created: 2025-02-21T06:55:27.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-06-27T01:28:12.000Z (7 months ago)
- Last Synced: 2025-07-04T13:21:51.408Z (6 months ago)
- Topics: bioinformatics-tool, fastapi, machine-learning, material-ui, react-query, tanstack-router, zustand
- Language: TypeScript
- Homepage: https://dna-seq-explorer.fly.dev/
- Size: 8.42 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.MD
- License: LICENSE
Awesome Lists containing this project
README
## 𧬠DNA Seq Explorer Web App
This **React-Typescript & Python-FastAPI** web application helps life science enthusiasts in performing basic analysis of fundamental biomolecules such as **DNA, RNA, & protein**. It includes features such as calculating sequence parameters, predicting taxa and DNA type, as well as performing pariwise sequence alignments.
The architecture of this app does not include storage system (no database), but it offers a **report export feature** for saving the analysis results. For the tech stacks, the client side is mainly built from React-Typescript, together with Material UI, Tanstack Router, React-Query, and Zustand. Meanwhile, the server side is powered by Python-FastAPI with libraries such as numpy, reportlab, pydantic, and sci-kit learn.
## 1. Client Side Directory
```bash
#client/src
βββ πΈ assets
β βββ π icons
β βββ π images
βββ π§© components
β βββ π common
β βββ π layout
β βββ π navigations
β βββ π providers
β βββ π ui
βββ π’ constants
βββ πcontext
βββ π features
β βββ π dotplot
β βββ π fileExport
β βββ π home
β βββ π pairSeq
β βββ π singleSeq
βββ πͺhooks
βββ π£οΈ routes
βββ π οΈutils
βββ πzustand
```
## 2. Server Side Directory
```bash
βββ π api
βββ ποΈ example
βββ π lib
βββ πΎ models
βββ π¬service
β βββ π advanced_analysis
β βββ π basic_analysis
β βββ π dotplot
β βββ π file_export
β βββ π global_alignment
β βββ π local_alignment
βββ π οΈ utils
```
## 3. Preview (Screenshots)


## 4. User Features

> Analyze DNA/RNA or protein sequences to extract some key biological information. For DNA/RNA, results includes the transcription, reverse complement, translation, GC content, and nucleotide frequency. Meanwhile, protien sequence can have results such as molecular weight (Da), isoelectric point, and amino acid frequency.

> Leverages a trained AI model to predict taxonomic classification (Kingdom level) and DNA type (e.g. genomic, mitochondrial, chloroplast). The AI was built using a public dataset from the UCI Machine Learning Repository. This feature demonstrates my skill set in the foundations of machine learning and its API integration via a Python microservice.

> Dotplot alignment is a fundamental computational biology method to have a brief comparison between two sequences and it is basically just plotting the matches of a base/amino acid. Leveraging the graphing library of matplotlib, the API endpoint can generate an image based on the input pair sequence for dotplot alignment. It should be noted that dotplot alignment is just based on a simple matching algorithmm and highly susceptible to noise/redundancy so complex gene comparison may not be suitable for this.

> Local pairwise sequence alignment feature based on the scoring system by Smith-Waterman algorithm. This feature is ideal for identifying conserved regions or subsequences which are called motifs within large sequences.

> Global pairwise sequence alignment feature based on the scoring system by Neeedleman-Wunsch algorithm. It is suitable for comparing an entire sequence of DNA or protein to reveal evolutionary relationships and functional similarities.
## 5 Software System Architecture & Machine Learning Workflow

Raw Data Preprocessing & AI Modeling (AI Development Phase)


## 6 API Documentation
This project demonstrates REST API using Python FastAPI framework which are generally several POST request methods for analysis and export file feature.
π **[View Full Documentation](server/docs/index.md)**
## 7 Licenses
MIT License
Copyright (c) 2025 Bryan Castillo
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.