Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/csbl-br/awesome-compbio-chatgpt
An awesome repository of community-curated applications of ChatGPT and other LLMs im computational biology
https://github.com/csbl-br/awesome-compbio-chatgpt
List: awesome-compbio-chatgpt
Last synced: 2 months ago
JSON representation
An awesome repository of community-curated applications of ChatGPT and other LLMs im computational biology
- Host: GitHub
- URL: https://github.com/csbl-br/awesome-compbio-chatgpt
- Owner: csbl-br
- License: unlicense
- Created: 2023-03-24T00:02:03.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-16T18:02:14.000Z (9 months ago)
- Last Synced: 2024-05-20T11:02:37.368Z (8 months ago)
- Size: 85 KB
- Stars: 241
- Watchers: 10
- Forks: 24
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- awesome-ChatGPT-repositories - awesome-compbio-chatgpt - An awesome repository of community-curated applications of ChatGPT and other LLMs im computational biology (Awesome-lists)
- awesome-llm-and-aigc - csbl-br/awesome-compbio-chatgpt - br/awesome-compbio-chatgpt?style=social"/> : An awesome repository of community-curated applications of ChatGPT and other LLMs in computational biology! (Summary)
- awesome-llm-and-aigc - csbl-br/awesome-compbio-chatgpt - br/awesome-compbio-chatgpt?style=social"/> : An awesome repository of community-curated applications of ChatGPT and other LLMs in computational biology! (Summary)
README
# awesome-compbio-chatgpt
[![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)
An awesome repository of community-curated applications of ChatGPT and other LLMs in computational biology!
Any material is welcome, as long as it is:
* Free to read without accounts
* High quality (subjective, but we know it when we see it)
* Relevant to bioinformaticiansTo contribute, just add the link to this document and open a Pull Request!
Have fun!
## Prompt collections
* Data science prompts: https://github.com/travistangvh/ChatGPT-Data-Science-Prompts
* ChatGPT cheatsheet for Data Science: https://www.datacamp.com/cheat-sheet/chatgpt-cheat-sheet-data-science
* Fun and useful prompts: https://github.com/f/awesome-chatgpt-prompts
* List of awesome ChatGPT lists: https://github.com/OpenMindClub/awesome-chatgpt
* Collection of prompts for developers (YouTube): https://www.youtube.com/watch?v=sTeoEFzVNSc&t=901s## Applications that use the GPT API
* GPT for Google Sheets: https://gptforwork.com/
* GPT for R Studio: https://github.com/MichelNivard/gptstudio
* GPT for R Developers: https://jameshwade.github.io/gpttools/
* GPT for PDFs (in a web interface, freemium service): https://chatpdf.com
* GPT directly from the command line: https://github.com/npiv/chatblade
* GPT-based chatbot fine-tuned for bioinformatics https://ai.tinybio.cloud/chat (discussed on [Biostars](https://www.biostars.org/p/9565757/))
* Very powerful question answering using GPT and internet searches: https://www.perplexity.ai/
* Python framework for connecting conversational AI to bioinformatics pipelines https://github.com/biocypher/biochatter and [ChatGSE](https://chat.biocypher.org)## Prompt Engineering
* Quick guide of best practices for Prompt Engineering: https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api
* More advanced guidance on Prompt Engineering: https://www.promptingguide.ai/
* Good 20 min blogpost covering core aspects of Prompt Engineering: https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/## Understanding the tool
* A very good post by Stephen Wolfram on the details on _how_ it works: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/
* List of papers published by OpenAI: https://openai.com/research
* Basic YouTube Crash Course on ChatGPT (12/December/2022, so a bit old): https://www.youtube.com/watch?v=JTxsNm9IdYU
* Technnical overview of details on how LLMs work and what can be their future (Eight Things to Know about Large Language Models): https://cims.nyu.edu/~sbowman/eightthings.pdf### Ethics and Accountability Discussions
* "Pause Giant AI Experiments: An Open Letter" https://futureoflife.org/open-letter/pause-giant-ai-experiments/
## Bioinformatics-specific resources
* Ten Quick Tips for Harnessing the Power of ChatGPT/GPT-4 in Computational Biology (our article): https://arxiv.org/abs/2303.16429 and https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011319
* Single-cell RNA-seq analysis in Python guided by ChatGPT (nice video, from Cell Ranger output to final plots): https://www.youtube.com/watch?v=fkuLFlC2ZWk
* Using ChatGPT in bioinformatics and biomedical research (good blog post): https://omicstutorials.com/using-chatgpt-in-bioinformatics-and-biomedical-research/
* ChatGPT for bioinformatics (on the perspective of a bioinformatician): https://medium.com/@91mattmoore/chatgpt-for-bioinformatics-404c6d0817a1
* ChatGPT and bioinformatics careers (Reddit forum discussion): https://www.reddit.com/r/bioinformatics/comments/11wwnqj/chatgpt_and_bioinformatics_careers/
* ChatG-PPi-T: Finding Interactions with OpenAI (far from production-ready, but creative idea): https://www.linkedin.com/pulse/chatg-ppi-t-finding-interactions-openai-jon-hill/
* BioGPT: generative pre-trained transformer for biomedical text generation and mining: https://github.com/microsoft/BioGPT
* A Platform for the Biomedical Application of Large Language Models: https://arxiv.org/abs/2305.06488
* scGPT: Foundation model for single-cell biology: https://github.com/bowang-lab/scGPT ([Paper (not OA)](https://doi.org/10.1038/s41592-024-02201-0))
## API and advanced applications
* LangChain combines LLM/GPT API requests with (1) access to external documents and (2) abilities to talk to the wider web, creating semi-autonomous agents: https://python.langchain.com/en/latest/
* Llama Index is an interface for combining LLMs with external data (such as documents): https://gpt-index.readthedocs.io/en/latest/index.html
* guardrails is an application to improve and refine LLM outputs in Python: https://github.com/ShreyaR/guardrails
* Embedchain is a framework to create ChatGPT like bots over your personalised dataset in 3 lines of code: https://embedchain.ai/
* The [Hugging Face Hub open LLM leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) collects and ranks open-source LLMs; very important for accessibility and reproducibility of LLM-based bioinformatics## Quick Prompts
The following prompts may not be the most efficient in terms of prompt engineering, but they are quick, useful and exemplify usage of ChatGPT for computational biologists.
### Improve Code Readability and Documentation
* “Add explanatory comments to this code: {code here}”
* “Rename the variables for clarity: {code here}”
* "Render roxygen2 documentation for the function: {R code here}”.### Write Code Efficiently
* "Extract functions to increase modularity: {code here}"
* "Write a unit test for the following function and help me implement it: {code here}"
* "Re-write and optimize this for loop: {code here}"### Clean up data
* "Write me regex for R/python/Excel with a pattern that will extract {} from {}"
* "Act as a table. Add a new column with consistent labels to this dataset:"### Improve data visualization
* "Create a ggplot2 violin plot with a log10 Y axis"
* "Change my code to make the plot color-blind friendly"
* "Translate this plot from ggplot2 to matplotlib syntax"## Tutorials
* "A Guide to Using ChatGPT For Data Science Projects" : https://www.datacamp.com/tutorial/chatgpt-data-science-projects