{"id":19751732,"url":"https://github.com/paulosalem/gpt3-poc-tutorial-with-braindump","last_synced_at":"2025-07-12T01:08:29.912Z","repository":{"id":150930995,"uuid":"592757073","full_name":"paulosalem/gpt3-poc-tutorial-with-braindump","owner":"paulosalem","description":"A demo application to support my tutorial on building applications with GPT-3.","archived":false,"fork":false,"pushed_at":"2023-11-18T14:55:51.000Z","size":494,"stargazers_count":36,"open_issues_count":0,"forks_count":19,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-30T10:32:18.956Z","etag":null,"topics":["data-science","gpt","gpt-3","natural-language-understanding","openai","proof-of-concept"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/paulosalem.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-01-24T13:23:04.000Z","updated_at":"2025-03-08T11:26:48.000Z","dependencies_parsed_at":"2025-04-30T10:31:54.853Z","dependency_job_id":"27964242-0fa4-4b34-9ba7-ec8e1fe6ff13","html_url":"https://github.com/paulosalem/gpt3-poc-tutorial-with-braindump","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/paulosalem/gpt3-poc-tutorial-with-braindump","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulosalem%2Fgpt3-poc-tutorial-with-braindump","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulosalem%2Fgpt3-poc-tutorial-with-braindump/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulosalem%2Fgpt3-poc-tutorial-with-braindump/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulosalem%2Fgpt3-poc-tutorial-with-braindump/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/paulosalem","download_url":"https://codeload.github.com/paulosalem/gpt3-poc-tutorial-with-braindump/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulosalem%2Fgpt3-poc-tutorial-with-braindump/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264922866,"owners_count":23683701,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","gpt","gpt-3","natural-language-understanding","openai","proof-of-concept"],"created_at":"2024-11-12T02:45:19.292Z","updated_at":"2025-07-12T01:08:29.890Z","avatar_url":"https://github.com/paulosalem.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Braindump\n\n  **Update (November 2023): new GPT-3.5-Turbo version to be preferred.** *I added a new version supporting the Chat Completion API (tested with GPT-3.5-Turbo). Appropriate subfolders (`gpt-3`, `gpt-35-turbo`) now contain the original and the new version. Other than the model change and corresponding asjustments, they are the same, but `gpt-35-turbo` is to be preferred, because GPT-3 completion is deprecated.*\n\nBraindump is a prototype application for taking notes and converting them to a database that can be more easily queried. Just type what is in your mind and the application properly classifies, slices, and stores it for later use. **It was built as a demo to show how to leverage GPT-3 to build applications starting with Proofs-of-Concept, as described in [my Data Science @ Microsoft tutorial, \"Building GPT-3 applications — beyond the prompt\"](https://medium.com/data-science-at-microsoft/building-gpt-3-applications-beyond-the-prompt-504140835560).** You can use it both to follow the tutorial and as a starting point for your\nown studies and applications (e.g., by reusing the utility functions and overal program structures in your own, different, problems).\n\nIt is a simple Python application that leverages [Streamlit](https://streamlit.io/) to provide a web interface. To actually call the GPT-3 model, you need to have a working [OpenAI API](https://openai.com/api/) key. At the time of writing, once you create your account, you get some free credits that should be enought to follow the tutorial and get started with the application. The application should also work with [Azure OpenAI Service](https://azure.microsoft.com/en-us/products/cognitive-services/openai-service/) instead of the original OpenAI offer, though I have not yet tested it there.\n\nBesides the application itself, this repository includes the studies, in the form of Jupyter notebooks, that led to it.\n\nThe UI for searching looks like this:\n![Search facts tab](./docs/braindump-search-facts.png)\n\nTo add facts, the UI is as follows, including an optional manual inspection of the model interpretation:\n![Add facts tab, including the optional manual inspection of the model interpretation](./docs/braindump-add-facts-with-inspection.png)\n\n## Running the Application or Studies\n\nThe application has been tested on Python 3.8 (GPT-3) and 3.10 (GPT-3.5-Turbo). The main libraries you'll need are: `openai`, `streamlit`, `pandas`, `notebook`, `pytest`. You can install them manually, or follow the below procedure to create a new environment and install them automatically. Note that for the older codebase you will need an older version of the `openai` library.\n\n**To run the application:**\n\n  1. It is recommended that you run Python 3.10+, from the Anaconda distribution, which can be obtained [here](https://www.anaconda.com/products/distribution).\n  2. To ensure dependencies are properly installed, you can first create a new environment just for this application using `conda create -n braindump_py310 python=3.10`\n  3. Activate the new environment using `conda activate braindump_py310`\n  4. For GPT-3.5-Turbo (recommended), install the dependencies listed in `requirements.txt`. You can do this by running `pip install -r requirements.txt` from the root of the project. For the original GPT-3 version (deprecated), use the `requirements.gpt3.txt` instead, to get the older dependencies necessary for its operation.\n  5. Obtain you need to have a working [OpenAI API](https://openai.com/api/) key and make it available as an environment variable called `OPENAI_API_KEY`.\n  6. Finally, launch the application from the root of the project. On Windows: `run.gpt3.bat` (GPT-3 version) or `run.gpt35turbo.bat` (GPT-3.5-Turbo version); on Linux:  `run.gpt3.sh` (GPT-3 version) or `run.gpt35turbo.sh` (GPT-3.5-Turbo version).\n\n**To run the studies:**\n  1. Follow the steps above, except the last one.\n  2. Open the desired Jupyter notebook under `notebooks/` with your favorite Jupyter client (personally, I use VS Code a lot for that).\n## Project Structure\n\nThe project is structured as follows:\n  - `notebooks/`: Jupyter notebooks used for prompt engineering.\n  - `src/`: source code for the final application.\n    * `src/gpt-3`: sources for the original GPT-3 version (deprecated).\n    * `src/gpt-3.5-turbo`: sources for the GPT-3.5-Turbo version (**recommended** since November 2023).\n  - `data/`: data stored by the application.\n  - `tests/`: unit tests for the application.\n    * `tests/gpt-3/`: tests for the original GPT-3 version (deprecated).\n    * `tests/gpt-3.5-turbo/`: tests for the GPT-3.5-Turbo version (**recommended** since November 2023).\n  - `docs/`: documentation and related assets.\n\n## Approach\nThe approach is presented in detail in [my Data Science @ Microsoft tutorial, \"Building GPT-3 applications — beyond the prompt\"](https://medium.com/data-science-at-microsoft/building-gpt-3-applications-beyond-the-prompt-504140835560). Nevertheless, let me highlight some key points here:\n\n  - Large Language Models, notably GPT-3, GPT-3.5-Turbo and GPT-4, offer a relatively easy and very flexible way to build some types of software. However, considerable additional Software Engineering aspects are required to actually build a robust and usable application.\n  - Proofs-of-Concept (PoC) are great to explore the capabilities of new technologies and demonstrate value quickly and at low cost. They thus provides a way to secure further investents if waranted. Since the application of LLMs like GPT-3 remains a very new area, PoCs are a great way to explore the space and learn.\n  - A gradual, iterative, process is the best way to build such PoCs and applications. Start with a simple use case and add features and complexity as you go.\n  - In this manner, it is now possible to achieve impressive results with relativelly little effort. Things that would be too costly or even impossible to do previously are now feasible. It is thus a great way to improve productivity -- both for individuals and for organizations. Time to explore and experiment with formerly unthinkable projects!\n\nIn terms specific phases, the following is advisable\n  - Try the OpenAI Playground with some very simple cases to see if the idea merits more work.\n  - Once you decide to proceed, write a simple specification consisting of the basic data structures you'll manipulate and some examples of inputs and outputs.\n  - Break the problem in subproblems, and determine which ones can be handled by GPT-3 or similar models.\n  - Gradually and iterativelly engineer your prompts, preferably using Jupyter notebooks.\n  - Once satisified with the quality of the prompts, encapsulate them and the auxiliary mechanisms in an engine.\n  - Build a UI for your engine, preferably with something like  [Streamlit](https://streamlit.io/) or [Gradio](https://www.gradio.app/), both of which produce good results very fast.\n  - Show the PoC to stakeholders and iterate as appropriate.\n\n## License\n\nMIT License\n\nCopyright (c) 2023 Paulo Salem da Silva\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaulosalem%2Fgpt3-poc-tutorial-with-braindump","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpaulosalem%2Fgpt3-poc-tutorial-with-braindump","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaulosalem%2Fgpt3-poc-tutorial-with-braindump/lists"}