{"id":51116347,"url":"https://github.com/oracle-devrel/springai-rag-db23ai","last_synced_at":"2026-06-24T22:30:26.584Z","repository":{"id":238929635,"uuid":"790773541","full_name":"oracle-devrel/springai-rag-db23ai","owner":"oracle-devrel","description":"springai-rag-db23ai","archived":false,"fork":false,"pushed_at":"2026-01-22T18:23:24.000Z","size":3889,"stargazers_count":24,"open_issues_count":2,"forks_count":11,"subscribers_count":7,"default_branch":"main","last_synced_at":"2026-06-24T11:36:12.717Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"upl-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oracle-devrel.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-23T13:57:46.000Z","updated_at":"2026-06-17T01:52:08.000Z","dependencies_parsed_at":"2024-05-09T00:24:09.337Z","dependency_job_id":"83e92b91-9fd1-44c0-aa58-8e041b6ba653","html_url":"https://github.com/oracle-devrel/springai-rag-db23ai","commit_stats":null,"previous_names":["oracle-devrel/springai-rag-db23ai"],"tags_count":0,"template":false,"template_full_name":"oracle-devrel/repo-template","purl":"pkg:github/oracle-devrel/springai-rag-db23ai","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oracle-devrel%2Fspringai-rag-db23ai","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oracle-devrel%2Fspringai-rag-db23ai/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oracle-devrel%2Fspringai-rag-db23ai/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oracle-devrel%2Fspringai-rag-db23ai/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oracle-devrel","download_url":"https://codeload.github.com/oracle-devrel/springai-rag-db23ai/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oracle-devrel%2Fspringai-rag-db23ai/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34752465,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-24T02:00:07.484Z","response_time":106,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-24T22:30:21.981Z","updated_at":"2026-06-24T22:30:26.577Z","avatar_url":"https://github.com/oracle-devrel.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Spring AI for RAG on Oracle 23ai Vector DB with OpenAI and private LLMs\n\n![cover](./img/cover.png)\n\n## Introduction\n\nIn this demo, we'll guide you through the process of leveraging Java, Spring Boot, Oracle DB23ai and the innovative Spring AI APIs to create next-generation applications.\n\n- Build a Spring Boot Application with RAG (Retrieval Augmented Generation): Discover how to leverage Spring AI to implement a knowledge management system that retrieves relevant information and utilizes large language models to generate insightful responses.\n- Integrate Domain Knowledge from Oracle 23ai: Learn how to connect your Spring Boot application with Oracle's 23ai to access and utilize domain-specific knowledge for more accurate and relevant responses.\n- Transition to Production with Oracle Backend Platform: We'll address the challenges of moving your knowledge management system from development to production using the Oracle Backend Platform for Spring Boot and Microservices.\n\nCheck out [demo here](https://www.youtube.com/watch?v=H2w6oULzFCo\u0026list=PLPIzp-E1msraY9To-BB-vVzPsK08s4tQD\u0026index=26)\n\nThe demo shows a Retrieval-Augmented Generation using the following modules:\n\n* Spring AI API\n* Oracle DB 23ai\n* OpenAI Embeddings\n* OpenAI Chat\n* OLLAMA local LLM embeddings model\n* OLLAMA local LLM LLama2 model for chat\n\nThis demo is based on a early draft example of **Spring AI API**'s implementation for the **Oracle 23ai** as vector store, according to the specifications reported here: **[Vector DBs](https://docs.spring.io/spring-ai/reference/api/vectordbs.html)**.\n\nThere are two different types of files that contribute to the Retrieval-Augmented Generation (RAG) system in this solution:\n\n- **PDF** file is split in chunks and stored as text with vector embeddings.\n- **JSON** docs are created exploiting the **JSON-Duality** capability on existing tables\n\nThe interface, that uses Oracle Database 23ai as a Vector Store in a Spring AI pipeline, is the following:\n\n```\npublic interface VectorStore {\n\n        void add(List\u003cDocument\u003e documents);\n\n        Optional\u003cBoolean\u003e delete(List\u003cString\u003e idList);\n\n        List\u003cDocument\u003e similaritySearch(SearchRequest request);\n\n        List\u003cDocument\u003e similaritySearch(String query);\n}\n```\n\nThese operations allow uploading documents into a vector database, searching for similar documents using the specific vector distance algorithm chosen (you can change this in the `.properties` files). \n\n```\ndefault List\u003cDocument\u003e similaritySearch(String query) {\n    return this.similaritySearch(SearchRequest.query(query));\n}\n```\n\nThe file `src/main/java/com/example/demoai/OracleDBVectorStore.java` holds this implementation.\n\nThe Vector Store saves the data in this **VECTORTABLE**:\n\n```\nCREATE TABLE VECTORTAB (\n        id NUMBER GENERATED AS IDENTITY,\n        text CLOB,\n        embeddings VECTOR,\n        metadata JSON,\n        PRIMARY KEY (id)\n);\n```\n\nThe **id** will be based on an generated **Identity** Column key, but this can be changed if you prefer.\n\nThe metadata content depends on what's coming from Document object, and in this case it will hold the following data:\n\n```\n{\n    \"page_number\":\"xxx\",\n    \"file_name\":\"xxx\", \n}\n```\n\nThis table is created at each application startup by default but, by configuring the `config.dropDb` parameter to `false` in  `application-dev.properties`, you can accumulate data every time you start up the application startup, in the same vector tab, and these documents will increase the vector database's knowledge base.\n\n## Docs\n\nWith regards to endpoint services, you can find the implementation in [DemoaiController.java](src/main/java/com/example/demoai/controller/DemoaiController.java). The following main REST services have been implemented:\n\n- **/store**\n\n    Accepts a PDF doc to be chunked, vector embeddings are created and stored in the **VECTORTABLE**.\n\n- **/store-json**\n\n    Providing the name of a **relational duality view** created on the DB, this service creates, for each JSON record, a vector embedding, chunks it, and stores it in the **VECTORTABLE**. This service shows that you can put both structured and unstructured text data into the RAG, and you'll be able to query this data in natural language as querying a JSON document.\n\n- **/rag**\n    Providing a query in natural language, it manages in a Retrieval-Augmented Generation pipeline that uses the content of **VECTORTABLE**, adding the most similar chunks to the question to the context and sending everything using a template in the file: [prompt-template.txt](src/main/resources/prompt-template.txt)\n\nThe following tests have also been implemented, to debug and play with the solution if you're really interested:\n\n- **/search-similar**\n\n    Returns a list of the nearest chunks to the message provided stored in the **VECTORTABLE**. This means, you can check the \"closest matches\" in your vector database. It's useful to get info about the context used to determine the prompt sent to the LLM for the completion process and use as references to provide a response.\n\n- **/delete**\n\n    Allows you to remove a list of chunks, identified by their IDs, from **VECTORTABLE**.\n\n- **/embedding**\n\n    Provide, given an input string, its corresponding generated vector embedding.\n\n- **/generate**\n\n    Chat client that doesn't use the RAG pipeline. It could be used as a baseline to show the differences between a response provided by the LLM service as-is (OpenAI, OLLAMA) and an augmented request. It's useful to check if any public content has been used for LLM training, whether the response is near to what you expect, without providing your documents.\n\n## 0. Prerequisites\n\n### JDBC driver for Oracle DB 23ai\n\n\n\nThis demo works with the latest `ojdbc11.jar` driver related to the Oracle DBMS (23.4). To run this project, download this driver from Oracle site or directly from your DB server, looking in the directory: `$ORACLE_HOME/jdbc/lib/ojdbc11.jar`. After downloading in your local home dir, import it as a local Maven artifact with this command:\n\n```\nmvn install:install-file -Dfile=\u003cHOME_DIR\u003e/ojdbc11.jar -DgroupId=com.oracle.database.jdbc -DartifactId=ojdbc11 -Dversion=23.4.0.0 -Dpackaging=jar -DgeneratePom=true\n```\nor including in the `pom.xml` the following dependency:\n\n```xml\n\u003cdependency\u003e\n\t\u003cgroupId\u003ecom.oracle.database.jdbc\u003c/groupId\u003e\n\t\u003cartifactId\u003eojdbc11\u003c/artifactId\u003e\n\t\u003cversion\u003e23.4.0.24.05\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n### Environment variables\n\nSet the correct environment variables in a `env.sh` (or put these directly into `/home/$USER/.bashrc`) file with this content, according your server IPs (if you're planning on deploying with oLLaMA):\n\n```\nexport OPENAI_URL=https://api.openai.com\nexport OPENAI_MODEL=gpt-3.5-turbo\nexport OPENAI_EMBEDDING_MODEL=text-embedding-ada-002\nexport VECTORDB=[VECTORDB_IP]\nexport DB_USER=vector\nexport DB_PASSWORD=vector\nexport OLLAMA_URL=http://[GPU_SERVER_IP]:11434\nexport OLLAMA_EMBEDDINGS=NousResearch--llama-2-7b-chat-hf\nexport OLLAMA_MODEL=llama2:7b-chat-fp16\nexport OPENAI_API_KEY=[YOUR_OPENAI_KEY]\n#export OPENAI_URL=http://[GPU_SERVER_IP]:3000\n#export OPENAI_MODEL=NousResearch--llama-2-7b-chat-hf\n```\n\nTo invoke both OpenAI `gpt-3.5-turbo` and `text-embedding-ada-002`, you'll also need your `YOUR_OPENAI_KEY`, which must be obtained directly from the [Open AI developer platform](https://platform.openai.com/).\n\nAbout the OLLAMA_EMBEDDINGS/MODEL used, you are free for your experiment to go on the [OLLAMA Library](https://ollama.com/library) and choose other models.\n\nAs you can see, you can configure also the `OPENAI_URL`, which helps to invoke OpenAI LLMs providers compatible with the OpenAI APIs. This way, you can switch easly to other providers, even private ones.\n\nSet env with command in a shell:\n\n```\nsource ./env.sh\n```\n\n## 1. Setup\n\n### Oracle Database 23ai setup\n\n1. Download and install from [Oracle Database Free Get Started](https://www.oracle.com/database/free/get-started/) site an **Oracle Database 23ai Free**, for example, as a docker container in this way:\n\n```\ndocker run -d -p 1521:1521 --name db23ai container-registry.oracle.com/database/free:latest\ndocker exec db23ai ./setPassword.sh manager\n```\n\n2. After startup, download and install an Oracle Instant Client from the same [site](https://www.oracle.com/database/free/get-started/), and connect to the instance as shown here:\n\n```\nsqlplus sys/manager@\"${VECTORDB}:1521/FREEPDB1\" as sysdba\n```\n\n3. If running locally:\n\n```\nsqlplus sys/manager@\"localhost:1521/FREEPDB1\" as sysdba\n```\n\nto create a **vector** user to run the example:\n\n```\ncreate user vector identified by \"vector\";\ngrant connect to vector;\ngrant resource to vector;\nalter user vector default role connect, resource;\nalter user vector quota unlimited on users;\n```\n\nOnce we've created the user, we'll be able to use it in our Spring AI application by modifying `application-dev.properties`.\n\nIf running locally:\n\n```\nsqlplus vector/vector@\"localhost:1521/FREEPDB1\" as sysdba\n```\n\nWe can check the content by connecting to the Oracle DB:\n\n```\nsqlplus vector/vector@\"${VECTORDB}:1521/FREEPDB1\"\n```\n\n### Application\n\nIn the `application-dev.properties` files will be used the environment variables set at the step before:\n\n```\nspring.ai.openai.api-key=${OPENAI_API_KEY}\nspring.ai.openai.base-url=${OPENAI_URL}\nspring.ai.openai.chat.options.model=${OPENAI_MODEL}\nspring.ai.openai.embedding.options.model=${OPENAI_EMBEDDING_MODEL}\nspring.ai.openai.chat.options.temperature=0.3\nspring.datasource.url=jdbc:oracle:thin:@${VECTORDB}:1521/ORCLPDB1\nspring.datasource.username=${DB_USER}\nspring.datasource.password=${DB_PASSWORD}\nspring.datasource.driver-class-name=oracle.jdbc.OracleDriver\nconfig.tempDir=tempDir\nconfig.dropDb=true\nconfig.vectorDB=vectortable\nconfig.distance=EUCLIDEAN\nspring.servlet.multipart.max-file-size=10MB\nspring.servlet.multipart.max-request-size=20MB\nspring.ai.ollama.base-url=${OLLAMA_URL}\nspring.ai.ollama.embedding.options.model=${OLLAMA_EMBEDDINGS}\nspring.ai.ollama.chat.options.model=${OLLAMA_MODEL}\n```\n\nIn `application.properties`, check if the default env is set as `dev`:\n\n```\nspring.profiles.active=dev\n```\n\nThen build and run the application:\n\n- Set env: `source ./env.sh`\n- Build: `mvn clean package -Dmaven.test.skip=true`\n- Run: `mvn spring-boot:run`\n\nFor each source update, repeat these two steps.\n\n## 1. Test OpenAI version\n\nCheck code:\n\npom.xml:\n\n```\n\u003c!--//CHANGE--\u003e\n\u003c!-- Ollama for embeddings/chat\n    \u003cdependency\u003e\n        \u003cgroupId\u003eorg.springframework.ai\u003c/groupId\u003e\n        \u003cartifactId\u003espring-ai-ollama-spring-boot-starter\u003c/artifactId\u003e\n    \u003c/dependency\u003e\n--\u003e\n```\n\nDemoaiController.java:\n\n```\n    //CHANGE\n    //import org.springframework.ai.ollama.OllamaEmbeddingClient;\n    //import org.springframework.ai.ollama.OllamaChatClient;\n    ...\n\n    //CHANGE\n    private final EmbeddingClient embeddingClient;\n    //private final OllamaEmbeddingClient embeddingClient;\n\n    //CHANGE\n        private final ChatClient chatClient;\n        //private final OllamaChatClient chatClient;\n\n    ...\n\n    //CHANGE\n        @Autowired\n        public DemoaiController(EmbeddingClient embeddingClient, @Qualifier(\"openAiChatClient\") ChatClient chatClient, VectorService vectorService) {  // OpenAI full\n        //public DemoaiController(OllamaEmbeddingClient embeddingClient, @Qualifier(\"openAiChatClient\") ChatClient chatClient, VectorService vectorService) {  // Ollama Embeddings - OpenAI Completion \n        //public DemoaiController(OllamaEmbeddingClient embeddingClient, OllamaChatClient chatClient, VectorService vectorService) { // Ollama full \n\n```\n\nVectorService.java:\n\n```\n    //CHANGE\n    //import org.springframework.ai.ollama.OllamaChatClient;\n    ...\n    //CHANGE\n        private final ChatClient aiClient;\n        //private final OllamaChatClient aiClient;\n\n        //CHANGE\n        VectorService(@Qualifier(\"openAiChatClient\") ChatClient aiClient) {\n        //VectorService(OllamaChatClient aiClient) {\n```\n\nDemoaiApplication.java:\n\n```\n    //CHANGE\n    //import org.springframework.ai.ollama.OllamaEmbeddingClient;\n\n    ...\n    //CHANGE\n        @Bean\n        VectorStore vectorStore(EmbeddingClient ec, JdbcTemplate t) {\n        //VectorStore vectorStore(OllamaEmbeddingClient ec, JdbcTemplate t) {\n            return new OracleDBVectorStore(t, ec); \n        }\n\n```\n\n### Pre document store\n\n#### Generic chat\n\n```bash\ncurl -X POST http://localhost:8080/ai/generate \\\n    -H \"Content-Type: application/json\" \\\n    -d '{\"message\":\"What is a Generative AI?\"}' | jq -r .generation\n```\n\nHere's a sample output from the command:\n\n```\n    Generative AI refers to artificial intelligence systems that are capable of creating new content, such as images, text, or music, based on patterns and examples provided to them. These systems use algorithms and machine learning techniques to generate realistic and original content that mimics human creativity. Generative AI can be used in a variety of applications, such as creating art, writing stories, or designing products.\n```\n\n#### RAG request without any data stored in the DB\n\n```\ncurl -X POST http://localhost:8080/ai/rag \\\n        -H \"Content-Type: application/json\" \\\n        -d '{\"message\":\"Can I use any kind of development environment to run the example?\"}'\n```\n\nOutput from the command:\n\n```\n    {\n        \"generation\" : \"Based on the provided documents, it is not specified whether any kind of development environment can be used to run the example. Therefore, I'm sorry but I haven't enough information to answer.\"\n    }\n```\n\n### Search on data coming from a PDF stored\n\nStore a PDF document in the DBMC 23c library: [**Oracle® Database: Get Started with Java Development**](https://docs.oracle.com/en/database/oracle/oracle-database/23/tdpjd/get-started-java-development.pdf) in the Oracle DB 23ai with embeddings coming from the OpenAI Embedding service. Dowload locally, and run in a shell:\n\n```\ncurl -X POST -F \"file=@./docs/get-started-java-development.pdf\" http://localhost:8080/ai/store\n```\n\n**Note**: this process usually takes time because document will be splitted in hundreds or thousands of chunks, and for each one it will asked for an embeddings vector to OpenAI API service. In this case has been choosen a small document to wait a few seconds.\n\n#### Q\u0026A Sample\n\nLet's look at some info in this document and try to query comparing the results with the actual content:\n\n- **4.1.1 Oracle Database**\n\n![dbtype](./img/dbtype.png)\n\n```\ncurl -X POST http://localhost:8080/ai/rag \\\n    -H \"Content-Type: application/json\" \\\n    -d '{\"message\":\"Which kind of database you can use to run the Java Web example application) \"}' | jq -r .generation\n```\n\nResponse:\n\n```\n    You can use either Oracle Autonomous Database or Oracle Database Free available on OTN to run the Java Web example application.\n```\n\n- **4.1.5 Integrated Development Environment**\n\n![ide](./img/ide.png)\n\n```\ncurl -X POST http://localhost:8080/ai/rag \\\n    -H \"Content-Type: application/json\" \\\n    -d '{\"message\":\"Can I use any kind of development environment to run the example?\"}' | jq -r .generation\n```\n\nResponse:\n\n```\n    Based on the information provided in the documents, you can use an Integrated Development Environment (IDE) like IntelliJ Idea community version to develop the Java application that connects to the Oracle Database. The guide specifically mentions using IntelliJ Idea for creating and updating the files for the application. Therefore, it is recommended to use IntelliJ Idea as the development environment for running the example.\n```\n\n- **4.2 Verifying the Oracle Database Installation**\n\n![dbverify](./img/dbverify.png)\n\n```\ncurl -X POST http://localhost:8080/ai/rag \\\n    -H \"Content-Type: application/json\" \\\n    -d '{\"message\":\"To run the example, how can I check if the dbms it is working correctly?\"}' | jq -r .generation\n```\n\nResponse:\n\n```\n    To check if the Oracle Database is working correctly, you can verify the installation by connecting to the database using the following commands:\n    1. Navigate to the Oracle Database bin directory: $ cd $ORACLE_HOME/bin\n    2. Connect to the database as sysdba: $ ./sqlplus / as sysdba\n\n    If the connection is successful, you will see an output confirming that you are connected to the root container of the database. This indicates that the Oracle Database installation is working correctly. Additionally, you can download the Client Credentials for an ATP instance and verify the connection by following the steps provided in the documentation.\n```\n\nFirst, let's ask for a question not related to the document stored:\n\n```\ncurl -X POST http://localhost:8080/ai/rag \\\n        -H \"Content-Type: application/json\" \\\n        -d '{\"message\":\"How is the weather tomorrow?\"}' | jq -r .generation\n```\n\nResponse:\n\n```\n{\n    \"generation\" : \"I'm sorry but I haven't enough info to answer.\"\n}\n```\n\nThen, let's test similarity search for message **\"To run the example, how can I check if the dbms it is working correctly?\"** example. The `top_k` parameter determines how many nearest chunks to retrieve is set to **4** by default, and the result set is by default in reverse order. So, we need to execute the fololwing command:\n\n```\ncurl -X POST http://localhost:8080/ai/search-similar \\\n        -H \"Content-Type: application/json\" \\\n        -d '{\"message\":\"To run the example, how can I check if the dbms it is working correctly?\"}' | jq '.[3]'\n```\n\nThen, we test the deletion. Indexes begin counting at `1`, so let's execute the following command to delete occurrences 1, 4 and 5:\n\n```\ncurl \"http://localhost:8080/ai/delete?id=1\u0026id=5\u0026id=4\"\n```\n\n## 2. Running generations and chat with private LLMs through OLLAMA\n\nWe'll need to create an OCI Compute instance and install OLLAMA inside. Then, we will expose the server through an Internet Gateway and allow our Spring AI application connect to the OLLAMA server and make the equivalent requests as with OpenAI generations.\n\nThe following shape and images are recommended for the server: (it will require a GPU, as we'll be running an HPC load that will require lots of computing! More than the CPU can handle at this moment without quantization enabled.)\n\n- Shape: `VM.GPU.A10.2` (2x NVIDIA A10 Tensor Cores)\n- OCPU: 30\n- GPU Memory: 48GB\n- CPU Memory: 480GB\n- Storage: \u003e250GB\n- Max Network Bandwidth: 48Gbps (6GBps)\n- Image: Oracle Linux 8.9\n\n1. From OCI console, choose Compute/Instances menu:\n\n    ![image](./img/instance.png)\n\n2. Press **Create instance** button:\n\n    ![image](./img/create.png)\n\n3. Choose `VM.GPU.A10.2` shape, selecting **Virtual machine**/**Specialty and previous generation**:\n\n    ![image](./img/shape.png)\n\n4. Choose the Image `Oracle-Linux-8.9-Gen2-GPU-2024.02.26-0` from Oracle Linux 8 list of images:\n\n    ![image](./img/image.png)\n\n5. Specify a custom boot volume size and set 100 GB:\n\n    ![image](./img/bootvolume.png)\n\n6. Create the image.\n\n7. At the end of creation process, obtain the **Public IPv4 address**, and with your private key (the one you generated or uploaded during creation), connect to:\n\n```\n    ssh -i ./\u003cyour_private\u003e.key opc@[GPU_SERVER_IP]\n```\n\n8. Install and configure docker to use GPUs:\n\n```\n    sudo /usr/libexec/oci-growfs\n    curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo |   sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo\n    sudo dnf install -y dnf-utils zip unzip\n    sudo dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo\n    sudo dnf remove -y runc\n    sudo dnf install -y docker-ce --nobest\n    sudo useradd docker_user\n```\n\n9. We need to make sure that your Operating System user has permissions to run Docker containers. To do this, we can run the following command:\n\n```\nsudo visudo\n```\n\nAnd add this line at the end:\n\n```\ndocker_user  ALL=(ALL)  NOPASSWD: /usr/bin/docker\n```\n\n10. For convenience, we need to switch to our new user. For this, run:\n\n```\nsudo su - docker_user\n```\n\n11. Finally, let's add an alias to execute Docker with admin privileges every time we type `docker` in our shell. For this, we need to modify a file, depending on your OS (in `.bash_profile` (MacOS) / `.bashrc` (Linux)). Insert, at the end of the file, this command:\n\n```\nalias docker=\"sudo /usr/bin/docker\"\nexit\n```\n\n12. We finalize our installation by executing:\n\n```\nsudo yum install -y nvidia-container-toolkit\nsudo nvidia-ctk runtime configure --runtime=docker\nsudo systemctl restart docker\nnvidia-ctk runtime configure --runtime=docker --config=$HOME/.config/docker/daemon.json\n```\n\n13. If you're on Ubuntu instead, run:\n\n```\nsudo apt-get install nvidia-container-toolkit=1.14.3-1 \\\n        nvidia-container-toolkit-base=1.14.3-1 \\\n        libnvidia-container-tools=1.14.3-1 \\\n        libnvidia-container1=1.14.3-1\nsudo apt-get install -y nvidia-docker2\n```\n\n13. Let's reboot and re-connect to the VM, and run again:\n\n```\nsudo reboot now\n# after restart, run:\nsudo su - docker_user\n```\n\n14. Run `docker` to check if everything it's ok.\n\n15. Let's run a Docker container with the `ollama/llama2` model for embeddings/completion:\n\n```\ndocker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama serve\ndocker exec -it ollama ollama pull nomic-embed-text\ndocker exec -it ollama ollama pull llama2:13b-chat-fp16\ndocker logs -f --tail 10 ollama\n```\n\nBoth the model, for embeddings/completion will run under the same server, and they will be addressed providing in the REST request for the specific model required.\n\nTo handle the firewall, we need to open port `11434` on our Security List. For this, let's:\n\n1. In **Instance details** click on the **Virtual cloud network:** link:\n\n    ![securitylist](./img/vcn.png)\n\n2. In the menu **Resources** click on **Security Lists**:\n\n    ![security](./img/securitylist.png)\n\n3. Click on the link of **Default Security List...**\n\n4. Click on the **Add Ingress Rules** button:\n\n    ![security](./img/addIngress.png)\n\n5. Click on the **Add Ingress Rules** button:\n\n    ![security](./img/addIngress.png)\n\n6. Insert details as shown in the following image and then click **Add Ingress Rules** button:\n\n    ![security](./img/rule.png)\n\n7. Update the `env.sh` file and run `source ./env.sh`:\n\n```\n#export OPENAI_URL=http://[GPU_SERVER_IP]:3000\nexport OPENAI_URL=https://api.openai.com\n#export OPENAI_MODEL=NousResearch--llama-2-7b-chat-hf\nexport OPENAI_MODEL=gpt-3.5-turbo\nexport OPENAI_EMBEDDING_MODEL=text-embedding-ada-002\nexport VECTORDB=[VECTORDB_IP]\nexport DB_USER=vector\nexport DB_PASSWORD=vector\nexport OLLAMA_URL=http://[GPU_SERVER_IP]:11434\nexport OLLAMA_EMBEDDINGS=NousResearch--llama-2-7b-chat-hf\nexport OLLAMA_MODEL=llama2:7b-chat-fp16\nexport OPENAI_API_KEY=[YOUR_OPENAI_KEY]\n```\n\n8. Test with a shell running:\n\n```\ncurl ${OLLAMA_URL}/api/generate -d '{\n        \"model\": \"llama2:7b-chat-fp16\",\n        \"prompt\":\"Why is the sky blue?\"\n}'\n```\n\nYou'll receive the response in continuous sequential responses, facilitating the delivery of the content little by little, instead of forcing users to wait for the whole response to be generated before it's desplayed to them.\n\n### Customize for private LLMs: Vector Embeddings local, Open AI for Completion\n\n* pom.xml: uncomment the ollama dependency:\n\n```\n    \u003c!--//CHANGE--\u003e\t\n\t\u003c!-- Ollama for embeddings --\u003e\n\t\t\u003cdependency\u003e\n\t\t\t\u003cgroupId\u003eorg.springframework.ai\u003c/groupId\u003e\n\t\t\t\u003cartifactId\u003espring-ai-ollama-spring-boot-starter\u003c/artifactId\u003e\n\t\t \u003c/dependency\u003e\n\t \t\u003c!--  --\u003e\n```\n\n* DemoaiController.java - uncomment with final source code:\n\n```\n    //CHANGE\n    import org.springframework.ai.ollama.OllamaEmbeddingClient;\n    //import org.springframework.ai.ollama.OllamaChatClient;\n...\n\n    //CHANGE\n    //private final EmbeddingClient embeddingClient;\n    private final OllamaEmbeddingClient embeddingClient;\n\n    //CHANGE\n    private final ChatClient chatClient;\n    //private final OllamaChatClient chatClient;\n...\n\n    //CHANGE\n    //public DemoaiController(EmbeddingClient embeddingClient, @Qualifier(\"openAiChatClient\") ChatClient chatClient, VectorService vectorService) {  // OpenAI full\n    public DemoaiController(OllamaEmbeddingClient embeddingClient, @Qualifier(\"openAiChatClient\") ChatClient chatClient, VectorService vectorService) {  // Ollama Embeddings - OpenAI Completion \n    //public DemoaiController(OllamaEmbeddingClient embeddingClient, OllamaChatClient chatClient, VectorService vectorService) { // Ollama full \n        \n\n```\n\nVectorService.java - check if it's like this:\n\n```\n    //CHANGE\n    //import org.springframework.ai.ollama.OllamaChatClient;\n\n    ...\n\n    //CHANGE\n        private final ChatClient aiClient;\n        //private final OllamaChatClient aiClient;\n\n        //CHANGE\n        VectorService(@Qualifier(\"openAiChatClient\") ChatClient aiClient) {\n        //VectorService(OllamaChatClient aiClient) {\n\n```\n\nTest as done before. In the gpu docker logs, you'll see the chunks coming to be embedded.\n\n### Full private LLMs with `llama2:7b-chat-fp16`\n\nDemoaiController.java - uncomment with final source code:\n\n```\n    //CHANGE\n    import org.springframework.ai.ollama.OllamaEmbeddingClient;\n    import org.springframework.ai.ollama.OllamaChatClient;\n\n    ...\n        //CHANGE\n        //private final EmbeddingClient embeddingClient;\n        private final OllamaEmbeddingClient embeddingClient;\n\n        //CHANGE\n        //private final ChatClient chatClient;\n        private final OllamaChatClient chatClient;\n    ...\n\n\n    //CHANGE\n        @Autowired\n        //public DemoaiController(EmbeddingClient embeddingClient, @Qualifier(\"openAiChatClient\") ChatClient chatClient, VectorService vectorService) {  // OpenAI full\n        //public DemoaiController(OllamaEmbeddingClient embeddingClient, @Qualifier(\"openAiChatClient\") ChatClient chatClient, VectorService vectorService) {  // Ollama Embeddings - OpenAI Completion \n        public DemoaiController(OllamaEmbeddingClient embeddingClient, OllamaChatClient chatClient, VectorService vectorService) { // Ollama full \n        \n\n```\n\nVectorService.java - uncomment with final source code:\n\n```\n    //CHANGE\n    import org.springframework.ai.ollama.OllamaChatClient;\n\n    ...\n\n\n        //CHANGE\n        //private final ChatClient aiClient;\n        private final OllamaChatClient aiClient;\n\n        //CHANGE\n        //VectorService(@Qualifier(\"openAiChatClient\") ChatClient aiClient) {\n        VectorService(OllamaChatClient aiClient) {\n```\n\nTest as before. Now, you could see a tiny degradation of quality (since the model is quite small), mantaining the same embeddings and context retrieval. For example, with the question:\n\n```\n    Q: Which kind of database you can use to run the Java Web example application:\n    A: The document states that the application uses Oracle JDBC Thin driver, Universal Connection Pool (UCP), and Java in the Database (using embedded OJVM). Therefore, you can use an Oracle database to run the Java Web example application.\n```\n\nThis first result doesn't return a very good, personalized result. However, if we  good. But for:\n\n```\n    Q: Can I use any kind of development environment to run the example?\n    A: the user can use any development environment to run the example as long as it has a Java compiler and a database manager installed. The DOCUMENTS text does not specify any particular IDE that must be used, so any IDE with Java development capabilities should work. However, the guide does recommend using IntelliJ Idea Community Edition for ease of development, but this is not a requirement.\\n\\nTo answer the user's question, you could say: \\\"Yes, you can use any development environment to run the example as long as it has a Java compiler and a database manager installed. While the guide recommends using IntelliJ Idea Community Edition for ease of development, any IDE with Java development capabilities should work.\\\n```\n\nThe result is more customized and acceptable.\n\nThis trade-off in using private LLMs model could be overcome choosing *larger models*, enough to mantain a good quality.\n\n\u003e **Note**: the number of billions of parameters of a model version usually has a direct correlation with the size of the model, and its generation quality. The higher, the better, although you also need to watch out for OOM (out of memory) errors and a slower generation throughput.\n\n## 3. Deploy on Oracle Backend for Spring Boot and Microservices\n\nLet's show what Oracle can offer to deploy on an enterprise grade the GenAI application developed so far.\n\nThe platform [**Oracle Backend for Spring Boot and Microservices**](https://oracle.github.io/microservices-datadriven/spring/) allows developers to build microservices in Spring Boot and provision a backend as a service with the Oracle Database and other infrastructure components that operate on multiple clouds. This service vastly simplifies the task of building, testing, and operating microservices platforms for reliable, secure, and scalable enterprise applications.\n\nTo setup this platform, follow the instruction included in **Lab1: Provision an instance** and **Lab 2: Setup your Development Environment** of the [LiveLabs: CloudBank - Building an App with Spring Boot and Mobile APIs with Oracle Database and Kubernetes](https://apexapps.oracle.com/pls/apex/f?p=133:180:7384418726808::::wid:3607). At the end, proceed with the following steps:\n\n1. In the `application.properties` change the active env as `prod`:\n\n```\nspring.profiles.active=prod\n```\n\n2. In the `application-prod.properties`, change the parameters in `\u003c \u003e` with the values set in `env.sh`:\n\n```\n    spring.ai.openai.api-key=\u003cOPENAI_API_KEY\u003e\n    spring.ai.openai.base-url=\u003cOPENAI_URL\u003e\n    spring.ai.openai.chat.options.model=gpt-3.5-turbo\n    spring.ai.openai.embedding.options.model=text-embedding-ada-002\n    spring.datasource.url=jdbc:oracle:thin:@\u003cVECTORDB\u003e:1521/ORCLPDB1\n    spring.datasource.username=vector\n    spring.datasource.password=vector\n    spring.datasource.driver-class-name=oracle.jdbc.OracleDriver\n    config.tempDir=tempDir\n    config.dropDb=true\n    config.vectorDB=vectortable\n    config.distance=EUCLIDEAN\n    spring.servlet.multipart.max-file-size=10MB\n    spring.servlet.multipart.max-request-size=20MB\n    spring.ai.ollama.base-url=\u003cOLLAMA_URL\u003e\n    spring.ai.ollama.embedding.options.model=nomic-embed-text\n    spring.ai.ollama.chat.options.model=llama2:7b-chat-fp16\n```\n\n3. Open a terminal, and using the **Kubernetes** admin command, open a port forward to the backend:\n\n```\nkubectl -n obaas-admin port-forward svc/obaas-admin 8080:8080\n```\n\n4. Using the command-line tool `oractl`, deploy the application running the following commands:\n\n```\noractl:\u003econnect\n? username obaas-admin\n? password **************\n\noractl:\u003ecreate --app-name rag\noractl:\u003edeploy --app-name rag --service-name demoai --artifact-path /Users/cdebari/Documents/GitHub/spring-ai-demo/target/demoai-0.0.1-SNAPSHOT.jar --image-version 0.0.1 --service-profile prod\n\n```\n\n5. Let's test the application with port forwarding. First, we need to stop the current `demoai` instance running on the background, to free the previous port being used; and, in a different terminal, run a port forwarding on port 8080 to the remote service on the **Oracle Backend for Spring Boot and Microservices**:\n\n```\nkubectl -n rag port-forward svc/demoai 8080:8080\n```\n\n6. In a different terminal, test the service as done before, for example:\n\n```\ncurl -X POST http://localhost:8080/ai/rag \\\n        -H \"Content-Type: application/json\" \\\n        -d '{\"message\":\"Can I use any kind of development environment to run the example?\"}' | jq -r .generation\n```\n\n## Notes/Issues\n\nAdditional Use Cases like summarization and embedding coming soon.\n\n## URLs\n\n- [Oracle AI](https://www.oracle.com/artificial-intelligence/)\n- [AI for Developers](https://developer.oracle.com/technologies/ai.html)\n\n## Contributing\n\nThis project is open source.  Please submit your contributions by forking this repository and submitting a pull request!  Oracle appreciates any contributions that are made by the open-source community.\n\n## License\n\nCopyright (c) 2024 Oracle and/or its affiliates.\n\nLicensed under the Universal Permissive License (UPL), Version 1.0.\n\nSee [LICENSE](LICENSE) for more details.\n\nORACLE AND ITS AFFILIATES DO NOT PROVIDE ANY WARRANTY WHATSOEVER, EXPRESS OR IMPLIED, FOR ANY SOFTWARE, MATERIAL OR CONTENT OF ANY KIND CONTAINED OR PRODUCED WITHIN THIS REPOSITORY, AND IN PARTICULAR SPECIFICALLY DISCLAIM ANY AND ALL IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.  FURTHERMORE, ORACLE AND ITS AFFILIATES DO NOT REPRESENT THAT ANY CUSTOMARY SECURITY REVIEW HAS BEEN PERFORMED WITH RESPECT TO ANY SOFTWARE, MATERIAL OR CONTENT CONTAINED OR PRODUCED WITHIN THIS REPOSITORY. IN ADDITION, AND WITHOUT LIMITING THE FOREGOING, THIRD PARTIES MAY HAVE POSTED SOFTWARE, MATERIAL OR CONTENT TO THIS REPOSITORY WITHOUT ANY REVIEW. USE AT YOUR OWN RISK.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foracle-devrel%2Fspringai-rag-db23ai","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foracle-devrel%2Fspringai-rag-db23ai","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foracle-devrel%2Fspringai-rag-db23ai/lists"}