{"id":31156115,"url":"https://github.com/aborroy/spring-ai-summarizer","last_synced_at":"2026-02-13T23:31:09.728Z","repository":{"id":297429294,"uuid":"996683301","full_name":"aborroy/spring-ai-summarizer","owner":"aborroy","description":"Tutorial to create a summarizer endpoint with Spring AI and Docker Model Runner","archived":false,"fork":false,"pushed_at":"2025-06-05T11:43:37.000Z","size":15,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-05T12:30:45.530Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aborroy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-05T09:57:43.000Z","updated_at":"2025-06-05T11:43:39.000Z","dependencies_parsed_at":"2025-06-05T12:42:03.395Z","dependency_job_id":null,"html_url":"https://github.com/aborroy/spring-ai-summarizer","commit_stats":null,"previous_names":["aborroy/spring-ai-summarizer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/aborroy/spring-ai-summarizer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aborroy%2Fspring-ai-summarizer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aborroy%2Fspring-ai-summarizer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aborroy%2Fspring-ai-summarizer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aborroy%2Fspring-ai-summarizer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aborroy","download_url":"https://codeload.github.com/aborroy/spring-ai-summarizer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aborroy%2Fspring-ai-summarizer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":275830188,"owners_count":25536280,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-18T02:00:09.552Z","response_time":77,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-09-18T20:55:03.280Z","updated_at":"2025-09-18T20:55:05.452Z","avatar_url":"https://github.com/aborroy.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Lab: Building a Local PDF Summarizer with Spring Boot \u0026 Docker-Hosted LLM\n\nWelcome to this hands-on lab! You'll create a **Spring Boot 3.5** microservice that transforms any PDF into a concise summary using a **local Large Language Model** (LLM) served through Docker Model Runner. By the end, you'll have a running REST endpoint (`POST /api/summarize`) that accepts a PDF and returns a summary – **without ever sending data to the cloud**.\n\n## Learning Objectives\n\n1. Scaffold a modern Spring Boot project with proper dependency management\n2. Configure Spring AI to communicate with a locally running LLM (Mistral) via Docker Model Runner\n3. Implement PDF processing with page-by-page text extraction using Spring AI utilities\n4. Build a multi-stage summarization pipeline with prompt chaining\n5. Expose functionality through a robust REST API with proper error handling\n6. Package the complete solution with Docker Compose for easy deployment\n\n## Prerequisites \u0026 Environment Setup\n\n| Tool           | Minimum Version | Check Command        | Installation Notes |\n| -------------- | --------------- | -------------------- | ------------------ |\n| Java (Temurin) | 21              | `java --version`     | Use SDKMAN or official installer |\n| Maven          | 3.9             | `mvn --version`      | Bundled with most IDEs |\n| Docker Desktop | 4.40            | `docker --version`   | Required for Model Runner |\n| IDE            | Latest          | N/A                  | IntelliJ IDEA recommended |\n\n**Verification Task:** Run all check commands. If any fail, install the missing tools before proceeding.\n\n## Step 0: Download and Verify LLM Model\n\nWe'll use **Mistral 7B** - it's lightweight (~4GB) yet powerful enough for quality summarization.\n\n**Your Task:**\n1. Use `docker model pull` to download the `ai/mistral` model\n2. Verify the download with `docker model list`\n3. Check if Docker Model Runner is active with `docker model status`\n\n**Common Issues:**\n- Command not recognized? Your Docker Desktop version is too old\n- Download hanging? Check your internet connection and Docker daemon\n\n**Success Indicator:** You should see \"Docker Model Running\" and the mistral model listed.\n\n## Step 1: Create Project Structure\n\n**Your Task:**\n1. Choose meaningful names:\n   - Group ID: Use your domain (e.g., `com.yourname.pdfsummarizer`)\n   - Artifact ID: Something descriptive (e.g., `pdf-summarizer-service`)\n\n2. Generate a Maven project using the `maven-archetype-quickstart` archetype\n\n```bash\nmvn archetype:generate                                   \\\n  -DgroupId=com.yourname.pdfsummarizer                   \\\n  -DartifactId=pdf-summarizer-service                    \\\n  -DarchetypeGroupId=org.apache.maven.archetypes         \\\n  -DarchetypeArtifactId=maven-archetype-quickstart       \\\n  -DarchetypeVersion=1.4 -DinteractiveMode=false\n```\n\n3. Open the project in your IDE and run an initial `mvn clean compile` to verify setup\n\n## Step 2: Configure Dependencies (pom.xml)\n\n**Your Task:** Transform the basic Maven project into a Spring Boot application by modifying `pom.xml`:\n\n### Required Changes:\n1. **Parent Configuration:** Set Spring Boot as the parent with version 3.5.0\n2. **Properties Block:** Define Java version (21) and Spring AI version (1.0.0)\n3. **Dependency Management:** Import the Spring AI BOM to manage versions\n4. **Core Dependencies:** Add these three essential starters:\n   - Web starter (for REST endpoints)\n   - Spring AI OpenAI starter (for LLM communication)\n   - PDF document reader (for PDF processing)\n5. **Optional Enhancement:** Add Lombok for cleaner code\n\n```xml\n\u003c!-- parent --\u003e\n\u003cparent\u003e...Spring Boot 3.5.0...\u003c/parent\u003e\n\n\u003c!-- properties --\u003e\n\u003cjava.version\u003e21\u003c/java.version\u003e\n\u003cspring-ai.version\u003e1.0.0\u003c/spring-ai.version\u003e\n\n\u003c!-- dependencyManagement --\u003e\n\u003cdependencyManagement\u003e...spring-ai-bom...\u003c/dependencyManagement\u003e\n\n\u003c!-- starters you’ll need --\u003e\n\u003cdependency\u003e spring-boot-starter-web \u003c/dependency\u003e\n\u003cdependency\u003e spring-ai-openai-spring-boot-starter \u003c/dependency\u003e\n\u003cdependency\u003e spring-ai-pdf-document-reader \u003c/dependency\u003e\n\u003c!-- optional --\u003e\n\u003cdependency scope=\"provided\"\u003e lombok \u003c/dependency\u003e\n```\n\n**Verification:** `mvn clean package` should complete without errors.\n\n## Step 3: Application Configuration\n\n**Your Task:** Create `src/main/resources/application.yml` with proper Spring AI configuration.\n\n### Configuration Requirements:\n1. **Server Settings:** Configure the application port\n2. **Multipart Configuration:** Set appropriate file size limits for PDF uploads\n3. **Spring AI OpenAI Settings:** Configure four key properties:\n   - `base-url`: Where is Docker Model Runner listening? (Hint: localhost with a specific port and path)\n   - `api-key`: What dummy value works for local usage?\n   - `model`: Which model did you download in Step 0?\n   - `temperature`: What value gives consistent summaries? (0.0-1.0 range)\n\n```yaml\nserver:\n  port: 8080\n\nspring:\n  servlet:\n    multipart:\n      max-file-size: 20MB    \n  ai:\n    openai:\n      base-url:  http://localhost:12434/\n      api-key:   dummy\n      model:     ai/istral\n      temperature: 0.0\n```   \n\n**Research Required:**\n- What port does Docker Model Runner use by default?\n- What's the correct API endpoint path for OpenAI-compatible APIs?\n- How does temperature affect LLM output consistency?\n\n## Step 4: Core Business Logic - Service Layer\n\n**Your Task:** Create a `PdfSummarizationService` class that orchestrates the summarization process.\n\n### Class Structure:\n\n```java\n@Service\n@RequiredArgsConstructor\npublic class PdfSummarizationService {\n\n    private final ChatModel chatModel;\n\n    public String summarize(MultipartFile pdf) throws IOException {\n        // TODO 1: validate pdf (size, emptiness)\n        // TODO 2: use PagePdfDocumentReader to pull List\u003cDocument\u003e\n        // TODO 3: loop over pages → ask LLM for 2-3 sentence summary each\n        //         Hint: new ChatClient(chatModel).call(\"Prompt\" + pageContent)\n        // TODO 4: combine page summaries → ask LLM again for global summary\n        // Return final summary\n        return null; // placeholder\n    }\n}\n```\n\n### Implementation Strategy:\n1. **Input Validation:** Check if the file is valid and not empty\n2. **PDF Processing:** Use `PagePdfDocumentReader` to extract text from each page\n3. **Page Summarization:** Create individual summaries for each page using the LLM\n4. **Global Summarization:** Combine page summaries into a comprehensive document summary\n\n### Key Classes to Research:\n- `ChatClient`: How do you build one from a `ChatModel`?\n- `PagePdfDocumentReader`: What constructor parameters does it need?\n- `Document`: How do you extract text content?\n- `InputStreamResource`: How do you create one from a `MultipartFile`?\n\n**Design Questions:**\n- Why summarize page-by-page instead of the entire document at once?\n- What happens if a PDF has no readable text?\n- How should you handle very large PDFs?\n\n**Prompt Engineering Tips:**\n- Be specific about desired summary length\n- Ask for key points and main ideas\n\n## Step 5: REST API Layer - Controller\n\n**Your Task:** Create a REST controller that exposes your summarization service.\n\n### Class Structure:\n\n```java\n@RestController\n@RequestMapping(\"/api\")\n@RequiredArgsConstructor\npublic class SummarizationController {\n\n    private final PdfSummarizationService service;\n\n    @PostMapping(\"/summarize\")\n    public ResponseEntity\u003cString\u003e summarize(@RequestParam(\"file\") MultipartFile file) {\n        // TODO: call service\n        return null;\n    }\n}\n```\n\n### Controller Requirements:\n1. **Annotations:** Use appropriate Spring annotations for REST endpoints\n2. **Endpoint Mapping:** Map to `/api/summarize` with POST method\n3. **File Handling:** Accept multipart file uploads\n4. **Error Handling:** Return appropriate HTTP status codes\n5. **Response Format:** Return plain text summaries\n\n### Implementation Considerations:\n- What `@RequestMapping` configuration do you need?\n- How do you handle `MultipartFile` parameters?\n- What should happen when summarization fails?\n- Should you return `ResponseEntity` or plain `String`?\n\n## Step 6: Application Bootstrap\n\n**Your Task:** Replace the default `App.java` with a proper Spring Boot application class.\n\n### Class Structure:\n\n```java\n@SpringBootApplication\npublic class PdfSummarizerApplication {\n    public static void main(String[] args) { SpringApplication.run(PdfSummarizerApplication.class, args); }\n}\n```\n\n*(This one is boilerplate; left intact so the app actually starts.)*\n\n## Step 7: Local Testing \u0026 Debugging\n\n**Your Task:** Get everything running and test the complete workflow.\n\n### Testing Checklist:\n1. **Verify Model Runner:** Confirm Docker Model Runner is active\n2. **Start Application:** Use Maven to run your Spring Boot app\n3. **Test Endpoint:** Use curl or Postman to send a PDF file\n\n### Sample Test Command Structure:\n\n```bash\n# You'll need to construct the proper curl command\ncurl -F file=@your-test.pdf http://localhost:YOUR_PORT/YOUR_ENDPOINT\n```\n\n**Troubleshooting Guide:**\n- `ResourceAccessException`: Check Docker Model Runner status and base-url configuration\n- `404 Not Found`: Verify your controller mappings and component scanning\n- `400 Bad Request`: Check file upload configuration and request format\n- Empty response: Examine your service logic and LLM prompts\n\n**Testing Tips:**\n- Start with a small, simple PDF\n- Check application logs for detailed error information\n- Verify each step independently (PDF reading, LLM calls, etc.)\n\n## Step 8: Containerization with Docker Compose\n\n**Your Task:** Package your application for easy deployment.\n\n### Docker Setup:\n1. **Initialize Docker:** Use `docker init` to generate Docker configuration\n2. **Configure Environment:** Set environment variables for container-to-host communication\n3. **Network Configuration:** Ensure the containerized app can reach Docker Model Runner on the host\n\n### Key Considerations:\n- What base URL should the containerized app use to reach the host?\n- How do you override Spring configuration with environment variables?\n- What port mapping do you need in Docker Compose?\n\n## Success Criteria\n\n**Functional Requirements:**\n- Application starts without errors\n- PDF files can be uploaded and processed\n- Summaries are generated and returned\n- Local LLM integration works correctly\n\n**Technical Requirements:**\n- Proper Spring Boot project structure\n- Clean separation of concerns (Controller → Service → AI)\n- Appropriate error handling\n- Containerized deployment option\n\n**Learning Outcomes:**\n- Understanding of Spring AI framework\n- Experience with local LLM deployment\n- REST API development skills\n- Docker containerization knowledge\n\n\u003e Remember: The goal is to understand each component and how they work together. Don't hesitate to experiment, break things, and fix them – that's how real learning happens!","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faborroy%2Fspring-ai-summarizer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faborroy%2Fspring-ai-summarizer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faborroy%2Fspring-ai-summarizer/lists"}