https://github.com/ap-dev-github/adobe-hackathon-2024-round1b-winners
Round 1b submission for the adobe india hackathon 2025
https://github.com/ap-dev-github/adobe-hackathon-2024-round1b-winners
Last synced: 7 months ago
JSON representation
Round 1b submission for the adobe india hackathon 2025
- Host: GitHub
- URL: https://github.com/ap-dev-github/adobe-hackathon-2024-round1b-winners
- Owner: ap-dev-github
- License: mit
- Created: 2025-07-28T15:59:04.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-07-28T16:23:53.000Z (8 months ago)
- Last Synced: 2025-07-28T18:25:53.068Z (8 months ago)
- Language: Python
- Size: 4.2 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Adobe Hackathon Round 1B Submission
## Team: Winners
### Members:
- Ayush Pandey
- Ayush Banerjee
---
## Goal
To build a **Persona-Keyword Mapping** system that scans PDF documents and surfaces the most relevant sections for a given **persona** and **job role**. For example, a "student" looking for a "summary" of a document will receive relevant, ranked excerpts tailored to that context.
---
## Tech Stack
- **Python 3.9 (slim)**
- **PDF Parsing:** `pdfplumber`
- No external NLP libraries were used. The system is fully **rule-based and deterministic**.
---
## Docker Setup
### Dockerfile
```
FROM python:3.9-slim
WORKDIR /app
COPY . .
RUN pip install --no-cache-dir -r requirements.txt
CMD ["python", "src/pdf_analyzer.py"]
```
## Build the Docker Image
```
docker build --no-cache --platform linux/amd64 -t adobe-1b .
```
## Run the Container
Run this command in PowerShell (Windows):
```
docker run --rm `
-v "D:/adobe hackathon/adobe_hackathon_1b/input:/app/input" `
-v "D:/adobe hackathon/adobe_hackathon_1b/output:/app/output" `
-e PERSONA="student" `
-e JOB="summary" `
adobe-1b
```
## Output
The result will be stored in: output/output.json
Contains:
Matching sections
Page numbers
Section title
Importance rank
200-character preview of each section
## Dependencies
```
pdfplumber==0.10.3
```
Install via:
```
pip install -r requirements.txt
```