https://github.com/ap-dev-github/adobe-hackathon-2024-round1b-winners

Round 1b submission for the adobe india hackathon 2025
https://github.com/ap-dev-github/adobe-hackathon-2024-round1b-winners

Last synced: 9 months ago
JSON representation

Round 1b submission for the adobe india hackathon 2025

Host: GitHub
URL: https://github.com/ap-dev-github/adobe-hackathon-2024-round1b-winners
Owner: ap-dev-github
License: mit
Created: 2025-07-28T15:59:04.000Z (10 months ago)
Default Branch: main
Last Pushed: 2025-07-28T16:23:53.000Z (10 months ago)
Last Synced: 2025-07-28T18:25:53.068Z (10 months ago)
Language: Python
Size: 4.2 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Adobe Hackathon Round 1B Submission

## Team: Winners
### Members:
- Ayush Pandey
- Ayush Banerjee

---

## Goal
To build a **Persona-Keyword Mapping** system that scans PDF documents and surfaces the most relevant sections for a given **persona** and **job role**. For example, a "student" looking for a "summary" of a document will receive relevant, ranked excerpts tailored to that context.

---

## Tech Stack
- **Python 3.9 (slim)**
- **PDF Parsing:** `pdfplumber`
- No external NLP libraries were used. The system is fully **rule-based and deterministic**.

---

## Docker Setup

### Dockerfile

```
FROM python:3.9-slim

WORKDIR /app
COPY . .

RUN pip install --no-cache-dir -r requirements.txt

CMD ["python", "src/pdf_analyzer.py"]
```
## Build the Docker Image
```
docker build --no-cache --platform linux/amd64 -t adobe-1b .
```
## Run the Container
Run this command in PowerShell (Windows):
```
docker run --rm `
-v "D:/adobe hackathon/adobe_hackathon_1b/input:/app/input" `
-v "D:/adobe hackathon/adobe_hackathon_1b/output:/app/output" `
-e PERSONA="student" `
-e JOB="summary" `
adobe-1b
```
## Output
The result will be stored in: output/output.json

Contains:

Matching sections

Page numbers

Section title

Importance rank

200-character preview of each section

## Dependencies
```
pdfplumber==0.10.3
```
Install via:
```
pip install -r requirements.txt
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ap-dev-github/adobe-hackathon-2024-round1b-winners

Awesome Lists containing this project

README