https://github.com/hoosnick/olx-parser
OLX Real Estate Parser
https://github.com/hoosnick/olx-parser
crawler olx
Last synced: 10 months ago
JSON representation
OLX Real Estate Parser
- Host: GitHub
- URL: https://github.com/hoosnick/olx-parser
- Owner: hoosnick
- Created: 2025-08-15T09:49:57.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-08-15T10:09:38.000Z (10 months ago)
- Last Synced: 2025-08-15T12:08:59.514Z (10 months ago)
- Topics: crawler, olx
- Language: Python
- Homepage: https://t.me/arzonroqkv
- Size: 531 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# OLX Parser

## Architecture
```bash
src/
├── core/ # Core business logic and configuration
│ ├── config.py # Application configuration and constants
│ ├── models.py # Pydantic data models with full typing
│ └── app_factory.py # Dependency injection factory
├── services/ # Business logic services
│ ├── olx_service.py # Main OLX scraping logic
│ ├── telegram_service.py # Telegram bot messaging
│ └── image_service.py # Image processing and collages
├── adapters/ # External service adapters
│ └── database.py # Database interface and SQLite implementation
└── utils/ # Utility functions
└── logging_utils.py # Logging configuration
```
## Installation
1. **Clone and setup the project:**
```bash
python -m venv .venv
```
```bash
.venv\Scripts\activate # Windows
```
```bash
source .venv/bin/activate # Linux/Mac
```
2. **Install dependencies:**
```bash
pip install -r requirements.txt
```
3. **Configure your Telegram bot:**
- Rename `sample.env` to `.env` with your bot token and channel ID
- And set environment variables
- Windows (PowerShell):
- `$env:TELEGRAM_BOT_TOKEN=YOUR_TOKEN`
- `$env:TELEGRAM_CHANNEL_ID=YOUR_CHANNEL_ID`
- Windows (Command Prompt):
- `set TELEGRAM_BOT_TOKEN=YOUR_TOKEN`
- `set TELEGRAM_CHANNEL_ID=YOUR_CHANNEL_ID`
- Linux/Mac:
- `export TELEGRAM_BOT_TOKEN=YOUR_TOKEN`
- `export TELEGRAM_CHANNEL_ID=YOUR_CHANNEL_ID`
## Usage
## Configuration
Key configuration options in `src/core/config.py`:
- `SCHEDULER_INTERVAL_MINUTES`: Scraping frequency
- `SEARCH_PARAMS`: OLX search criteria
```py
SEARCH_PARAMS = {
"offset": 0,
"limit": 50, # Results per page
"category_id": 1147, # Real estate category ID
"region_id": 5, # Tashkent region ID
"district_id": 26, # Tashkent district ID
"city_id": 5, # Tashkent city ID
"distance": 10, # Radius in km
"currency": "UYE", # Currency UZS/UYE
"sort_by": "created_at:desc",
"filter_float_price:from": 100, # Min. price
"filter_float_price:to": 300, # Max. price
"filter_float_number_of_rooms:from": 1, # Min. rooms
"filter_float_number_of_rooms:to": 6, # Max. rooms
"filter_refiners": "",
}
```
### Run the application
```bash
python app.py
```