https://github.com/andprov/krisha.kz
Rental ad parser
https://github.com/andprov/krisha.kz
beautifulsoup4 parser parsing python request sqlite3
Last synced: 4 months ago
JSON representation
Rental ad parser
- Host: GitHub
- URL: https://github.com/andprov/krisha.kz
- Owner: andprov
- License: mit
- Created: 2023-07-03T12:36:11.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2025-05-21T00:54:22.000Z (about 1 year ago)
- Last Synced: 2025-05-21T01:51:35.014Z (about 1 year ago)
- Topics: beautifulsoup4, parser, parsing, python, request, sqlite3
- Language: Python
- Homepage: https://krisha.kz
- Size: 124 KB
- Stars: 20
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
[](https://github.com/andprov/krisha.kz/blob/main/README.ru.md)
# Real Estate Rental Site Parser krisha.kz
[](https://github.com/andprov/krisha.kz/blob/main/LICENSE.md)
[](https://github.com/psf/black)
[](https://www.python.org/)
[](https://github.com/andprov/krisha.kz/actions/workflows/main.yml)
# Description
Searches and views listings based on the specified [parameters:](#params):
- Requests data from preview pages of listings. Finds links to pages with detailed descriptions.
- Visits the detailed description pages of each listing and collects data.
- Stores results in SQLite database.
Search parameters selection replicates the website's functionality. To specify search parameters, use the `SEARCH_PARAMETERS.json` file in the project's root directory.
### Third-party libraries used in the project
- [BeautifulSoup4](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
- [requests](https://requests.readthedocs.io/en/latest/)
# Installation and Running
Clone the repository:
```shell
git clone
```
Navigate to the project folder:
```shell
cd krisha.kz
```
Create a virtual environment:
```shell
python3 -m venv .venv
```
Activate the virtual environment:
```shell
source .venv/bin/activate
```
Install dependencies:
- for use
```shell
pip install .
```
- for dev in editable mode
```shell
pip install -e .[test,lint]
```
Specify search parameters in the [SEARCH_PARAMETERS.json](SEARCH_PARAMETERS.json) file. See [examples](#examples)
Run the script:
```shell
python -m krisha
```
# Setting Up Scheduled Runs
Edit the [cron.sh](cron.sh) file, adding your project path:
```shell
#!/bin/bash
cd //krisha.kz
source .venv/bin/activate
python -m krisha
```
If necessary, add the rights to execute `cron.sh`:
```shell
chmod +x //krisha.kz/cron.sh
```
Open cron settings:
```shell
crontab -e
```
Add a cron job entry:
```shell
# Daily run at 12 PM.
0 12 * * * //krisha.kz/cron.sh
```
- `city` - Search city from 0 to 20;
- `has_photo` - Listing has photos;
- `furniture` - Apartment has furniture;
- `rooms` - Number of rooms from 1 to 5;
- `price_from` - Minimum price;
- `price_to` - Maximum price;
- `owner` - Listing posted by the owner;
### City Values
- 0 - All of Kazakhstan.
- 1 - Almaty.
- 2 - Astana.
- 3 - Shymkent.
- 4 - Abai Region.
- 5 - Akmola Region.
- 6 - Aktobe Region.
- 7 - Almaty Region.
- 8 - Atyrau Region.
- 9 - East Kazakhstan Region.
- 10 - Zhambyl Region.
- 11 - Zhetysu Region.
- 12 - West Kazakhstan Region.
- 13 - Karaganda Region.
- 14 - Kostanay Region.
- 15 - Kyzylorda Region.
- 16 - Mangystau Region.
- 17 - Pavlodar Region.
- 18 - North Kazakhstan Region.
- 19 - Turkestan Region.
- 20 - Ulytau Region.
### Examples of Search Parameter Specification:
1. Find one-room apartments in Almaty.
Listings with photos.
Apartments with furniture.
Price from 100000 to 300000.
Listings from owners.
```json
{
"city": 1,
"has_photo": true,
"furniture": true,
"rooms": [1],
"price_from": 100000,
"price_to": 300000,
"owner": true
}
```
2. Find two-room and three-room apartments in Astana.
Listings with photos.
Apartments without furniture.
Price up to 400000.
Listings from owners.
```json
{
"city": 2,
"has_photo": true,
"rooms": [2, 3],
"price_to": 400000,
"owner": true
}
```
3. Find apartments with any number of rooms in Kazakhstan.
Listings without photos.
Apartments without furniture.
Price from 200000.
Listings from owners, agencies, and private realtors.
```json
{
"price_from": 200000
}
```
4. Returns the same result as example No. 3.
```json
{
"city": 0,
"has_photo": false,
"furniture": false,
"rooms": [0],
"price_from": 200000,
"price_to": null,
"owner": false
}
```