Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/eli64s/pyflink-poc

PyFlink data stream processing utilities ๐Ÿฟ
https://github.com/eli64s/pyflink-poc

apache-flink data-stream-processing data-streaming data-streams pyflink real-time-data

Last synced: 4 days ago
JSON representation

PyFlink data stream processing utilities ๐Ÿฟ

Awesome Lists containing this project

README

        






FlinkFlow


๐Ÿ“ Real-time stream processing wiht PyFlink.


๐Ÿš€ Developed with the software and tools below.


py
pyflink

aioresponses
aiohttp
asyncio
pack

---
## ๐Ÿ“š Table of Contents
- [๐Ÿ“š Table of Contents](#-table-of-contents)
- [๐Ÿ“Overview](#overview)
- [๐Ÿ”ฎ Feautres](#-feautres)
- [โš™๏ธ Project Structure](#๏ธ-project-structure)
- [๐Ÿ’ป Modules](#-modules)
- [๐Ÿš€ Getting Started](#-getting-started)
- [โœ… Prerequisites](#-prerequisites)
- [๐Ÿ’ป Installation](#-installation)
- [๐Ÿค– Using FlinkFlow](#-using-flinkflow)
- [๐Ÿงช Running Tests](#-running-tests)
- [๐Ÿ›  Future Development](#-future-development)
- [๐Ÿค Contributing](#-contributing)
- [๐Ÿชช License](#-license)
- [๐Ÿ™ Acknowledgments](#-acknowledgments)

---

## ๐Ÿ“Overview

FlinkFlow is a repository for building real-time data processing apps with PyFlink.

## ๐Ÿ”ฎ Feautres

> `[๐Ÿ“Œ INSERT-PROJECT-FEATURES]`

---

## โš™๏ธ Project Structure

```bash
.
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ conf
โ”‚ย ย  โ”œโ”€โ”€ conf.toml
โ”‚ย ย  โ””โ”€โ”€ flink-config.yaml
โ”œโ”€โ”€ data
โ”‚ย ย  โ””โ”€โ”€ data.csv
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ scripts
โ”‚ย ย  โ”œโ”€โ”€ clean.sh
โ”‚ย ย  โ””โ”€โ”€ run.sh
โ”œโ”€โ”€ setup
โ”‚ย ย  โ””โ”€โ”€ setup.sh
โ”œโ”€โ”€ setup.py
โ””โ”€โ”€ src
โ”œโ”€โ”€ alerts_handler.py
โ”œโ”€โ”€ consumer.py
โ””โ”€โ”€ logger.py

6 directories, 12 files
```
---

## ๐Ÿ’ป Modules
Scripts

| File | Summary |
|:---------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| run.sh | This code is a Bash script that starts a Flink cluster, submits a PyFlink job, and then stops the Flink cluster. |
| clean.sh | This code is a Bash script that cleans up files and directories related to Python, Jupyter Notebooks, and pytest. It deletes Python cache files, build artifacts, Jupyter notebook checkpoints, and log files. |

Src

| File | Summary |
|:------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| alerts_handler.py | This code is a REST API alert handler for the Flink consumer. It buffers alerts and sends them to the API in batches using aiohttp, and serializes them using Apache Avro. |
| logger.py | Logger is a class for the project that provides logging capabilities with colored output and different log levels. |
| consumer.py | This code is a Python script that uses Apache Flink to process streaming data. It creates a StreamExecutionEnvironment, sets the parallelism, time characteristic, and checkpointing mode, and creates a StreamTableEnvironment. |


## ๐Ÿš€ Getting Started

### โœ… Prerequisites

Before you begin, ensure that you have the following prerequisites installed:
> `[๐Ÿ“Œ INSERT-PROJECT-PREREQUISITES]`

### ๐Ÿ’ป Installation

1. Clone the FlinkFlow repository:
```sh
git clone https://github.com/eli64s/FlinkFlow
```

2. Change to the project directory:
```sh
cd FlinkFlow
```

3. Install the dependencies:
```sh
pip install -r requirements.txt
```

### ๐Ÿค– Using FlinkFlow

```sh
python main.py
```

### ๐Ÿงช Running Tests
```sh
#run tests
```


## ๐Ÿ›  Future Development
- [X] [๐Ÿ“Œ COMPLETED-TASK]
- [ ] [๐Ÿ“Œ INSERT-TASK]
- [ ] [๐Ÿ“Œ INSERT-TASK]

---

## ๐Ÿค Contributing
Contributions are always welcome! Please follow these steps:
1. Fork the project repository. This creates a copy of the project on your account that you can modify without affecting the original project.
2. Clone the forked repository to your local machine using a Git client like Git or GitHub Desktop.
3. Create a new branch with a descriptive name (e.g., `new-feature-branch` or `bugfix-issue-123`).
```sh
git checkout -b new-feature-branch
```
4. Make changes to the project's codebase.
5. Commit your changes to your local branch with a clear commit message that explains the changes you've made.
```sh
git commit -m 'Implemented new feature.'
```
6. Push your changes to your forked repository on GitHub using the following command
```sh
git push origin new-feature-branch
```
7. Create a pull request to the original repository.
Open a new pull request to the original project repository. In the pull request, describe the changes you've made and why they're necessary.
The project maintainers will review your changes and provide feedback or merge them into the main branch.

---

## ๐Ÿชช License

This project is licensed under the `[๐Ÿ“Œ INSERT-LICENSE-TYPE]` License. See the [LICENSE](https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/adding-a-license-to-a-repository) file for additional info.

---

## ๐Ÿ™ Acknowledgments

[๐Ÿ“Œ INSERT-DESCRIPTION]

---