Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/eli64s/pyflink-poc
PyFlink data stream processing utilities ๐ฟ
https://github.com/eli64s/pyflink-poc
apache-flink data-stream-processing data-streaming data-streams pyflink real-time-data
Last synced: 4 days ago
JSON representation
PyFlink data stream processing utilities ๐ฟ
- Host: GitHub
- URL: https://github.com/eli64s/pyflink-poc
- Owner: eli64s
- Created: 2023-03-24T05:40:09.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-06-16T08:53:42.000Z (over 1 year ago)
- Last Synced: 2024-11-12T09:09:06.887Z (about 1 month ago)
- Topics: apache-flink, data-stream-processing, data-streaming, data-streams, pyflink, real-time-data
- Language: Python
- Homepage:
- Size: 32.2 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
FlinkFlow๐ Real-time stream processing wiht PyFlink.
๐ Developed with the software and tools below.
---
## ๐ Table of Contents
- [๐ Table of Contents](#-table-of-contents)
- [๐Overview](#overview)
- [๐ฎ Feautres](#-feautres)
- [โ๏ธ Project Structure](#๏ธ-project-structure)
- [๐ป Modules](#-modules)
- [๐ Getting Started](#-getting-started)
- [โ Prerequisites](#-prerequisites)
- [๐ป Installation](#-installation)
- [๐ค Using FlinkFlow](#-using-flinkflow)
- [๐งช Running Tests](#-running-tests)
- [๐ Future Development](#-future-development)
- [๐ค Contributing](#-contributing)
- [๐ชช License](#-license)
- [๐ Acknowledgments](#-acknowledgments)---
## ๐Overview
FlinkFlow is a repository for building real-time data processing apps with PyFlink.
## ๐ฎ Feautres
> `[๐ INSERT-PROJECT-FEATURES]`
---
## โ๏ธ Project Structure
```bash
.
โโโ README.md
โโโ conf
โย ย โโโ conf.toml
โย ย โโโ flink-config.yaml
โโโ data
โย ย โโโ data.csv
โโโ requirements.txt
โโโ scripts
โย ย โโโ clean.sh
โย ย โโโ run.sh
โโโ setup
โย ย โโโ setup.sh
โโโ setup.py
โโโ src
โโโ alerts_handler.py
โโโ consumer.py
โโโ logger.py6 directories, 12 files
```
---## ๐ป Modules
Scripts| File | Summary |
|:---------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| run.sh | This code is a Bash script that starts a Flink cluster, submits a PyFlink job, and then stops the Flink cluster. |
| clean.sh | This code is a Bash script that cleans up files and directories related to Python, Jupyter Notebooks, and pytest. It deletes Python cache files, build artifacts, Jupyter notebook checkpoints, and log files. |Src
| File | Summary |
|:------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| alerts_handler.py | This code is a REST API alert handler for the Flink consumer. It buffers alerts and sends them to the API in batches using aiohttp, and serializes them using Apache Avro. |
| logger.py | Logger is a class for the project that provides logging capabilities with colored output and different log levels. |
| consumer.py | This code is a Python script that uses Apache Flink to process streaming data. It creates a StreamExecutionEnvironment, sets the parallelism, time characteristic, and checkpointing mode, and creates a StreamTableEnvironment. |
## ๐ Getting Started
### โ Prerequisites
Before you begin, ensure that you have the following prerequisites installed:
> `[๐ INSERT-PROJECT-PREREQUISITES]`### ๐ป Installation
1. Clone the FlinkFlow repository:
```sh
git clone https://github.com/eli64s/FlinkFlow
```2. Change to the project directory:
```sh
cd FlinkFlow
```3. Install the dependencies:
```sh
pip install -r requirements.txt
```### ๐ค Using FlinkFlow
```sh
python main.py
```### ๐งช Running Tests
```sh
#run tests
```
## ๐ Future Development
- [X] [๐ COMPLETED-TASK]
- [ ] [๐ INSERT-TASK]
- [ ] [๐ INSERT-TASK]---
## ๐ค Contributing
Contributions are always welcome! Please follow these steps:
1. Fork the project repository. This creates a copy of the project on your account that you can modify without affecting the original project.
2. Clone the forked repository to your local machine using a Git client like Git or GitHub Desktop.
3. Create a new branch with a descriptive name (e.g., `new-feature-branch` or `bugfix-issue-123`).
```sh
git checkout -b new-feature-branch
```
4. Make changes to the project's codebase.
5. Commit your changes to your local branch with a clear commit message that explains the changes you've made.
```sh
git commit -m 'Implemented new feature.'
```
6. Push your changes to your forked repository on GitHub using the following command
```sh
git push origin new-feature-branch
```
7. Create a pull request to the original repository.
Open a new pull request to the original project repository. In the pull request, describe the changes you've made and why they're necessary.
The project maintainers will review your changes and provide feedback or merge them into the main branch.---
## ๐ชช License
This project is licensed under the `[๐ INSERT-LICENSE-TYPE]` License. See the [LICENSE](https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/adding-a-license-to-a-repository) file for additional info.
---
## ๐ Acknowledgments
[๐ INSERT-DESCRIPTION]
---