Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/skar-software/duckdb-py-lambda
DuckDB Python runtime working in AWS Lambda
https://github.com/skar-software/duckdb-py-lambda
Last synced: 7 days ago
JSON representation
DuckDB Python runtime working in AWS Lambda
- Host: GitHub
- URL: https://github.com/skar-software/duckdb-py-lambda
- Owner: skar-software
- License: unlicense
- Created: 2024-05-18T14:07:20.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-05-21T04:25:27.000Z (8 months ago)
- Last Synced: 2025-01-14T19:58:39.715Z (12 days ago)
- Language: Python
- Size: 5.86 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Usage
just pass your python code into the "pandas_code" key in the input event json of the lambda like below
```json
{
"pandas_code": "import duckdb\nimport pandas as pd\nimport numpy as np\ndf = pd.DataFrame({\n'A': ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'],\n'B': ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'],\n'C': np.random.randn(8),\n'D': np.random.randn(8)\n})\nresult = duckdb.query('SELECT A, AVG(D) FROM df GROUP BY A').to_df()\nprint(result)"
}
```
if you require to use double quotes in the query then it must be escaped while its being passed into the event json# Example queries
fetch count of records present in example csv**Input Code**
```python
import duckdb
import pandas as pd
import numpy as np# Create a Pandas DataFrame
df = pd.DataFrame({
'A': ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'],
'B': ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'],
'C': np.random.randn(8),
'D': np.random.randn(8)
})
# Use DuckDB to run a SQL query on the DataFrame
result = duckdb.query("SELECT A, AVG(D) FROM df GROUP BY A").to_df()
```
**Output**
```
A avg(D)
0 foo 0.468670
1 bar 0.399205
```# Option 1 : Run EXE file (pre-compiled)
- Go DuckDB in Lambda EXE file is avalaible under the releases tab. You can download the zip and directly upload it to a AWS lambda and test it out your self.
https://github.com/skarcapital/DuckDB-Py-Lambda/releases/tag/v1
- since the zip which contains the program and requries dependencies is > 50MB we have to upload it to AWS S3 and configure our lambda to utilize that
- If you have questions, post it in the issues.# Option 2: Compile steps
- the lambda/lambda_function.py file is the main program
- inside this folder install all required dependencies locally
`pip install duckdb pandas -t .`
- zip the folder that has the lambda_function.py file and the dependencies
- since the zip which contains the program and requries dependencies is > 50MB we have to upload it to AWS S3 and configure our lambda to utilize that