https://github.com/lithops-cloud/lithops
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
https://github.com/lithops-cloud/lithops
big-data big-data-analytics cloud-computing data-processing distributed kubernetes multicloud multiprocessing object-storage parallel python serverless serverless-computing serverless-functions
Last synced: 7 days ago
JSON representation
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
- Host: GitHub
- URL: https://github.com/lithops-cloud/lithops
- Owner: lithops-cloud
- License: apache-2.0
- Created: 2018-04-23T09:02:25.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2025-02-27T11:05:17.000Z (11 months ago)
- Last Synced: 2025-04-11T17:50:42.450Z (9 months ago)
- Topics: big-data, big-data-analytics, cloud-computing, data-processing, distributed, kubernetes, multicloud, multiprocessing, object-storage, parallel, python, serverless, serverless-computing, serverless-functions
- Language: Python
- Homepage: http://lithops.cloud
- Size: 12.9 MB
- Stars: 327
- Watchers: 12
- Forks: 111
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
Lithops is a Python multi-cloud distributed computing framework that lets you run unmodified Python code at massive scale across cloud, HPC, and on-premise platforms. It supports major cloud providers and Kubernetes platforms, running your code transparently without requiring you to manage deployment or infrastructure.
Lithops is ideal for highly parallel workloads—such as Monte Carlo simulations, machine learning, metabolomics, or geospatial analytics—and lets you tailor execution to your priorities: you can optimize for performance using AWS Lambda to launch hundreds of functions in milliseconds, or reduce costs by running the same code on AWS Batch with Spot Instances.
## Installation
1. Install Lithops from the PyPi repository:
```bash
pip install lithops
```
2. Execute a *Hello World* function:
```bash
lithops hello
```
## Configuration
Lithops provides an extensible backend architecture (compute, storage) designed to work with various cloud providers and on-premise platforms. You can write your code in Python and run it unmodified across major cloud providers and Kubernetes environments.
[Follow these instructions to configure your compute and storage backends](config/)
## High-level API
Lithops is shipped with 2 different high-level Compute APIs, and 2 high-level Storage APIs
```python
from lithops import FunctionExecutor
def double(i):
return i * 2
with FunctionExecutor() as fexec:
f = fexec.map(double, [1, 2, 3, 4])
print(f.result())
```
```python
from lithops.multiprocessing import Pool
def double(i):
return i * 2
with Pool() as pool:
result = pool.map(double, [1, 2, 3, 4])
print(result)
```
```python
from lithops import Storage
if __name__ == "__main__":
st = Storage()
st.put_object(bucket='mybucket',
key='test.txt',
body='Hello World')
print(st.get_object(bucket='lithops',
key='test.txt'))
```
```python
from lithops.storage.cloud_proxy import os
if __name__ == "__main__":
filepath = 'bar/foo.txt'
with os.open(filepath, 'w') as f:
f.write('Hello world!')
dirname = os.path.dirname(filepath)
print(os.listdir(dirname))
os.remove(filepath)
```
You can find more usage examples in the [examples](/examples) folder.
## Documentation
For documentation on using Lithops, see [latest release documentation](https://lithops-cloud.github.io/docs/)
If you are interested in contributing, see [CONTRIBUTING.md](./CONTRIBUTING.md).
## Additional resources
### Blogs and Talks
* [How to run Lithops over EC2 VMs using the new K8s backend](https://danielalecoll.medium.com/how-to-run-lithops-over-ec2-vms-using-the-new-k8s-backend-4b0a4377c4e9)
* [Simplify the developer experience with OpenShift for Big Data processing by using Lithops framework](https://medium.com/@gvernik/simplify-the-developer-experience-with-openshift-for-big-data-processing-by-using-lithops-framework-d62a795b5e1c)
* [Speed-up your Python applications using Lithops and Serverless Cloud resources](https://itnext.io/speed-up-your-python-applications-using-lithops-and-serverless-cloud-resources-a64beb008bb5)
* [Lithops, a Multi-cloud Serverless Programming Framework](https://itnext.io/lithops-a-multi-cloud-serverless-programming-framework-fd97f0d5e9e4)
* [CNCF Webinar - Toward Hybrid Cloud Serverless Transparency with Lithops Framework](https://www.youtube.com/watch?v=-uS-wi8CxBo)
* [Your easy move to serverless computing and radically simplified data processing](https://www.slideshare.net/gvernik/your-easy-move-to-serverless-computing-and-radically-simplified-data-processing-238929020) Strata Data Conference, NY 2019. See video of Lithops usage [here](https://www.youtube.com/watch?v=EYa95KyYEtg&list=PLpR7f3Www9KCjYisaG7AMaR0C2GqLUh2G&index=3&t=0s) and the example of Monte Carlo [here](https://www.youtube.com/watch?v=vF5HI2q5VKw&list=PLpR7f3Www9KCjYisaG7AMaR0C2GqLUh2G&index=2&t=0s)
### Papers
* [Serverful Functions: Leveraging Servers in Complex Serverless Workflows](https://dl.acm.org/doi/10.1145/3700824.3701095) - ACM Middleware Industrial Track 2024
* [Transparent serverless execution of Python multiprocessing applications](https://dl.acm.org/doi/10.1016/j.future.2022.10.038) - Elsevier Future Generation Computer Systems 2023
* [Outsourcing Data Processing Jobs with Lithops](https://ieeexplore.ieee.org/document/9619947) - IEEE Transactions on Cloud Computing 2022
* [Towards Multicloud Access Transparency in Serverless Computing](https://www.computer.org/csdl/magazine/so/5555/01/09218932/1nMMkpZ8Ko8) - IEEE Software 2021
* [Primula: a Practical Shuffle/Sort Operator for Serverless Computing](https://dl.acm.org/doi/10.1145/3429357.3430522) - ACM/IFIP International Middleware Conference 2020. [See presentation here](https://www.youtube.com/watch?v=v698iu5YfWM)
* [Bringing scaling transparency to Proteomics applications with serverless computing](https://dl.acm.org/doi/abs/10.1145/3429880.3430101) - 6th International Workshop on Serverless Computing (WoSC6) 2020. [See presentation here](https://www.serverlesscomputing.org/wosc6/#p10)
* [Serverless data analytics in the IBM Cloud](https://dl.acm.org/citation.cfm?id=3284029) - ACM/IFIP International Middleware Conference 2018
# Acknowledgements
This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 825184 (CloudButton).
