https://github.com/lithops-cloud/lithops
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud βοΈπ
https://github.com/lithops-cloud/lithops
big-data big-data-analytics cloud-computing data-processing distributed kubernetes multicloud multiprocessing object-storage parallel python serverless serverless-computing serverless-functions
Last synced: 9 days ago
JSON representation
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud βοΈπ
- Host: GitHub
- URL: https://github.com/lithops-cloud/lithops
- Owner: lithops-cloud
- License: apache-2.0
- Created: 2018-04-23T09:02:25.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2025-02-27T11:05:17.000Z (3 months ago)
- Last Synced: 2025-04-11T17:50:42.450Z (about 1 month ago)
- Topics: big-data, big-data-analytics, cloud-computing, data-processing, distributed, kubernetes, multicloud, multiprocessing, object-storage, parallel, python, serverless, serverless-computing, serverless-functions
- Language: Python
- Homepage: http://lithops.cloud
- Size: 12.9 MB
- Stars: 327
- Watchers: 12
- Forks: 111
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
Lithops is a Python multi-cloud distributed computing framework. It allows you to run unmodified local python code at massive scale in the main
serverless computing platforms. Lithops delivers the userβs code into the cloud without requiring knowledge of how it is deployed and run. Moreover, its multicloud-agnostic architecture ensures portability across cloud providers.Lithops is specially suited for highly-parallel programs with little or no need for communication between processes, but it also supports parallel applications that need to share state among processes. Examples of applications that run with Lithops include Monte Carlo simulations, deep learning and machine learning processes, metabolomics computations, and geospatial analytics, to name a few.
## Installation
1. Install Lithops from the PyPi repository:
```bash
pip install lithops
```2. Execute a *Hello World* function:
```bash
lithops hello
```## Configuration
Lithops provides an extensible backend architecture (compute, storage) that is designed to work with different Cloud providers and on-premise backends. In this sense, you can code in python and run it unmodified in IBM Cloud, AWS, Azure, Google Cloud, Aliyun and Kubernetes or OpenShift.[Follow these instructions to configure your compute and storage backends](config/)
## High-level API
Lithops is shipped with 2 different high-level Compute APIs, and 2 high-level Storage APIs
![]()
![]()
```python
from lithops import FunctionExecutordef hello(name):
return f'Hello {name}!'with FunctionExecutor() as fexec:
f = fexec.call_async(hello, 'World')
print(f.result())
``````python
from lithops.multiprocessing import Pooldef double(i):
return i * 2with Pool() as pool:
result = pool.map(double, [1, 2, 3, 4])
print(result)
```
![]()
![]()
```python
from lithops import Storageif __name__ == "__main__":
st = Storage()
st.put_object(bucket='mybucket',
key='test.txt',
body='Hello World')print(st.get_object(bucket='lithops',
key='test.txt'))
``````python
from lithops.storage.cloud_proxy import osif __name__ == "__main__":
filepath = 'bar/foo.txt'
with os.open(filepath, 'w') as f:
f.write('Hello world!')dirname = os.path.dirname(filepath)
print(os.listdir(dirname))
os.remove(filepath)
```You can find more usage examples in the [examples](/examples) folder.
## Execution Modes
Lithops is shipped with 3 different modes of execution. The execution mode allows you to decide where and how the functions are executed.
* [Localhost Mode](docs/source/execution_modes.rst#localhost-mode)
This mode allows you to execute functions on the local machine using processes, providing a convenient way to leverage Lithops' distributed computing capabilities without relying on cloud resources. This mode is particularly useful for development, testing, and debugging purposes. This is the default mode of execution if no configuration is provided.
* [Serverless Mode](docs/source/execution_modes.rst#serverless-mode)
This mode allows you to execute functions on popular serverless compute services, leveraging the scalability, isolation, and automatic resource provisioning provided by these platforms. With serverless mode, you can easily parallelize task execution, harness the elastic nature of serverless environments, and simplify the development and deployment of scalable data processing workloads and parallel applications.
* [Standalone Mode](docs/source/execution_modes.rst#standalone-mode)
This mode provides the capability to execute functions on one or multiple virtual machines (VMs) simultaneously, in a serverless-like fashion, without requiring manual provisioning as everything is automatically created. This mode can be used in a private cluster or in the cloud, where functions within each VM are executed using parallel processes.
## Documentation
For documentation on using Lithops, see [latest release documentation](https://lithops-cloud.github.io/docs/) or [current github docs](docs/user_guide.md).
If you are interested in contributing, see [CONTRIBUTING.md](./CONTRIBUTING.md).
## Additional resources
### Blogs and Talks
* [Simplify the developer experience with OpenShift for Big Data processing by using Lithops framework](https://medium.com/@gvernik/simplify-the-developer-experience-with-openshift-for-big-data-processing-by-using-lithops-framework-d62a795b5e1c)
* [Speed-up your Python applications using Lithops and Serverless Cloud resources](https://itnext.io/speed-up-your-python-applications-using-lithops-and-serverless-cloud-resources-a64beb008bb5)
* [Serverless Without Constraints](https://www.ibm.com/cloud/blog/serverless-without-constraints)
* [Lithops, a Multi-cloud Serverless Programming Framework](https://itnext.io/lithops-a-multi-cloud-serverless-programming-framework-fd97f0d5e9e4)
* [CNCF Webinar - Toward Hybrid Cloud Serverless Transparency with Lithops Framework](https://www.youtube.com/watch?v=-uS-wi8CxBo)
* [Using Serverless to Run Your Python Code on 1000 Cores by Changing Two Lines of Code](https://www.ibm.com/cloud/blog/using-serverless-to-run-your-python-code-on-1000-cores-by-changing-two-lines-of-code)
* [Decoding dark molecular matter in spatial metabolomics with IBM Cloud Functions](https://www.ibm.com/cloud/blog/decoding-dark-molecular-matter-in-spatial-metabolomics-with-ibm-cloud-functions)
* [Your easy move to serverless computing and radically simplified data processing](https://www.slideshare.net/gvernik/your-easy-move-to-serverless-computing-and-radically-simplified-data-processing-238929020) Strata Data Conference, NY 2019. See video of Lithops usage [here](https://www.youtube.com/watch?v=EYa95KyYEtg&list=PLpR7f3Www9KCjYisaG7AMaR0C2GqLUh2G&index=3&t=0s) and the example of Monte Carlo [here](https://www.youtube.com/watch?v=vF5HI2q5VKw&list=PLpR7f3Www9KCjYisaG7AMaR0C2GqLUh2G&index=2&t=0s)
* [Speed up data pre-processing with Lithops in deep learning](https://developer.ibm.com/patterns/speed-up-data-pre-processing-with-pywren-in-deep-learning/)
* [Predicting the future with Monte Carlo simulations over IBM Cloud Functions](https://www.ibm.com/cloud/blog/monte-carlo-simulations-with-ibm-cloud-functions)
* [Process large data sets at massive scale with Lithops over IBM Cloud Functions](https://www.ibm.com/cloud/blog/process-large-data-sets-massive-scale-pywren-ibm-cloud-functions)
* [Industrial project in Technion on Lithops](http://www.cs.technion.ac.il/~cs234313/projects_sites/W19/04/site/)### Papers
* [Outsourcing Data Processing Jobs with Lithops](https://ieeexplore.ieee.org/document/9619947) - IEEE Transactions on Cloud Computing 2022
* [Towards Multicloud Access Transparency in Serverless Computing](https://www.computer.org/csdl/magazine/so/5555/01/09218932/1nMMkpZ8Ko8) - IEEE Software 2021
* [Primula: a Practical Shuffle/Sort Operator for Serverless Computing](https://dl.acm.org/doi/10.1145/3429357.3430522) - ACM/IFIP International Middleware Conference 2020. [See presentation here](https://www.youtube.com/watch?v=v698iu5YfWM)
* [Bringing scaling transparency to Proteomics applications with serverless computing](https://dl.acm.org/doi/abs/10.1145/3429880.3430101) - 6th International Workshop on Serverless Computing (WoSC6) 2020. [See presentation here](https://www.serverlesscomputing.org/wosc6/#p10)
* [Serverless data analytics in the IBM Cloud](https://dl.acm.org/citation.cfm?id=3284029) - ACM/IFIP International Middleware Conference 2018# Acknowledgements
This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 825184 (CloudButton).