{"id":13468798,"url":"https://github.com/robinhood/faust","last_synced_at":"2025-05-13T15:10:38.475Z","repository":{"id":37550387,"uuid":"84353800","full_name":"robinhood/faust","owner":"robinhood","description":"Python Stream Processing","archived":false,"fork":false,"pushed_at":"2024-07-27T10:27:41.000Z","size":8707,"stargazers_count":6787,"open_issues_count":276,"forks_count":534,"subscribers_count":136,"default_branch":"master","last_synced_at":"2025-05-12T23:45:56.361Z","etag":null,"topics":["asyncio","distributed-systems","kafka","kafka-streams","python","stream-processing"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/robinhood.png","metadata":{"files":{"readme":"README.rst","changelog":"Changelog.rst","contributing":"CONTRIBUTING.rst","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.rst","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-03-08T18:36:11.000Z","updated_at":"2025-05-12T14:46:29.000Z","dependencies_parsed_at":"2023-02-08T17:30:28.012Z","dependency_job_id":"bf136e26-7690-4158-bde2-39ab32fe6f97","html_url":"https://github.com/robinhood/faust","commit_stats":{"total_commits":3761,"total_committers":93,"mean_commits":40.44086021505376,"dds":0.4150491890454666,"last_synced_commit":"14f65ee6f2810ecab3cf3a8888757949dd12dbd8"},"previous_names":[],"tags_count":135,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robinhood%2Ffaust","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robinhood%2Ffaust/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robinhood%2Ffaust/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robinhood%2Ffaust/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/robinhood","download_url":"https://codeload.github.com/robinhood/faust/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253969248,"owners_count":21992263,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["asyncio","distributed-systems","kafka","kafka-streams","python","stream-processing"],"created_at":"2024-07-31T15:01:19.292Z","updated_at":"2025-05-13T15:10:33.458Z","avatar_url":"https://github.com/robinhood.png","language":"Python","funding_links":[],"categories":["Cluster Computing","Python","Table of Contents","资源列表","Distributed Computing","Libraries","分布式计算","Data Processing","数据管道和流处理","Data Stream Processing","_Table of Contents_","Development","Distributed Computing [🔝](#readme)","其他","Awesome Python"],"sub_categories":["Streaming Engine","分布式计算","Kafkaesque","On-Prem","Kafka Streams Libraries","Cluster Computing","Drone Frames"],"readme":".. XXX Need to change this image to readthedocs before release\n\n.. image:: https://raw.githubusercontent.com/robinhood/faust/8ee5e209322d9edf5bdb79b992ef986be2de4bb4/artwork/banner-alt1.png\n\n===========================\n Deprecation Notice\n===========================\n\nThis library has been deprecated and no longer managed or supported. The current active community project can be found at https://github.com/faust-streaming/faust\n\n===========================\n Python Stream Processing\n===========================\n\n|build-status| |coverage| |license| |wheel| |pyversion| |pyimp|\n\n:Version: 1.10.4\n:Web: http://faust.readthedocs.io/\n:Download: http://pypi.org/project/faust\n:Source: http://github.com/robinhood/faust\n:Keywords: distributed, stream, async, processing, data, queue, state management\n\n\n.. sourcecode:: python\n\n    # Python Streams\n    # Forever scalable event processing \u0026 in-memory durable K/V store;\n    # as a library w/ asyncio \u0026 static typing.\n    import faust\n\n**Faust** is a stream processing library, porting the ideas from\n`Kafka Streams`_ to Python.\n\nIt is used at `Robinhood`_ to build high performance distributed systems\nand real-time data pipelines that process billions of events every day.\n\nFaust provides both *stream processing* and *event processing*,\nsharing similarity with tools such as\n`Kafka Streams`_, `Apache Spark`_/`Storm`_/`Samza`_/`Flink`_,\n\nIt does not use a DSL, it's just Python!\nThis means you can use all your favorite Python libraries\nwhen stream processing: NumPy, PyTorch, Pandas, NLTK, Django,\nFlask, SQLAlchemy, ++\n\nFaust requires Python 3.6 or later for the new `async/await`_ syntax,\nand variable type annotations.\n\nHere's an example processing a stream of incoming orders:\n\n.. sourcecode:: python\n\n    app = faust.App('myapp', broker='kafka://localhost')\n\n    # Models describe how messages are serialized:\n    # {\"account_id\": \"3fae-...\", amount\": 3}\n    class Order(faust.Record):\n        account_id: str\n        amount: int\n\n    @app.agent(value_type=Order)\n    async def order(orders):\n        async for order in orders:\n            # process infinite stream of orders.\n            print(f'Order for {order.account_id}: {order.amount}')\n\nThe Agent decorator defines a \"stream processor\" that essentially\nconsumes from a Kafka topic and does something for every event it receives.\n\nThe agent is an ``async def`` function, so can also perform\nother operations asynchronously, such as web requests.\n\nThis system can persist state, acting like a database.\nTables are named distributed key/value stores you can use\nas regular Python dictionaries.\n\nTables are stored locally on each machine using a super fast\nembedded database written in C++, called `RocksDB`_.\n\nTables can also store aggregate counts that are optionally \"windowed\"\nso you can keep track\nof \"number of clicks from the last day,\" or\n\"number of clicks in the last hour.\" for example. Like `Kafka Streams`_,\nwe support tumbling, hopping and sliding windows of time, and old windows\ncan be expired to stop data from filling up.\n\nFor reliability we use a Kafka topic as \"write-ahead-log\".\nWhenever a key is changed we publish to the changelog.\nStandby nodes consume from this changelog to keep an exact replica\nof the data and enables instant recovery should any of the nodes fail.\n\nTo the user a table is just a dictionary, but data is persisted between\nrestarts and replicated across nodes so on failover other nodes can take over\nautomatically.\n\nYou can count page views by URL:\n\n.. sourcecode:: python\n\n    # data sent to 'clicks' topic sharded by URL key.\n    # e.g. key=\"http://example.com\" value=\"1\"\n    click_topic = app.topic('clicks', key_type=str, value_type=int)\n\n    # default value for missing URL will be 0 with `default=int`\n    counts = app.Table('click_counts', default=int)\n\n    @app.agent(click_topic)\n    async def count_click(clicks):\n        async for url, count in clicks.items():\n            counts[url] += count\n\nThe data sent to the Kafka topic is partitioned, which means\nthe clicks will be sharded by URL in such a way that every count\nfor the same URL will be delivered to the same Faust worker instance.\n\n\nFaust supports any type of stream data: bytes, Unicode and serialized\nstructures, but also comes with \"Models\" that use modern Python\nsyntax to describe how keys and values in streams are serialized:\n\n.. sourcecode:: python\n\n    # Order is a json serialized dictionary,\n    # having these fields:\n\n    class Order(faust.Record):\n        account_id: str\n        product_id: str\n        price: float\n        quantity: float = 1.0\n\n    orders_topic = app.topic('orders', key_type=str, value_type=Order)\n\n    @app.agent(orders_topic)\n    async def process_order(orders):\n        async for order in orders:\n            # process each order using regular Python\n            total_price = order.price * order.quantity\n            await send_order_received_email(order.account_id, order)\n\nFaust is statically typed, using the ``mypy`` type checker,\nso you can take advantage of static types when writing applications.\n\nThe Faust source code is small, well organized, and serves as a good\nresource for learning the implementation of `Kafka Streams`_.\n\n**Learn more about Faust in the** `introduction`_ **introduction page**\n    to read more about Faust, system requirements, installation instructions,\n    community resources, and more.\n\n**or go directly to the** `quickstart`_ **tutorial**\n    to see Faust in action by programming a streaming application.\n\n**then explore the** `User Guide`_\n    for in-depth information organized by topic.\n\n.. _`Robinhood`: http://robinhood.com\n.. _`async/await`:\n    https://medium.freecodecamp.org/a-guide-to-asynchronous-programming-in-python-with-asyncio-232e2afa44f6\n.. _`Celery`: http://celeryproject.org\n.. _`Kafka Streams`: https://kafka.apache.org/documentation/streams\n.. _`Apache Spark`: http://spark.apache.org\n.. _`Storm`: http://storm.apache.org\n.. _`Samza`: http://samza.apache.org\n.. _`Flink`: http://flink.apache.org\n.. _`RocksDB`: http://rocksdb.org\n.. _`Apache Kafka`: https://kafka.apache.org\n\n.. _`introduction`: http://faust.readthedocs.io/en/latest/introduction.html\n\n.. _`quickstart`: http://faust.readthedocs.io/en/latest/playbooks/quickstart.html\n\n.. _`User Guide`: http://faust.readthedocs.io/en/latest/userguide/index.html\n\nFaust is...\n===========\n\n**Simple**\n    Faust is extremely easy to use. To get started using other stream processing\n    solutions you have complicated hello-world projects, and\n    infrastructure requirements.  Faust only requires Kafka,\n    the rest is just Python, so If you know Python you can already use Faust to do\n    stream processing, and it can integrate with just about anything.\n\n    Here's one of the easier applications you can make::\n\n        import faust\n\n        class Greeting(faust.Record):\n            from_name: str\n            to_name: str\n\n        app = faust.App('hello-app', broker='kafka://localhost')\n        topic = app.topic('hello-topic', value_type=Greeting)\n\n        @app.agent(topic)\n        async def hello(greetings):\n            async for greeting in greetings:\n                print(f'Hello from {greeting.from_name} to {greeting.to_name}')\n\n        @app.timer(interval=1.0)\n        async def example_sender(app):\n            await hello.send(\n                value=Greeting(from_name='Faust', to_name='you'),\n            )\n\n        if __name__ == '__main__':\n            app.main()\n\n    You're probably a bit intimidated by the `async` and `await` keywords,\n    but you don't have to know how ``asyncio`` works to use\n    Faust: just mimic the examples, and you'll be fine.\n\n    The example application starts two tasks: one is processing a stream,\n    the other is a background thread sending events to that stream.\n    In a real-life application, your system will publish\n    events to Kafka topics that your processors can consume from,\n    and the background thread is only needed to feed data into our\n    example.\n\n**Highly Available**\n    Faust is highly available and can survive network problems and server\n    crashes.  In the case of node failure, it can automatically recover,\n    and tables have standby nodes that will take over.\n\n**Distributed**\n    Start more instances of your application as needed.\n\n**Fast**\n    A single-core Faust worker instance can already process tens of thousands\n    of events every second, and we are reasonably confident that throughput will\n    increase once we can support a more optimized Kafka client.\n\n**Flexible**\n    Faust is just Python, and a stream is an infinite asynchronous iterator.\n    If you know how to use Python, you already know how to use Faust,\n    and it works with your favorite Python libraries like Django, Flask,\n    SQLAlchemy, NTLK, NumPy, SciPy, TensorFlow, etc.\n\n.. _`introduction`: http://faust.readthedocs.io/en/latest/introduction.html\n\n.. _`quickstart`: http://faust.readthedocs.io/en/latest/playbooks/quickstart.html\n\n.. _`User Guide`: http://faust.readthedocs.io/en/latest/userguide/index.html\n\nInstallation\n============\n\nYou can install Faust either via the Python Package Index (PyPI)\nor from source.\n\nTo install using `pip`:\n\n.. sourcecode:: console\n\n    $ pip install -U faust\n\n.. _bundles:\n\nBundles\n-------\n\nFaust also defines a group of ``setuptools`` extensions that can be used\nto install Faust and the dependencies for a given feature.\n\nYou can specify these in your requirements or on the ``pip``\ncommand-line by using brackets. Separate multiple bundles using the comma:\n\n.. sourcecode:: console\n\n    $ pip install \"faust[rocksdb]\"\n\n    $ pip install \"faust[rocksdb,uvloop,fast,redis]\"\n\nThe following bundles are available:\n\nStores\n~~~~~~\n\n:``faust[rocksdb]``:\n    for using `RocksDB`_ for storing Faust table state.\n\n    **Recommended in production.**\n\n\n.. _`RocksDB`: http://rocksdb.org\n\nCaching\n~~~~~~~\n\n:``faust[redis]``:\n    for using `Redis_` as a simple caching backend (Memcached-style).\n\nCodecs\n~~~~~~\n\n:``faust[yaml]``:\n    for using YAML and the ``PyYAML`` library in streams.\n\nOptimization\n~~~~~~~~~~~~\n\n:``faust[fast]``:\n    for installing all the available C speedup extensions to Faust core.\n\nSensors\n~~~~~~~\n\n:``faust[datadog]``:\n    for using the Datadog Faust monitor.\n\n:``faust[statsd]``:\n    for using the Statsd Faust monitor.\n\nEvent Loops\n~~~~~~~~~~~\n\n:``faust[uvloop]``:\n    for using Faust with ``uvloop``.\n\n:``faust[eventlet]``:\n    for using Faust with ``eventlet``\n\nDebugging\n~~~~~~~~~\n\n:``faust[debug]``:\n    for using ``aiomonitor`` to connect and debug a running Faust worker.\n\n:``faust[setproctitle]``:\n    when the ``setproctitle`` module is installed the Faust worker will\n    use it to set a nicer process name in ``ps``/``top`` listings.\n    Also installed with the ``fast`` and ``debug`` bundles.\n\nDownloading and installing from source\n--------------------------------------\n\nDownload the latest version of Faust from\nhttp://pypi.org/project/faust\n\nYou can install it by doing:\n\n.. sourcecode:: console\n\n    $ tar xvfz faust-0.0.0.tar.gz\n    $ cd faust-0.0.0\n    $ python setup.py build\n    # python setup.py install\n\nThe last command must be executed as a privileged user if\nyou are not currently using a virtualenv.\n\nUsing the development version\n-----------------------------\n\nWith pip\n~~~~~~~~\n\nYou can install the latest snapshot of Faust using the following\n``pip`` command:\n\n.. sourcecode:: console\n\n    $ pip install https://github.com/robinhood/faust/zipball/master#egg=faust\n\n.. _`introduction`: http://faust.readthedocs.io/en/latest/introduction.html\n\n.. _`quickstart`: http://faust.readthedocs.io/en/latest/playbooks/quickstart.html\n\n.. _`User Guide`: http://faust.readthedocs.io/en/latest/userguide/index.html\n\nFAQ\n===\n\nCan I use Faust with Django/Flask/etc.?\n---------------------------------------\n\nYes! Use ``eventlet`` as a bridge to integrate with ``asyncio``.\n\n\nUsing ``eventlet``\n~~~~~~~~~~~~~~~~~~~~~~\n\nThis approach works with any blocking Python library that can work with\n``eventlet``.\n\nUsing ``eventlet`` requires you to install the ``aioeventlet`` module,\nand you can install this as a bundle along with Faust:\n\n.. sourcecode:: console\n\n    $ pip install -U faust[eventlet]\n\nThen to actually use eventlet as the event loop you have to either\nuse the ``-L \u003cfaust --loop\u003e`` argument to the ``faust`` program:\n\n.. sourcecode:: console\n\n    $ faust -L eventlet -A myproj worker -l info\n\nor add ``import mode.loop.eventlet`` at the top of your entry point script:\n\n.. sourcecode:: python\n\n    #!/usr/bin/env python3\n    import mode.loop.eventlet  # noqa\n\n.. warning::\n\n    It's very important this is at the very top of the module,\n    and that it executes before you import libraries.\n\nCan I use Faust with Tornado?\n-----------------------------\n\nYes! Use the ``tornado.platform.asyncio`` bridge:\nhttp://www.tornadoweb.org/en/stable/asyncio.html\n\nCan I use Faust with Twisted?\n-----------------------------\n\nYes! Use the ``asyncio`` reactor implementation:\nhttps://twistedmatrix.com/documents/17.1.0/api/twisted.internet.asyncioreactor.html\n\nWill you support Python 2.7 or Python 3.5?\n------------------------------------------\n\nNo. Faust requires Python 3.6 or later, since it heavily uses features that were\nintroduced in Python 3.6 (`async`, `await`, variable type annotations).\n\nI get a maximum number of open files exceeded error by RocksDB when running a Faust app locally. How can I fix this?\n--------------------------------------------------------------------------------------------------------------------\n\nYou may need to increase the limit for the maximum number of open files. The\nfollowing post explains how to do so on OS X:\nhttps://blog.dekstroza.io/ulimit-shenanigans-on-osx-el-capitan/\n\n\nWhat kafka versions faust supports?\n---------------------------------------\n\nFaust supports kafka with version \u003e= 0.10.\n\n.. _`introduction`: http://faust.readthedocs.io/en/latest/introduction.html\n\n.. _`quickstart`: http://faust.readthedocs.io/en/latest/playbooks/quickstart.html\n\n.. _`User Guide`: http://faust.readthedocs.io/en/latest/userguide/index.html\n\n.. _getting-help:\n\nGetting Help\n============\n\n.. _slack-channel:\n\nSlack\n-----\n\nFor discussions about the usage, development, and future of Faust,\nplease join the `fauststream`_ Slack.\n\n* https://fauststream.slack.com\n* Sign-up: https://join.slack.com/t/fauststream/shared_invite/enQtNDEzMTIyMTUyNzU2LTIyMjNjY2M2YzA2OWFhMDlmMzVkODk3YTBlYThlYmZiNTUwZDJlYWZiZTdkN2Q4ZGU4NWM4YWMyNTM5MGQ5OTg\n\nResources\n=========\n\n.. _bug-tracker:\n\nBug tracker\n-----------\n\nIf you have any suggestions, bug reports, or annoyances please report them\nto our issue tracker at https://github.com/robinhood/faust/issues/\n\n.. _license:\n\nLicense\n=======\n\nThis software is licensed under the `New BSD License`. See the ``LICENSE``\nfile in the top distribution directory for the full license text.\n\n.. # vim: syntax=rst expandtab tabstop=4 shiftwidth=4 shiftround\n\n.. _`introduction`: http://faust.readthedocs.io/en/latest/introduction.html\n\n.. _`quickstart`: http://faust.readthedocs.io/en/latest/playbooks/quickstart.html\n\n.. _`User Guide`: http://faust.readthedocs.io/en/latest/userguide/index.html\n\nContributing\n============\n\nDevelopment of `Faust` happens at GitHub: https://github.com/robinhood/faust\n\nYou're highly encouraged to participate in the development\nof `Faust`.\n\nBe sure to also read the `Contributing to Faust`_ section in the\ndocumentation.\n\n.. _`Contributing to Faust`:\n    http://faust.readthedocs.io/en/latest/contributing.html\n\nCode of Conduct\n===============\n\nEveryone interacting in the project's code bases, issue trackers, chat rooms,\nand mailing lists is expected to follow the Faust Code of Conduct.\n\nAs contributors and maintainers of these projects, and in the interest of fostering\nan open and welcoming community, we pledge to respect all people who contribute\nthrough reporting issues, posting feature requests, updating documentation,\nsubmitting pull requests or patches, and other activities.\n\nWe are committed to making participation in these projects a harassment-free\nexperience for everyone, regardless of level of experience, gender,\ngender identity and expression, sexual orientation, disability,\npersonal appearance, body size, race, ethnicity, age,\nreligion, or nationality.\n\nExamples of unacceptable behavior by participants include:\n\n* The use of sexualized language or imagery\n* Personal attacks\n* Trolling or insulting/derogatory comments\n* Public or private harassment\n* Publishing other's private information, such as physical\n  or electronic addresses, without explicit permission\n* Other unethical or unprofessional conduct.\n\nProject maintainers have the right and responsibility to remove, edit, or reject\ncomments, commits, code, wiki edits, issues, and other contributions that are\nnot aligned to this Code of Conduct. By adopting this Code of Conduct,\nproject maintainers commit themselves to fairly and consistently applying\nthese principles to every aspect of managing this project. Project maintainers\nwho do not follow or enforce the Code of Conduct may be permanently removed from\nthe project team.\n\nThis code of conduct applies both within project spaces and in public spaces\nwhen an individual is representing the project or its community.\n\nInstances of abusive, harassing, or otherwise unacceptable behavior may be\nreported by opening an issue or contacting one or more of the project maintainers.\n\nThis Code of Conduct is adapted from the Contributor Covenant,\nversion 1.2.0 available at http://contributor-covenant.org/version/1/2/0/.\n\n.. _`introduction`: http://faust.readthedocs.io/en/latest/introduction.html\n\n.. _`quickstart`: http://faust.readthedocs.io/en/latest/playbooks/quickstart.html\n\n.. _`User Guide`: http://faust.readthedocs.io/en/latest/userguide/index.html\n\n.. |build-status| image:: https://secure.travis-ci.org/robinhood/faust.png?branch=master\n    :alt: Build status\n    :target: https://travis-ci.org/robinhood/faust\n\n.. |coverage| image:: https://codecov.io/github/robinhood/faust/coverage.svg?branch=master\n    :target: https://codecov.io/github/robinhood/faust?branch=master\n\n.. |license| image:: https://img.shields.io/pypi/l/faust.svg\n    :alt: BSD License\n    :target: https://opensource.org/licenses/BSD-3-Clause\n\n.. |wheel| image:: https://img.shields.io/pypi/wheel/faust.svg\n    :alt: faust can be installed via wheel\n    :target: http://pypi.org/project/faust/\n\n.. |pyversion| image:: https://img.shields.io/pypi/pyversions/faust.svg\n    :alt: Supported Python versions.\n    :target: http://pypi.org/project/faust/\n\n.. |pyimp| image:: https://img.shields.io/pypi/implementation/faust.svg\n    :alt: Support Python implementations.\n    :target: http://pypi.org/project/faust/\n\n.. _`introduction`: http://faust.readthedocs.io/en/latest/introduction.html\n\n.. _`quickstart`: http://faust.readthedocs.io/en/latest/playbooks/quickstart.html\n\n.. _`User Guide`: http://faust.readthedocs.io/en/latest/userguide/index.html\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobinhood%2Ffaust","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frobinhood%2Ffaust","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobinhood%2Ffaust/lists"}