{"id":28486573,"url":"https://github.com/zipfile/python-sonyflake-turbo","last_synced_at":"2026-03-07T08:02:37.546Z","repository":{"id":297448212,"uuid":"996792729","full_name":"ZipFile/python-sonyflake-turbo","owner":"ZipFile","description":"A Sonyflake ID generator tailored for high-volume ID generation","archived":false,"fork":false,"pushed_at":"2025-10-04T11:45:32.000Z","size":47,"stargazers_count":4,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-10-04T12:28:13.613Z","etag":null,"topics":["cpython-extensions","id-generator","python","sonyflake"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/sonyflake-turbo/","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ZipFile.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-06-05T13:23:01.000Z","updated_at":"2025-10-04T11:44:33.000Z","dependencies_parsed_at":null,"dependency_job_id":"1759e0f2-5e2a-4222-bc8a-50a89e13e2dd","html_url":"https://github.com/ZipFile/python-sonyflake-turbo","commit_stats":null,"previous_names":["zipfile/python-sonyflake-turbo"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/ZipFile/python-sonyflake-turbo","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZipFile%2Fpython-sonyflake-turbo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZipFile%2Fpython-sonyflake-turbo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZipFile%2Fpython-sonyflake-turbo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZipFile%2Fpython-sonyflake-turbo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ZipFile","download_url":"https://codeload.github.com/ZipFile/python-sonyflake-turbo/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZipFile%2Fpython-sonyflake-turbo/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278838472,"owners_count":26054721,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-07T02:00:06.786Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpython-extensions","id-generator","python","sonyflake"],"created_at":"2025-06-08T01:35:14.449Z","updated_at":"2026-03-07T08:02:37.515Z","avatar_url":"https://github.com/ZipFile.png","language":"C","readme":"========================\nPython SonyFlake (Turbo)\n========================\n\nA `SonyFlake \u003chttps://github.com/sony/sonyflake\u003e`_ ID generator tailored for\nhigh-volume ID generation.\n\nInstallation\n============\n\n.. code-block:: sh\n\n    pip install sonyflake-turbo\n\nUsage\n=====\n\nEasy mode:\n\n.. code-block:: python\n\n    from sonyflake_turbo import SonyFlake\n\n    sf = SonyFlake(0x1337, 0xCAFE, start_time=1749081600)\n\n    print(\"one\", next(sf))\n    print(\"n\", sf(5))\n\n    for id_ in sf:\n        print(\"iter\", id_)\n        break\n\nTurbo mode:\n\n.. code-block:: python\n\n    from time import time_ns\n    from timeit import timeit\n\n    from sonyflake_turbo import MachineIDLCG, SonyFlake\n\n    get_machine_id = MachineIDLCG(time_ns())\n    EPOCH = 1749081600  # 2025-06-05T00:00:00Z\n\n    for count in [32, 16, 8, 4, 2, 1]:\n        machine_ids = [get_machine_id() for _ in range(count)]\n        sf = SonyFlake(*machine_ids, start_time=EPOCH)\n        t = timeit(lambda: sf(1000), number=1000)\n        print(f\"Speed: 1M ids / {t:.2f}sec with {count} machine IDs\")\n\nAsync:\n\n.. code-block:: python\n\n    import anyio\n    from sonyflake_turbo import AsyncSonyFlake, SonyFlake\n\n    sf = SonyFlake(0x1337, 0xCAFE, start_time=1749081600)\n    asf = AsyncSonyFlake(sf, sleep=anyio.sleep)  # defaults to asyncio.sleep\n\n    print(\"one\", await asf)\n    print(\"n\", await asf(5))\n\n    async for id_ in asf:\n        print(\"aiter\", id_)\n        break\n\nImportant Notes\n===============\n\nVanilla SonyFlake Difference\n----------------------------\n\nIn vanilla SonyFlake, whenever counter overflows, it simply waits for the next\n10ms window. Which severely limits the throughput. I.e. single generator\nproduces 256ids/10ms.\n\nTurbo version is basically the same as vanilla SonyFlake, except it accepts\nmore than one Machine ID in constructor args. On counter overflow, it advances\nto the next \"unexhausted\" Machine ID and resumes the generation. Waiting for\nthe next 10ms window happens only when all of the Machine IDs were exhausted.\n\nThis behavior is not much different from having multiple vanilla ID generators\nin parallel, but by doing so we ensure produced IDs are always monotonically\nincreasing (per generator instance) and avoid potential concurrency issues\n(by not doing concurrency).\n\nFew other features in comparison to other SonyFlake implementations found in\nthe wild:\n\n* Optional C extension module, for extra performance in CPython.\n* Async-framework-agnostic wrapper.\n* Thread-safe. Also has free-threading/nogil support [#]_.\n\n.. note::\n\n    Safe for concurrent use; internal locking ensures correctness. Sleeps are\n    always done after internal state updates.\n\n.. [#] \"nogil\" wheels are not published to PyPi, you have to install this\n       package with ``--no-binary sonyflake-turbo`` flag.\n.. _Locks: https://docs.python.org/3/library/threading.html#lock-objects\n\nMachine IDs\n-----------\n\nMachine ID is a 16 bit integer in range ``0x0000`` to ``0xFFFF``. Machine IDs\nare encoded as part of the SonyFlake ID:\n\n+----+-----------------+------------+---------+\n|    | Time            | Machine ID | Counter |\n+====+=================+============+=========+\n| 0x | 0874AD4993 [#]_ | CAFE       | 04      |\n+----+-----------------+------------+---------+\n\nSonyFlake IDs, in spirit, are UUIDv6_, but compressed down to 64 bit. But\nunfortunately, we do not have luxury of having 48 bits for encoding node id\n(UUID equivalent of SonyFlake's Machine ID). UUID standard proposes to use\npseudo-random value for this field, which is sub-optimal for our case due to\nhigh risk of collisions.\n\nVanilla SonyFlake, on the other hand, used lower 16 bits of the private IP\naddress. Which is sort of works, but has two major drawbacks:\n\n1. It assumes you have *exactly one* ID generator per machine in your network.\n2. You're leaking some of your infrastructure info.\n\nIn the modern world (k8s, \"lambdas\", etc...), both of these fall apart:\n\n1. Single machine often runs multiple different processes and/or threads.\n   More often than not they're isolated enough to be able to successfully\n   coordinate ID generation.\n2. Security aspect aside, container IPs within cluster network are not\n   something globally unique, especially if trimmed down to 16 bit.\n\nSolving this issue is up to you, as a developer. This particular library does\nnot include Machine ID management logic, so you are responsible for\ncoordinating Machine IDs in your deployment.\n\nTask is not trivial, but neither is impossible. Here are a few ideas:\n\n* Coordinate ID assignment via something like etcd_ or ZooKeeper_ using lease_\n  pattern. Optimal, but a bit bothersome to implement.\n* Reinvent Twitter's SnowFlake_ by having a centralized service/sidecar. Extra\n  round-trips SonyFlake intended to avoid.\n* Assign Machine IDs manually. DevOps team will hate you.\n* Use random Machine IDs. ``If I ignore it, maybe it will go away.jpg``\n\nBut nevertheless, it has one helper class: ``MachineIDLCG``. This is a\nprimitive LCG_-based 16 bit PRNG. It is intended to be used in tests, or in\nsituations where concurrency is not a problem (e.g. desktop or CLI apps).\nYou can also reuse it for generating IDs for a lease to avoid congestion when\ngoing etcd/ZooKeeper route.\n\nHow many Machine IDs you want to allocate per generator is something you\nshould figure out on your own. Here's some numbers for you to start\n(generating 1 million SonyFlake IDs):\n\n+--------+-------------+\n| Time   | Machine IDs |\n+========+=============+\n| 1.22s  | 32          |\n+--------+-------------+\n| 2.44s  | 16          |\n+--------+-------------+\n| 4.88s  | 8           |\n+--------+-------------+\n| 9.76s  | 4           |\n+--------+-------------+\n| 19.53s | 2           |\n+--------+-------------+\n| 39.06s | 1           |\n+--------+-------------+\n\n.. [#] 1409529600 + 0x874AD4993 / 100 = 2026-03-05T09:15:19.87Z\n.. _UUIDv6: https://www.rfc-editor.org/rfc/rfc9562.html#name-uuid-version-6\n.. _etcd: https://etcd.io/\n.. _ZooKeeper: https://zookeeper.apache.org/\n.. _SnowFlake: https://en.wikipedia.org/wiki/Snowflake_ID\n.. _lease: https://martinfowler.com/articles/patterns-of-distributed-systems/lease.html\n.. _LCG: https://en.wikipedia.org/wiki/Linear_congruential_generator\n\nClock Rollback\n--------------\n\nThere is no logic to handle clock rollbacks or drift at the moment. If clock\nmoves backward, it will ``sleep()`` (``await sleep()`` in async wrapper)\nuntil time catches up to last timestamp.\n\nStart Time\n----------\n\nSonyFlake ID has 39 bits dedicated for the time component with a resolution of\n10ms. The time is stored relative to ``start_time``. By default it is\n1409529600 (``2014-09-01T00:00:00Z``), but you may want to define your own\n\"epoch\".\n\nMotivation\n----------\n\nSometimes you have to bear with consequences of decisions you've made long\ntime ago. On a project I was leading, I made a decision to utilize SonyFlake.\nEverything was fine until we needed to ingest a lot of data, very quickly.\n\nA flame graph showed we were sleeping way too much. The culprit was\nSonyFlake library we were using at that time. Some RTFM later, it was revealed\nthat the problem was somewhere between the chair and keyboard.\n\nSolution was found rather quickly: just instantiate more generators and cycle\nthrough them about every 256 IDs. Nothing could go wrong, right? Aside from\nfact that hack was of questionable quality, it did work.\n\nExcept, we've got hit by `Hyrum's Law`_. Unintentional side effect of the hack\nabove was that IDs lost its \"monotonically increasing\" property [#]_. Ofc, some\nof our and other team's code were dependent on this SonyFlake's feature. Duh.\n\nAdding even more workarounds like pre-generate IDs, sort them and ingest was\na compelling idea, but I did not feel right. Hence, this library was born.\n\n.. [#] E.g. if you cycle through generators with Machine IDs 0xCAFE and 0x1337\n       You may get the following IDs: ``0x0874b2a7a0cafe00``,\n       ``0x0874b2a7a0133700``. Even though there are no collisions, sorting\n       them will result in a different order (vs order they've been generated)\n.. _Hyrum's Law: https://www.hyrumslaw.com/\n\nWhy should I use it?\n--------------------\n\nIf you're starting a new project, please use UUIDv7_. It is superior to\nSonyFlake in almost every way. It is an internet standard (RFC 9562), it is\nalready available in various languages' standard libraries and is supported by\npopular databases (PostgreSQL, MariaDB, etc...).\n\nOtherwise you might want to use it for one of the following reasons:\n\n* You already use it and encountered similar problems mentioned in\n  `Motivation`_ section.\n* You want to avoid extra round-trips to fetch IDs.\n* Usage of UUIDs is not feasible (legacy codebase, db indexes limited to 64\n  bit integers, etc...) but you still want to benefit from index\n  locality/strict global ordering.\n* As a cheap way to reduce predicability of IDOR_ attacks.\n* Architecture lunatism is still strong within you and you want your code to\n  be DDD-like (e.g. being able to reference an entity before it is stored in\n  DB).\n\n.. _UUIDv7: https://www.rfc-editor.org/rfc/rfc9562.html#name-uuid-version-7\n.. _IDOR: https://cheatsheetseries.owasp.org/cheatsheets/Insecure_Direct_Object_Reference_Prevention_Cheat_Sheet.html\n\nDevelopment\n===========\n\nInstall:\n\n.. code-block:: sh\n\n    python3 -m venv env\n    . env/bin/activate\n    pip install -r requirements-dev.txt\n    pip install -e .\n\nRun tests:\n\n.. code-block:: sh\n\n    pytest\n\nBuild wheels:\n\n.. code-block:: sh\n\n    pip install cibuildwheel\n    cibuildwheel\n\nBuild a ``py3-none-any`` wheel (without the C extension):\n\n.. code-block:: sh\n\n    SONYFLAKE_TURBO_BUILD=0 python -m build --wheel\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzipfile%2Fpython-sonyflake-turbo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzipfile%2Fpython-sonyflake-turbo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzipfile%2Fpython-sonyflake-turbo/lists"}