{"id":25239227,"url":"https://github.com/denver-code/for_test_python","last_synced_at":"2025-07-16T16:33:48.521Z","repository":{"id":50589661,"uuid":"519558647","full_name":"denver-code/for_test_python","owner":"denver-code","description":"Speed while generating id from uuid4x4","archived":false,"fork":false,"pushed_at":"2022-07-30T17:27:13.000Z","size":39,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-11T18:14:37.804Z","etag":null,"topics":["c","for","generating","loops","python","speed","timepy","uuid4"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/denver-code.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-07-30T15:54:18.000Z","updated_at":"2022-07-30T17:28:26.000Z","dependencies_parsed_at":"2022-09-14T17:42:43.219Z","dependency_job_id":null,"html_url":"https://github.com/denver-code/for_test_python","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/denver-code%2Ffor_test_python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/denver-code%2Ffor_test_python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/denver-code%2Ffor_test_python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/denver-code%2Ffor_test_python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/denver-code","download_url":"https://codeload.github.com/denver-code/for_test_python/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247393556,"owners_count":20931809,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","for","generating","loops","python","speed","timepy","uuid4"],"created_at":"2025-02-11T18:14:40.385Z","updated_at":"2025-04-05T19:38:22.773Z","avatar_url":"https://github.com/denver-code.png","language":"Python","readme":"# Speed while generating id from uuid4x4\n\n### Description:\nA simple script for checking speed and making predictions for large numbers of IDs out of 4 UUID4s.  \nThere are several loops in the file, the first three are just a speed test of the standard for loop.\nfrom 4 to 8 - test cycles, in which I tested different generation methods to choose the fastest and most correct solution for me.\n\n# How to run:\n## Use terminal or cmd to run this project.\n```bash\n$ git clone https://github.com/denver-code/for_test_python\n$ cd for_test_python\n$ python -m venv .venv\nLinux:\n$ source .venv/bin/activate\nWindows:\n$ .venv\\scripts\\activate.bat\n$ python main.py\n```\n\n## Execution example: \n![Result image](result_main.png)\n\n\n# Checking  \n## Checking forecasts with a real figure:\n\nWe will use for the projected forecast - the picture above, there is all the approximate time, which is built from a different number of necessary IDs.\n## Step 1\nIn the first line, we see the value of the need_items variable, this is 10. an example of the ID I need is:\n```cmd\n8e6ca7e8-bc09-4f1f-9313-c2fc10aee4d-231903b8-191f-4fe4-b6b9-27b76aab7a93-87e4d3b1-d3b1-4feb-a9c9-eff58b407 -45b3-8a6e-6caeffde97db\n```\nIts length is 147 characters, among them - 128 characters (letters and numbers). That is, 36 possible options for 1 position, there are 128 such positions.  \nWe recall the 11th grade of the school, or rather, combinatorics and constellations, we make the equation 128 ^ 36, and we get a lot of options for IDs.  \nWith a large number of ids already ready, there may be problems with generation, but this can be fixed by calling the generator function again (the execution time will increase noticeably.) approximately 7.\n```python\n237005577332262213973186563043e+75\n```\nor\n\n```python\n94.672220753319050022924562134291\n```  \nI don’t know how much it is exactly, but it’s a lot if the number of 128 character IDs is only from numbers - this is:\n```python\n1180591620717411303424(128^10)\n```\npossible combinations, then there are even more).  \nThe upper bound of int32 is 2,147,483,647.  \nWhat is noticeably more than our number of combinations, bigint, int64 or other data structures can fix it.  \n## Step 2\nNow you can observe the real execution time of each value from the screenshot, I repeat, it is approximate there.\n\n### Need items = 10\nExecution time:\n```python\n#loop9: 0.00024s\n```\nApproximate time from example: `0.00024s`\n\n### Need items = 20\nExecution time:\n```python\n#loop9: 0.00062s\n```\nApproximate time from example: `0.00053s`\n\n### Need items = 100\nExecution time:\n```python\n#loop9: 0.00331s\n```\nApproximate time from example: `0.00359s`\n\n### Need items = 1000\nExecution time:\n```python\n#loop9: 0.03314s\n```\nApproximate time from example: `0.03587s`\n\n\n### Need items = 10000\nExecution time:\n```python\n#loop9: 1.20738s\n```\nApproximate time from example: `1.07595s`\n\n## Checking conclusion  \n\nOn each computer, the result may be different, it depends on the characteristics.  \n\nUsually, even on the same computer, you have to repeat the same process several times to get the (rounded up, it will never be the same) answer as the last time.  \nBecause of this, it is impossible to predict the exact execution time for the following values ​​from the approximate statistics.  \nso when running the speed test on 10, it will take us `0.00024` seconds, but this time changes with each run, due to rounding, we can achieve a repetition of \"exactly\" (`0.0024`), but if we do not round, the result is different.  \nSo from the statistics we can take for the number 20 - in the example it is `0.00053`, in the real test of the number 20 it is `0.00062`, that is, the difference is small, but it is present.  \nPredicting accurately - I didn't succeed, because I selected the coefficients based on the analysis of past launches of a certain number of IDs. It is different for each Hx increase.  \nHowever, we have an approximate time, we can imagine how long it will take to generate 1000 IDs.  \nAt 10,000, we have a slightly larger deviation, due to the large number of IDs, compared to the small number of the original 10 IDs.  \n### The higher the number, the more difficult it is to predict even the approximate time. Only by analysis and selection of the coefficient.\n\n# Loops source code\nneed_items = 10\n## loop1\nBasic loop iterating current position.\n```python\ndef loop1() -\u003e int:\n    result = 0\n    for num in range(need_items):\n        result += num\n    return result\n```\n\n## loop2\nBasic loop iterating only one.\n```python\ndef loop2() -\u003e int:\n    result = 0\n    for num in range(need_items):\n        result += 1\n    return result\n```\n\n## loop3\nBasic loop with iteration of one, and output to the console (time slows down noticeably)\n```python\ndef loop3() -\u003e int:\n    result = 0\n    for num in range(need_items):\n        result += 1\n        print(f\"loop3 printed {result=}\", end='\\r')\n    print(end=\"\\n\")\n    return result\n```\n\n## loop4\nthe first code of the generator, which has a nested function for creating an ID, in which a list is formed, and uuid4 is filled 4 times from the for loop.\n```python\ndef loop4() -\u003e int:\n\n    db = []\n\n    def generate_id():\n        _uuid_list = []\n\n        for i in range(1, 5):\n            _uuid_list.append(str(uuid4()))\n\n        _id = \"-\".join(_uuid_list)\n\n        if _id not in db:\n            return _id\n        return generate_id()\n\n    for num in range(need_items):\n        db.append(generate_id())\n\n    return len(db)\n```\n\n## loop5\nEverything is the same as in the 4th loop, but instead of a loop - 4 lines of adding uuid4.\n```python\ndef loop5() -\u003e int:\n\n    db = []\n\n    def generate_id():\n        _uuid_list = []\n\n        _uuid_list.append(str(uuid4()))\n        _uuid_list.append(str(uuid4()))\n        _uuid_list.append(str(uuid4()))\n        _uuid_list.append(str(uuid4()))\n\n        _id = \"-\".join(_uuid_list)\n\n        if _id not in db:\n            return _id\n        return generate_id()\n\n    for num in range(need_items):\n        db.append(generate_id())\n\n    return len(db)\n```\n\n## loop6\nEverything is the same as in the 5th loop, but instead of adding lines, and for loop - elements are added directly from the list during initialization.\n```python\ndef loop6() -\u003e int:\n\n    db = []\n\n    def generate_id():\n        _uuid_list = [\n            str(uuid4()),\n            str(uuid4()),\n            str(uuid4()),\n            str(uuid4())\n        ]\n\n        _id = \"-\".join(_uuid_list)\n\n        if _id not in db:\n            return _id\n        return generate_id()\n\n    for num in range(need_items):\n        db.append(generate_id())\n\n    return len(db)\n```\n\n## loop7\nWe get rid of the built-in generation function, this reduces the time very noticeably. We also remove the initialization of the list in a separate anonymous variable. And add a list with an attachment of IDs.  \nAnd also remove the check of whether the element is in the list.\n```python\ndef loop7() -\u003e int:\n    db = []\n\n    for num in range(need_items):\n        _id = \"-\".join([\n            str(uuid4()),\n            str(uuid4()),\n            str(uuid4()),\n            str(uuid4())\n        ])\n\n        db.append(_id)\n\n    return len(db)\n```\n\n## loop8\nExactly identical to option 7 - but already fixing, and designed to check the time with 7 and 8 loops.  \nIn the example for 7 it is `0.00012` and for 8 it is `0.00022` seconds, the difference is almost 2 times, and here we can understand why it is impossible to accurately predict, calculate what time will be for a larger number of elements, even if we have the same number of elements different results, with a difference of 2 times, the deviation is very large.\n```python\ndef loop8() -\u003e int:\n    db = []\n\n    for num in range(need_items):\n        _id = \"-\".join([\n            str(uuid4()),\n            str(uuid4()),\n            str(uuid4()),\n            str(uuid4())\n        ])\n\n        db.append(_id)\n\n    return len(db)\n```\n\n## loop9 - The final version on which statistics are generated.\nDifference from 7 and 8 loop - added check if there is already an id in the list, and added a print to see if there will be a repeated element during generation.  \nOn small counts, it is unlikely that there will be matches, and the result will be like nid_items.  \nBut on large numbers, repetitions are already possible during generation, and we will be able to find out how many there were.\n```python\ndef loop9() -\u003e int:\n    db = []\n\n    for num in range(need_items):\n        _id = \"-\".join([\n            str(uuid4()),\n            str(uuid4()),\n            str(uuid4()),\n            str(uuid4())\n        ])\n\n        if _id not in db:\n            db.append(_id)\n        else:\n            continue\n    print(f\"nine loop db count: {len(db)}\")\n    return len(db)\n```\n# Generate analysis and selection of the coefficient.\nI use a multiplication factor a little more than 2, 10, 100, 1000 - due to the fact that it did not give the correct result.  \nI end up using:  \nrounding to 5 decimal places.  \nto increase by 2 times - coefficient 2.2  \nto 10 times - 15.  \nto 100 - 150.  \nso that in 1000 - 4500.  \nAs a result, it works approximately correctly. But it can work worse for large numbers.  \n```python\ndef loop9() -\u003e int:\n    db = []\n\n    for num in range(need_items):\n        _id = \"-\".join([\n            str(uuid4()),\n            str(uuid4()),\n            str(uuid4()),\n            str(uuid4())\n        ])\n\n        if _id not in db:\n            db.append(_id)\n        else:\n            continue\n    print(f\"nine loop db count: {len(db)}\")\n    return len(db)\n\nloop9_time = timeit.timeit(loop9, number=1)\nprint(f\"loop9: {round(loop9_time, 5)}s\")\nprint(\n    f\"loop9 {need_items}*2={need_items*2} about time: {round(loop9_time*2.2, 5)}s\")\nprint(\n    f\"loop9 {need_items}*10={need_items*10} about time: {round(loop9_time*15, 5)}s\")\nprint(\n    f\"loop9 {need_items}*100={need_items*100} about time: {round(loop9_time*150, 5)}s\")\nprint(f\"loop9 {need_items}*1000={need_items*1000} about time: {round(loop9_time*4500, 5)}s\")\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdenver-code%2Ffor_test_python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdenver-code%2Ffor_test_python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdenver-code%2Ffor_test_python/lists"}