{"id":17077822,"url":"https://github.com/reinderien/inspiration","last_synced_at":"2025-03-23T12:17:59.280Z","repository":{"id":69225905,"uuid":"89766990","full_name":"reinderien/inspiration","owner":"reinderien","description":"Thoughts on the PyCon 2015 Thumbtack challenge","archived":false,"fork":false,"pushed_at":"2017-05-04T22:10:00.000Z","size":446,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-01-28T18:31:16.701Z","etag":null,"topics":["algorithm","card","pycon","python","thumbtack"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/reinderien.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-04-29T05:59:36.000Z","updated_at":"2017-05-05T15:04:39.000Z","dependencies_parsed_at":"2023-02-22T12:30:44.423Z","dependency_job_id":null,"html_url":"https://github.com/reinderien/inspiration","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reinderien%2Finspiration","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reinderien%2Finspiration/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reinderien%2Finspiration/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reinderien%2Finspiration/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/reinderien","download_url":"https://codeload.github.com/reinderien/inspiration/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245097834,"owners_count":20560319,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm","card","pycon","python","thumbtack"],"created_at":"2024-10-14T12:17:08.554Z","updated_at":"2025-03-23T12:17:59.250Z","avatar_url":"https://github.com/reinderien.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Inspiration\n\nBack in 2015, I attended PyCon in Montreal, Canada. It was awesome. There were lots of stalls \nfrom big names, and some offered interesting programming challenges.\n[Thumbtack was among them](https://www.thumbtack.com/engineering/pycon-2015).\n\n\n## The problem\n\nTo quote Thumbtack Engineering,\n\n\u003e When I was little, my family went to our town’s district math night. We came back with a game\nthat we still play as a family. The game is called Inspiration. It’s played with a normal deck\nof cards, with the picture cards taken out. Everyone gets four cards and one card is turned \nface up for everyone to see. You then have to mathematically combine your four cards with \naddition, subtraction, multiplication, and division to get the center card. The person who \ndoes it the fastest wins.\n\u003e\n\u003e This year, our challenge was inspired by Inspiration, no pun intended. The first part asked \npeople to write a Python program that takes in four numbers and determines the mathematical \nexpression that can combine the first three numbers to get the fourth. If they could solve this,\nthey were awarded a t-shirt and sunglasses. The harder challenge was to solve the same problem,\nbut with an arbitrary number of inputs. The number to solve for was always the last number in \nthe string, but the total number of operands was not constant. These solvers won the coveted \nThumbtack beer glass.\n\nOther than on Thumbtack's own blog, discussion and solutions (other than a\n[dead link](https://gist.github.com/shayel/736f143cb4f7d2310c5b \"11 lines solution for https://www.thumbtack.com/engineering/pycon-2015\")\nhere and there) seem to be unfortunately non-existent. This kind of problem is dangerous for me \nbecause it's an ideal [nerd snipe](https://xkcd.com/356/): simple on its face, but with lots of \npotential for complexity given any cursory thought. And snipe it did: I've been occasionally \nthinking on and off about this for two years and counting.\n\n\n### Original problem statement\n\n\u003cimg src=\"doc/challenge.png\" alt=\"Original problem statement\"\n    style=\"border: thin solid lightgrey\"/\u003e\n\n\n### Problem formalization\n\nStated more formally, let there be _n_ integers _x_\u003csub\u003ei\u003c/sub\u003e on the left-hand side of an \nequation, and an integer _y_ on the right-hand side of that equation, where\n\nn ≥ 2\n\n0 ≤ i \u003c n\n\n1 ≤ x\u003csub\u003ei\u003c/sub\u003e ≤ 9\n\n1 ≤ y ≤ 9\n\nA strict interpretation would enforce a maximum count of four cards (suits) per card value, but \nthat criterion is disregarded for the purposes of this discussion.\n\nA strict interpretation would also impose an upper bound on _n_. Since there are 4 suits, 9 cards\nper suit, a minimum of 2 players, and _n_ cards per player plus one shared _y_ card, the upper \nbound for _n_ is:\n\n2 ≤ n ≤ (4 * 9 - 1)/2\n\n2 ≤ n ≤ 17\n\nThere are _n_ - 1 operators ƒ\u003csub\u003ei\u003c/sub\u003e between the elements of _x_\u003csub\u003ei\u003c/sub\u003e and \n_x_\u003csub\u003ei + 1\u003c/sub\u003e . Each ƒ is a binary arithmetic operator with conventional associativity and \ncommutativity but non-conventional order of operations. The equation is assumed to be evaluated in a\nflat, left-to-right manner, that is,\n\nx\u003csub\u003e0\u003c/sub\u003e [ƒ\u003csub\u003e0\u003c/sub\u003e] x\u003csub\u003e1\u003c/sub\u003e [ƒ\u003csub\u003e1\u003c/sub\u003e] x\u003csub\u003e2\u003c/sub\u003e [ƒ\u003csub\u003e2\u003c/sub\u003e] ...\n[ƒ\u003csub\u003en - 2\u003c/sub\u003e] x\u003csub\u003en - 1\u003c/sub\u003e= y\n\nƒ\u003csub\u003en - 2\u003c/sub\u003e( ... ƒ\u003csub\u003e1\u003c/sub\u003e(ƒ\u003csub\u003e0\u003c/sub\u003e(x\u003csub\u003e0\u003c/sub\u003e, x\u003csub\u003e1\u003c/sub\u003e), \nx\u003csub\u003e2\u003c/sub\u003e) ... x\u003csub\u003en - 1\u003c/sub\u003e) = y\n\nEach ƒ may be any of:\n\nƒ(a, b) ∈ { a + b, a - b, a * b, a / b }\n\nThe goal is to permute _x_ and choose each ƒ to satisfy the equation. Where there are multiple \nsolutions, the first one found is taken and the rest are discarded.\n\n\n### Problem Analysis\n\nThe cardinality of the ƒ function set is:\n\n|ƒ| = 4\n\nAssuming that the entries of _x_ are (in the worst case) unique, the number of permutations of _x_\n is _n_! .\n\nThe size _N_ of the (naïve) search space is then\n\nN = 4\u003csup\u003en - 1\u003c/sup\u003e ( n ! )\n\nFor given _x_ and _y_ the solution is not necessarily unique, nor is there necessarily a solution.\nBy example, there is no solution for\n\n    x = (1, 1), y = 9\n\nBy contrast, given these inputs:\n\n    x = (1, 2, 3), y = 4\n\nThere are 5 solutions out of a search space for which _N_ = 96 :\n\n    2 - 1 + 3 = 4\n    2 + 3 - 1 = 4\n    3 - 1 + 2 = 4\n    3 - 1 * 2 = 4\n    3 + 2 - 1 = 4\n\nSome inputs are thus more difficult than others, difficulty being measured by both the number of \nsolutions in the problem space as well as (more importantly) the number of candidates assessed \nbefore the first solution is found.\n\nFor example, evaluating all of the possible inputs where _n_ = 4, _N_ = 1536; and using the naïve\nalgorithm described later, the problem difficulty is heavily dependent on which inputs are chosen.\nFor the following inputs, a solution is found after only 7 iterations. There are 480 solutions\n(without discarding non-unique solutions):\n\n    x = (1, 1, 1, 1), y = 1\n    1 + 1 - 1 * 1 = 1\n    ...\n\nHowever, the following inputs only produce a solution after 1309 iterations, exhausting 85% of the \nsearch space; and there are only two solutions:\n\n    x = (2, 1, 5, 9), y = 9\n    9 - 1 / 2 + 5 = 9\n    9 - 5 * 2 + 1 = 9\n\n\n## Algorithms\n\n### Naïve\n\nThe naïve (or brute-force) algorithm is the easiest to implement but certainly not the most \nefficient. It was the most popular (perhaps only?) algorithm seen in use at PyCon. In pseudocode \nit looks like:\n\n    for each of n! permutations of x:\n        for each of the 4^(n-1) combinations of ƒ:\n            evaluate LHS\n            if LHS == y:\n                print solution\n                exit\n\nA brief implementation of this algorithm (indeed, briefer than what I submitted in 2015) is:\n\n```python\n#!/usr/bin/env python3\nimport functools, itertools, operator as opr, re, sys\n\noperators = tuple(zip((opr.add, opr.sub, opr.mul, opr.truediv), '+-*/'))\ninputs = [int(i) for i in re.findall('\\S+', sys.stdin.readline())]\nfor perm in itertools.permutations(inputs[:-1]):\n    for ops in itertools.product(operators, repeat=len(perm)-1):\n        funcs = tuple(zip(ops, perm[1:]))\n        if inputs[-1] == functools.reduce(lambda n, fun: fun[0][0](n, fun[1]), funcs, perm[0]):\n            print(perm[0], ' '.join('%s %d' % (o[1], n) for o, n in funcs))\n            sys.exit()\nprint('Invalid')\n```\n\nThere are surely ways to code-golf this down even further, but this implementation is non-magical -\nit simply uses:\n* _itertools.permutations_ to search through all _x_; \n* _itertools.product_ (as in, the Cartesian product) with _repeat_ set to _n_ - 1 to search \n  through all ƒ; and\n* _functools.reduce_ to evaluate the result.\n\nThis brute-force method can be very slow: since the search through _x_ is O(n!), the search \nthrough ƒ is O(4\u003csup\u003en - 1\u003c/sup\u003e), and the evaluation of the LHS is O(n), the overall worst-case\nruntime is O(n! 4\u003csup\u003en - 1\u003c/sup\u003e n) .\n\n\n### Recursive-ƒ\n\nWhenever a new ƒ is attempted in the naïve algorithm, it takes O(n) time to re-evaluate the LHS in\norder to compare it to _y_ on the RHS - but this O(n) factor can be reduced.\n\nA recursive function can be used that, either from left to right or right to left, iterates \nthrough the possibilities for ƒ\u003csub\u003ei\u003c/sub\u003e at four times the speed of ƒ\u003csub\u003ei - 1\u003c/sub\u003e. Even \nthough it is more complicated than calling _itertools.product_, it offers speedup by reducing\nevaluation redundancy. This is achieved on every recursion at depth _i_, reusing the prior value \ncomputed at upper depths 0 through _i_ - 1.\n\n\n### Gray code\n\nAs with the recursive-ƒ algorithm, the O(n) factor can be reduced by eliminating some redundant \nre-evaluation of sections of the LHS. In this case, we attempt this by ensuring that only one\nƒ\u003csub\u003ei\u003c/sub\u003e changes at a time, reducing this factor to O(1).\n\n[Gray code](https://en.wikipedia.org/wiki/Gray_code#n-ary_Gray_code) is a method of incrementing\nnumbers such that only one digit changes at a time, but all possible values are still visited. For \ninstance, for a 3-bit binary integer:\n\n    0 0 0\n    0 0 1\n    0 1 1\n    0 1 0\n    1 1 0\n    1 1 1\n    1 0 1\n    1 0 0\n\nƒ can be modelled as a radix-4 integer with _n_ - 1 digits, where each digit represents one \noperator, and the integer increments from 0 through 4\u003csup\u003en - 1\u003c/sup\u003e - 1. There is benefit to \nhaving it increment by (4,_n_)-Gray code instead of linearly. Whenever an ƒ\u003csub\u003ei\u003c/sub\u003e changes, the\ncomputed value from ƒ\u003csub\u003e0\u003c/sub\u003e through ƒ\u003csub\u003ei - 1\u003c/sub\u003e can stay the same; and the computed \ncomposite function equivalent to ƒ\u003csub\u003ei + 1\u003c/sub\u003e through ƒ\u003csub\u003en - 2\u003c/sub\u003e can also stay the same.\n\nFor _n_ = 4 and a single permutation of _x_, the search through all ƒ would look like:\n\n    + + +\n    + + -\n    + + *\n    + + /\n    + - /\n    + - +\n    + - -\n    + - *\n    + * *\n    + * /\n    + * +\n    + * -\n    ...\n\nThis optimization will have the greatest effect when the player has a large hand of up to 17 cards.\n\n\n### Heap's Permutations\n\nSimilar in spirit to the Gray code approach, there is benefit to reducing the number of changed \nelements between permutations of _x_.\n[Heap's permutation algorithm](https://en.wikipedia.org/wiki/Heap%27s_algorithm) is designed \nspecifically for this purpose. It guarantees that iteration through all permutations only does a \nsingle swap of two elements at a time. The canonical implementation of the algorithm swaps most \nfrequently at the beginning of the set, but a simple index reversal allows swaps to occur most \nfrequently at the end, allowing more of the computed portion of the LHS to be saved from \niteration to iteration.\n\nFor _n_ = 4, disregarding ƒ, the search through _x_ would look like:\n\n    (0, 1, 2, 3)\n    (0, 1, 3, 2)\n    (0, 2, 3, 1)\n    (0, 2, 1, 3)\n    (0, 3, 1, 2)\n    (0, 3, 2, 1)\n    (1, 3, 2, 0)\n    (1, 3, 0, 2)\n    (1, 2, 0, 3)\n    (1, 2, 3, 0)\n    (1, 0, 3, 2)\n    (1, 0, 2, 3)\n    ...\n\n\n### Gray-Heap Iteration\n\nIt would be interesting to try combined Gray-Heap iteration using the above methods. This would \nproduce a series of LHS candidates looking like:\n\n    1 + 2 + 3 + 4\n    1 + 2 + 3 - 4\n    1 + 2 + 3 * 4\n    1 + 2 + 3 / 4\n    1 + 2 - 3 / 4\n    ...\n    4 - 3 + 2 - 1\n    4 - 3 + 2 * 1\n    4 - 3 + 2 / 1\n    4 - 3 + 2 + 1\n    4 - 3 + 1 + 2\n    4 - 3 + 1 - 2\n    4 - 3 + 1 * 2\n    4 - 3 + 1 / 2\n    ...\n    4 + 1 / 2 - 3\n    4 + 1 + 2 - 3\n    4 + 1 + 2 * 3\n    4 + 1 + 2 / 3\n    4 + 1 + 2 + 3\n\n\n### Condensed-section\n\ntodo.\n\n\n## Hardware Considerations\n\n### Precision\n\nThe equation with the maximal intermediate value is where _n_ takes its maximum of 17, _x_ and _y_ \ntake their maxima of 9, and only multiplication and division are used:\n\n    9^8 / 9^8 * 9 = 9\n\nThe maximal intermediate value then requires an integer width of:\n\n    ceil( 8*ln(9)/ln(2) ) = 26 bits\n\nIf using integer math with a fraction, 32-bit signed integers will suffice. \n\nSingle-precision (32-bit) floating-point has a 24-bit significand. As such, it would suffice for _n_\n≤ 15, since 9\u003csup\u003e7\u003c/sup\u003e \u003c 2\u003csup\u003e24\u003c/sup\u003e. Above that, it is insufficient for the worst edge \ncases, and double (64-bit) floating-point would be required instead.\n\n### Parallelism\n\nThe naïve algorithm is easily parallelized (and is, perhaps, even one of\n[Moler's embarassingly parallel](https://en.wikipedia.org/wiki/Embarrassingly_parallel#Etymology)\nproblems).\nThe simplest method is to perform the permutation in a parent thread and assign subdivided ƒ search \nspace to _m_ children, such that all children have the same set of inputs in the same order but \nattempt different combinations for ƒ. Each child will have a search space size of 4\u003csup\u003en - \n1\u003c/sup\u003e/m . Whichever child finds the first solution returns it to the parent, the parent cancels\nall children and completes execution. In pseudocode,\n\n    for each of n! permutations of x:\n        fork m children each searching 4^(n-1) / m combinations of ƒ\n        join on any child completion\n        if a child found a solution:\n            cancel other children\n            print solution\n            exit\n        join on all remaining children\n\n### Parallelism with SIMD\n\nA more carefully optimized solution would be to use\n[SIMD](https://en.wikipedia.org/wiki/SIMD) on an appropriate processor architecture - either on a \nCPU or a GPU.\n\nContemporary Intel CPUs offer up to\n[28 cores](https://en.wikipedia.org/wiki/List_of_Intel_Xeon_microprocessors#.22Skylake-SP.22_.2814_nm.29_Scalable_Performance),\n2 semi-parallel \"hyper-threads\" per core, and\n[512-bit-wide vectorized math](https://en.wikipedia.org/wiki/AVX-512). In 512-bit single-precision \nvectorized mode, 16 values can be operated upon at once. This all allows a theoretical parallel \nspeedup factor of up to 896.\n\nUsing [GPGPU](https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units)\ncan offer even greater parallel speedup. For example, nVidia offers GPUs with core counts well in \nexcess of\n[4,000](https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#Tesla) before \ntaking other parallelism into account. There are various Python bindings for\n[OpenCL](https://en.wikipedia.org/wiki/OpenCL) able to target this hardware.\n\nUsing a SIMD scheme, there would be two levels of parallelism: core/thread children, and vectorized \nelements within each of those cores. Within a single core, each element of the vectorized operation\nwould have to have a different set of _x_ inputs but the same ƒ operations. Division of the _x_ \nsearch space would then need to occur across SIMD elements, with all elements in the core using the\nsame ƒ set. The top-level parent would divide the ƒ search space across cores.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freinderien%2Finspiration","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Freinderien%2Finspiration","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freinderien%2Finspiration/lists"}