{"id":21707807,"url":"https://github.com/flightaware/starch","last_synced_at":"2025-03-20T17:22:25.515Z","repository":{"id":48252596,"uuid":"309366497","full_name":"flightaware/starch","owner":"flightaware","description":"Framework for runtime selection of architecture-dependent code","archived":false,"fork":false,"pushed_at":"2021-08-04T09:04:40.000Z","size":110,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":7,"default_branch":"master","last_synced_at":"2023-04-18T10:34:09.699Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/flightaware.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-11-02T12:42:37.000Z","updated_at":"2021-08-04T09:04:43.000Z","dependencies_parsed_at":"2022-07-29T18:09:24.496Z","dependency_job_id":null,"html_url":"https://github.com/flightaware/starch","commit_stats":null,"previous_names":[],"tags_count":null,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flightaware%2Fstarch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flightaware%2Fstarch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flightaware%2Fstarch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flightaware%2Fstarch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/flightaware","download_url":"https://codeload.github.com/flightaware/starch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244657079,"owners_count":20488714,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-25T22:19:21.810Z","updated_at":"2025-03-20T17:22:25.494Z","avatar_url":"https://github.com/flightaware.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# starch - a framework for selecting architecture-specific code at runtime\n\n`starch` helps generates glue code to *s*elec*t* *arch*itecture-specific\nversions of code depending on the hardware detected at runtime.\n\nIt arranges for code to be built multiple times with different compiler\noptions. At runtime, user code calls a dispatcher entry point which\nselects the best compiled version of the versions that can safely run\non the hardware used at runtime.\n\nIt tries to be agnostic about the details of the code being generated\nand the details of the hardware.\n\n## Caution caution work in progress\n\nThis documentation isn't very complete. You'll need to look at the example\nand the code itself.\n\n## Design notes\n\n * Architecture-independent generated output; the generated outputs can\n   be generated during development and committed as part of the main\n   source code, and at build time starch does not need to be re-run.\n\n * Doesn't care about the details of the functions you call; they can\n   have any signature.\n\n * Can automatically generate benchmarking code given a benchmarking\n   helper that sets up inputs to the function.\n\n * Does not do any hardware detection itself, and does not care about\n   the hardware details; for each combination of compiler flags, the user\n   code provides a test function to be called at runtime to determine if\n   it is safe to run code compiled with those flags.\n\n * Allows the same generic code to be compiled multiple times with different\n   compile flags to take advantage of compile auto-vectorization that\n   requires additional instruction set features (AVX, NEON, ..) being enabled.\n\n * Emits makefile fragments to be included into a larger makefile structure\n\n## License\n\nThe generator script and templates are licensed under a BSD 2-clause license,\nsee the LICENSE file.\n\nNo copyright claim is made on generated code.\n\n## Prerequisites\n\nAt generation time (results can be committed to version control):\n\n * Python 3\n * [Mako](https://www.makotemplates.org/)\n\nAt build time:\n\n * a C compiler\n * make\n\n## Quickstart\n\nLook in example/ for a full example.\n\n## Concepts\n\nA *function* is the user-visible API to starch-generated code. It just looks\nlike a C function pointer. Initially, this pointer points to a dispatcher\nroutine which will select an appropriate implementation at runtime and call\nit. For subsequent calls, the dispatcher updates the function pointer to\npoint directly to the selected implementation.\n\nA *function impl* is one particular way of implementing a function. All\nimpls should produce the same results given the same inputs to avoid confusing\nuser code. There may be different impls with different performance\ncharacteristics - for example, different degrees of manual loop unrolling, or\nan impl that takes advantage of a particular instruction set (NEON, AVX, etc).\nEach impl has a unique-within-the-function \"variant\" name that identifies it.\n\nFunction impls may be conditionally compiled depending on build features\n(see below). This is useful for impls that cannot always be compiled e.g.\nthey depend on the availability of a particular instruction set.\n\nA *build flavor* is a particular way of building the function impl. It\nconsists of a set of compiler flags to use, plus an associated test function\nthat determines at runtime if it is safe to run the code. For example,\na flavor may enable use of specific instructions that may or may not be\navailable at runtime via `-mavx`, `-march=...`, and similar flags. Each\nflavor declares that it provides zero or more *features*.\n\nA *feature* is a characteristic of the build flavor compiler flags that\nallows certain impls to be compiled. For example, an impl that uses NEON\nintrinsics can only be compiled if the compiler is building for an ARM\ninstruction set that supports NEON. Features are defined in the build flavor,\nand are advertised at compile time by the presence of a `STARCH_FEATURE_x`\nmacro; implementations may conditionally compile on this macro and should use\n`STARCH_IMPL_REQUIRES` to indicate they will only be emitted when a given\nfeature is present.\n\nA *build mix* is a combination of build flavors that can coexist in the same\nbinary. For example, an \"x86\" mix might include build flavors that build\nfor generic x86, x86-with-AVX, and x86-with-AVX2; but it would not include\na build flavor for ARM, because ARM and x86 object code can't be linked\ntogether into a single binary.\n\n## Alignment\n\nA function can optionally include an aligned version; this is a version of the\nfunction with an independent call point and wisdom, which assumes that\ndata passed to the function is already aligned. Each flavor has an associated\nalignment in bytes, but otherwise it is up to the implementations to decide\nwhat exactly is aligned. Implementations for an aligned function on a flavor\nthat specifies an alignment (\u003e1 byte) will be compiled twice, once with an\nalignment of 1 and once with the flavor's alignment, to generate two different\ncompiled versions.\n\nstarch provides macros to help with alignment:\n\n * `STARCH_ALIGNMENT`, in implementations, is the alignment (in bytes) that\n   implementations can assume.\n * `STARCH_MIX_ALIGNMENT`, defined in the generated header file, is the required\n   alignment (in bytes) for callers of the _aligned version of a function.\n   It is the largest alignment of all flavors in the mix.\n * `STARCH_ALIGNED(ptr)` in implementations evaluates to `ptr` while hinting to\n   the compiler that the data is aligned according to STARCH_ALIGNMENT. This\n   maps to gcc's `__builtin_assume_aligned` builtin.\n\n## Benchmarks\n\nFunctions can optionally provide a benchmark helper by defining a\n(no args, void return typer) function using the STARCH_BENCHMARK macro. This\nmacro is only present when benchmark code is being compiled.\n\nThe benchmark helper should set up function inputs for benchmarking and then\nuse the `STARCH_BENCHMARK_RUN` macro. This macro expands to code that will\nbenchmark each possible impl in turn with the provided arguments.\n\nIf the benchmark needs to allocated possibly-aligned buffers,\ntwo macros `STARCH_BENCHMARK_ALLOC` and `STARCH_BENCHMARK_FREE`\nwill allocate suitably aligned buffers for the current `STARCH_ALIGNMENT`\nvalue. `STARCH_BENCHMARK_ALLOC(count,type)` will allocate `count` elements of\ntype `type`, aligned to either `STARCH_ALIGNMENT` or the required alignment\nfor `type`, whichever is larger. `STARCH_BENCHMARK_FREE(ptr)` will free a\nbuffer previously allocated by `STARCH_BENCHMARK_ALLOC`.\n\nSee `example/benchmark/subtract_n_benchmark.c` for examples.\n\n## Gotchas\n\nFiles added by `scan_file` are `#include`-d into surrounding support files.\nMultiple files may be included into the same compilation unit. You should\nensure that you don't pollute the global namespace (macros, static functions\nnames, etc) for subsequent files that will follow.\n\nFiles added by `scan_file` will be compiled multiple times. You should ensure\nthat any symbols other than those handled by STARCH_IMPL / STARCH_IMPL_REQUIRES\nare either static or use the STARCH_SYMBOL macro to get a unique name for\nthis compilation pass.\n\nYou probably want to separate out benchmark-support code into separate files\nto avoid an extra version of any impls present in the same file from being\nemitted.\n\n## Wisdom\n\nThere is partial support for a wisdom implementation. Wisdom is a priori\ninformation about the preferred code to use for a given function, for example\nas the result of benchmarking to find the fastest version. It is simply the\norder in which compiled impls are tried until one that is supported is found.\n\nTo set wisdom, there are two options:\n\n1) Provide a wisdom ordering for the function when defining a build mix. This\ncontrols the order in which the compiled impls are included in the generated\nregistry that is searched at runtime.\n\n2) Call `starch_\u003cfunction\u003e_set_wisdom` at runtime. This accepts an array of\nfunction variants, terminated by NULL. When called, the registry is re-sorted\nto prefer the listed variants in the order provided (and the function pointer\nis reset to the dispatcher so the chosen code will be re-selected on the next\ncall). This could be used to load install-specific wisdom during program\nstartup.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflightaware%2Fstarch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fflightaware%2Fstarch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflightaware%2Fstarch/lists"}