{"id":18422021,"url":"https://github.com/spcl/asa","last_synced_at":"2025-04-13T12:11:30.381Z","repository":{"id":135158261,"uuid":"596202932","full_name":"spcl/ASA","owner":"spcl","description":"Application Specific Architecture Toolchain","archived":false,"fork":false,"pushed_at":"2023-02-08T17:14:38.000Z","size":1005,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-02-10T00:57:55.178Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/spcl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-01T17:26:13.000Z","updated_at":"2024-11-11T02:26:59.000Z","dependencies_parsed_at":"2023-07-09T07:46:28.544Z","dependency_job_id":null,"html_url":"https://github.com/spcl/ASA","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2FASA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2FASA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2FASA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2FASA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/spcl","download_url":"https://codeload.github.com/spcl/ASA/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248710445,"owners_count":21149190,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T04:27:46.388Z","updated_at":"2025-04-13T12:11:30.346Z","avatar_url":"https://github.com/spcl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![CI Tests](https://github.com/spcl/ASA/actions/workflows/python-package.yml/badge.svg)](https://github.com/spcl/ASA/actions/workflows/python-package.yml)\n\nThis repository contains a framework for building Application Specific Architecture.\n\nStarting from a user-provided application, the framework will analyze it, and perform\na _Design Space Exploration_ phase. The goal of this exploration is to return a (or a set of) macro-level architecture descriptions of a System on Chip (SoC) able to\nexecute the application, resulting in good performance/power/area trade-offs.\n\nThe macro architectural description considers, among the others, the number of processing elements, the \non-chip buffer space, and a schedule of the application on the resulting SoC.\n\n\nThis project depends on DaCe (for frontend and data analysis) and Streaming-Sched (for scheduling).\n\n\n_Note_: this is a work-in-progress project.\n\nFor the development guide please refer to the [wiki](https://github.com/spcl/ASA/wiki).\n\n\n## Requirements\nThe framework has been tested with Python 3.8.\n\nWhen cloning this repository, be sure to initialize all the submodules\n\n```\ngit clone --recursive https://github.com/spcl/ASA.git\n```\n\n\nThen we need to install the requirements, DaCe, and properly set the Python PATH:\n\n```\n$ cd ASA/\n$ pip install -r requirements.txt\n$ pip install --editable dace/\n$ export PYTHONPATH=$(pwd):$(pwd)/streamingsched\n```\n\n### When used with ML based samples\n\nDaCeML currently only support Python3.8.\nWhen using with ML based samples, DaCeML requires a specific version of DaCe:\n\n```\n$ pip install --editable daceml/\n$ cd dace/\n$ git checkout v0133_and_inline_pass \n$ cd -\n\n```\n\nWhen dealing with larger ML samples, or, more in general, large programs, it has sense to increase the number\nof files that can be opened in the given system (e.g., to 100K):\n```\n$ ulimit -n 100000\n```\n\n\n*Important*: with the other samples, use the latest DaCe master or the specific commit of this repository submodules.\n\n_Note_: as this is currently development, an automated environment setup will be provided later.\n\n\n\n## Usage\n\nThe tool flow is shown in the following figure:\n\n\n\u003cimg src=\"workflow.png\" alt= “” width=\"800\"\u003e\n\nThe input is a user-provided application that will be parsed using DaCe and represented in its intermediate representation (Stateful DataFlow Graph - SDFG). The user application must respect certain requirements (see later section).\n\nThis will enter a Design Space Exploration phase where we evaluate:\n- different ways of representing the application (Application Space Exploration): the same application can be implemented in multiple ways. Let's consider the case of a Matrix-Matrix multiplication. This can be implemented by resorting to matrix-vector multiplications, outer products, and other basic operations. Similarly, it may read/produce data in a row or column major formats. The goal of this phase is to generate different (but semantically equivalent) representations of the user-provided applications that may result in different running times, parallelism opportunities, and on-chip buffer space requirements. In this stage, starting from the user-provided SDFG, the framework generates multiple _Canonical SDFGs_. Please refer to the development guide in the wiki for a more detailed\ndefinition.\n- different architectures (Architecture Space Exploration). This currently takes into consideration two main aspects:\n  - number of processing elements (PEs): architectures comprising a different number of PEs will be considered (user-defined).\n  - on-chip memory space: on-chip memory can be used to store data that will be re-used later on in the computation. The larger the on-chip memory, the lower the off-chip accesses but the higher the area that will be used to implement it. Here we want to explore this trade-off by considering several on-chip memory sizes (user-defined).\n\n\n\n\nLet's consider the following simple example (you can find the full program under `samples/simple/simple_1.py`).\n\nThe user-provided application is a DaCe program that computes the matrix multiplication between two $8\\times 8$ input matrices `A` and `B`\n\n```Python\n@dace.program\ndef simple_1(A: dace.float32[8, 8], B: dace.float32[8, 8]):\n    return A @ B\n```\n\nTo perform Design Space Exploration we have to parse it using DaCe, and then call the `DSE` function:\n\n```Python\nfrom dse.dse import DSE\n# Get the SDFG\nsdfg = simple_1.to_sdfg()\n    \n# Perform Design Space Exploration\nresults = DSE(\n    sdfg,\n    num_pes=[8, 16, 32, 64],  # List of allowed number of PEs \n    on_chip_memory_sizes=[32, 128]  # List of allowed on-chip-memory size\n)\n```\n\nBy default, the framework returns all the evaluated configurations (Application Variations, Number of PEs, On-Chip memory area ...), together with\nsynthetics score for their Power consumption, expected Performance and Area consumption (PPA).\n\nAs the typical objective is to minimize all of them, we can analyze the return results and save the Pareto frontier points into a file for successive analysis.\n\n```Python\n\nfrom dse.analysis import get_pareto_configurations\nget_pareto_configurations(results, \"my_program\")\n```\n\n\nWhile invoking DSE other arguments can be specified:\n- whether or not to use multithreading (`use_multithreading`, set to `False` by default) and, in case, the number of threads to use (`n_threads`, 8 by default). This is useful to reduce DSE times when dealing with non-trivial computations\n- unrolling factor: for iterated algorithms, where the computation is repeated multiple times but on different _independent_ input data, working on the fully unrolled SDFG may be expensive and unnecessary (the computation is the same). In this case, we allow partial unrolling: the computation is unrolled a certain amount of times, as defined by the argument. In this way, the SDFG will be smaller than the fully unrolled one (we save analysis time), but we can still exploit the parallelism given by the independent sub-computations. See `sample/simple/simple_iterated.py` for an example.\n\n\n### Application Requirements\nTo process a given user application, the DSE phase needs that the user input respects the following constraint:\n- the resulting user application SDFG is comprised of a single state, with library nodes and access nodes. Maps are allowed only as top-level scope (in the case of iterated computation)\n- Each library node can be expanded using a canonical expression\n- at the time of DSE, the SDFG must not have symbolic expressions. So symbols must be explicitly specialized.\n\nThe samples provided with this repository and (many of) DaCeML-generated SDFGs respect these requirements.  \n\nPlease refer to the development wiki for additional details. \n## Samples and tests\n\nThis repository provides samples (under the `sample/` folder) to illustrate the framework API.\nFor each sample, its documentation describes goals and various aspects of the ASA framework.\n\nUnit tests are contained under the `tests/` folder to test basic functionalities.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspcl%2Fasa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fspcl%2Fasa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspcl%2Fasa/lists"}