https://github.com/haskell/clc-stackage
Meta-package to facilitate impact assessment for CLC proposals
https://github.com/haskell/clc-stackage
Last synced: 5 months ago
JSON representation
Meta-package to facilitate impact assessment for CLC proposals
- Host: GitHub
- URL: https://github.com/haskell/clc-stackage
- Owner: haskell
- Created: 2022-01-20T00:33:10.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2024-10-25T21:00:43.000Z (over 1 year ago)
- Last Synced: 2025-01-30T06:51:19.161Z (about 1 year ago)
- Language: Haskell
- Homepage: https://github.com/haskell/core-libraries-committee/blob/main/PROPOSALS.md#impact-assessments
- Size: 280 KB
- Stars: 4
- Watchers: 3
- Forks: 7
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
# `clc-stackage`
[](https://github.com/haskell/clc-stackage/actions/workflows/ci.yaml)
## How to?
This is a meta-package to facilitate impact assessment for [CLC proposals](https://github.com/haskell/core-libraries-committee).
An impact assessment is due when
1. Proposal makes a breaking change according to [PVP](https://pvp.haskell.org/).
2. Proposal exports a new entity from `Prelude` or other modules, described in [Haskell Report](https://www.haskell.org/onlinereport/haskell2010/haskellpa2.html#x20-192000II).
3. On discretion of CLC members.
The procedure is as follows:
1. Rebase changes, mandated by your proposal, atop of `ghc-9.10` branch.
2. Compile a patched GHC, say, `~/ghc/_build/stage1/bin/ghc`.
3. `git clone https://github.com/haskell/clc-stackage`, then `cd clc-stackage`.
4. Build the exe: `cabal install clc-stackage --installdir=./bin`.
> :warning: **Warning:** Use a normal downloaded GHC for this step, **not** your custom built one. Why? Using the custom GHC can force a build of many dependencies you'd otherwise get for free e.g. `vector`.
5. Uncomment and modify the `with-compiler` line in [generated/cabal.project](generated/cabal.project) e.g.
```
with-compiler: /home/ghc/_build/stage1/bin/ghc
```
6. Run `./bin/clc-stackage` and wait for a long time. See [below](#the-clc-stackage-exe) for more details.
* On a recent Macbook Air it takes around 12 hours, YMMV.
* You can interrupt `cabal` at any time and rerun again later.
* Consider setting `--jobs` to retain free CPU cores for other tasks.
* Full build requires roughly 7 Gb of free disk space.
To get an idea of the current progress, we can run the following commands
on the log file:
```sh
# prints completed / total packages in this group
$ grep -Eo 'Completed|^ -' output/logs/current-build/stdout.log | sort -r | uniq -c | awk '{print $1}'
110
182
# combine with watch
$ watch -n 10 "grep -Eo 'Completed|^ -' output/logs/current-build/stdout.log | sort -r | uniq -c | awk '{print \$1}'"
```
7. If any packages fail to compile:
* copy them locally using `cabal unpack`,
* patch to confirm with your proposal,
* link them from `packages` section of `cabal.project`,
* return to Step 6.
8. When everything finally builds, get back to CLC with a list of packages affected and patches required.
### The clc-stackage exe
`clc-stackage` is an executable that will:
1. Download the stackage snapshot from the stackage server.
2. Divide the snapshot into groups (determined by `--batch` argument).
3. For each group, generate a cabal file and attempt to build it.
#### Querying stackage
By default, `clc-stackage` queries https://www.stackage.org/ for snapshot information. In situations where this is not desirable (e.g. the server is not working, or we want to test a custom snapshot), the snapshot can be overridden:
```sh
$ ./bin/clc-stackage --snapshot-path=path/to/snapshot
```
This snapshot should be formatted similar to the `cabal.config` endpoint on the stackage server (e.g. https://www.stackage.org/nightly/cabal.config). That is, package lines should be formatted ` ==`:
```
abstract-deque ==0.3
abstract-deque-tests ==0.3
abstract-par ==0.3.3
AC-Angle ==1.0
acc ==0.2.0.3
...
```
The stackage config itself is valid, so trailing commas and other extraneous lines are allowed (and ignored).
#### Investigating failures
By default (`--write-logs save-failures`), the build logs are saved to the `./output/logs/` directory, with `./output/logs/current-build/` streaming the current build logs.
#### Group batching
The `clc-stackage` exe allows for splitting the entire package set into subset groups of size `N` with the `--batch N` option. Each group is then built sequentially. Not only can this be useful for situations where building the entire package set in one go is infeasible, but it also provides a "cache" functionality, that allows us to interrupt the program at any point (e.g. `CTRL-C`), and pick up where we left off. For example:
```sh
$ ./bin/clc-stackage --batch 100
```
This will split the entire downloaded package set into groups of size 100. Each time a group finishes (success or failure), stdout/err will be updated, and then the next group will start. If the group failed to build and we have `--write-logs save-failures` (the default), then the logs and error output will be in `./output/logs//`, where `` is the name of the first package in the group.
See `./bin/clc-stackage --help` for more info.
##### Optimal performance
On the one hand, splitting the entire package set into `--batch` groups makes the output easier to understand and offers a nice workflow for interrupting/restarting the build. On the other hand, there is a question of what the best value of `N` is for `--batch N`, with respect to performance.
In general, the smaller `N` is, the worse the performance. There are several reasons for this:
- The smaller `N` is, the more `cabal build` processes, which adds overhead.
- More packages increase the chances for concurrency gains.
Thus for optimal performance, you want to take the largest group possible, with the upper limit being no `--batch` argument at all, as that puts all packages into the same group.
> [!TIP]
>
> Additionally, the `./output/cache.json` file can be manipulated directly. For example, if you want to try building only `foo`, ensure `foo` is the only entry in the json file's `untested` field.
## Getting dependencies via `nix`
For Linux based systems, there's a provided `flake.nix` and `shell.nix` to get a nix shell
with an approximation of the required dependencies (cabal itself, C libs) to build `clc-stackage`.
Note that it is not actively maintained, so it may require some tweaking to get working, and conversely, it may have some redundant dependencies.
## Misc
* Your custom GHC will need to be on the PATH to build the `stack` library e.g.
```
export PATH=/home/ghc/_build/stage1/bin/:$PATH
```
Nix users can uncomment (and modify) this line in the `flake.nix`.