Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/bgamari/ghc-dump

A GHC plugin and library for analysing GHC Core
https://github.com/bgamari/ghc-dump

ghc haskell

Last synced: 9 days ago
JSON representation

A GHC plugin and library for analysing GHC Core

Awesome Lists containing this project

README

        

# ghc-dump: A tool for analysing GHC Core

`ghc-dump` is a library, GHC plugin, and set of tools for recording and
analysing GHC's Core representation. The plugin is compatible with GHC 7.10
through 9.0, exporting a consistent (albeit somewhat lossy) representation
across these versions. The AST is encoded as CBOR, which is small and easy to
deserialise.

## Dumping Core from compilation

The GHC plugin `GhcDump.Plugin` provides a Core-to-Core plugin which dumps a
representation of the Core AST to a file after every Core-to-Core pass. To use
it on a `cabal` package add the following to the relevant stanza of your cabal
file (e.g. the `library` stanza in the case of a library):
```
build-depends: ghc-dump-core
ghc-options: -fplugin GhcDump.Plugin
```

Now building the project will produce a number of `.cbor` files in the
`dist-newstyle` directory, one for each step of GHC's Core-to-Core pipeline.

```
$ cabal build
$ ls **/*.cbor
dist-newstyle/build/x86_64-linux/ghc-8.10.4/aeson-1.5.6.0/build/attoparsec-iso8601/src/Data/Attoparsec/Time/Internal.pass-0000.cbor
dist-newstyle/build/x86_64-linux/ghc-8.10.4/aeson-1.5.6.0/build/attoparsec-iso8601/src/Data/Attoparsec/Time/Internal.pass-0001.cbor
dist-newstyle/build/x86_64-linux/ghc-8.10.4/aeson-1.5.6.0/build/attoparsec-iso8601/src/Data/Attoparsec/Time/Internal.pass-0002.cbor
dist-newstyle/build/x86_64-linux/ghc-8.10.4/aeson-1.5.6.0/build/attoparsec-iso8601/src/Data/Attoparsec/Time/Internal.pass-0003.cbor
dist-newstyle/build/x86_64-linux/ghc-8.10.4/aeson-1.5.6.0/build/attoparsec-iso8601/src/Data/Attoparsec/Time/Internal.pass-0004.cbor
dist-newstyle/build/x86_64-linux/ghc-8.10.4/aeson-1.5.6.0/build/attoparsec-iso8601/src/Data/Attoparsec/Time/Internal.pass-0005.cbor
...
```

Here we see a `pass-N.cbor` file was produced for each Core-to-Core pass.

We can produce a brief summary of these files using the `summarize` mode of the `ghc-dump`
CLI tool:
```
Name Terms Types Coerc. Previous phase
Data/Attoparsec/Time/Internal.pass-0000.cbor
124 63 9 desugar
Data/Attoparsec/Time/Internal.pass-0001.cbor
138 59 38 Simplifier: Max iterations = 4
SimplMode {Phase = InitialPhase [Gentle],
inline,
rules,
eta-expand,
no case-of-case}
Data/Attoparsec/Time/Internal.pass-0002.cbor
138 59 38 Specialise:
Data/Attoparsec/Time/Internal.pass-0003.cbor
142 59 38 Float out(FOS {Lam = Just 0, Consts = True, OverSatApps = False}):
Data/Attoparsec/Time/Internal.pass-0004.cbor
118 41 38 Simplifier: Max iterations = 4
SimplMode {Phase = 2 [main],
inline,
rules,
eta-expand,
case-of-case}
Data/Attoparsec/Time/Internal.pass-0005.cbor
112 32 38 Simplifier: Max iterations = 4
SimplMode {Phase = 1 [main],
inline,
rules,
eta-expand,
case-of-case}
```
Here we see for each dump file:

* the file name
* the size of the program measured in terms, types, and coercions
* the name of the phase that produced the program

## Analysis in GHCi

One can then load this into `ghci` for analysis,
```
$ ghci
GHCi, version 8.3.20170413: http://www.haskell.org/ghc/ :? for help
Loaded GHCi configuration from /home/ben/.ghci
λ> import GhcDump.Repl as Dump
λ> mod <- readDump "Test.pass-0.cbor"
λ> pretty mod
module Main where

nsoln :: Int-> Int
{- Core Size{terms=98 types=66 cos=0 vbinds=0 jbinds=0} -}
nsoln =
λ nq →
let safe =
λ x d ds →
case ds of wild {
[] → GHC.Types.True
: q l →
GHC.Classes.&&
...
```

## Analysis with CLI tool

Alternatively, the `ghc-dump` utility can be used to render the representation
in human-readable form. For instance, we can filter the dump to include only
top-level binders containing `main` in its name,
```
$ ghc-dump show --filter='.*main.*' Test.pass-0.cbor
```
You can conveniently summarize the top-level bindings of the program,
```
$ ghc-dump list-bindings --sort=terms Test.pass-0.cbor
Name Terms Types Coerc. Type
nsoln 98 66 0 Int-> Int
main 30 28 0 IO ()
$trModule 5 0 0 Module
main 2 1 0 IO ()
...
```