https://github.com/lukechampine/ply

Painless polymorphism
https://github.com/lukechampine/ply
generics transpiler
Last synced: 12 months ago
JSON representation
Painless polymorphism
Host: GitHub
URL: https://github.com/lukechampine/ply
Owner: lukechampine
License: mit
Created: 2016-12-04T12:00:23.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2017-03-21T05:29:02.000Z (over 9 years ago)
Last Synced: 2024-06-20T17:41:48.009Z (about 2 years ago)
Topics: generics, transpiler
Language: Go
Homepage:
Size: 389 KB
Stars: 125
Watchers: 6
Forks: 4
Open Issues: 10
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          ply

===

`ply` is an experimental compile-to-Go language. Its syntax and semantics are

basically identical to Go's, but with more builtin functions for manipulating

generic containers (slices, arrays, maps). This is accomplished by forking

Go's type-checker, running it on the `.ply` file, and using the resolved types

to generate specific versions of the generic function. For example, given the

following Ply code:

```go

m1 := map[int]int{1: 1}

m2 := map[int]int{2: 2}

m3 := merge(m1, m2)

```

`merge` is a generic function. After type-checking, the Ply compiler knows the

types of `m1` and `m2`, so it can generate a specific function for these types:

```go

func mergeintint(m1, m2 map[int]int) map[int]int {

	m3 := make(map[int]int)

	for k, v := range m1 {

		m3[k] = v

	}

	for k, v := range m2 {

		m3[k] = v

	}

	return m3

}

```

`mergeintint` is then substituted for `merge` in the relevant expression, and

the modified source can then be passed to the Go compiler.

A similar approach is used to implement generic methods:

```go

xs := []int{1, 2, 3, 4, 6, 20}

b := xs.filter(func(x int) bool { return x > 3 }).

        morph(func(x int) bool { return x % 2 == 0 }).

        fold(func(x, y bool) bool { return x && y })

```

In the above, `b` is true because all the integers in `xs` greater than 3 are

even. To compile this, `xs` is wrapped in a new type that has a `filter`

method. Then, that call is wrapped in a new type that has a `morph` method,

and so on.

Note that in most cases, Ply can combine these method chains into a single

"pipeline" that **does not allocate any intermediate slices**. Without

pipelining, `filter` would allocate a slice and pass it to `morph`, which

would allocate another slice and pass it to `fold`. But Ply is able to merge

these methods into a single transformation that does not require allocations,

the same way a (good) human programmer would write it.

Usage

-----

First, install the Ply compiler:

```

go get github.com/lukechampine/ply

```

The `ply` command behaves similarly to the `go` command. In fact, you can run

any `go` subcommand through `ply`, including `build`, `run`, `install`, and

even `test`.

When you run `ply run test.ply`, `ply` parses `test.ply` and generates a

`ply-impls.go` file containing the specific implementations of any generics

used in `test.ply`. It then rewrites `test.ply` as a standard Go file,

`ply-test.go`, that calls those implementations. Finally, `go run` is invoked

on `ply-test.go` and `ply-impls.go`.

Supported Functions and Methods

-------------------------------

**Builtins:** `enum`, `max`, `merge`, `min`, `not`, `zip`

- Planned: `repeat`, `compose`

**Methods:** `all`, `any`, `contains`, `drop`, `dropWhile`, `elems`, `filter`,

`fold`, `foreach`, `keys`, `morph`, `reverse`, `sort`, `take`, `takeWhile`,

`tee`, `toMap`, `toSet`, `uniq`

- Planned: `join`, `replace`, `split`

All functions and methods are documented in the [`ply` pseudo-package](https://godoc.org/github.com/lukechampine/ply/doc).

Supported Optimizations

-----------------------

In many cases we can reduce allocations when using Ply functions and methods.

The Ply compiler will automatically apply these optimizations when it is safe

to do so. However, all optimizations have trade-offs. If performance is

important, you should always read the docstring of each method in order to

understand what optimizations may be applied. Depending on your use case, it

may be necessary to write your own implementation to squeeze out maximum

performance.

**Pipelining:**

Pipelining means chaining together multiple Ply functions and/or methods.

Currently only method chaining is supported. For example:

```go

xs := []int{1, 2, 3, 4, 6, 20}

b := xs.filter(func(x int) bool { return x > 3 }).

        morph(func(x int) bool { return x % 2 == 0 }).

        fold(func(acc, x bool) bool { return acc && x })

```

As written, this chain requires allocating a new slice for the `filter` and a

new slice for the `morph`. But if we were writing this transformation by hand,

we could optimize it like so:

```go

b := true

for _, x := range xs {

	if x > 3 {

		b = b && (x % 2 == 0)

	}

}

```

(A good rule of thumb is that, for most chains, only the allocations in the

final method are required. `fold` doesn't require any allocations, but if the

chain stopped at `morph`, then of course we would still need to allocate

memory in order to return the morphed slice.)

Ply is able to perform the above optimization automatically. The bodies of

`filter`, `morph`, and `fold` are combined into a single method, `pipe`, and

the callsite is rewritten to supply the arguments of each chained function:

```go

xs := []int{1, 2, 3, 4, 6, 20}

b := filtermorphfold(xs).pipe(

		func(x int) bool { return x > 3 },

		func(x int) bool { return x % 2 == 0 },

		func(x, y bool) bool { return x && y })

```

However, not all methods can be pipelined. `reverse` is a good example. If

`reverse` is the first method in the chain, then we can eliminate an

allocation by reversing the order in which we iterate through the slice. We

can also eliminate an allocation if `reverse` is the last method in the chain,

since we can reverse the result in-place. But what do we do if `reverse` is in

the middle? Consider this chain:

```go

xs.takeWhile(even).reverse().morph(square)

```

Since we don't know what `takeWhile` will return, there is no way to pass its

reversed elements to `morph` without allocating an intermediate slice. So we

resort to a less-efficient form, splitting the chain into `takeWhile(even)`

and `reverse().morph(square)`, each of which will perform an allocation.

Fortunately, it is usually possible to reorder the chain such that `reverse`

is the first or last method. In the above, we know that `morph` doesn't affect

the length or order of the slice, so we can move `reverse` to the end and the

result will be the same. Ply can't perform this reordering automatically

though: methods may have side effects that the programmer is relying upon.

Side effects are also problematic because pipelining can change the number of

times a function is called. For example, in this expression:

```go

[]int{1, 2, 3, 4, 6, 20}.morph(fn).take(3)

```

Without pipelining, `fn` is called on every element of the slice. But with

pipelining, it is only called 3 times. So the best practice is to avoid side

effects in functions passed to `morph`, `filter`, etc.

Lastly, it's worth pointing out that pipelining cannot eliminate any

allocations performed inside function arguments. For example, in this chain:

```go

myEnum := func(n int) []int {

	r := make([]int, n)

	for i := range r {

		r[i] = i

	}

	return r

}

concat := func(x, y []int) []int { return append(x, y...) }

list := xs.morph(myEnum).fold(concat)

```

A handwritten version of this chain could eliminate the allocations performed

by `myEnum`, but there is no way to do so programmatically.

**Parallelization (planned):**

Functor operations like `morph` can be trivially parallelized, but this

optimization should not be applied automatically. For small lists, the

overhead is probably not worth it. More importantly, if the function has side

effects, parallelizing may cause a race condition. So this optimization must

be specifically requested by the caller via separate identifiers, e.g.

`pmorph`, `pfilter`, etc.

**Reassignment (planned):**

It is a common pattern to reassign the result of a transformation to the

original variable, for example when filtering or reversing a slice. In such

cases, we would like to reuse the existing slice's memory instead of

allocating a new one. At one time, Ply did this automatically (by detecting

reassignment), but the feature was later removed because it is not provably

safe. If the underlying slice memory is referenced by a different variable,

then silently performing this optimization would affect that memory as well,

which is surprising behavior.

However, this optimization remains important. It is directly in line with

Ply's goal of generating code that is as good as the hand-written version. We

just need a different approach; probably a more explicit one. This could take

the form of separate identifiers (e.g. `rfilter`), similar to parallelization.

But this leads to an unfortunate bifurcation: what if you want both

reassignment and parallelization? So now we need four different forms:

standard, parallel, reassigned, and parallel reassigned, each with its own

identifier. More identifiers means more burden on the programmer, so I'm

hesistant to implement this approach until I've given it more thought.

**Compile-time evaluation:**

A few functions (currently just `max` and `min`) can be evaluated at compile

time if their arguments are also known at compile time. This is similar to how

the builtin `len` and `cap` functions work:

```go

len([3]int) // known at compile-time; compiles to 3

max(1, min(2, 3)) // known at compile time; compiles to 2

```

In theory, it is also possible to perform compile-time evaluation on certain

literals. For example:

```go

[]int{1, 2, 3}.contains(3) // compile to true?

```

We could even go further and support arbitrary compile-time execution. But

that seems a little dangerous. At best, it's useful for things like computing

a large table instead of including it in the source. But I don't think that

single case warrants such a powerful feature.

**Function hoisting (planned):**

`not` currently returns a function that wraps its argument. Instead, `not`

could generate a new top-level function definition, and replace the callsite

wholesale. For example, given these definitions:

```go

even := func(i int) bool { return i % 2 == 0 }

odd := not(even)

```

The compiled code currently looks like this:

```go

func not_int(fn func(int) bool) func(int) bool {

	return func(i int) bool {

		return !fn(i)

	}

}

even := func(i int) bool { return i % 2 == 0 }

odd := not_int(even)

```

But we could improve upon this by generating a top-level `not_even` function:

```go

func not_even(i int) bool {

	return !even(i)

}

even := func(i int) bool { return i % 2 == 0 }

odd := not_even

```

This is non-trivial, though, because `even` is not in the top-level scope; we

would need to hoist its definition into the function body of `not_even`.

Alternatively, we could simply not consider local functions for this

optimization -- but we'd still need a way to distinguish global functions from

local functions.

The motivation for this optimization is that the Go compiler is more likely to

inline top-level functions (AFAIK). Eliminating the overhead of a function

call could be significant when, say, filtering a large slice. Benchmarks are

needed to confirm that this would actually result in a significant speedup.

FAQ

---

**Why wouldn't you just use [existing generics solution]?**

There are basically two options: runtime generics (via reflection) and

compile-time generics (via codegen). They both suck for different reasons:

reflection is slow, and codegen is cumbersome. Ply is an attempt at making

codegen suck a bit less. You don't need to grapple with magic annotations or

custom types; you can just start using `filter` and `fold` as though Go had

always supported them.

**What are the downsides of this approach?**

The most obvious is that it's less flexible; you can only use the functions

and methods that Ply provides. Another annoyance is that since they behave

like builtins, you can't pass them around as first-class values. Fortunately

this is a pretty rare thing to do, and it's possible to work around it in most

cases. (For example, you can wrap the call in a `func`.)

Generating a specific implementation of every generic function call produces

very fast code, at the cost of slower compilation, larger binaries, and less

helpful error messages. Your build process will also be more complicated,

though hopefully not as complicated as writing template code and using `go

generate`. The fact of the matter is that *there is no silver bullet*: every

implementation of generics has its downsides. Do your research before deciding

whether Ply is the right approach for your project.

**What if I want to define my own generic functions?**

Sorry, that's not in the cards. The purpose of Ply is to make polymorphism as

painless as possible. Supporting custom generics would mean defining some kind

of template syntax, and that adds a lot of complexity to the language.

Restricting the set of generic functions also allows the Ply compiler to apply

deep optimizations, such as pipelining.

I understand that this is a controversial position, and Ply's set of functions

may not suit everyone's needs. My rationale is that by adding a small set of

new functions, Go can be made much more productive without becoming any harder

to parse (by computer or by human). If you have suggestions for new functions,

[open an issue](https://github.com/lukechampine/ply/issues) and I'll consider

adding them.

**What about generic data structures?**

Go seems pretty productive without them. Slices and maps are sufficient for

the vast majority of programs. Adding new generic data structures would

complicate Go's syntax (do we overload `make` for our new `RedBlackTree`

type?) and I really want to avoid that. Go's simplicity is one of its biggest

strengths.

**How does Ply interact with the existing Go toolchain?**

One nice thing about Ply is that because it has the same syntax as Go, many

tools built for Go will "just work" with Ply. For example, you can run `gofmt`

and `golint` on `.ply` files. Other tools (like `go vet`) are pickier about

their input filenames ending in `.go`, but will work if you rename your `.ply`

files. Lastly, tools that require type information will fail, because Go's

type-checker does not understand Ply builtins.

One current deficiency is that Ply will not automatically compile imported

`.ply` files. So you can't write pure-Ply packages (yet).

**Will you add support for feature X?**

Open an issue and I will gladly consider it.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lukechampine/ply

Awesome Lists containing this project

README