Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/minio/c2goasm
C to Go Assembly
https://github.com/minio/c2goasm
asm clang gcc go golang llvm plan9 runtime
Last synced: 3 months ago
JSON representation
C to Go Assembly
- Host: GitHub
- URL: https://github.com/minio/c2goasm
- Owner: minio
- License: apache-2.0
- Archived: true
- Created: 2017-03-24T19:30:35.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2021-11-27T14:34:32.000Z (about 3 years ago)
- Last Synced: 2024-04-14T12:06:00.083Z (8 months ago)
- Topics: asm, clang, gcc, go, golang, llvm, plan9, runtime
- Language: Go
- Size: 171 KB
- Stars: 1,298
- Watchers: 50
- Forks: 108
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- cybersecurity-golang-security - c2goasm - C to Go Assembly (Assembly)
- awesome-go-security - c2goasm - C to Go Assembly (Assembly)
README
# c2goasm: C to Go Assembly
## Introduction
This is a tool to convert assembly as generated by a C/C++ compiler into Golang assembly. It is meant to be used in combination with [asm2plan9s](https://github.com/minio/asm2plan9s) in order to automatically generate pure Go wrappers for C/C++ code (that may for instance take advantage of compiler SIMD intrinsics or `template<>` code).
Mode of operation:
```
$ c2goasm -a /path/to/some/great/c-code.s /path/to/now/great/golang-code_amd64.s
```You can optionally nicely format the code using [asmfmt](https://github.com/klauspost/asmfmt) by passing in an `-f` flag.
This project has been developed as part of developing a Go wrapper around [Simd](https://github.com/fwessels/go-cv-simd). However it should also work with other projects and libraries. Keep in mind though that it is not intented to 'port' a complete C/C++ project in a single action but rather do it on a case-by-case basis per function/source file (and create accompanying high level Go code to call into the assembly code).
## Command line options
```
$ c2goasm --help
Usage of c2goasm:
-a Immediately invoke asm2plan9s
-c Compact byte codes
-f Format using asmfmt
-s Strip comments
```## A simple example
Here is a simple C function doing an AVX2 intrinsics computation:
```
void MultiplyAndAdd(float* arg1, float* arg2, float* arg3, float* result) {
__m256 vec1 = _mm256_load_ps(arg1);
__m256 vec2 = _mm256_load_ps(arg2);
__m256 vec3 = _mm256_load_ps(arg3);
__m256 res = _mm256_fmadd_ps(vec1, vec2, vec3);
_mm256_storeu_ps(result, res);
}
```Compiling into assembly gives the following
```
__ZN14MultiplyAndAddEPfS1_S1_S1_: ## @_ZN14MultiplyAndAddEPfS1_S1_S1_
## BB#0:
push rbp
mov rbp, rsp
vmovups ymm0, ymmword ptr [rdi]
vmovups ymm1, ymmword ptr [rsi]
vfmadd213ps ymm1, ymm0, ymmword ptr [rdx]
vmovups ymmword ptr [rcx], ymm1
pop rbp
vzeroupper
ret
```Running `c2goasm` will generate the following Go assembly (eg. saved in `MultiplyAndAdd_amd64.s`)
```
//+build !noasm !appengine
// AUTO-GENERATED BY C2GOASM -- DO NOT EDITTEXT ·_MultiplyAndAdd(SB), $0-32
MOVQ vec1+0(FP), DI
MOVQ vec2+8(FP), SI
MOVQ vec3+16(FP), DX
MOVQ result+24(FP), CXLONG $0x0710fcc5 // vmovups ymm0, yword [rdi]
LONG $0x0e10fcc5 // vmovups ymm1, yword [rsi]
LONG $0xa87de2c4; BYTE $0x0a // vfmadd213ps ymm1, ymm0, yword [rdx]
LONG $0x0911fcc5 // vmovups yword [rcx], ymm1VZEROUPPER
RET
```This needs to be accompanied by the following Go code (in `MultiplyAndAdd_amd64.go`)
```
//go:noescape
func _MultiplyAndAdd(vec1, vec2, vec3, result unsafe.Pointer)func MultiplyAndAdd(someObj Object) {
_MultiplyAndAdd(someObj.GetVec1(), someObj.GetVec2(), someObj.GetVec3(), someObj.GetResult()))
}
```And as you may have gathered the amd64.go file needs to be in place in order for the arguments names to be derived (and allow `go vet` to succeed).
## Benchmark against cgo
We have run benchmarks of `c2goasm` versus `cgo` for both Go version 1.7.5 and 1.8.1. You can find the `c2goasm` benchmark test in `test/` and the `cgo` test in `cgocmp/` respectively. Here are the results for both versions:
```
$ benchcmp ../cgocmp/cgo-1.7.5.out c2goasm.out
benchmark old ns/op new ns/op delta
BenchmarkMultiplyAndAdd-12 382 10.9 -97.15%
```
```
$ benchcmp ../cgocmp/cgo-1.8.1.out c2goasm.out
benchmark old ns/op new ns/op delta
BenchmarkMultiplyAndAdd-12 236 10.9 -95.38%
```As you can see Golang 1.8 has made a significant improvement (38.2%) over 1.7.5, but it is still about 20x slower than directly calling into assembly code as wrapped by `c2goasm`.
## Converted projects
- [go-cv-simd (WIP)](https://github.com/fwessels/go-cv-simd)
## Internals
The basic process is to (in the prologue) setup the stack and registers as how the C code expects this to be the case, and upon exiting the subroutine (in the epilogue) to revert back to the golang world and pass a return value back if required. In more details:
- Define assembly subroutine with proper golang decoration in terms of needed stack space and overall size of arguments plus return value.
- Function arguments are loaded from the golang stack into registers and prior to starting the C code any arguments beyond 6 are stored in C stack space.
- Stack space is reserved and setup for the C code. Depending on the C code, the stack pointer maybe aligned on a certain boundary (especially needed for code that takes advantages of SIMD instructions such as AVX etc.).
- A constants table is generated (if needed) and any `rip`-based references are replaced with proper offsets to where Go will put the table.## Limitations
- Arguments need (for now) to be 64-bit size, meaning either a value or a pointer (this requirement will be lifted)
- Maximum number of 14 arguments (hard limit -- if you hit this maybe you should rethink your api anyway...)
- Generally no `call` statements (thus inline your C code) with a couple of exceptions for functions such as `memset` and `memcpy` (see `clib_amd64.s`)## Generate assembly from C/C++
For eg. projects using cmake, here is how to see a list of assembly targets
```
$ make help | grep "\.s"
```To see the actual command to generate the assembly
```
$ make -n SimdAvx2BgraToGray.s
```## Supported golang architectures
For now just the AMD64 architecture is supported. Also ARM64 should work just fine in a similar fashion but support is lacking at the moment.
## Compatible compilers
The following compilers have been tested:
- `clang` (Apple LLVM version) on OSX/darwin
- `clang` on linuxCompiler flags:
```
-masm=intel -mno-red-zone -mstackrealign -mllvm -inline-threshold=1000 -fno-asynchronous-unwind-tables -fno-exceptions -fno-rtti
```| Flag | Explanation |
|:----------------------------------| :--------------------------------------------------|
| `-masm=intel` | Output Intel syntax for assembly |
| `-mno-red-zone` | Do not write below stack pointer (avoid [red zone](https://en.wikipedia.org/wiki/Red_zone_(computing))) |
| `-mstackrealign` | Use explicit stack initialization |
| `-mllvm -inline-threshold=1000` | Higher limit for inlining heuristic (default=255) |
| `-fno-asynchronous-unwind-tables` | Do not generate unwind tables (for debug purposes) |
| `-fno-exceptions` | Disable exception handling |
| `-fno-rtti` | Disable run-time type information |The following flags are only available in `clang -cc1` frontend mode (see [below]()):
| Flag | Explanation |
|:----------------------------------| :------------------------------------------------------------------|
| `-fno-jump-tables` | Do not use jump tables as may be generated for `select` statements |#### `clang` vs `clang -cc1`
As per the clang [FAQ](https://clang.llvm.org/docs/FAQ.html#driver), `clang -cc1` is the frontend, and `clang` is a (mostly GCC compatible) driver for the frontend. To see all options that the driver passes on to the frontend, use `-###` like this:
```
$ clang -### -c hello.c
"/usr/lib/llvm/bin/clang" "-cc1" "-triple" "x86_64-pc-linux-gnu" etc. etc. etc.
```#### Command line flags for clang
To see all command line flags use either `clang --help` or `clang --help-hidden` for the clang driver or `clang -cc1 -help` for the frontend.
#### Further optimization and fine tuning
Using the LLVM optimizer ([opt](http://llvm.org/docs/CommandGuide/opt.html)) you can further optimize the code generation. Use `opt -help` or `opt -help-hidden` for all available options.
An option can be passed in via `clang` using the `-mllvm ` option, such as `-mllvm -inline-threshold=1000` as discussed above.
Also LLVM allows you to tune specific functions via [function attributes](http://llvm.org/docs/LangRef.html#function-attributes) like `define void @f() alwaysinline norecurse { ... }`.
#### What about GCC support?
For now GCC code will not work out of the box. However there is no reason why GCC should not work fundamentally (PRs are welcome).
## Resources
- [A Primer on Go Assembly](https://github.com/teh-cmc/go-internals/blob/master/chapter1_assembly_primer/README.md)
- [Go Function in Assembly](https://github.com/golang/go/files/447163/GoFunctionsInAssembly.pdf)
- [Stack frame layout on x86-64](http://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64)
- [Compiler Explorer (interactive)](https://go.godbolt.org/)## License
c2goasm is released under the Apache License v2.0. You can find the complete text in the file LICENSE.
## Contributing
Contributions are welcome, please send PRs for any enhancements.