https://github.com/hikettei/Caten

[wip] Deep Learning Compiler based on Polyhedral Compiler, Light-weight IRs, and Optimizing Pattern Matcher.
https://github.com/hikettei/Caten

Last synced: 3 months ago
JSON representation

[wip] Deep Learning Compiler based on Polyhedral Compiler, Light-weight IRs, and Optimizing Pattern Matcher.

Host: GitHub
URL: https://github.com/hikettei/Caten
Owner: hikettei
License: other
Created: 2024-07-22T05:17:22.000Z (11 months ago)
Default Branch: main
Last Pushed: 2025-02-16T10:47:51.000Z (4 months ago)
Last Synced: 2025-02-16T11:26:49.419Z (4 months ago)
Language: Common Lisp
Homepage: https://hikettei.github.io/Caten/
Size: 6.14 MB
Stars: 183
Watchers: 7
Forks: 11
Open Issues: 73
Metadata Files:
- Readme: Readme.md
- License: LICENSE

Awesome Lists containing this project

trackawesomelist - Caten (⭐192) - Deep Learning Compiler based on Polyhedral Compiler and Light-weight IRs, and Optimizing Pattern Matcher, written in Common Lisp (Recently Updated / [Mar 08, 2025](/content/2025/03/08/README.md))

README

        # Caten

> **This repository is still in the early stages of development. Additionally, it includes many experimental approaches. Please consider this as a place to experiment with my ideas. Do not use it in a product under any circumstances.**

[![CI](https://github.com/hikettei/Caten/actions/workflows/tests_on_push.yml/badge.svg)](https://github.com/hikettei/Caten/actions/workflows/tests_on_push.yml) [![Benchmarks](https://github.com/hikettei/Caten/actions/workflows/benchmark.yml/badge.svg)](https://github.com/hikettei/Caten/actions/workflows/benchmark.yml) [![pages-build-deployment](https://github.com/hikettei/Caten/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/hikettei/Caten/actions/workflows/pages/pages-build-deployment) [![](https://dcbadge.limes.pink/api/server/tNawU7TN3s?style=flat)](https://discord.gg/tNawU7TN3s)

`Caten = Compile+AbstracTENsor`

Caten is an experimental deep learning compiler. Our goal is to create a solution that’s as simple as tinygrad yet as flexible as TVM—all while extending the possibilities of interactive programming into the realm of AI.

**We're looking for collaborators! Please join our Discord and let me know if you'd like to contribute!**

## Showcases

Caten is still under development, but it aims to support a wide range of models in the future—from image processing to text generation, and vision language models! Some models are already up and running.

### Examples

We have two doc files that explain how the Caten compilation pipeline works:

- [End-to-End Example](docs/end-to-end-example.lisp) Which how the end-to-end compilation pipeline works.

- [Getting Started](docs/getting-started.lisp) An intro to Caten.

### Running LLMs

```sh

$ BACKEND=CLANG PARALLEL=8 ./roswell/caten.ros llm-example --model "gpt2" --prompt "Hello" --max-length 100

```

Give the GPT2 demo a try! You can pass compilation settings through environment variables.

For example, setting `BACKEND=CLANG` enables JIT compilation, while `JIT_DEBUG >= 2` allows you to view the schedule and the generated kernels. Setting `PARALLEL=8` divides the ScheduleGraph and compiles it in parallel.

You may still find the token/ms rate slow, but we're not yet at the stage of implementing an AutoScheduler to accelerate kernel performance (as well as GPU support). Once our IR matures enough to handle a wide range of deep learning models, we plan to focus on speeding things up!

### Lazy Evaluation

Caten is capable of generating the necessary kernels independently!

Instead of relying on OpenBLAS bindings or hand-optimized CUDA kernels, Caten avoids abstractions that would restrict us to specific libraries.

Let’s take `Matmul+Activation` Fusion as an example to illustrate this approach:

```lisp

(in-package :caten-user)

(pprint-graph

  (tensor-graph (!relu (!matmul (make-tensor `(a b)) (make-tensor `(b c))))))

```

When you set `BACKEND=CLANG`, the graph is compiled to an external language. You can view the generated code by specifying `JIT_DEBUG >= 2`.

Give it a try in your REPL!

```lisp

(in-package :caten-user)

;; (setf (ctx:getenv :BACKEND) "CLANG") to set globally

(ctx:with-contextvar (:BACKEND "CLANG")

  (caten (!relu (!matmul (make-tensor `(a b)) (make-tensor `(b c))))))

```

We’ve adopted a RISC-style architecture. Ultimately, everything in Caten boils down to [just 26 composable primitive ops](https://github.com/hikettei/Caten/blob/main/source/aasm/attrs.lisp).

When you replace `tensor-graph` with `tensor-lowered-graph`, you’ll see exactly what we mean! And by using `->dot` instead of `pprint-graph`, you can visualize that graph right in your browser!

Finally, our lazy evaluation doesn’t make debugging any harder. If you want to check an intermediate result, just insert `proceed` at any point—it won’t break the computation graph!

```lisp

;; They are the equivalent

(proceed (!sin (!cos (ax+b `(3 3) 1 0))))

(proceed (!sin (proceed (!cos (ax+b `(3 3) 1 0)))))

```

### Training Models (Experimental)

```lisp

(in-package :caten-user)

(defsequence MLP (in-features hidden-dim out-features &key (activation #'!relu))

	     (Linear in-features hidden-dim)

	     (asnode activation)

	     (Linear hidden-dim hidden-dim)

	     (asnode activation)

	     (Linear hidden-dim out-features))

(defun build-mlp-model ()

  (let* ((model (MLP 64 32 16))

         (outputs (call model (make-tensor `(b 64) :from :x)))

         (loss (!cross-entropy (!softmax outputs) (make-tensor `(b 16) :from :y)))

         (runner (caten loss)))

    (values runner (hook-optimizers runner (SGD :lr 1e-3)))))

(defun train ()

  (multiple-value-bind (runner optimizers) (build-mlp-model)

    (dotimes (i 10)

      (forward runner `(:x . ,(rand `(10 64))) `(:y . ,(rand `(10 16))) `(b . 10)) ;; replace with mnist dataloader

      (backward runner)

      (mapc #'step-optimizer optimizers)

      (mapc #'zero-grad optimizers))))

```

Though our focus is still on the inference, we will support training models. (Still Experimental, Unstable.) I am not sure our backward scheduler can be expanded into more large and complicated graphs. :(

## Getting Started

1. Install [Roswell](https://github.com/roswell/roswell) and a suitable IDE. (If unsure, Emacs or [Lem](https://github.com/lem-project/lem) is recommended)

2. Install [ISL (Integer Set Library)](https://libisl.sourceforge.io/) for the fast kernel generation.

3. If not already installed, then install [libyaml](https://github.com/yaml/libyaml) for YAML parsing and emitting.

4. Install [Qlot](https://github.com/fukamachi/qlot)

5. Check out [getting-started.lisp](./docs/getting-started.lisp)

```sh

$ git clone [email protected]:hikettei/Caten.git

$ cd Caten

$ qlot install

$ qlot exec ros run

> (ql:quickload :caten)

> (in-package :caten-user)

> (proceed (!randn `(3 3)))

```

## Get Involved

1. Join our [Discord Server](https://discord.gg/tNawU7TN3s).

2. Check out our [roadmap](https://github.com/users/hikettei/projects/2).

3. Create a PR

Caten is a project that started only a few months ago. We are currently in the stage of building a solid foundational library. Here’s what we’re looking for:

- Feature additions with tests (e.g., new activations, unimplemented matrix operations)

- Bug reports and additional tests.

- Refactoring of the core compiler components

- Improving the documentation

etc...

Before contributing, please note that there is no linter here. Make an effort to adhere to [Google Common Lisp Style Guide](https://google.github.io/styleguide/lispguide.xml). Changes that do not follow this should be rejected by the review.

## Roadmap

### Supported Models

- **Generative AI**

  - [x] GPT2

  - [ ] Llama3

  - [ ] TinyLLAMA

  - [ ] StableDiffusion

  - [ ] QwenVL2

- **Classification**

  - [x] MobileNetV2

  - [ ] MobileNetV3

  - [x] ResNet18/ResNet34/ResNet50

  - [ ] VIT_B_16

- **Segmentation**

  - [ ] CenterNet

- **Detection**

  - [ ] YoLOv3

  - [ ] YoLOv7

### Supported Formats

- [x] Common Lisp Frontend (caten/api)

- [x] ONNX (caten/onnx)

- [x] GGUF (caten/gguf)

### Quantization

- [x] Support Dequantization from GGUF

- [ ] Support QOPs

### Training

- [x] Autodiff

- [ ] Fast Autodiff

- [x] Support Training (But still limited)

- [ ] Distributed Training

### Accelerators (caten/byoc)

- [x] LISP VM (BACKEND=LISP)

- [x] LISP JIT (BACKEND=NATIVE)

- [x] CLANG JIT (BACKEND=CLANG)

- [x] METAL (BACKEND=METAL)

- [ ] WebGPU (BACKEND=WEBGPU)

- [ ] CUDA (BACKEND=CUDA)

- [ ] LLVM (BACKEND=LLVM)

- [ ] OpenCL (BACKEND=OPENCL)

- [ ] Finish AutoScheduler (Polyhedral Compiler + BEAM Search)

### Runtimes

- [x] LISP RUNTIME

- [ ] Exported Lisp Runtime (BACKEND=NATIVE)

- [ ] Exported to dylib (BACKEND=CLANG)

- [ ] JavaScript Runtime (BACKEND=WEBGPU)

## Running tests

You should install python, numpy, pytorch before running the test-suite by using `make install_extra`. If not specified, install the latest one. 

```sh

$ make install_extra # extra dependencies for running tests

$ make test

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/hikettei/Caten

Awesome Lists containing this project

README