Skip to content

Commit

Permalink
Squashed commit of the following:
Browse files Browse the repository at this point in the history
commit 34e3732
Author: Lisa Ong <onglisa@microsoft.com>
Date:   Wed Jan 26 09:21:51 2022 +0000

    Merged PR 2391: Update quickstart example, updated docs structure per feedback

    * Teasers for transformations in the Quickstart sample (to differentiate Accera from others), with benchmarking
    * Removed the Miscellaneous section, redistributed various docs to various related locations
    * Renamed the cross compilation tutorial so that it is ordered last

    Note: currently we are using dynamic navigation for Material with mkdocs, which avoids maintaining a separate nav section each time a markdown file is added/removed (becomes unwieldly as the number of files increases). This means that filenames will need to names in the ordered they will show up in tabs or sections.

commit 972b7fc
Author: Kern Handa <kerha@microsoft.com>
Date:   Wed Jan 26 08:34:09 2022 +0000

    Merged PR 2392: Populate Target.Models based on known devices

    Populate Target.Models based on known devices

commit 8d99afe
Author: Kern Handa <kerha@microsoft.com>
Date:   Wed Jan 26 03:00:06 2022 +0000

    Merged PR 2390: Merge multiple HAT files during project building

    Merge multiple HAT files during project building

    Related work items: #3559

commit 295d396
Author: Kern Handa <kerha@microsoft.com>
Date:   Tue Jan 25 20:36:04 2022 +0000

    Merged PR 2386: Add support for various targets

    Add support for various targets

    Related work items: #3631

commit d0eef65
Author: Lisa Ong <onglisa@microsoft.com>
Date:   Tue Jan 25 11:36:00 2022 +0000

    Merged PR 2389: [nfc] Doc typos and consistency fixes
  • Loading branch information
Lisa Ong committed Jan 26, 2022
1 parent 0bfb779 commit 933e71f
Show file tree
Hide file tree
Showing 52 changed files with 1,778 additions and 700 deletions.
8 changes: 4 additions & 4 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
name: Bug report
about: Create a report to help us improve
title: 'BUG: <enter bug title>'
title: '[BUG] <enter bug title>'
labels: ''
assignees: ''

Expand All @@ -11,7 +11,7 @@ assignees: ''
<!-- Please read our Rules of Conduct: https://opensource.microsoft.com/codeofconduct/ -->
<!-- Please search for existing issues to avoid creating duplicates. -->
<!-- Incomplete reports will lead to closing the issue. -->
<!-- Also, please test using the latest master make sure your issue has not already been fixed -->
<!-- Also, please test using the latest main and make sure your issue has not already been fixed -->

**Describe the bug**
A clear and concise description of what the bug is.
Expand All @@ -24,7 +24,7 @@ A clear and concise description of what the bug is.

**To Reproduce**
<!-- Include a detailed step by step process for recreating your issue. -->
<!-- If your issue includes code, create a [gist](https://gist.github.com/) and past the link here. -->
<!-- If your issue includes code, create a [gist](https://gist.github.com/) and paste the link here. -->
Steps to reproduce the behavior:
1.
2.
Expand All @@ -34,4 +34,4 @@ Include the full error message in text form so that we can help troubleshoot qui
**Expected behavior**
A clear and concise description of what you expected to happen.

**What's better than filing an issue? Filing a pull request :).**
**What's better than filing an issue? Opening a pull request :).**
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
title: '[Feature] <enter feature request title>'
labels: ''
assignees: ''

Expand Down
4 changes: 2 additions & 2 deletions .github/ISSUE_TEMPLATE/question.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
name: Question
about: Support Questions
title: "[Question]: <enter question title>"
title: "[Q] <enter question title>"
labels: ''
assignees: ''

Expand All @@ -21,6 +21,6 @@ assignees: ''

#### Context details
<!-- Add OS, Accera version, Python version, if applicable -->
<!-- If it's too large, you can create a [gist](https://gist.github.com/) and past the link here. -->
<!-- If it's too large, you can create a [gist](https://gist.github.com/) and paste the link here. -->

### Include details of what you already did to find answers
8 changes: 7 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -365,4 +365,10 @@ _version.py
.vscode*

# llvm setup
LLVMSetupConan.cmake
LLVMSetupConan.cmake

# docs build
docs/README.md

# iPython
.ipynb_checkpoints/
173 changes: 110 additions & 63 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@

<a href="https://pypi.org/project/accera/"><img src="https://badge.fury.io/py/accera.svg" alt="PyPI package version"/></a> <a href="https://pypi.org/project/accera/"><img src="https://img.shields.io/pypi/pyversions/accera" alt="Python versions"/></a> ![MIT License](https://img.shields.io/pypi/l/accera)

Accera is a programming model, a domain-specific programming language embedded in Python (eDSL), and an optimizing cross-compiler for compute-intensive code. Accera currently supports CPU and GPU targets and focuses on optimization of nested for-loops.
# Welcome to Accera

Accera is a compiler that enables you to experiment with loop optimizations without hand-writing Assembly code. Accera is available as a Python library and supports cross-compiling to a wide range of [processor targets](https://github.com/microsoft/Accera/blob/main/accera/python/accera/Targets.py).

Writing highly optimized compute-intensive code in a traditional programming language is a difficult and time-consuming process. It requires special engineering skills, such as fluency in Assembly language and a deep understanding of computer architecture. Manually optimizing the simplest numerical algorithms already requires a significant engineering effort. Moreover, highly optimized numerical code is prone to bugs, is often hard to read and maintain, and needs to be reimplemented every time a new target architecture is introduced. Accera aims to solve these problems.

Expand All @@ -27,98 +29,143 @@ See the [Install Instructions](https://microsoft.github.io/Accera/Install/) for

### Quickstart

#### Try Accera in your browser
In this example, we will:

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/microsoft/Accera/HEAD?labpath=docs%2Fdemos%2Fbinder%2Fquickstart.ipynb)
* Implement matrix multiplication with a ReLU activation (matmul + ReLU), commonly used in in machine learning algorithms
* Generate two implementations: a naive algorithm and one with loop transformations
* Compare the timings of both implementations

No installation required.
#### Run in your browser

#### Run Accera on your local machine
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/microsoft/Accera/main?labpath=docs%2Fdemos%2Fquickstart.ipynb)

In this quickstart example, you will:
No installation is required. This will launch a Jupyter notebook with the quickstart example running in the cloud.

* Implement a simple `hello_accera` function that performs basic matrix multiplication with a ReLU activation
* Build a [HAT](https://github.com/microsoft/hat) package with a dynamic (shared) library that exports this function
* Call the `hello_accera` function in the dynamic library with some NumPy arrays, and checks against a NumPy implementation
#### Run on your machine

1. Create a Python 3 script called `quickstart.py`
1. Create a Python 3 script called `quickstart.py`:

```python
import accera as acc
import hatlib as hat
import numpy as np
```python
import accera as acc

A = acc.Array(role=acc.Array.Role.INPUT, shape=(16, 16))
B = acc.Array(role=acc.Array.Role.INPUT, shape=(16, 16))
C = acc.Array(role=acc.Array.Role.INPUT_OUTPUT, shape=(16, 16))
# define placeholder inputs/output
A = acc.Array(role=acc.Array.Role.INPUT, shape=(512, 512))
B = acc.Array(role=acc.Array.Role.INPUT, shape=(512, 512))
C = acc.Array(role=acc.Array.Role.INPUT_OUTPUT, shape=(512, 512))

matmul = acc.Nest(shape=(16, 16, 16))
i1, j1, k1 = matmul.get_indices()
# implement the logic for matmul and relu
matmul = acc.Nest(shape=(512, 512, 512))
i1, j1, k1 = matmul.get_indices()
@matmul.iteration_logic
def _():
C[i1, j1] += A[i1, k1] * B[k1, j1]

@matmul.iteration_logic
def _():
C[i1, j1] += A[i1, k1] * B[k1, j1]
relu = acc.Nest(shape=(512, 512))
i2, j2 = relu.get_indices()
@relu.iteration_logic
def _():
C[i2, j2] = acc.max(C[i2, j2], 0.0)

relu = acc.Nest(shape=(16, 16))
i2, j2 = relu.get_indices()
package = acc.Package()

@relu.iteration_logic
def _():
C[i2, j2] = acc.max(C[i2, j2], 0.0)
# fuse the i and j indices of matmul and relu, add to the package
schedule = acc.fuse(matmul.create_schedule(), relu.create_schedule(), partial=2)
package.add(schedule, args=(A, B, C), base_name="matmul_relu_fusion_naive")

matmul_schedule = matmul.create_schedule()
relu_schedule = relu.create_schedule()
# transform the schedule, add to the package
f, i, j, k = schedule.get_indices()
ii, jj = schedule.tile((i, j), (16, 16)) # loop tiling
schedule.reorder(j, i, f, k, jj, ii) # loop reordering
plan = schedule.create_plan()
plan.unroll(ii) # loop unrolling
package.add(plan, args=(A, B, C), base_name="matmul_relu_fusion_transformed")

# fuse the first 2 indices of matmul and relu
schedule = acc.fuse(matmul_schedule, relu_schedule, partial=2)
# build a dynamically-linked package (a .dll or .so) that exports both functions
print(package.build(name="hello_accera", format=acc.Package.Format.HAT_DYNAMIC))
```

package = acc.Package()
package.add(schedule, args=(A, B, C), base_name="hello_accera")
2. Ensure that you have a compiler in your PATH:

# build a dynamically-linked HAT package
package.build(name="mypackage", format=acc.Package.Format.HAT_DYNAMIC)
* Windows: Install Microsoft Visual Studio and run `vcvars64.bat` to setup the command prompt
* Linux/macOS: Install gcc

# load the package and call the function with random test input
hat_package = hat.load("mypackage.hat")
hello_accera = hat_package["hello_accera"]
Don't have a compiler handy? We recommend trying Accera in your browser instead [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/microsoft/Accera/main?labpath=docs%2Fdemos%2Fquickstart.ipynb).

A_test = np.random.rand(16, 16).astype(np.float32)
B_test = np.random.rand(16, 16).astype(np.float32)
C_test = np.zeros((16, 16)).astype(np.float32)

# compute using NumPy as a comparison
C_np = np.maximum(C_test + A_test @ B_test, 0)
3. Install Accera:

hello_accera(A_test, B_test, C_test)
```shell
pip install accera
```

# compare the result with NumPy
np.testing.assert_allclose(C_test, C_np)
print(C_test)
print(C_np)
```
4. Generate the library that implements two versions of matmul + ReLU:

2. Ensure that you have a compiler in your PATH:
```shell
python quickstart.py
```

* Windows: Install Microsoft Visual Studio and run `vcvars64.bat` to setup the command prompt
* Linux/macOS: Install gcc
5. To consume and compare the library functions, create a file called `benchmark.py` in the same location:

3. Install Accera:
```python
import hatlib as hat
import numpy as np

```shell
pip install accera
```
# load the package
hat_package = hat.load("hello_accera.hat")

4. Run the Python script:
# call one of the functions with test inputs
A_test = np.random.rand(512, 512).astype(np.float32)
B_test = np.random.rand(512, 512).astype(np.float32)
C_test = np.zeros((512, 512)).astype(np.float32)
C_numpy = np.maximum(C_test + A_test @ B_test, 0.0)

```python
python quickstart.py
```
matmul_relu = hat_package["matmul_relu_fusion_transformed"]
matmul_relu(A_test, B_test, C_test)

# check correctness
np.testing.assert_allclose(C_test, C_numpy, atol=1e-3)

# benchmark all functions
hat.run_benchmark("hello_accera.hat", batch_size=5, min_time_in_sec=5)
```

6. Run the benchmark to get the timing results:

```shell
python benchmark.py
```

#### Next Steps

The function can be optimized using [schedule transformations](https://microsoft.github.io/Accera/Manual/03%20Schedules/#schedule-transformations). The [Manual](https://microsoft.github.io/Accera/Manual/00%20Introduction/) is a good place to start for an introduction to the Accera programming model.
The [Manual](https://microsoft.github.io/Accera/Manual/00%20Introduction/) is a good place to start for an introduction to the Accera Python programming model.

In particular, the [schedule transformations](https://microsoft.github.io/Accera/Manual/03%20Schedules/#schedule-transformations) describe how you can experiment with different loop transformations with just a few lines of Python.

Finally, the `.hat` format is just a C header file containing metadata. Learn more about the [HAT format](https://github.com/microsoft/hat) and [benchmarking](https://github.com/microsoft/hat/tree/main/tools).


## How it works

In a nutshell, Accera takes the Python code that defines the loop schedule and algorithm and converts it into [MLIR](https://mlir.llvm.org/) intermediate representation (IR). Accera's compiler then takes this IR through a series of MLIR pipelines to perform transformations. The result is a binary library with a C header file. The library implements the algorithms that are defined in Python, and is compatible with the target.

To peek into the stages of IR transformation that Accera does, try replacing `format=acc.Package.Format.HAT_DYNAMIC` with `format=acc.Package.Format.MLIR_DYNAMIC` in `quickstart.py`, re-run the script, and search the `_tmp` subfolder for the intermediate `*.mlir` files. We plan to document these IR constructs in the future.

## Documentation
Get to know Accera by reading the [Documentation](https://microsoft.github.io/Accera/).

You can find more step-by-step examples in the [Tutorials](https://microsoft.github.io/Accera/Tutorials).
Get to know Accera's concepts and Python constructs in the [Documentation](https://microsoft.github.io/Accera/) page.

## Tutorials

More step-by-step examples are available on the [Tutorials](https://microsoft.github.io/Accera/Tutorials) page. We're working on more examples and tutorials soon.

## Contributions

Accera is a research platform-in-progress. We would love your contributions, feedback, questions, and feature requests! Please file a [Github issue](https://github.com/microsoft/Accera/issues/new) or send us a pull request. Please review the [Microsoft Code of Conduct](https://opensource.microsoft.com/codeofconduct/) to learn more.

## Credits

Accera is built using several open source libraries, including: [LLVM](https://llvm.org/), [toml++](https://marzer.github.io/tomlplusplus/), [tomlkit](https://github.com/sdispater/tomlkit), [vcpkg](https://vcpkg.io/en/index.html), [pyyaml](https://pyyaml.org/), and [HAT](https://github.com/microsoft/hat). For testing, we also use [numpy](https://github.com/numpy/numpy) and [catch2](https://github.com/catchorg/Catch2).

## License

This project is released under the [MIT License](https://github.com/microsoft/Accera/blob/main/LICENSE).
2 changes: 1 addition & 1 deletion accera/acc-gpu-runner/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# acc-gpu-runner

The `acc-gpu-runner` tool is functionally a wrapper around `acc-opt` and `mlir-vulkan-runner`.
It takes in a Accera-emitted MLIR file produced by a Accera generator and does the following:
It takes in an Accera-emitted MLIR file produced by an Accera generator and does the following:
- Runs the Accera lowering passes like `acc-opt` does
- Runs the GPU and Vulkan passes that `mlir-vulkan-runner` does
- Runs the lowered MLIR code in a GPU JIT engine
Expand Down
4 changes: 2 additions & 2 deletions accera/accc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,14 +111,14 @@ Generating for the sample in `samples/GEMM/MLAS_value/Accera_Sample.cpp`:
The above invocation will:
1. Create a directory `mlas_value_sample`
1. Create a subdirectory `mlas_value_sample/generator` and make a Accera generator CMake project there with the given Accera DSL file.
1. Create a subdirectory `mlas_value_sample/generator` and make an Accera generator CMake project there with the given Accera DSL file.
1. Build the generator
1. Run the generator with the given domain csv and custom argument values from the given config file.
1. Run `acc-opt.exe`, `mlir-translate.exe`, `llc.exe`, and `opt.exe` lowering the emitted code to a header and object file.
1. Create a subdirectory `mlas_value_sample/mlas_value_sample_lib_intermediate` and put intermediate IR files there that are the result of running the generator, `acc-opt.exe`, `mlir-translate.exe`, `llc.exe`, and `opt.exe`, which include the final header for the Accera sample.
1. Create a subdirectory `mlas_value_sample/lib` containing the project for the static library for the Accera sample.
1. Create a subdirectory `mlas_value_sample/logs` and put the `stdout` and `stderr` logs for each phase there.
1. (Because the `--main` argument was provided) Create a subdirectory `mlas_value_sample/main` and make a Accera main CMake project there with the given Accera main file and build the project.
1. (Because the `--main` argument was provided) Create a subdirectory `mlas_value_sample/main` and make an Accera main CMake project there with the given Accera main file and build the project.
1. (Because the `--run` argument was provided) Run the build main project.
Note: the intermediate files and the generator and runner projects will be named based on the `--library_name` parameter
Expand Down
4 changes: 2 additions & 2 deletions accera/hat/include/HATEmitter.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,15 @@ template <typename StreamType>
void EnableTOML(StreamType& os)
{
os << "\n";
os << "#ifdef __TOML__";
os << "#ifdef TOML";
os << "\n";
}

template <typename StreamType>
void DisableTOML(StreamType& os)
{
os << "\n";
os << "#endif // __TOML__";
os << "#endif // TOML";
os << "\n";
}

Expand Down
7 changes: 4 additions & 3 deletions accera/onnx-emitter/onnx_emitter.py
Original file line number Diff line number Diff line change
Expand Up @@ -142,15 +142,16 @@ def load_model(model_file):

def get_target(target_name):
if target_name == 'pi4':
return Target(model=Target.Model.RASPBERRY_PI4)
return Target(Target.Model.RASPBERRY_PI_4B, category=Target.Category.CPU)
elif target_name == 'pi3':
return Target(model=Target.Model.RASPBERRY_PI3)
return Target("Raspberry Pi 3B", category=Target.Category.CPU)
else:
return Target.HOST


def get_target_options(target):
if target.model == Target.Model.RASPBERRY_PI3:
if "Raspberry Pi" in target.name:
# TODO: Make use of the different attributes between the Pi devices
return MLASOptions(KUnroll=2,
BCacheSizeThreshold=64**1,
NumRowsInKernel=2,
Expand Down
4 changes: 2 additions & 2 deletions accera/onnx-emitter/test/pi3/emit_hat_package.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,9 +218,9 @@ def _emit_hat_package_for_model(model, package_name, target, output_dir, large_m
model = onnx.load(model)

if target == "pi4":
target_device = Target(model=Target.Model.RASPBERRY_PI4)
target_device = Target("Raspberry Pi 4B", category=Target.Category.CPU)
elif target == "pi3":
target_device = Target(model=Target.Model.RASPBERRY_PI3)
target_device = Target("Raspberry Pi 3B", category=Target.Category.CPU)
elif target == "host":
target_device = Target.HOST

Expand Down
Loading

0 comments on commit 933e71f

Please sign in to comment.