A deep learning accelerator ASIC chip design to classify images from the MNIST handwritten image dataset.
Source: Wikipedia - MNIST database
Design implementation for Tiny Tapeout.
Thanks to Columbus IEEE Joint Chapter of the Solid-State Circuits and Circuits and Systems Societies!
Example:
Input images from the MNIST Dataset are preprocessed by a raspberry pi and transmitted to the ASIC. The images in MNIST are 28x28 grayscale images. However, as part of the preprocessing step, these images are reduced to a 14x14 black/white image to reduce the amount of data needed to be transmitted to the ASIC and to reduce the complexity of the neural network. Since the images are 14x14, a 8-pin interface (ui_in) is used which transmits 7 pixels at a time for 28 clock cycles to transmit each image. The remaining bit, the most significant bit (MSB), is a active-low signal. pulled low to start transmitting a new image.
A preprocessing python script (utility.py) is provided to convert the standard MNIST images into the reduced dataformat used in this project. The script is used to train the network, test the network, convert the pytorch implementation into verilog, and generate cocotb unit-tests directly from the MNIST dataset.
Based on MNIST pytorch example.
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 16, 3, 1) #1,32,3,1
self.conv2 = nn.Conv2d(16, 32, 3, 1) #32,64,3,1
self.dropout1 = nn.Dropout(0.25)
self.dropout2 = nn.Dropout(0.5)
self.fc1 = nn.Linear(800, 128) #9216
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = self.conv2(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = F.relu(x)
x = self.dropout2(x)
x = self.fc2(x)
output = F.log_softmax(x, dim=1)
return output
Implemented into Verilog as a main file: project.v with 3 supporting files for readimage.v, neuralnetwork.v, and decoder.v
Goal - show results of chip vs identical python-based neural network implementation.