0% found this document useful (0 votes)
12 views

24CH10039 AGV Task 4

The paper discusses the implementation of neural networks on FPGAs, highlighting their advantages such as energy efficiency, parallel processing, and real-time performance. It details the hardware architecture used, focusing on forward and backward propagation processes, and emphasizes the flexibility of the design for various neural network types. Potential applications include embedded systems, real-time processing, and AI-driven technologies like speech recognition and autonomous vehicles.

Uploaded by

poribaf830
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

24CH10039 AGV Task 4

The paper discusses the implementation of neural networks on FPGAs, highlighting their advantages such as energy efficiency, parallel processing, and real-time performance. It details the hardware architecture used, focusing on forward and backward propagation processes, and emphasizes the flexibility of the design for various neural network types. Potential applications include embedded systems, real-time processing, and AI-driven technologies like speech recognition and autonomous vehicles.

Uploaded by

poribaf830
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

AGV Task 4

Neural Networks on FPGAs


Soham Agarwal 24CH10039
After reading this paper, I find that implementing neural networks on FPGAs is challenging and yet
fascinating. In this paper the basic hardware architecture for neural networks running on FPGAs and
then highlighting the pros and cons of doing so.

Introduction to FPGA
FPGAs or Field Programmable Gate Arrays have become increasingly more important in recent times,
just like CPUs and GPUs, but unlike them they have the special ability that allows them to do multiple
calculations simultaneously.
As we know already that there are multiple matrix multiplications in a neural network, and FGPAs do it
in much fewer clock cycles as compared to CPUs and GPUs. This makes them better suited.
Some key advantages of FPGAs mentioned in the paper include:
1.​ Higher energy efficiency.
2.​ Parallel processing capabilities.
3.​ Real-time computation performance.
4.​ Flexibility in implementing custom algorithms.
Major companies have already started using FPGAs in their AI systems, like Microsoft with Bing
search engine and Baidu with their speech recognition applications

Neural Network Basics


The Neural Networks have multiple layers containing various neurons, there is an input layer, an
output layer and all the layers between them are the hidden layers. Each neuron is a small linear
regression model in itself, where it uses a back propagation method to minimize the Loss by
changing the weights and biases.
The paper mentions different types of neural networks:
1.​ Deep Neural Networks (DNNs) with multiple hidden layers
2.​ Recurrent Neural Networks (RNNs) with feedback loops
3.​ Convolutional Neural Networks (CNNs) specialized for tasks like image processing
Not only Neural Networks, all machine learning algorithms are just making predictions on what the
target must be, learning from its mistakes again and again, so this process of making a prediction
and then learning from the errors is called one training epoch. In making a good neural network,
hundreds of training epochs must be made. Making a prediction is the forward propagation, and the
learning part is the back propagation.
Hardware Implementation on FPGA
In this paper, we have seen the Neural Network architecture being implemented on the FPGA System
on Chip (SoC) platform.
This is a very flexible and reusable architecture which can be used for many different cases. It uses
reusable components like multiply-add banks, RAM modules, and activation function lookup tables,
all of which can be re-configured and can be applied to many neural networks.

The model focuses on the forward propagation in detail, input layer to hidden layer, hidden layer to
hidden layer, and hidden layer to output layer. During the first stage, input vectors are multiplied by the
first hidden layer weight matrix using the multiply-add bank, which consists of many parallel
multiplication and accumulation units that can be adjusted. The results are then stored in memory
before being passed through activation functions implemented using lookup tables. This approach
allows different activation functions like sigmoid, ReLU, or tanh to be used by simply loading different
parameters into the lookup table. The output from the activation function is stored in another RAM
module before being processed by the next layer. For multi-layer networks, these components can be
reused to process each subsequent layer, making the architecture very flexible and efficient.

In the backward propagation process implements the learning algorithm that adjusts the network
weights. The paper uses cross-entropy as the loss function to calculate error derivatives for the
backpropagation algorithm. The core calculation involves multiplying the error (the difference
between predicted and actual values) by the derivative of the activation function.

It's implemented on the XILINX ZU9CG FPGA SoC platform, offering 2520 DSPs and 32Mb on-chip
memory. For larger networks, multiple FPGAs can be clustered. Additionally, the paper mentions the
possibility of deploying deep learning frameworks like TensorFlow directly on the 64-bit FPGA SoC
platform, calling FPGA hardware resources directly.

The main components of this architecture are


1.​ Multiply-add banks for matrix operations
2.​ RAM modules for storing intermediate results
3.​ Activation function lookup tables
4.​ Control units for managing the process flow
This architecture focuses on the forward propagation in detail, from input layer to hidden layer, in
between hidden layers and then from hidden layer to output layer.

Advantages and Potential Applications


One of the main strengths of this architecture is its scalability and adaptability. Different neural
networks can be implemented by reusing modules and making slight modifications. For larger
networks, multiple FPGAs can be clustered together.
The paper suggests that this approach could be particularly useful for:
1.​ Embedded systems where energy efficiency is crucial
2.​ Real-time applications requiring fast processing
3.​ Autonomous vehicles
4.​ Speech recognition systems
5.​ Other AI and machine learning applications

You might also like