Question: Q 3 - Deep learning accelerator Design and implement a verilog module called perceptron. We will use fixed point format in our simple single layer

Q3- Deep learning accelerator
Design and implement a verilog module called "perceptron".
We will use fixed point format in our simple single layer perceptron.
It should follow the following specification.
Perceptron should also contain internally (Read Only Memory - ROM)-
Please use the following values
signed [7:0] w1=8'sb00000010;
signed [7:0] w2=8'sb11111110;
signed [7:0] b =8'sb11111101;
Fixed point x1,x2,b,w1, w2 all have an implicit decimal point in the middle of the 4 least
significant bits i.e.bw bit 2 and bit 1.
Hence, every signed fixed point number =(Value if it were a signed integer)/4,
000001012=1.25(=5/4)
111110002=-2(=-8/4)
111110102=-1.5(=-6/4) etc
You may use the same adder and multiplier that work with signed integers, but will need to
account for the decimal point. If we add two numbers of the above type, the decimal point is
still 2 places from the least significant bit (LSB). But if we multiply two such numbers, the
implicit decimal point is 4 places from the least significant bit, so to get the correct fixed point
product, we need to shift it right by 2 in this case.
Working -
The perceptron must be pipelined (each stage 1 clock cycle and computation on new inputs
begins every clock cycle).
In first stage, it uses 2 signed multipliers to calculate the products p1= w1*x1 and p2= w2*x2.
In the second stage, it adds the p1, p2 and b to get an intermediate signed output - s.
For a deep learning accelerator we will over provision, hence, we will use multiple multipliers
and adders such that all multiplications may be finished in 1 stage (clock cycle) and all additions
may be finished in one stage (clock cycle).
For now, we just want to get it functionally correct and pipelined. At a later point we may want
to refine the design to be our accelerator. You may use * and + sign now, but at some point we
may want to use the IP components and may want to pipeline the multiplier for best
performance. You may save your self future effort by using the library components now itself (be
aware that will change clock cycles, no extra marks).
After the addition (after the 2nd stage), it asserts or de-asserts y based on -
y =1 if s >=0
0 Otherwise
Hence, latency =2 Clock cycles
 Q3- Deep learning accelerator Design and implement a verilog module called

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!