Question: 3 - Deep learning accelerator Design and implement a verilog module called perceptron . We will use fixed point format in our simple single layer
Deep learning accelerator Design and implement a verilog module called perceptron We will use fixed point format in our simple single layer perceptron. It should follow the following specification. Top level Port name Direction Width Description rstn Input : Active low reset clk Input : Clock x Input : Signed fixed point x Input : Signed fixed point validin Input : Inputs valid y Output : Unsigned or yvalid Output : Y valid, driven by design Perceptron should also contain internally Read Only Memory ROM b reg : Bias Signed fixed point w reg : Weight Signed fixed point w reg : Weight Signed fixed point Please use the following values signed : wsb; signed : wsb; signed : b sb; Fixed point x x b w w all have an implicit decimal point in the middle of the least significant bits ie bw bit and bit Hence, every signed fixed point number Value if it were a signed integer etc You may use the same adder and multiplier that work with signed integers, but will need to account for the decimal point. If we add two numbers of the above type, the decimal point is still places from the least significant bit LSB But if we multiply two such numbers, the implicit decimal point is places from the least significant bit, so to get the correct fixed point product, we need to shift it right by in this case. Working The perceptron must be pipelined each stage clock cycle and computation on new inputs begins every clock cycle In first stage, it uses signed multipliers to calculate the products p wx and p wx In the second stage, it adds the p p and b to get an intermediate signed output s For a deep learning accelerator we will over provision, hence, we will use multiple multipliers and adders such that all multiplications may be finished in stage clock cycle and all additions may be finished in one stage clock cycle For now, we just want to get it functionally correct and pipelined. At a later point we may want to refine the design to be our accelerator. You may use and sign now, but at some point we may want to use the IP components and may want to pipeline the multiplier for best performance. You may save your self future effort by using the library components now itself be aware that will change clock cycles, no extra marks After the addition after the nd stage it asserts or deasserts y based on y if s Otherwise Hence, latency Clock cycles You must use internal registers of appropriate size to contain the complete intermediate results.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
