Question: 4. Weight Quantization Assuming your design needs to fetch N weights from the memory and the average memory access bandwidth is S bytes per second,

4. Weight Quantization Assuming your design needs to fetch N weights from the memory and the average memory access bandwidth is S bytes per second, calculate the data transfer latency (in units of second) when using FLOAT32 to represent each weight. Then, calculate the data transfer latency if we apply quantization to convert the model to INT8. 5. Weight Quantization vs Weight Clustering Besides quantization, we can also compress the model by clustering the weight 10 values into M groups. As shown in Fig. 5, we can convert the N weights in the last sub-question to N indices in the cluster index matrix and M centroids. In this example, N=44=16 and M=4. If we keep using FLOAT32 to represent the centroids and log2(M) bits to represent each index, what is the maximum value of M to gain lower data transfer latency compared to the quantized model in the last sub-question? (assuming N=64,M must be a power of 2, and the bandwidth is still S bytes per second) viusi Figure 5: Example of weight clustering, credit to Oreilly book here

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Provide a summary technical report with your own words about Pipelined Execution which is also named as Instruction Level Parallelism, addressing mainly the following areas: 1. What is Pipelined...

can someone solve this Modern workstations typically have memory systems that incorporate two or three levels of caching. Explain why they are designed like this. [4 marks] In order to investigate...

A creative engineer suggests structuring the TLB so that not all the bits of the presented address need match to result in a hit. Suggest how this might be achieved, and what might be the costs and...

software projects failed. In the run-up to Y2K, most of the world's large companies claimed that fixing the Millennium Bug was a large project whose success was critical to their survival. One would...

ttth Suppose that the sequence of bags {Bn | n N} is recursively enumerated by the computable function e(n, x) = fn(x), [7 marks] Hence prove that the set of all recursive bags cannot be recursively...

Show the parse trees for the two parses that the grammar assigns for sentence S1. S1: the train station bus rumbles [3 marks] (b) Give an algorithm for a bottom-up passive chart parser without...

re Regular Languages and Finite Automata (a) Let L be the set of all strings over the alphabet {a, b} that end in a and do not contain the substring bb. Describe a deterministic finite automaton...

1 Distributed Systems In a distributed electronic conference application each participant has a replica of a shared whiteboard. Only one user at a time may write to the whiteboard, after which the...

(iii) Now express your decision rule instead using only the quantities p(Ck), p(Cj ), p(x|Ck), p(x|Cj ), and relate it to the diagram above. [2 marks] (iv) If the classifier decision rule assigns...

Consider the neural network model MLP0 from Figure 7.5. That model has 20 Mweights in five fully connected layers (neural network researchers count the input layer as if it were a layer in the stack,...

1. The faces of the thin square plate in Fig. 297 with side a = 24 are perfectly insulated. The upper side is kept at 25C and the other sides are kept at 0C. Find the steady-state temperature u (x,...

Using loop analysis, find Vo in the circuit infigure. 1 knV 2 kn: 1 kn 2 kn 2 mA 1 kN Vo 12 v(+ 2V ww ww

Which of the following is true when comparing next month s estimates to the most recent month? Unit Sales Total Sales Variable Expenses Net Operating Income A . Decrease Decrease Increase Decrease B...

A company is preparing completing their Cash Budget. The following data has been prepared for cash receipts and payments. January February March Cash receipts $1,061,200 $1,182,400 $1,091,700 Cash...