Question: Translate the above code (bottom using nested loops) using our DLX vector instruction set. Assume: Vector registers of length 8 Load unit has a startup

Translate the above code (bottom using nested loops) using our DLX vector instruction set. Assume:
Vector registers of length 8
Load unit has a startup of L clocks
Adder unit has a startup of A clocks
Multiplier unit has a startup of M clocks
For vectors of length N, compute the number of clock cycles to execute the inner loop (the vector operations) both for normal execution and then for allowing changing of loads/stores/addition/ multiplication. How much speedup do we achieve with chaining?
low VL (n MVL); find odd-size piece using modulo op for (j 0; j (n/MVL) j j+1) /*outer loop*/ for (i low; i (low+VL); i i+1) runs for length VL*/ Y[i] a x[i] Y[i] /*main operation*/ low low VL; start of next vector*/ VL MVL; reset the length to maximum vector length
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
