Question: Layout the following code sequence in convoys and compute the momperor sequences takes considering one Load / Store Unit, one FP multiplier, one FP Adder

Layout the following code sequence in convoys and compute the momperor sequences takes considering one Load/Store Unit, one FP multiplier, one FP Adder and vector register length to be 64 elements.
vid
v1,100(x5)
// double precision vector load
vid
v2,200(x6)
Vadd.vv
v3, v1, v2
// double precision vector add (vector, vector)
vlsd- v4,0(x7), x4
// double precision vector load with stride
// double precision vector mul (vector, vector)
vmul.vv
v5, v4, x5
// double precision vector sub (vector, scalar)
vsub.vx
v5, v5, v3
vsub.vv
vsd
v4, v4, v2
5,200(*7)
// double precision vector store
For the code sequence of part-a), now consider that there are two lanes. Layout the same sequence in convoys and compute the cycles / F * LOPs
Scanned with CamScar
-4-
c) What features are available in vector processors to support the following:
i. Conditional Execution
Loading the non-zero elements of a sparse matrix
d) What is chaining n context to vector architecture?
e) What is meant by strided access while loading a vector from memory?
[2]
[1
[1the MIPS code after loop unrolling the following MIPS code twice, explain the benefits of loop unrolling by calculating the CPI for the original code and the loop unrol twice code. No rescheduling required. You have to show the steps of calculating the CPI final answer without steps will not get credits. (8 points) Loop: lw R 2,0(R 1) add R 2, R 2, R 3 sw R 2,0(R 1) addi R 1, R 1,-4 bne R1, R5, Loop

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!