Question: computer architecture li $t2, 32 # We'll stop when we've inserted 32 pieces of data li $t0, 9000 # Data memory starts at 8192, but

computer architecture
li $t2, 32 # We'll stop when we've inserted 32 pieces of data li $t0, 9000 # Data memory starts at 8192, but 9000 is easier to think about. defineMatrices : addi $t1, $t1, 1 # Increment the value to be stored in memory sw $t1, 0($t0) # Store the current value (tl) in the current location (to) addi $t0,$t0, 4 # Increment the data location bne $t1, $t2, defineMatrices # If we haven't yet finished defining, go back. # MAIN PROGRAM FLOW: # 1. Initialize row and column to (0,0). # 2. Load the first element of Matrix l's relevant row and Matrix 2's column # 3. Add the product of those elements to the output element. # 4. Repeat 2-3 for the second, third, and fourth elements. # 5. Store output element in data memory. # 6. Increment row. If row=4, then row=0, column++. If column = 4, stop. # 1. Initialize row and column to (0,0). li $a0, 0# row li Sal, 0 # column # 2-4... nextElement: # li $a2, 0# multiplying element number li $v0, 0 # output element number nextSubelement: # get the location in data memory for Matrix l's element sll $t1, $a0, 4 # the row number times 16... sll $t2, $a2, 2 # ... plus the row number times 4... add $t4, $t1, $t2 # ...is the location in data memory. (other than the +9000) # get the location in data memory for Matrix 2's element sll $ti, Sal, 2 # the column number times 2... sll $t2, $a2, 4 # ... plus the element number times 16... add $t5, $t1, $t2 # ...is the location in data memory. (other than the +9064) # actually load the data lw $t4, 9000 ($t4) lw $t5, 9064($t5) mult $t4, $t5 mflo $t5 add $v0, $v0, $t 5 addi $a2, $a2, 1 # increment the multiplying element number bne $a2, 4, nextSubelement # if we're not already at 4, go back # figure out where we're putting this data sll $t1, $a0, 4 # row num times 16 sll $t2, $al, 2 # plus col num times 4 add $t3, $t1,$t2 # is the location. (other than the +9128) sw $v0, 9128 ($t 3) # increment row addi $al, $al, 1 # if row = 4, then row = 0, column++ li $t1, 4 bne $a0, $ti, rowIsNot Four li $a0, 0 addi $al, $al, 1 rowIsNot Four: # if column != 4, keep going li $t1, 4 bne $al, $t1, nextElement Question#1: For the attached code (multiplying two 4x4 matrices) assume clk= 200MHz Instructions CPI MUL, li 5 Other ALU 2 Branch 3 LW, SW 7 1) What is the average CPI ? 2) What is the time percentage for each instruction group of the total execution time? 3) Assume after adding a cache the effective CPI for LW and SW changed to 1 cycle, recalculate questions 1 and 2 based on the new assumption
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
