Question: You need to re - write the assembly code, which you build in the previous exercise, using the RISC - V V extension. Before you

You need to re-write the assembly code, which you build in the previous exercise, using the RISC-V V
extension. Before you write the code, you are suggested to study the RISC-V V extension document (from p.
10 to p.31 and p.55) to get familiar with the concept of RISC-V vector programming. After you write thevectorized program, you should collect the performance data and derive related performance statistics as you
did in the previous exercise.
As shown in the improved_version1 function, you are responsible for writing the assembly for the forloop in C: for (...) y[i]= h[i]* x[i]+ c;.
NOTE: You should put your assembly code within the arraymul_improved_version1.c file, as
indicated in asm volatile( #include "arraymul_improved_version1.c" : [h]"+r"
(p_h),...); within the improved_version1() function in exercise2_1.c.
Your code should use the RISC-V V Extension and run with Spike simulator using the specific
configurations (i.e., vlen=128, elen=16). The vectorized version would improve the execution
efficiency, thanks to the parallel computations done in the vector computation engine.
NOTE: Please do not modify the rest of the header file.
Variables/Constants defined in the header files used in this exercise.
Var./Cons. Name Definition
x[] Input array 1 in arraymul.h
h[] Input array 2 in arraymul.h
y[] Output array in arraymul.h
improved_version1_cycle_count
Clock cycle in vectorsum_improved_version1.c you need
to calculate
cycle_time
The given clock cycle time of the target RISC-V processor
running at 2.6 GHz
improved_version1_cpu_time
The CPU time in vectorsum_improved_version1.c you
need to calculate
arr_size Size of the arrays used in this exercise
student_id
student_id = your_student_id %100
i.g. F12345678:
student_id =12345678%100=78
Your obtained scores of this exercise is determined by the correctness of your reported performance
data and the efficiency of your developed code against the serial version in the previous exercise.
1. The values of seven counters. (28%)
add_cnt (4%)
sub_cnt (4%)
mul_cnt (4%)
div_cnt (4%)
lw_cnt (4%)
sw_cnt (4%)
others_cnt (4%)
2. The total cycle count (improved_version1_cycle_count).(4%)
3. The CPU time (improved_version1_cpu_time).(4%)
4. Achieved speedup. (14%)
If 6< speedup, you get (14 pt).
If 4< speedup <6, you get (9 pt)
If 2< speedup <4, you get (5 pt).
If speedup <2, you get (0 pt).
The improved_version1 function in exercise2_1.c is as follows.
//The code snippet for improved_version1() in exercise2_1.c
void improved_version1(){
short int *p_h = h;
short int *p_x = x;
short int *p_y = y;
short int id = student_id;// id = your_student_id %100;
/* original C code
for (int i =0; i < arr_size; i++){
p_y[i]= p_h[i]* p_x[i]+ id;
}
*/
asm volatile(
#include "arraymul_improved_version1.c"// Write your code in this file:
arraymul_improved_version1.c
: [h]"+r"(p_h),[x]"+r"(p_x),[y]"+r"(p_y),[add_cnt]"+r"(add_cnt),
[sub_cnt]"+r"(sub_cnt),[mul_cnt]"+r"(mul_cnt),[div_cnt]"+r"(div_cnt),
[lw_cnt]"+r"(lw_cnt),[sw_cnt]"+r"(sw_cnt),[others_cnt]"+r"(others_cnt)
: [id]"r"(id),[arr_size]"r"(arr_size)
: "t0","v0","v1","v2"
);
printf("
===== Question 2-1=====
");
printf("output: ");
for (int i =0; i < arr_size; i++){
printf("%d ", y[i]);
}
printf("
");
printf("add counter used: %d
", add_cnt);
printf("sub counter used: %d
", sub_cnt);
printf("mul counter used: %d
", mul_cnt);
printf("div counter used: %d
", div_cnt);
printf("lw counter used: %d
", lw_cnt);
printf("sw counter used: %d
", sw_cnt);
printf("others counter used: %d
", others_cnt);
macro_improved_version1_cycle_count
printf("The total cycle count in this program: %.0f
",
improved_version1_cycle_count);
macro_improved_version1_cpu_time
printf("CPU time =%f us
", improved_version1_cpu_time);
FILE *fp_1;
fp_1= fopen("improved_version1_cpu_time.txt","w");
fprintf(fp_1,"%f", improved_version1_cpu_time);
fclose(fp_1);
float speedup =0.0;
FILE *fp_2;
fp_2= fopen("arraymul_baseline_cpu_time.txt","r");
fscanf(fp_2,"%f", &speedup);
fclose(fp_2);
speedup = speedup / improved_version1_cpu_time;
printf("The V extension version is %f times faster than the baseline
version
", speedup);
}

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!