Need help with C programming. Write code to do allowing you to store these smaller floating point
Question:
Need help with C programming.
Write code to do allowing you to store these smaller floating point numbers in a 32-bit integer.
INPUT: you will read in a ‘program’ and call your functions to
implement these programs. An example of one of these programs is:
x = 18.113
print x
y = 4.5
a = x + y
print a
z = x * y
print z
OUTPUT: The output will be the current values of the given variables at the print statements. For the above program, output would be:
x = 18.0937500000
a = 22.5937500000
z = 81.2500000000
Some of this task is already done for you. I will provide a program that reads in the given programs, saves the variable values and calls the functions (described next) that you will be implementing.
You are going to implement a 15 bit floating point representation, where 5 bits are for the exponent and 9 are for the fraction. Using bit level operators,write functions (shown below) to help implement the program.
• Assignment statement (variable = value) – calls your function computeFP(),
which converts from a C float value to our mini-float representation (which
only uses the 15 lowest of the given 32 bits).
int computeFP(float val) { }
// input: float value to be represented
// output: integer version in our representation
o Given the number of bits, the rounding you will have to do for this
representation is pretty substantial. For this assignment, we are always
going to take the easy way and truncate the fraction (i.e. round down).
For example, the closest representable value for 18.113 (rounding down)
is 18.0937, as can be seen in the program output.
• Print statement (print variable) – uses your getFP() function to convert from
our mini-float representation to a regular C float value, and formats/prints it out
nicely.
float getFP(int val) { }
// Using the defined representation, compute and
// return the floating point value
- Add statement – for this statement, you are going to take two values in our
representation and use the same technique as described in class/comments to add these values and return the result in our representation.
int addVals(int source1, int source2) {} - Multiply statement – for this statement, you are going to take two values in
our representation and use the same technique as described in class/comments
to multiply these values and return the result in our representation.
int multVals(int source1, int source2) {}
Assumptions:To make your life a little easier, we are going to make the following assumptions: - No negative numbers. The sign bit can be ignored.
- No denormalized (or special) numbers. If the given number is too small to be
represented as a normalized number, you can return 0. Same thing with
numbers that are too large.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include "fp.h"
// input: float value to be represented
// output: integer version in our representation
//
// Perform this the same way we did in class -
// either dividing or multiplying the value by 2
// until it is in the correct range (between 1 and 2).
// Your exponent (actually E) is the number of times this operation
// was performed.
// Deal with rounding by simply truncating the number.
// Check for overflow and underflow -
// with 4 exponent bits, we have overflow if the number to be
// stored is > 14
// for overflow (E > 14), return -1
// For underflow (E < 1), return 0
int computeFP(float val) {
return 2;
}
float getFP(int val) {
// Using the defined representation, compute the floating point
// value
// For denormalized values (including 0), simply return 0.
// For special values, return -1;
return 2.0;
}
int
multVals(int source1, int source2) {
// You must implement this by using the algorithm
// described in class:
// Add the exponents: E = E1+E2
// multiply the fractional values: M = M1*M2
// if M too large, divide it by 2 and increment E
// save the result
// Be sure to check for overflow - return -1 in this case
// Be sure to check for underflow - return 0 in this case
return 2;
}
int
addVals(int source1, int source2) {
// Do this function last - it is the most difficult!
// You must implement this as described in class:
// If needed, adjust one of the two number so that
// they have the same exponent E
// Add the two fractional parts: F1' + F2 = F
// (assumes F1' is the adjusted F1)
// Adjust the sum F and E so that F is in the correct range
//
// As described in the handout, you only need to implement this for
// positive, normalized numbers
// Also, return -1 if the sum overflows
return 2;