Question: Single-precision (32-bit) and double-precision (64-bit) are the most common floating-point formats, but formats with alternative field width may be implemented to achieve better precision-speed or


Single-precision (32-bit) and double-precision (64-bit) are the most common floating-point formats, but formats with alternative field width may be implemented to achieve better precision-speed or precision-range trade-offs for specific tasks. Consider the following two shorter floating-point formats: FP16: 1-bit sign, 5-bit exponent and 10-bit mantissa BF16: 1-bit sign, 8-bit exponent and 7-bit mantissa (b) (10 marks) Find the closest representation (i.e., with minimal absolute difference) of number 1428 in both formats and calculate the error. Which format has a better precision
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
