Question: A common operation performed in graphics rendering is computing the inverse square root of a number, 1 n 2 This is used when normalizing vectors

A common operation performed in graphics rendering is computing

the inverse square root of a number,

\frac{1}{\sqrt[]{n}}

This is used when normalizing vectors to calculate angles of incidence

and reflection

- -

an operation so common modern

3

D graphics

programs typically perform millions of these calculations every second

to simulate lighting.

In the early

1990

,

when

3

D graphics rendering started to become

widespread in the video game industry, the floating point processing

power of most CPUs was too slow to perform the inverse square root

at the speed required, necessitating certain speedups. This question

investigates the tradeoffs between these speedups.

Suppose the application QUAKE spends

20 %

of its execution time

computing the inverse square root of a flocting point number, and

15 %

of its time computing the inverse square root of an integer. Consider

each of the following speedups:

(1)

Proposed additions to the

86

ISA would add the rsqrtss instruction, which could be run in

parallel to speed up computation of the inverse square root for all numbers

(

integer and

floating point

)

20 % .

Supposing this instruction is added, what would the overall speedup

for the QUAKE application be

,

when considering this speedup alone?

You should input a float number here

(

round your result to the nearest

thousandth

) .

The overall speed up is

(2)

On existing hardware

(

without the speedup in

(

)),

computing the inverse square root of an

integer is

30 %

faster than that of a floating point number. However, by utilizing a trick

involving a magic number,

0

5

3759

,

a floating point number can be converted to an

integer

(

for the purposes of this computation

) .

This allows all inverse square root calculations

to be as fast as those for integers.

What is the overall speedup for the QUAKE application considering this optimization alone?

Show your work and round your result to the nearest thousandth.

You should input a float number here

(

round your result to the nearest

thousandth

) .

The overall speed up is

(3)

You are not satisfied with either speedup, and go purchase a DSP which supports

calculating the inverse

-

square root for all numbers with a speedup equivalent to

20

CPU

cores. Supposing this DSP is added, the entire inverse

-

square root calculation is parallelizable,

and the DSP is only used to accelerate the inverse

-

square

-

root calculation, what is the overall

speedup for QUAKE considering this optimization alone?

You should input a float number here

(

round your result to the nearest

thousandth

) .

The overall speed up is

A common operation performed in graphics rendering is computing the inverse

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

A common operation performed in graphics rendering is computing the inverse square root of a number, 1 n 2 This is used when normalizing vectors to calculate angles of incidence and reflection - - an...

Developments in Technology Light is incident from air on the end face of a multimode optical fibre at angle of incidence as shown below. n n 1 2 The refractive indices of the core and cladding are...

Jones & Bartlett Learning, LLC. NOT FOR RESALE OR DISTRIBUTION CHAPTER Hot Spot Analysis 10 LEARNING OBJECTIVES C A R R Provide a working definition of a \"hot spot.\" , Be able to explain different...

ANSI-SPARC6 Programming Language Compilation Write notes on each of the following topics: (a) the implementation of labels and jumps in a recursive, block structured programming language [7 marks]...

an operation that yields a N aN value when neither of its arguments is a N aN, (b) an operation with finite arguments that yields +, (c) an operation with an argument + that yields a finite result....

cck3 java help them all If a processor exhibited one branch delay slot how would you reorder (and possibly modify) the instructions in the following loop to gain a performance advantage? loop ldr...

What is a branch delay slot and why does it arise? [7 marks] How can branch delays be avoided? If a processor exhibited one branch delay slot how would you reorder (and possibly modify) the...

Distributed Systems Einstein has established that there is no universal time. For earth-based computer systems discuss how events might be assiganed a time stamp which is reasonably close to...

In this section, we're going to look at a Python program that uses turtle graphics to play a simple game. The object of the game is to launch the turtle like a projectile so it hits the target. You...

For each of the following procedures taken from the quality control manual of a CPA firm, identify the applicable element of quality control from Table 2-4. a. Appropriate accounting and auditing...

That ability to convert shares of stock into cash quickly and at their value is an expression of

Provide a Sensitivity and Scenario analysis. Best case, worst case. I want to buy a rental property for $300,000. The NOI (net operating income) will be $30,000 a year. I plan to leverage 2/3 and use...

Reward systems based on individual performance (pay-for-performance, individual performance and pay using an individual bonus or commission) would be more prevalent in countries with higher levels of...

Reward systems based on status (seniority-based and skills-based) would be more prevalent in countries with higher levels of uncertainty avoidance. This proposition was supported by their findings.

1 What are the key cultural variables that influence the success of motivational systems within organizations?