Figure 6.12 shows the impact of user perceived response time on revenue, and motivates the need to

Question:

Figure 6.12 shows the impact of user perceived response time on revenue, and motivates the need to achieve high-throughput while maintaining low latency.

a. Taking Web search as an example, what are the possible ways of reducing query latency?

b. What monitoring statistics can you collect to help understand where time is spent? How do you plan to implement such a monitoring tool?

c. Assuming that the number of disk accesses per query follows a normal distribution, with an average of 2 and standard deviation of 3, what kind of disk access latency is needed to satisfy a latency SLA of 0.1 s for 95% of the queries?

d. In-memory caching can reduce the frequencies of long-latency events (e.g., accessing hard drives). Assuming a steady-state hit rate of 40%, hit latency of 0.05 s, and miss latency of 0.2 s, does caching help meet a latency SLA of 0.1 s for 95% of the queries?

e. When can cached content become stale or even inconsistent? How often can this happen? How can you detect and invalidate such content?