Question: Debugging a system issue Background As a beta offering for our product, we are trying out a new third-party recommendation service to help suggest a

 Debugging a system issue Background As a beta offering for our

product, we are trying out a new third-party recommendation service to help

suggest a next action for users. We surface those next actions as

Debugging a system issue Background As a beta offering for our product, we are trying out a new third-party recommendation service to help suggest a next action for users. We surface those next actions as part of the user's landing page when they sign in - maybe it's sending an important message, checking off an important todo, etc. You're settling in at your desk after a delicious and energizing team lunch when you receive a performance alert from the monitoring system. Rendering the beta landing page for a user is averaging around 10 seconds, up from the normal average of 800ms. About 10% of the users in the beta are even seeing the dreaded 504 Gateway Timeout error. The beta users represent a small subset of the overall traffic, but this is something that we'd like to understand and fix in the next few days. No changes have been deployed to the frontend that powers the recommendation service in several weeks. Architecture Here's a highly simplified diagram of the system in question: And here's a sequence diagram showing the flow of the request through the main components of the architecture: And here's a sequence diagram showing the flow of the request through the main components of the architecture: Use your understanding of the architecture above to list as many reasons as you can think of that the recommendations endpoint could be running slowly or timing out. Please spend 10-15 minutes on this part of the scenario. Please focus your response on ideas about what could be causing the recommendations endpoint to be running slowly, avoid talking about debugging steps until part 2 of the scenario Consider performance issues across the entire stack, including databases, third-party services, networking, hardware, cloud infrastructure, etc. You never know where the issue might be! We're looking for responses that demonstrate both breadth (how many areas of the system you address) and depth of knowledge (level of detail on individual ideas). For example, we would value covering 3 areas of the system each with 3 ideas more highly than we would value 10 ideas all focused on hardware issues

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!