Context
Hello, I’m using FastAPI and Sqlalchemy to create a backend in python3.12. Fairly recently, we decided to introduce the Server Timing out of curiosity to see in which parts of the application we were spending the most time. Once we completed the implementation and took a look at the results, we were pretty astonished. Here is an example of the Timing tab from the Chrome browser:
So as you can see, out of the 395.41 milliseconds, only 13 milliseconds were actually captured by our server timing tracking code. Our first reaction was to assume that our tracking was actually really broken. After reviewing and testing the implementation, it is actually working perfectly.
Implementation
So basically to try to track time, we just use middleware and some decorators to track different things:
- Time spent in the db
- Time spent in redis
- Time spent in calling some third parties that we depend on
The middleware is pretty simple:
class ServerTimingMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request: Request, call_next):
self._init_tracking()
response = await call_next(request)
self._finish_tracking()
response.headers["Server-Timing"] = self._generate_server_timing_header()
return response
app.middleware("http")(catch_exceptions_middleware)
app.add_middleware(ServerTimingMiddleware) # this is the middleware
# rest of the middleware
This is a simplified example, but it gets the point across.
Question
What is going on? I’m pretty sure we must be ignoring something, but it baffles me that only 3% of the time it takes to generate the response is spent on our code. Any suggestions?