How to Create Structured and Traceable Logs in FastAPI Applications
Log messages are fundamental for monitoring applications in production, but often these messages are inadequate. This article presents how to enhance log message generation in FastAPI applications using Loguru as the logging module, JSON formatting, and log message identification through request IDs.
Introduction
Logs are the primary source of information about an application's behavior in production. They are crucial for identifying issues, tracking events, and monitoring application performance. However, most applications create log messages in an inadequate manner. Logs are often generated in plain text format, without a defined structure, making processing and analysis challenging. Additionally, there is no link between log messages, making it impossible to trace the processing of a specific request in a web application.
The objective of this article is to improve log message generation in FastAPI applications. To achieve this, we will implement the following features:
- Use of Loguru to replace the default Python logging module. Loguru is a good alternative because it is simpler to configure and use.
- Logs in JSON format. The JSON format is easy to read and has a well-defined structure that facilitates processing. It is widely accepted for integration with centralized logging services and command-line tools such as jq.
- Association of every log message with a request ID to enable tracking of the entire processing of a response.
- Recording the total processing time of each request. This information will be essential for monitoring the application's performance.
- Logging application errors and exceptions to assist in debugging issues."
Logs in JSON Format
When using loguru
, the log message serialization in JSON is done through the serialize=True
parameter during the inclusion of the handler
:
$ python -m venv .venv $ source .venv/bin/activate (.venv) $ pip install loguru (.venv) $ python Python 3.11.4 (main, Aug 10 2023, 20:08:21) [GCC 12.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> from loguru import logger >>> logger.remove() >>> logger.add(sys.stderr, serialize=True) 1 >>> logger.info('Hello World', request_id='abcdef') {"text": "2023-10-16 07:22:49.234 | INFO | __main__:<module>:1 - Hello World\n", "record": {"elapsed": {"repr": "4:50:14.048257", "seconds": 17414.048257}, "exception": null, "extra": {"request_id": "abcdef"}, "file": {"name": "<stdin>", "path": "<stdin>"}, "function": "<module>", "level": {"icon": "i", "name": "INFO", "no": 20}, "line": 1, "message": "Hello World", "module": "<stdin>", "name": "__main__", "process": {"id": 834569, "name": "MainProcess"}, "thread": {"id": 140172067380096, "name": "MainThread"}, "timestamp": {"repr": "2023-10-16 07:22:49.234603-03:00", "timestamp": 1697451769.234603}}}
Below, we have the same log, but in an improved formatting:
{ "text": "2023-10-16 07:22:49.234 | INFO | __main__:<module>:1 - Hello World\n", "record": { "elapsed": { "repr": "4:50:14.048257", "seconds": 17414.048257 }, "exception": null, "extra": { "request_id": "abcdef" }, "file": { "name": "<stdin>", "path": "<stdin>" }, "function": "<module>", "level": { "icon": "i", "name": "INFO", "no": 20 }, "line": 1, "message": "Hello World", "module": "<stdin>", "name": "__main__", "process": { "id": 834569, "name": "MainProcess" }, "thread": { "id": 140172067380096, "name": "MainThread" }, "timestamp": { "repr": "2023-10-16 07:22:49.234603-03:00", "timestamp": 1697451769.234603 } } }
There are some issues with this structure:
- There are too many levels of nesting. This makes reading and processing logs difficult.
- Some fields are unnecessary, such as
process
andthread
. - The
elapsed
field doesn`t represent the response time of a request but the time since the application started. - The
request_id
field is nested within theextra
field, but it would be better placed at the top level. - The
level
field represents the message level using a dictionary withicon
,name
, andno
fields. It would be better if it were a singlelevel
field with the value fromname
. - The source of the log message is presented in the
text
field and also broken down into different fields inrecord
. It would be better if there were a singlesource
field with the value<file.name>:<function>:<line>
, as initially presented.
We will resolve these issues through custom log message serialization.
First Version of Custom Serialization
The recommendation from the Loguru documentation for customizing log serialization is to create a specific serialization function and associate it with a sink. This will be done in a new module named logging.py
:
The serialize
function receives a dictionary record
with the log message fields and returns a string with the serialized message in JSON format.
The extra
fields from the original record are moved to the top level of the log message (line 16).
The conversion to JSON is done using the orjson library, which is faster than the standard json library (line 17).
The call to orjson.dumps passes str as the default function to serialize unsupported types. It may not always produce the best results, but it prevents the log from being interrupted by a serialization exception.
The result for the command logger.info('Hello World', request_id='abcdef')
is now:
{"timestamp": "2023-10-16T11:01:03.223737-03:00", "level": "INFO", "message": "Hello World", "source": "<stdin>:<module>:1", "request_id": "abcdef"}
This first version of serialize
fixes the issues listed earlier, but there's still room for further improvement. We can enhance the function to produce a more developer-friendly formatting with indentation and syntax highlighting.
Second Version with Indentation and Syntax Highlighting
The second version of the serialize
function is as follows:
During development, when the DEBUG
configuration variable is set to True
, the JSON formatting is improved with indentation and syntax highlighting.
Indentation is achieved by adding the OPT_INDENT_2
option to the orjson
serialization options (lines 14 and 15).
Syntax highlighting is done using the highlight
function from the Pygments library (line 28). The formatting style is determined by the PYGMENTS_STYLE
configuration variable (line 12), which can be customized by the developer by setting an environment variable with the same name.
The output for the command logger.info('Hello World', request_id='abcdef')
appears as follows during development and testing:
Third and Final Version of Custom Serialization
There's one more adjustment to make. In case of an exception, the log message should include the exception's traceback
. Its formatting will be done using the stackprinter library, which provides an improved context of the exception with the values of nearby variables:
Adjustments to the Minimal FastAPI Project
The template for the Minimal FastAPI project requires some adjustments to adopt the presented improvements. First, you need to add orjson
, pygments
, and stackprinter
to the project's dependencies:
Second, the call to init_loguru
should be made during the application initialization in resources.py
:
With these changes, log messages will now be generated in JSON format. The next step is to include the request identifier in all log messages.
Request Tracking
The processing of a response to a request is recorded in the log in several different messages, interspersed with messages from other concurrent processing. If we generate a unique identifier for each request and use this identifier in subsequent log messages, we can filter and group related log messages to track the entire processing.
For a FastAPI application, a good way to generate the request identifier is through middleware. The following middleware
implementation creates a random ID for each request and includes it in the context of the logger to be included in all log messages for the same processing. Additionally, it also records the processing time for each request.
Let's analyze the code in the following sections.
Request Identifier
The request identifier is generated through the secrets.token_urlsafe function, which returns a random text encoded in Base64. The size of the identifier does not need to be very large as it will not be used for encryption. A default value of 8
bytes for the configuration variable REQUEST_ID_LENGTH
gives us 18.446.744.073.709.551.616
possible combinations, which is enough for a random identifier.
Passing the Request Identifier for Subsequent Calls to the Logger
On line 21, the request identifier is included in the logger
's context through the contextualize function. This way, all calls to logger
in the same thread or asynchronous task automatically receive the request identifier within the extra
field, without the need to explicitly pass it as a parameter. The extra
field is then expanded to the first level in the function serialize, as we saw in the previous section.
Request and Response Data
At the end of the request processing, a log message is displayed with the processing time and the data of the request and the response. Instead of extracting request and response information directly from objects request
and response
, an AccessLogAtoms object is used to do the heavy lifting (line 30).
Registering the Middleware in the Application
The middleware
must be registered in the application for it to be activated. In the minimal FastAPI project, it happens in main.py
:
The activation is done by calling add_middleware, passing the function log_request_middleware
as the dispatch
parameter (line 22).
Log Message After Middleware is Registered
Considering the /hello
endpoint:
After middleware
is implemented and registered in the application, the log messages for a call to /hello
look like this:
{ "timestamp": "2023-10-18T07:23:59.274914+00:00", "level": "INFO", "message": "Hello world!", "source": "hello.py:hello_world:9", "request_id": "M5L1Dftkkog" } { "timestamp": "2023-10-18T07:23:59.275833+00:00", "level": "INFO", "message": "log request", "source": "middleware.py:log_request_middleware:31", "request_id": "M5L1Dftkkog", "remote_ip": "192.168.112.3:47140", "schema": "http", "protocol": "2", "method": "GET", "path_with_query": "/hello", "status_code": 200, "response_length": "25", "elapsed": 0.0012085437774658203, "referer": "-", "user_agent": "HTTPie/3.2.2" }
The first log message is generated upon calling /hello
, as indicated by the field source. The second is the final processing log message, generated in log_request_middleware
.
Logging Validation Errors
A validation error 422
is generated when an endpoint receives unprocessable parameters. Let the endpoint below calculate the division of two integers:
Calling this endpoint with a float
instead of an int
generates a validation error:
$ httpx --http2 --no-verify 'https://localhost/api/divide?a=1.5&b=2' HTTP/2 422 Unprocessable Entity alt-svc: h3=":443"; ma=2592000 content-type: application/json date: Thu, 19 Oct 2023 08:05:14 GMT server: Caddy server: hypercorn-h2 x-processed-time: 0.0010986328125 x-request-id: Gw5vFSCiHFM content-length: 201 { "detail": [ { "type": "int_parsing", "loc": [ "query", "a" ], "msg": "Input should be a valid integer, unable to parse string as an integer", "input": "1.5", "url": "https://errors.pydantic.dev/2.4/v/int_parsing" } ] }
Even with middleware
implemented, the validation error is not recorded in the log. We only get the log from the request processing:
{ "timestamp": "2023-10-19T08:05:14.601434+00:00", "level": "INFO", "message": "log request", "source": "middleware.py:log_request_middleware:30", "request_id": "Gw5vFSCiHFM", "remote_ip": "192.168.128.3:37830", "schema": "http", "protocol": "2", "method": "GET", "path_with_query": "/divide?a=1.5&b=2", "status_code": 422, "response_length": "201", "elapsed": 0.0010986328125, "referer": "-", "user_agent": "python-httpx/0.24.1" }
Logging validation errors can be useful for debugging validation issues during development and testing. To do that, we are going to create an exception handler
that will intercept RequestValidationError
exceptions. The idea is to just log the validation error and then call the FastAPI default exception handler:
A call to /divide?a=1.5&b=2
now generates the following log message:
{ "timestamp": "2023-10-19T08:05:14.600671+00:00", "level": "INFO", "message": "request validation exception", "source": "exception_handlers.py:request_validation_exception_handler:17", "request_id": "Gw5vFSCiHFM", "method": "GET", "path_with_query": "/divide?a=1.5&b=2", "detail": [ { "type": "int_parsing", "loc": [ "query", "a" ], "msg": "Input should be a valid integer, unable to parse string as an integer", "input": "1.5", "url": "https://errors.pydantic.dev/2.4/v/int_parsing" } ] }
Logging Unhandled Exceptions
Another type of error that can occur during the processing of a request is an unhandled exception. For example, if the endpoint /divide
is called with the parameter b=0
, an exception ZeroDivisionError
will be thrown. The client receives a 500
response with a generic error message:
$ httpx --http2 --no-verify 'https://localhost/api/divide?a=1&b=0' HTTP/2 500 Internal Server Error alt-svc: h3=":443"; ma=2592000 content-type: text/plain; charset=utf-8 date: Thu, 19 Oct 2023 20:58:55 GMT server: Caddy server: hypercorn-h2 content-length: 21 Internal Server Error
In the current configuration, the FastAPI application does not log the exception, but passes it to the ASGI
server, which creates a server log message:
[2023-10-19 20:58:53 +0000] [10] [INFO] Running on http://0.0.0.0:5000 (CTRL + C to quit) [2023-10-19 20:58:55 +0000] [10] [ERROR] Error in ASGI Framework Traceback (most recent call last): File "/venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 108, in __call__ response = await self.dispatch_func(request, call_next) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/fastapi_minimum/middleware.py", line 22, in log_request_middleware response = await call_next(request) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 84, in call_next raise app_exc File "/venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 70, in coro await self.app(scope, receive_or_disconnect, send_no_error) File "/venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in __call__ raise exc File "/venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in __call__ await self.app(scope, receive, sender) File "/venv/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__ raise e File "/venv/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__ await self.app(scope, receive, send) File "/venv/lib/python3.11/site-packages/starlette/routing.py", line 718, in __call__ await route.handle(scope, receive, send) File "/venv/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle await self.app(scope, receive, send) File "/venv/lib/python3.11/site-packages/starlette/routing.py", line 66, in app response = await func(request) ^^^^^^^^^^^^^^^^^^^ File "/venv/lib/python3.11/site-packages/fastapi/routing.py", line 273, in app raw_response = await run_endpoint_function( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/venv/lib/python3.11/site-packages/fastapi/routing.py", line 190, in run_endpoint_function return await dependant.call(**values) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/fastapi_minimum/routers/hello.py", line 15, in divide return a / b ~~^~~ ZeroDivisionError: division by zero ...
There is no connection between the log message and the request, making it difficult to track the request. Ideally, the exception log message should contain the request identifier as in previous log messages.
Typically, we would solve this by creating an exception handler
to log unhandled exceptions:
The exception handler
above creates a log message with the request identifier and the exception, but does not prevent the exception from being passed on to the server ASGI, which continues to generate a duplicate and decontextualized log message (see this link for more details).
To avoid this, let's improve middleware
to also catch unhandled exceptions:
Unhandled exceptions are caught in the try..except
block (lines 23 to 27) and the exception is stored in the variable exception
to be used later in the log message. The default response for unhandled exceptions is a 500
response with a generic error message (line 27).
The request log message type will depend on whether there was an exception or not. If there was no exception, the log message is generated with the level INFO
(line 48). If there was an exception, the log message is generated with the level ERROR
(line 50) and the exception included through the opt method, which allows changing the configuration of a specific call to logger
.
The unhandled traceback
exception is now presented along with the request log message:
{ "timestamp": "2023-10-20T21:18:55.045870+00:00", "level": "ERROR", "message": "Unhandled exception", "source": "middleware.py:log_request_middleware:50", "request_id": "zv9ueo-i1Hg", "remote_ip": "192.168.144.3:36196", "schema": "http", "protocol": "2", "method": "GET", "path_with_query": "/divide?a=1&b=0", "status_code": 500, "response_length": "34", "elapsed": 0.0005097389221191406, "referer": "-", "user_agent": "python-httpx/0.24.1", "exception": "File \"/venv/lib/python3.11/site-packages/starlette/middleware/base.py\", line 78, in call_next\n 31 async def call_next(request: Request) -> Response:\n (...)\n 74 task_group.start_soon(close_recv_stream_on_response_sent)\n 75 task_group.start_soon(coro)..." }
Automated Tests
To ensure that log messages are being generated correctly, let's create some automated tests in test_logging.py
.
Application Fixture for Testing
The first part of the file test_logging.py
is presented below:
Lines 8 to 19 define the fields that must be present in each type of log message. These fields will be used in tests to verify that log messages are being generated correctly.
logging_client
(line 23 to 56) is a fixture that creates a new application with the endpoints necessary for testing, adds the logging middleware
and exception handler
, and returns an HTTP
client for that application.
The /info
endpoint returns a simple response (lines 38 to 40). It represents a normal call to an endpoint without exceptions or validation errors.
/divide
takes two integer parameters a
and b
, and returns the result of dividing a
by b
(lines 42 to 44). This endpoint will be used both to test validation errors, by receiving a float
instead of an integer, and for unhandled exceptions, when receives b=0
and generates a ZeroDivisionError
exception.
The function init_loguru
(line 51) initializes the logger
with custom serialization. There may be other initializations in the main application, but only this one is necessary for logging tests .
Testing JSON Format in Log Messages
In the second part of the same file, we have the first test that verifies whether the log message is generated in the JSON format and contains the expected fields:
capsys is a pytest fixture that captures standard and error output.
Lines 65 through 67 replace the function highlight
in the logging
module with a function that returns the input unchanged. It prevents the output to the terminal from being colored during testing, which would make it difficult to compare with the expected JSON.
The log message must contain at least the fields of request_log_fields
(line 73).
Since it is not an exception, the level must be INFO
(line 74) and the field exception
must not be present (line 75).
Testing Validation Error Log Messages
In the third part of test_logging.py
, we have a test that checks if the log message is generated correctly for a validation error:
The response to the request that causes the validation error generates two log messages: one for the exception and one for the request. The messages are separated by the character sequence \n}\n{\n
(line 93).
The first message is the exception. One of the test conditions is that the detail
in the log message is the same as detail
in the response field.
The second message is the request. Checking the fields is the same as the previous test.
Testing Unhandled Exception Log Messages
In the fourth part of test_logging.py
, we have the test that checks if the log message is generated correctly for an unhandled exception:
The request response is a 500
response with a generic error message (lines 119 and 120). The log message must contain at least the fields of exception_log_fields
(line 123) and the log level must be ERROR
(line 124).
Testing Non-Serializable Types in the Log
In the fifth and final part of test_logging.py
, we have the test that verifies that types that are not serializable do not cause errors in the log:
Although they are not serializable to JSON by default, Path
, datetime
and set
should not cause errors in the log because they are converted to str
before being serialized, as defined in the serialize
function in logging.py
.
Conclusion
FastAPI is one of the most productive and efficient frameworks for developing REST APIs in Python. However, recording log messages is not part of FastAPI's native functionality and needs to be implemented separately.
The solution presented in this article starts with configuring Loguru
to generate log messages in JSON format. From there, we created additional configurations for FastAPI based on middleware
and an exception handlers
to generate log messages with unique identifiers for each request, record the processing time of each response, and record validation errors and unhandled exceptions. These configurations have been tested and validated through automated tests.
The end result is an efficient and flexible logging system that can be used as a basis for developing any FastAPI application. Adjustments can be made to meet the specific needs of each project, but basic monitoring and debugging needs are already covered.
The presented code can be used with few modifications in other FastAPI projects. But if you want to start a new FastAPI project with these features, the best option is to use the template available on GitHub. To instantiate a new project, you need to use cookiecutter. I recommend combining it with pipx:
References
- A Complete Guide to Logging in Python with Loguru
- FastAPI: Experiment Middleware feature
- OWASP - Logging Cheat Sheet
- Logging in Python: A Comparison of the Top 6 Libraries
Next article: Developing a FastAPI Application for File Sharing - Part 1
Previous article: Packaging and Distribution of the Minimal FastAPI Project
Comments
Comments powered by Disqus