How to Create Structured and Traceable Logs in FastAPI Applications

André Felipe Dias

2023-10-25

Also available in:

https://www.flickr.com/photos/gregloby/3763720734/

Log messages are fundamental for monitoring applications in production, but often these messages are inadequate. This article presents how to enhance log message generation in FastAPI applications using Loguru as the logging module, JSON formatting, and log message identification through request IDs.

Introduction

Logs are the primary source of information about an application's behavior in production. They are crucial for identifying issues, tracking events, and monitoring application performance. However, most applications create log messages in an inadequate manner. Logs are often generated in plain text format, without a defined structure, making processing and analysis challenging. Additionally, there is no link between log messages, making it impossible to trace the processing of a specific request in a web application.

The objective of this article is to improve log message generation in FastAPI applications. To achieve this, we will implement the following features:

Use of Loguru to replace the default Python logging module. Loguru is a good alternative because it is simpler to configure and use.
Logs in JSON format. The JSON format is easy to read and has a well-defined structure that facilitates processing. It is widely accepted for integration with centralized logging services and command-line tools such as jq.
Association of every log message with a request ID to enable tracking of the entire processing of a response.
Recording the total processing time of each request. This information will be essential for monitoring the application's performance.
Logging application errors and exceptions to assist in debugging issues."

Logs in JSON Format

When using loguru, the log message serialization in JSON is done through the serialize=True parameter during the inclusion of the handler:

$ python -m venv .venv
$ source .venv/bin/activate
(.venv) $ pip install loguru
(.venv) $ python
Python 3.11.4 (main, Aug 10 2023, 20:08:21) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> from loguru import logger
>>> logger.remove()
>>> logger.add(sys.stderr, serialize=True)
1
>>> logger.info('Hello World', request_id='abcdef')
{"text": "2023-10-16 07:22:49.234 | INFO     | __main__:<module>:1 - Hello World\n", "record": {"elapsed": {"repr": "4:50:14.048257", "seconds": 17414.048257}, "exception": null, "extra": {"request_id": "abcdef"}, "file": {"name": "<stdin>", "path": "<stdin>"}, "function": "<module>", "level": {"icon": "i", "name": "INFO", "no": 20}, "line": 1, "message": "Hello World", "module": "<stdin>", "name": "__main__", "process": {"id": 834569, "name": "MainProcess"}, "thread": {"id": 140172067380096, "name": "MainThread"}, "timestamp": {"repr": "2023-10-16 07:22:49.234603-03:00", "timestamp": 1697451769.234603}}}

Below, we have the same log, but in an improved formatting:

{
    "text": "2023-10-16 07:22:49.234 | INFO     | __main__:<module>:1 - Hello World\n",
    "record": {
        "elapsed": {
            "repr": "4:50:14.048257",
            "seconds": 17414.048257
        },
        "exception": null,
        "extra": {
            "request_id": "abcdef"
        },
        "file": {
            "name": "<stdin>",
            "path": "<stdin>"
        },
        "function": "<module>",
        "level": {
            "icon": "i",
            "name": "INFO",
            "no": 20
        },
        "line": 1,
        "message": "Hello World",
        "module": "<stdin>",
        "name": "__main__",
        "process": {
            "id": 834569,
            "name": "MainProcess"
        },
        "thread": {
            "id": 140172067380096,
            "name": "MainThread"
        },
        "timestamp": {
            "repr": "2023-10-16 07:22:49.234603-03:00",
            "timestamp": 1697451769.234603
        }
    }
}

There are some issues with this structure:

There are too many levels of nesting. This makes reading and processing logs difficult.
Some fields are unnecessary, such as process and thread.
The elapsed field doesn`t represent the response time of a request but the time since the application started.
The request_id field is nested within the extra field, but it would be better placed at the top level.
The level field represents the message level using a dictionary with icon, name, and no fields. It would be better if it were a single level field with the value from name.
The source of the log message is presented in the text field and also broken down into different fields in record. It would be better if there were a single source field with the value <file.name>:<function>:<line>, as initially presented.

We will resolve these issues through custom log message serialization.

First Version of Custom Serialization

The recommendation from the Loguru documentation for customizing log serialization is to create a specific serialization function and associate it with a sink. This will be done in a new module named logging.py:

	`import sys`

	`import orjson`
	`from loguru import logger`

	`orjson_options = orjson.OPT_NAIVE_UTC`


	`def serialize(record: dict) -> str:`
	`subset = {`
	`'timestamp': record['time'].isoformat(),`
	`'level': record['level'].name,`
	`'message': record['message'],`
	`'source': f'{record["file"].name}:{record["function"]}:{record["line"]}',`
	`}`
	`subset.update(record['extra'])`
	`return orjson.dumps(subset, default=str, options=orjson_options).decode()`


	`def init_loguru() -> None:`
	`logger.remove()`
	`logger.add(lambda message: print(serialize(message.record), file=sys.stderr))`

The serialize function receives a dictionary record with the log message fields and returns a string with the serialized message in JSON format.

The extra fields from the original record are moved to the top level of the log message (line 16).

The conversion to JSON is done using the orjson library, which is faster than the standard json library (line 17).

The call to orjson.dumps passes str as the default function to serialize unsupported types. It may not always produce the best results, but it prevents the log from being interrupted by a serialization exception.

The result for the command logger.info('Hello World', request_id='abcdef') is now:

{"timestamp": "2023-10-16T11:01:03.223737-03:00", "level": "INFO", "message": "Hello World", "source": "<stdin>:<module>:1", "request_id": "abcdef"}

This first version of serialize fixes the issues listed earlier, but there's still room for further improvement. We can enhance the function to produce a more developer-friendly formatting with indentation and syntax highlighting.

Second Version with Indentation and Syntax Highlighting

The second version of the serialize function is as follows:

	`import sys`

	`import orjson`
	`from loguru import logger`
	`from pygments import highlight`
	`from pygments.formatters import Terminal256Formatter`
	`from pygments.lexers import JsonLexer`

	`from . import config`

	`lexer = JsonLexer()`
	`formatter = Terminal256Formatter(style=config.PYGMENTS_STYLE)`
	`orjson_options = orjson.OPT_NAIVE_UTC`
	`if config.DEBUG:`
	`orjson_options \|= orjson.OPT_INDENT_2`


	`def serialize(record: dict) -> str:`
	`subset = {`
	`'timestamp': record['time'].isoformat(),`
	`'level': record['level'].name,`
	`'message': record['message'],`
	`'source': f'{record["file"].name}:{record["function"]}:{record["line"]}',`
	`}`
	`subset.update(record['extra'])`
	`formatted_json = orjson.dumps(subset, default=str, option=orjson_options).decode()`
	`if config.DEBUG:`
	`formatted_json = highlight(formatted_json, lexer, formatter)`
	`return formatted_json`


	`def init_loguru() -> None:`
	`logger.remove()`
	`logger.add(lambda message: print(serialize(message.record), file=sys.stderr))`

During development, when the DEBUG configuration variable is set to True, the JSON formatting is improved with indentation and syntax highlighting.

Indentation is achieved by adding the OPT_INDENT_2 option to the orjson serialization options (lines 14 and 15).

Syntax highlighting is done using the highlight function from the Pygments library (line 28). The formatting style is determined by the PYGMENTS_STYLE configuration variable (line 12), which can be customized by the developer by setting an environment variable with the same name.

The output for the command logger.info('Hello World', request_id='abcdef') appears as follows during development and testing:

{
  "timestamp": "2023-10-16T11:01:03.223737-03:00",
  "level": "INFO",
  "message": "Hello World",
  "source": "<stdin>:<module>:1",
  "request_id": "abcdef"
}

Third and Final Version of Custom Serialization

There's one more adjustment to make. In case of an exception, the log message should include the exception's traceback. Its formatting will be done using the stackprinter library, which provides an improved context of the exception with the values of nearby variables:

	`import sys`

	`import orjson`
	`import stackprinter`
	`from loguru import logger`
	`from pygments import highlight`
	`from pygments.formatters import Terminal256Formatter`
	`from pygments.lexers import JsonLexer`

	`from . import config`

	`lexer = JsonLexer()`
	`formatter = Terminal256Formatter(style=config.PYGMENTS_STYLE)`
	`orjson_options = orjson.OPT_NAIVE_UTC`
	`if config.DEBUG:`
	`orjson_options \|= orjson.OPT_INDENT_2`


	`def serialize(record: dict) -> str:`
	`subset = {`
	`'timestamp': record['time'].isoformat(),`
	`'level': record['level'].name,`
	`'message': record['message'],`
	`'source': f'{record["file"].name}:{record["function"]}:{record["line"]}',`
	`}`
	`subset.update(record['extra'])`
	`if record['exception']:`
	`subset['exception'] = stackprinter.format(record['exception'])`
	`formatted_json = orjson.dumps(subset, default=str, option=orjson_options).decode()`
	`if config.DEBUG:`
	`formatted_json = highlight(formatted_json, lexer, formatter)`
	`return formatted_json`


	`def init_loguru() -> None:`
	`logger.remove()`
	`logger.add(lambda message: print(serialize(message.record), file=sys.stderr)) # type: ignore`

Adjustments to the Minimal FastAPI Project

The template for the Minimal FastAPI project requires some adjustments to adopt the presented improvements. First, you need to add orjson, pygments, and stackprinter to the project's dependencies:

$ poetry add orjson=="*" pygments=="*" stackprinter=="*"

Second, the call to init_loguru should be made during the application initialization in resources.py:

	`from collections.abc import AsyncIterator`
	`from contextlib import asynccontextmanager`

	`from loguru import logger`
	`from fastapi import FastAPI`

	`from . import config`
	`from .logging import init_loguru`


	`@asynccontextmanager`
	`async def lifespan(app: FastAPI) -> AsyncIterator: # noqa: ARG001`
	`await startup()`
	`try:`
	`yield`
	`finally:`
	`await shutdown()`


	`async def startup() -> None:`
	`init_loguru()`
	`show_config()`
	`# insert here calls to connect to database and other services`
	`logger.info('started...')`

	`...`

With these changes, log messages will now be generated in JSON format. The next step is to include the request identifier in all log messages.

Request Tracking

The processing of a response to a request is recorded in the log in several different messages, interspersed with messages from other concurrent processing. If we generate a unique identifier for each request and use this identifier in subsequent log messages, we can filter and group related log messages to track the entire processing.

For a FastAPI application, a good way to generate the request identifier is through middleware. The following middleware implementation creates a random ID for each request and includes it in the context of the logger to be included in all log messages for the same processing. Additionally, it also records the processing time for each request.

	`from collections.abc import Callable`
	`from secrets import token_urlsafe`
	`from time import time`

	`from fastapi import Request, Response`
	`from fastapi.responses import JSONResponse`
	`from hypercorn.logging import AccessLogAtoms`
	`from loguru import logger`

	`from . import config`


	`async def log_request_middleware(request: Request, call_next: Callable) -> Response:`
	`"""`
	`Uniquely identify each request and logs its processing time.`
	`"""`
	`start_time = time()`
	`request_id: str = token_urlsafe(config.REQUEST_ID_LENGTH)`

	`# keep the same request_id in the context of all subsequent calls to logger`
	`with logger.contextualize(request_id=request_id):`
	`response = await call_next(request)`
	`final_time = time()`
	`elapsed = final_time - start_time`

	`response_dict = {`
	`'status': response.status_code,`
	`'headers': response.headers.raw,`
	`}`
	`atoms = AccessLogAtoms(request, response_dict, final_time) # type: ignore`
	`logger.info(`
	`'log request',`
	`remote_ip=request.headers.get('x-forwarded-for') or atoms['h'],`
	`schema=request.headers.get('x-forwarded-proto') or atoms['S'],`
	`protocol=atoms['H'],`
	`method=atoms['m'],`
	`path_with_query=atoms['Uq'],`
	`status_code=response.status_code,`
	`response_length=atoms['b'],`
	`elapsed=elapsed,`
	`referer=atoms['f'],`
	`user_agent=atoms['a'],`
	`)`
	`response.headers['X-Request-ID'] = request_id`
	`response.headers['X-Processed-Time'] = str(elapsed)`
	`return response`

Let's analyze the code in the following sections.

Request Identifier

The request identifier is generated through the secrets.token_urlsafe function, which returns a random text encoded in Base64. The size of the identifier does not need to be very large as it will not be used for encryption. A default value of 8 bytes for the configuration variable REQUEST_ID_LENGTH gives us 18.446.744.073.709.551.616 possible combinations, which is enough for a random identifier.

Passing the Request Identifier for Subsequent Calls to the Logger

On line 21, the request identifier is included in the logger's context through the contextualize function. This way, all calls to logger in the same thread or asynchronous task automatically receive the request identifier within the extra field, without the need to explicitly pass it as a parameter. The extra field is then expanded to the first level in the function serialize, as we saw in the previous section.

Request and Response Data

At the end of the request processing, a log message is displayed with the processing time and the data of the request and the response. Instead of extracting request and response information directly from objects request and response, an AccessLogAtoms object is used to do the heavy lifting (line 30).

Registering the Middleware in the Application

The middleware must be registered in the application for it to be activated. In the minimal FastAPI project, it happens in main.py:

	`from fastapi import FastAPI`
	`from fastapi.exceptions import RequestValidationError`
	`from starlette.middleware.base import BaseHTTPMiddleware`

	`from . import config # noqa: F401`
	`from .resources import lifespan`
	`from .routers import hello`
	`from .middleware import log_request_middleware`

	`app = FastAPI(`
	`title='fastapi_minimum',`
	`lifespan=lifespan,`
	`)`

	`routers = (`
	`hello.router,`
	`)`

	`for router in routers:`
	`app.include_router(router)`

	`app.add_middleware(BaseHTTPMiddleware, dispatch=log_request_middleware)`

The activation is done by calling add_middleware, passing the function log_request_middleware as the dispatch parameter (line 22).

Log Message After Middleware is Registered

Considering the /hello endpoint:

	`from fastapi import APIRouter`
	`from loguru import logger`

	`router = APIRouter()`


	`@router.get('/hello')`
	`async def hello_world() -> dict[str, str]:`
	`logger.info('Hello world!')`
	`return {'message': 'Hello World'}`

After middleware is implemented and registered in the application, the log messages for a call to /hello look like this:

{
  "timestamp": "2023-10-18T07:23:59.274914+00:00",
  "level": "INFO",
  "message": "Hello world!",
  "source": "hello.py:hello_world:9",
  "request_id": "M5L1Dftkkog"
}

{
  "timestamp": "2023-10-18T07:23:59.275833+00:00",
  "level": "INFO",
  "message": "log request",
  "source": "middleware.py:log_request_middleware:31",
  "request_id": "M5L1Dftkkog",
  "remote_ip": "192.168.112.3:47140",
  "schema": "http",
  "protocol": "2",
  "method": "GET",
  "path_with_query": "/hello",
  "status_code": 200,
  "response_length": "25",
  "elapsed": 0.0012085437774658203,
  "referer": "-",
  "user_agent": "HTTPie/3.2.2"
}

The first log message is generated upon calling /hello, as indicated by the field source. The second is the final processing log message, generated in log_request_middleware.

Logging Validation Errors

A validation error 422 is generated when an endpoint receives unprocessable parameters. Let the endpoint below calculate the division of two integers:

@router.get('/divide')
def divide(a: int, b: int) -> float:
    return a / b

Calling this endpoint with a float instead of an int generates a validation error:

$ httpx --http2 --no-verify 'https://localhost/api/divide?a=1.5&b=2'
HTTP/2 422 Unprocessable Entity
alt-svc: h3=":443"; ma=2592000
content-type: application/json
date: Thu, 19 Oct 2023 08:05:14 GMT
server: Caddy
server: hypercorn-h2
x-processed-time: 0.0010986328125
x-request-id: Gw5vFSCiHFM
content-length: 201

{
    "detail": [
        {
            "type": "int_parsing",
            "loc": [
                "query",
                "a"
            ],
            "msg": "Input should be a valid integer, unable to parse string as an
integer",
            "input": "1.5",
            "url": "https://errors.pydantic.dev/2.4/v/int_parsing"
        }
    ]
}

Even with middleware implemented, the validation error is not recorded in the log. We only get the log from the request processing:

{
  "timestamp": "2023-10-19T08:05:14.601434+00:00",
  "level": "INFO",
  "message": "log request",
  "source": "middleware.py:log_request_middleware:30",
  "request_id": "Gw5vFSCiHFM",
  "remote_ip": "192.168.128.3:37830",
  "schema": "http",
  "protocol": "2",
  "method": "GET",
  "path_with_query": "/divide?a=1.5&b=2",
  "status_code": 422,
  "response_length": "201",
  "elapsed": 0.0010986328125,
  "referer": "-",
  "user_agent": "python-httpx/0.24.1"
}

Logging validation errors can be useful for debugging validation issues during development and testing. To do that, we are going to create an exception handler that will intercept RequestValidationError exceptions. The idea is to just log the validation error and then call the FastAPI default exception handler:

	`from fastapi import Request`
	`from fastapi.exception_handlers import (`
	`request_validation_exception_handler as _request_validation_exception_handler,`
	`)`
	`from fastapi.exceptions import RequestValidationError`
	`from fastapi.responses import JSONResponse`
	`from loguru import logger`


	`async def request_validation_exception_handler(`
	`request: Request, exc: RequestValidationError`
	`) -> JSONResponse:`
	`"""`
	`This is a wrapper to the default RequestValidationException handler of FastAPI`
	`that logs the exception for easier debugging.`
	`"""`
	`method = request.get('method', 'GET')`
	`query_string = request['query_string'].decode()`
	`path_with_query = request['path'] + ('?' + query_string if query_string else '')`
	`logger.info('request validation exception', method=method, path_with_query=path_with_query, detail=exc.errors())`
	`return await _request_validation_exception_handler(request, exc)`

A call to /divide?a=1.5&b=2 now generates the following log message:

{
  "timestamp": "2023-10-19T08:05:14.600671+00:00",
  "level": "INFO",
  "message": "request validation exception",
  "source": "exception_handlers.py:request_validation_exception_handler:17",
  "request_id": "Gw5vFSCiHFM",
  "method": "GET",
  "path_with_query": "/divide?a=1.5&b=2",
  "detail": [
    {
      "type": "int_parsing",
      "loc": [
        "query",
        "a"
      ],
      "msg": "Input should be a valid integer, unable to parse string as an integer",
      "input": "1.5",
      "url": "https://errors.pydantic.dev/2.4/v/int_parsing"
    }
  ]
}

Logging Unhandled Exceptions

Another type of error that can occur during the processing of a request is an unhandled exception. For example, if the endpoint /divide is called with the parameter b=0, an exception ZeroDivisionError will be thrown. The client receives a 500 response with a generic error message:

$ httpx --http2 --no-verify 'https://localhost/api/divide?a=1&b=0'
HTTP/2 500 Internal Server Error
alt-svc: h3=":443"; ma=2592000
content-type: text/plain; charset=utf-8
date: Thu, 19 Oct 2023 20:58:55 GMT
server: Caddy
server: hypercorn-h2
content-length: 21

Internal Server Error

In the current configuration, the FastAPI application does not log the exception, but passes it to the ASGI server, which creates a server log message:

[2023-10-19 20:58:53 +0000] [10] [INFO] Running on http://0.0.0.0:5000 (CTRL + C to quit)
[2023-10-19 20:58:55 +0000] [10] [ERROR] Error in ASGI Framework
Traceback (most recent call last):
  File "/venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 108, in __call__
    response = await self.dispatch_func(request, call_next)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/fastapi_minimum/middleware.py", line 22, in log_request_middleware
    response = await call_next(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 84, in call_next
    raise app_exc
  File "/venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 70, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/venv/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/venv/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/venv/lib/python3.11/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/venv/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/venv/lib/python3.11/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/fastapi/routing.py", line 190, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/fastapi_minimum/routers/hello.py", line 15, in divide
    return a / b
           ~~^~~
ZeroDivisionError: division by zero
...

There is no connection between the log message and the request, making it difficult to track the request. Ideally, the exception log message should contain the request identifier as in previous log messages.

Typically, we would solve this by creating an exception handler to log unhandled exceptions:

	`async def unhandled_exception_handler(request: Request, exc: Exception) -> JSONResponse:`
	`method = request.get('method', 'GET')`
	`query_string = request['query_string'].decode()`
	`path_with_qs = request['path'] + ('?' + query_string if query_string else '')`
	`logger.opt(exception=exc).info('Unhandled exception', method=method, path=path)`
	`return JSONResponse(content={'detail': 'Internal Server Error'}, status_code=500)`

The exception handler above creates a log message with the request identifier and the exception, but does not prevent the exception from being passed on to the server ASGI, which continues to generate a duplicate and decontextualized log message (see this link for more details).

To avoid this, let's improve middleware to also catch unhandled exceptions:

	`async def log_request_middleware(request: Request, call_next: Callable) -> Response:`
	`"""`
	`Uniquely identify each request and logs its processing time.`
	`"""`
	`start_time = time()`
	`request_id: str = token_urlsafe(config.REQUEST_ID_LENGTH)`
	`exception = None`

	`# keep the same request_id in the context of all subsequent calls to logger`
	`with logger.contextualize(request_id=request_id):`
	`try:`
	`response = await call_next(request)`
	`except Exception as exc:`
	`exception = exc`
	`response = PlainTextResponse('Internal Server Error', status_code=500)`
	`final_time = time()`
	`elapsed = final_time - start_time`
	`response_dict = {`
	`'status': response.status_code,`
	`'headers': response.headers.raw,`
	`}`
	`atoms = AccessLogAtoms(request, response_dict, final_time) # type: ignore`
	`data = {`
	`'remote_ip': request.headers.get('x-forwarded-for') or atoms['h'],`
	`'schema': request.headers.get('x-forwarded-proto') or atoms['S'],`
	`'protocol': atoms['H'],`
	`'method': atoms['m'],`
	`'path_with_query': atoms['Uq'],`
	`'status_code': response.status_code,`
	`'response_length': response_length,`
	`'elapsed': elapsed,`
	`'referer': atoms['f'],`
	`'user_agent': atoms['a'],`
	`}`
	`if not exception:`
	`logger.info('log request', **data)`
	`else:`
	`logger.opt(exception=exception).error('unhandled exception', **data)`
	`response.headers['X-Request-ID'] = request_id`
	`response.headers['X-Processed-Time'] = str(elapsed)`
	`return response`

Unhandled exceptions are caught in the try..except block (lines 23 to 27) and the exception is stored in the variable exception to be used later in the log message. The default response for unhandled exceptions is a 500 response with a generic error message (line 27).

The request log message type will depend on whether there was an exception or not. If there was no exception, the log message is generated with the level INFO (line 48). If there was an exception, the log message is generated with the level ERROR (line 50) and the exception included through the opt method, which allows changing the configuration of a specific call to logger.

exception cannot be passed as a parameter to logger.error() because it would be added to the extra field and would need another processing to be correctly serialized. Using logger.opt(exception=exception).error(...), exception is added to the first level of the log message and serialized without further changes.

The unhandled traceback exception is now presented along with the request log message:

{
  "timestamp": "2023-10-20T21:18:55.045870+00:00",
  "level": "ERROR",
  "message": "Unhandled exception",
  "source": "middleware.py:log_request_middleware:50",
  "request_id": "zv9ueo-i1Hg",
  "remote_ip": "192.168.144.3:36196",
  "schema": "http",
  "protocol": "2",
  "method": "GET",
  "path_with_query": "/divide?a=1&b=0",
  "status_code": 500,
  "response_length": "34",
  "elapsed": 0.0005097389221191406,
  "referer": "-",
  "user_agent": "python-httpx/0.24.1",
  "exception": "File \"/venv/lib/python3.11/site-packages/starlette/middleware/base.py\", line 78, in call_next\n    31   async def call_next(request: Request) -> Response:\n (...)\n    74       task_group.start_soon(close_recv_stream_on_response_sent)\n    75       task_group.start_soon(coro)..."
}

Automated Tests

To ensure that log messages are being generated correctly, let's create some automated tests in test_logging.py.

Application Fixture for Testing

The first part of the file test_logging.py is presented below:

	`import json`
	`from collections.abc import AsyncIterable`
	`from unittest.mock import patch`

	`from fastapi import APIRouter, FastAPI, status`
	`from httpx import AsyncClient`
	`from pytest import CaptureFixture, fixture`

	`basic_log_fields = {`
	`'timestamp',`
	`'level',`
	`'message',`
	`'source',`
	`'request_id',`
	`'method',`
	`'path_with_query',`
	`}`
	`failed_validation_log_fields = basic_log_fields \| {'detail'}`
	`request_log_fields = basic_log_fields \| {'protocol', 'schema', 'elapsed', 'status_code'}`
	`exception_log_fields = request_log_fields \| {'exception'}`


	`@fixture(scope='module')`
	`async def logging_client() -> AsyncIterable[AsyncClient]:`
	`"""`
	`Independent client/app instance for testing the logging.`
	`"""`
	`from fastapi_minimum.logging import init_loguru`
	`from fastapi_minimum.main import (`
	`BaseHTTPMiddleware,`
	`RequestValidationError,`
	`log_request_middleware,`
	`request_validation_exception_handler,`
	`)`

	`router = APIRouter()`

	`@router.get('/info')`
	`async def info() -> dict[str, int]:`
	`return {'data': 1234}`

	`@router.get('/divide')`
	`async def divide(a: int, b: int) -> float:`
	`return a / b`

	`app = FastAPI()`
	`app.include_router(router)`
	`app.add_middleware(BaseHTTPMiddleware, dispatch=log_request_middleware)`
	`app.add_exception_handler(RequestValidationError, request_validation_exception_handler)`

	`init_loguru()`

	`async with AsyncClient(app=app, base_url='http://test_logging') as client:`
	`yield client`

	`return`

Lines 8 to 19 define the fields that must be present in each type of log message. These fields will be used in tests to verify that log messages are being generated correctly.

logging_client (line 23 to 56) is a fixture that creates a new application with the endpoints necessary for testing, adds the logging middleware and exception handler, and returns an HTTP client for that application.

The /info endpoint returns a simple response (lines 38 to 40). It represents a normal call to an endpoint without exceptions or validation errors.

/divide takes two integer parameters a and b, and returns the result of dividing a by b (lines 42 to 44). This endpoint will be used both to test validation errors, by receiving a float instead of an integer, and for unhandled exceptions, when receives b=0 and generates a ZeroDivisionError exception.

The function init_loguru (line 51) initializes the logger with custom serialization. There may be other initializations in the main application, but only this one is necessary for logging tests .

Testing JSON Format in Log Messages

In the second part of the same file, we have the first test that verifies whether the log message is generated in the JSON format and contains the expected fields:

	`async def test_json_logging(`
	`logging_client: AsyncClient, capsys: CaptureFixture`
	`) -> None:`
	`"""`
	`Test that the log is in JSON format.`
	`"""`
	`with patch(`
	`'fastapi_minimum.logging.highlight', side_effect=lambda x, y, z: x`
	`): # prevents highlighting`
	`response = await logging_client.get('/info')`
	`assert response.status_code == status.HTTP_200_OK`
	`assert response.json() == {'data': 1234}`

	`log = json.loads(capsys.readouterr().err)`
	`assert request_log_fields <= set(log.keys())`
	`assert log['level'] == 'INFO'`
	`assert 'exception' not in log`

capsys is a pytest fixture that captures standard and error output.

Lines 65 through 67 replace the function highlight in the logging module with a function that returns the input unchanged. It prevents the output to the terminal from being colored during testing, which would make it difficult to compare with the expected JSON.

The log message must contain at least the fields of request_log_fields (line 73).

Since it is not an exception, the level must be INFO (line 74) and the field exception must not be present (line 75).

Testing Validation Error Log Messages

In the third part of test_logging.py, we have a test that checks if the log message is generated correctly for a validation error:

	`async def test_logging_422_exception(`
	`app_with_logging_routes: FastAPI, client: AsyncClient, capsys: CaptureFixture`
	`) -> None:`
	`"""`
	`Test if the log contains the exception when the request is invalid.`
	`"""`
	`with patch(`
	`'fastapi_minimum.logging.highlight', side_effect=lambda x, y, z: x`
	`): # prevents highlighting`
	`response = await client.get('/test_logging/divide', params={'a': 1.1, 'b': 0})`
	`assert response.status_code == status.HTTP_422_UNPROCESSABLE_ENTITY`
	`detail = response.json()['detail']`

	`# there must be 2 log entries: 1 for the exception and 1 for the request`
	`text = capsys.readouterr().err`
	`position = text.index('\n}\n{\n')`

	`# test validation log`
	`validation_log = json.loads(text[: position + 2])`
	`assert failed_validation_log_fields <= set(validation_log.keys())`
	`assert 'exception' not in validation_log`
	`assert validation_log['level'] == 'INFO'`
	`assert detail == validation_log['detail']`

	`# test request_log`
	`request_log = json.loads(text[position + 3 :])`
	`assert request_log_fields <= set(request_log.keys())`
	`assert request_log['level'] == 'INFO'`
	`assert 'exception' not in request_log`

The response to the request that causes the validation error generates two log messages: one for the exception and one for the request. The messages are separated by the character sequence \n}\n{\n (line 93).

The first message is the exception. One of the test conditions is that the detail in the log message is the same as detail in the response field.

The second message is the request. Checking the fields is the same as the previous test.

Testing Unhandled Exception Log Messages

In the fourth part of test_logging.py, we have the test that checks if the log message is generated correctly for an unhandled exception:

	`async def test_logging_500_exception(`
	`app_with_logging_routes: FastAPI, client: AsyncClient, capsys: CaptureFixture`
	`) -> None:`
	`"""`
	`Test the log message of a unhandled exception.`
	`"""`
	`with patch(`
	`'fastapi_minimum.logging.highlight', side_effect=lambda x, y, z: x`
	`): # prevents highlighting`
	`response = await client.get('/test_logging/divide', params={'a': 1, 'b': 0})`
	`assert response.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR`
	`assert response.text == 'Internal Server Error'`

	`log = json.loads(capsys.readouterr().err)`
	`assert exception_log_fields <= set(log.keys())`
	`assert log['level'] == 'ERROR'`

The request response is a 500 response with a generic error message (lines 119 and 120). The log message must contain at least the fields of exception_log_fields (line 123) and the log level must be ERROR (line 124).

Testing Non-Serializable Types in the Log

In the fifth and final part of test_logging.py, we have the test that verifies that types that are not serializable do not cause errors in the log:

	`async def test_default_encoding(logging_client: AsyncClient, capsys: CaptureFixture) -> None:`
	`"""`
	`Test the encoding behavior for non-serializable types.`
	`"""`
	`from datetime import datetime`
	`from pathlib import Path`

	`from loguru import logger`

	`# Path, datetime and set are non-serializable types by default`
	`for param in (Path('.'), datetime.now(), {1, 2}):`
	`logger.info('test param encoding', param=param)`
	`assert 'TypeError:' not in capsys.readouterr().err`

Although they are not serializable to JSON by default, Path, datetime and set should not cause errors in the log because they are converted to str before being serialized, as defined in the serialize function in logging.py.

Conclusion

FastAPI is one of the most productive and efficient frameworks for developing REST APIs in Python. However, recording log messages is not part of FastAPI's native functionality and needs to be implemented separately.

The solution presented in this article starts with configuring Loguru to generate log messages in JSON format. From there, we created additional configurations for FastAPI based on middleware and an exception handlers to generate log messages with unique identifiers for each request, record the processing time of each response, and record validation errors and unhandled exceptions. These configurations have been tested and validated through automated tests.

The end result is an efficient and flexible logging system that can be used as a basis for developing any FastAPI application. Adjustments can be made to meet the specific needs of each project, but basic monitoring and debugging needs are already covered.

The presented code can be used with few modifications in other FastAPI projects. But if you want to start a new FastAPI project with these features, the best option is to use the template available on GitHub. To instantiate a new project, you need to use cookiecutter. I recommend combining it with pipx:

$ pipx run cookiecutter gh:andredias/perfect_python_project \
       -c fastapi-minimum

References

Next article: Developing a FastAPI Application for File Sharing - Part 1

Previous article: Packaging and Distribution of the Minimal FastAPI Project

Blog

Introduction

Logs in JSON Format

First Version of Custom Serialization

Second Version with Indentation and Syntax Highlighting

Third and Final Version of Custom Serialization

Adjustments to the Minimal FastAPI Project

Request Tracking

Request Identifier

Passing the Request Identifier for Subsequent Calls to the Logger

Request and Response Data

Registering the Middleware in the Application

Log Message After Middleware is Registered

Logging Validation Errors

Logging Unhandled Exceptions

Automated Tests

Application Fixture for Testing

Testing JSON Format in Log Messages

Testing Validation Error Log Messages

Testing Unhandled Exception Log Messages

Testing Non-Serializable Types in the Log

Conclusion

References

Comments