In this post I’ll show several approaches to logging in Python.

  • We’ll start off with the simplest, most basic approach: using print statements. This is often looked down on, but it does have some real utility and there are a few useful hints you can use to get the most of this approach.

  • Next, we will move onto a more formal approach of using Python’s built-in logging capabilities.

  • Finally, we’ll look at the structlog library, and talk about why I consider it to be the best approach. We’ll also look at the pros and cons of some other third-party logging libraries, and why I would choose structlog over them.

Basic: print statements

Sometimes, a print statement is all you need to quickly verify your code’s behaviour.

In a pinch, you might only want confirmation that the code you are running is entering a particular code block, such as the expected branch of an if statement. For instance, I often use print("sanity") or print("got here").

To trace the path through some code, scattering print(“A”), print(“B”), etc., can help. This is especially effective if an exception is raised and you’re having difficulty tracking down the cause.

This method is quick and dirty, with quickness being the key. It’s not elegant but can be effective and typically won’t hit source control.

As an aside, using the f-string syntax with a variable name followed by = is a useful shortcut. Instead of writing print("answer=", answer), you can simply do this:

>>> answer = 42
>>> print(f"{answer=}")
answer=42

I use prints for transient testing and debugging. For more permanent solutions, logging is a better choice.

Better: logging

Base logger

Python has a logging component in the standard library. This is straightforward to use and configure, and well documented.

It is the benchmark against which all third-party logging libraries must be judged; each of them addresses aspects of the standard logging library that their creators think can be improved.

import logging

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s - %(levelname)s - %(message)s"
)
logging.info("Hello, world.")
2024-05-18 22:13:06,880 - INFO - Hello, world.

There are various logging levels you can use for messages, in increasing order of severity:

  • logger.debug('This is a debug message')
  • logger.info('This is an info message')
  • logger.warning('This is a warning message')
  • logger.error('This is an error message')
  • logger.critical('This is a critical message')

You can adjust the logging level to control which messages are logged. For instance, you might set a lower level when debugging to see more detailed messages, and a higher logging level in production to log only warnings and above.

A note about message construction

When constructing log messages, there are different ways to include variable data. Consider the following examples:

widget = 3
logger.info(f"Widget {widget} is out of space")
logger.info("Widget %s is out of space", widget)

Both these logger calls output Widget 3 is out of space. However, the second format is preferable for a couple of reasons:

  1. The conversion of widget to a string and embedding it into the larger message string will only happen if the log level is low enough for the message to be output. This can save processing time and resources when logging is disabled for that level.

  2. Some formatters will be able to group together all the messages using the same template. This is important if you use a log aggregation service that you will later want to search through and generate statistics from.

Custom logger

In full-fledged applications, it’s beneficial to have each Python module maintain its own independent logger. This approach offers several benefits:

  • You can set log levels and handler types independently in different parts of your code.
  • Log messages can include the logger name to highlight which part of your codebase they came from.
  • Your logging won’t interfere with the other code that calls into yours (for example, if you are creating a library).

Here is a typical example. It creates a logger with two handlers: one to log to the console all messages at level INFO and above, and one to log to a file all messages at level DEBUG and above.

Note that the base logger itself is set to DEBUG; if it were set to a higher level then debug messages would be filtered out before getting to the handlers, whatever their own logging levels were set to.

import logging

logger = logging.getLogger(__file__)
logger.setLevel(logging.DEBUG)

console_handler = logging.StreamHandler()
file_handler = logging.FileHandler("app.log")

console_handler.setLevel(logging.INFO)
file_handler.setLevel(logging.DEBUG)

formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
console_handler.setFormatter(formatter)
file_handler.setFormatter(formatter)

logger.addHandler(console_handler)
logger.addHandler(file_handler)

logger.debug("Debug message to file")
logger.info("Info message to console and file")
matthew@grug ~/p/2024-05-15> python3 main_logging.py
2024-05-18 22:56:16,249 - /home/matthew/permatmp/2024-05-15/main_logging.py - INFO - Info message to console and file
matthew@grug ~/p/2024-05-15> cat app.log
2024-05-18 22:56:16,249 - /home/matthew/permatmp/2024-05-15/main_logging.py - DEBUG - Debug message to file
2024-05-18 22:56:16,249 - /home/matthew/permatmp/2024-05-15/main_logging.py - INFO - Info message to console and file

Typically, you’d write a utility function to apply your preferred setup to each logger.

def setupLogging(log_name):
    logger = logging.getLogger(log_name)
    logger.setLevel(logging.DEBUG)

    console_handler = logging.StreamHandler()
    file_handler = logging.FileHandler("app.log")

    console_handler.setLevel(logging.INFO)
    file_handler.setLevel(logging.DEBUG)

    formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
    console_handler.setFormatter(formatter)
    file_handler.setFormatter(formatter)

    logger.addHandler(console_handler)
    logger.addHandler(file_handler)

    return logger

Then you would set up the logs in your modules by calling it:

from utils import setupLogging

logger = setupLogging(__file__)

Best: structlog

structlog is a powerful and flexible logging library that goes above and beyond the functionality of Python’s built-in logging. It focuses on structured logging, where simple log messages (events) accompanied by value-key metadata (context), making them easy for other tools to parse later.

Structured logging separates the context from the message, making the logs easier to write and parse. While you can still use %s placeholders (as described above), structlog’s approach simplifies adding context to your logs.

>>> from structlog import get_logger
>>> logger = get_logger()
>>> widget = 3
>>> logger.info("Widget is out of space", widget=widget)

2024-06-02 21:46:31 [info     ] Widget is out of space         widget=3

Binding values

One of the powerful features of structlog is the ability to bind values to loggers. This means you can attach extra context, such as a session ID, to a logger, and these values will automatically be included in all subsequent log messages.

>>> from structlog import get_logger
>>> logger = get_logger()
>>> bound_logger = logger.bind(session=123)
>>> bound_logger.info("hello")

2024-06-04 11:37:49 [info     ] Hello                          session=123

Exceptions

When debugging locally, the more quickly you can get to the source of the problem the better. Structlog’s display of stack traces is clear and helpful, making it easier to diagnose issues.

Here’s an example of a stack trace generated by structlog:


╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
 /home/matthew/permatmp/2024-05-15/main_exception.py:4 in <module>                                
                                                                                                  
   1 from structlog import get_logger                                                             
   2 logger = get_logger()                                                                        
   3 try:                                                                                         
 4 div_by_zero = 1 / 0                                                                      
   5 except BaseException as e:                                                                   
   6 logger.exception(e)                                                                      
   7                                                                                              
                                                                                                  
 ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ 
           e = ZeroDivisionError('division by zero')                                            
  get_logger = <function get_logger at 0x75c44db8dd80>                                          
      logger = <BoundLoggerLazyProxy(logger=None, wrapper_class=None, processors=None,          
               context_class=None, initial_values={}, logger_factory_args=())>                  
 ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ 
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ZeroDivisionError: division by zero

As you can see, structlog highlights the problematic line with a red arrow, along with some context around it. It also lists all the local variables (locals), which can be useful for diagnosing the issue.

Aside: this colourful HTML was generated by piping into ansi2html with --partial --inline parameters set.

Typical structlog config

Structlog’s configuration is incredibly flexible. The documentation is thorough but fragmented, so here I’ve consolidated guidance into a realistic configuration that includes the following features:

  • Optimised for human readability when running locally; optimised for aggregation and machine readability when running in production.
  • Adds the file name and line number to log messages, which isn’t included by default but is generally useful information.
  • Uses a LOG_LEVEL environment variable to set the logging level; log entries at lower levels will be treated as no-ops.
  • Uses the orjson library for JSON generation in production, for ultra-fast JSON creation (straight to bytes).
import logging
import os
import sys

import orjson
import structlog
from structlog.contextvars import merge_contextvars
from structlog.dev import ConsoleRenderer, set_exc_info
from structlog.processors import (
    CallsiteParameter,
    CallsiteParameterAdder,
    EventRenamer,
    JSONRenderer,
    StackInfoRenderer,
    TimeStamper,
    add_log_level,
    dict_tracebacks,
)

shared_processors = [
    # This list is based on _BUILTIN_DEFAULT_PROCESSORS.
    # See https://github.com/hynek/structlog/blob/main/src/structlog/_config.py
    merge_contextvars,
    add_log_level,
    StackInfoRenderer(),
    set_exc_info,
    # Extras not in the defaults
    CallsiteParameterAdder([CallsiteParameter.MODULE, CallsiteParameter.LINENO]),
]

# Approach based on https://www.structlog.org/en/stable/logging-best-practices.html
logger_factory = None  # use the default
if sys.stderr.isatty():
    # Pretty printing when we run in a terminal session.
    # Automatically prints pretty tracebacks when "rich" is installed
    processors = shared_processors + [
        TimeStamper(fmt="%Y-%m-%d %H:%M:%S", utc=False),
        ConsoleRenderer(),
    ]
else:
    # Print JSON when we run, e.g., in a Docker container.
    # Also print structured tracebacks.
    processors = shared_processors + [
        # slightly less readable and more easily processed, e.g. `2024-06-02T23:04:00.000000Z`.
        TimeStamper(fmt="iso", utc=True),
        EventRenamer("msg"),  # rename "event" key to "msg"
        dict_tracebacks,
        JSONRenderer(serializer=orjson.dumps),
    ]
    # Note that orjson generates bytes not strings
    # logger_factory=structlog.BytesLoggerFactory()  # for outputting JSON to console
    logger_factory=structlog.BytesLoggerFactory(
        file=open("output.log", "ab")  # for outputting JSON to file
    )

log_level_raw = os.environ.get("LOG_LEVEL", "INFO").upper()
log_level = logging.getLevelName(log_level_raw)
if isinstance(log_level, str):
    raise ValueError(f"Invalid log level: {log_level_raw}")


structlog.configure(
    cache_logger_on_first_use=True,
    wrapper_class=structlog.make_filtering_bound_logger(log_level),
    processors=processors,
    logger_factory=logger_factory,
)

The configuration above should be run once in your application. Then, in the modules where you use logging, you can set up and use the logger as follows:

import structlog

logger = structlog.get_logger()
widgets = 3

logger.debug("Too low a level -- will be skipped");
logger.info("Test info", widgets=widgets)
logger.warning("Test warning", widgets=widgets)
logger.error("Test error", widgets=widgets)
logger.critical("Test critical", widgets=widgets)

2024-06-02 23:04:00 [info     ] Test info                      lineno=5 module=my_module widgets=3
2024-06-02 23:04:00 [warning  ] Test info                      lineno=5 module=my_module widgets=3
2024-06-02 23:04:00 [error    ] Test info                      lineno=5 module=my_module widgets=3
2024-06-02 23:04:00 [critical ] Test info                      lineno=5 module=my_module widgets=3

Or, if outputting to a file:

> cat output.log
{"widgets":3,"level":"info","module":"main_structlog2","lineno":75,"timestamp":"2024-06-04T09:20:57.636807Z","msg":"Test info"}
{"widgets":3,"level":"warning","module":"main_structlog2","lineno":76,"timestamp":"2024-06-04T09:20:57.637064Z","msg":"Test warning"}
{"widgets":3,"level":"error","module":"main_structlog2","lineno":77,"timestamp":"2024-06-04T09:20:57.637170Z","msg":"Test error"}
{"widgets":3,"level":"critical","module":"main_structlog2","lineno":78,"timestamp":"2024-06-04T09:20:57.637254Z","msg":"Test critical"}

A note about async logging

Structlog also supports async logging, which can be useful in scenarios where logging is particularly slow and frequent, such as logging to a database. The async function calls use a prefixes, like await logger.ainfo("...").

Async logging can be beneficial when logging involves complex pipelines or expensive-to-write syncs. However, it’s important to combine this with batching, saving up log entries and writing them in bulk periodically. Otherwise, the overhead of async call machinery might outweigh the benefits of asynchronous writing, potentially resulting in worse performance.

As with most performance tweaks, you should profile the code to check that it works in practice before adopting this wholesale.

Other options

Loguru

Loguru is a good alternative logging framework that simplifies the logging setup in Python.

Pros:

  • Much easier configuration that structlog.
  • Faster than the standard Python logging module.

Cons:

  • Uses string-based logging with {} placeholders; I would prefer full structured logging, and this form is not drop-in compatible with existing standard logging implementations using %s style placeholders.

Logbook

Logbook is another interesting Python logging library that offers unique features and improvements over the standard logging module.

Pros:

  • Faster than the standard Python logging module.
  • Includes esoteric handlers for logging directly to some unusual syncs. For example, you can log to Twitter or ZeroMQ, or send critical errors to an mail eaddress.
  • Trivially easy to redirect standard logging calls to Logbook with a redirect_logging() call.

Cons:

  • Does not support full structured logging.

Sentry

There is a Python API for the Sentry platform, which might be of interest if you want to aggregate all your logs into Sentry.

python-json-logger

The python-json-logger library adds JSON-formatted output to the standard logging module. However, it doesn’t appear to be under active maintenance.

Wrap-up

Structlog offers some welcome extra functionality over that provided by the standard library’s logging component. The coloured output makes it easier to parse console output; in particular, the formatting of logged exceptions makes it simpler to track down problems.

Logging is one of those things that you set up once in a project and then rarely have to tweak the configuration again. While structlog’s flexibility can make its documentation challenging to follow, this initial setup effort pays off with simplified maintenance in the long run.

Although many other logging libraries are easier to configure, they primarily focus on filling in templated message strings rather than providing JSON-formatted output. As demonstrated in this post, structlog makes it possible to achieve both easy-to-read logging locally and machine-friendly output in production.