Agenda

This post aims at making you aware about the internals of the Python logging module.

I often find myself struggling to use the logging module and have to refer the documentation. It's either my ignorance towards logging or probably logging isn't documented as well and crisply as it should be.

So this is my take to unravel what goes on in the logging module. If you have a solid grasp of different components which work under the hood, there is a higher likelihood that you don't have to refer the documentation everytime you want to use logging.

Logging components

Python logging module can be categorized into four broad components:

  • Formatter
  • Filter
  • Handler
  • Logger

These four components work in harmony to provide a smooth logging experience.

There is another component at the core of all these four components. This other component is called LogRecord.

A LogRecord instance represents an event being logged.

Pay attention to the event in the above sentence. A LogRecord instance should be considered to represent an event.

An event could be a user registration, or a user changing their password or a user downloading the report etc. Your application might want to log and keep a track of these events.

LogRecord

Python logging module has a class called LogRecord. It can be imported in the following way:

In [35]: from logging import LogRecord

Any call to logging.info(), logging.debug(), logger.info(), logger.debug() etc. generates a LogRecord instance. That's why it's essential to understand a LogRecord.

A LogRecord instance has five major attributes:

  • name — which serves as the identifier for the LogRecord.
  • levelno and levelname — work together to represent the severity of the event. Generally you might want to associate different severity with different events.
  • msg — a descriptive text to associate with the log
  • args – a set of arguments that can be used for the purpose of string substitution in the msg

Let's create few LogRecord instances. We will use a helper method called makeLogRecord() provided by logging module

In [37]: from logging import makeLogRecord

LogRecord __init__ signature isn't very straightforward so we are using makeLogRecord to create LogRecord instances.

In [44]: record = makeLogRecord({'name': 'first record', 'levelname': logging.INFO, 'levelno':20, 'msg': 'User registered', 'args': ()})

In [45]: type(record)
Out[45]: logging.LogRecord

In [46]: record.name
Out[46]: 'first record'

In [47]: record.msg
Out[47]: 'User registered'

In [48]: record.args
Out[48]: ()

In [49]: record.levelname
Out[49]: 20

We passed levelname as logging.INFO. Module logging has a constant called INFO.

Severity levels

Let's talk a bit on the severity levels and the constants used to represent them.

As described earlier, levelname and levelno represent severity of LogRecord events. The logging module uses constants to define the severities. The severity levels are:

  • logging.DEBUG
  • logging.INFO
  • logging.WARNING

I have deliberately left out a few levels to keep this simple.

Each severity has a string representation and a numeric value associated with it. eg: logging.INFO has a string representation of INFO and has a numeric value of 20. logging.DEBUG has a string representation of DEBUG and has a numeric value of 10.

makeLogRecord

Coming back to how we created a LogRecord using makeLogRecord.

We passed args as empty tuple in this scenario because there is no string substitution required in the msg.

Let's create another LogRecord where we will have a string substitution.

In [54]: record = makeLogRecord({'name': 'second record', 'levelname': loggin
    ...: g.INFO, 'levelno':20, 'msg': 'User registered with username %s', 'ar
    ...: gs': ('steve',)})

In [55]: record.msg
Out[55]: 'User registered with username %s'

In [56]: record.args
Out[56]: ('steve',)

In [57]: record.getMessage()
Out[57]: 'User registered with username steve'

LogRecord has a method called getMessage() which combines msg and args and does string substitution to generate a readable message for the event.

Let's create another record.

In [74]: msg = 'User registered with username %(username)s. Assigned id is %(id)s'

In [75]: args = {'username': 'steve', 'id': 5}

In [62]: record = makeLogRecord({'name': 'third record', 'levelname': logging.INFO, 'levelno':20, 'msg': msg, 'args': args})

In [63]: record.getMessage()
Out[63]: 'User registered with username steve. Assigned id is 5'

Formatter

You would expect a control over the format of your event logs.

Sometimes you might want to display the severity level while at other times you might not want to display the severity level.

Similarly, sometimes you might want the event time to precede the severity level while sometime you might want time info to show after the severity level.

This is where Formatter comes into picture.

Let's create a Formatter and log a LogRecord using the formatter.

In [72]: record = makeLogRecord({'name': 'user.username', 'levelname': getLevelName(logging.INFO), 'levelno':logging.INFO, 'msg': msg, 'args': args})

In [73]: from logging import Formatter

In [76]: formatter = Formatter("Name: %(name)s Level: %(levelname)s Message: %(message)s")

In [73]: formatter.format(record)
Out[73]: 'Name: user.username Level: INFO Message: User registered with username steve. Assigned id is 5'

Let's create another Formatter and log using the formatter. Let's use levelno instead of levelname this time.

In [76]: formatter = Formatter("Name: %(name)s Level: %(levelno)s Message: %(message)s")

In [77]: formatter.format(record)
Out[77]: 'Name: user.username Level: 20 Message: User registered with username steve. Assigned id is 5'

Essentially any attribute of LogRecord can be passed as format specifier in Formatter. And they will be substitued with LogRecord attribute values during formatting.

So the attributes that can be used are:

  • name
  • args
  • levelname
  • levelno
  • pathname
  • filename
  • module
  • lineno
  • funcName
  • thread
  • threadName
  • processName

And few more!

LogRecord doesn't have an attribute called message. But it can be used as format specifier in Formatter. In such cases a message is generated using LogRecord's msg and args attribute and is substitued for message specifier.

In addition to message, asctime can be used as format specifier too. This too isn't an attribute of LogRecord but the formatter will generate a value for asctime based on LogRecord's created attribute.

In [78]: formatter = Formatter("Time: %(asctime)s Name: %(name)s Level: %(levelno)s Message: %(message)s")

In [82]: formatter.format(record)
Out[82]: 'Time: 2020-03-05 01:45:10,295 Name: user.username Level: 20 Message: User registered with username steve. Assigned id is 5'

It is possible to specify a date format on the formatter.

In [89]: formatter = Formatter("Time: %(asctime)s Name: %(name)s Level: %(levelno)s Message: %(message)s", datefmt="%B %dth %H:%M %p")

In [90]: formatter.format(record)
Out[90]: 'Time: March 05th 01:45 AM Name: user.username Level: 20 Message: User registered with username steve. Assigned id is 5'

Handler

A handler decides how the output of a particular logrecord should be handled. A logrecord could be handled in multiple ways. Some possible ways are:

  • Write logrecord to a file
  • Write logrecord to a sys.stderr
  • Email the logrecord details to an admin
  • Add an entry for the logrecord in the database

Let's create a handler which writes to sys.stderr

In [91]: from logging import StreamHandler

In [92]: handler = StreamHandler()

In [93]: handler.handle(record)
User registered with username steve. Assigned id is 5
Out[93]: 1

A handler also requires an associated formatter.

In the last example we did not associate a formatter with the handler, so the logging module associated a default formatter with this handler. The format for the default formatter is "%(message)s". That's why only the record's message was logged.

Let's associate the handler with an explicit formatter.

In [94]: formatter = Formatter("Time: %(asctime)s Name: %(name)s Level: %(levelno)s Message: %(message)s", datefmt="%B %dth %H:%M %p")

In [95]: handler.setFormatter(formatter)

In [96]: handler.handle(record)
Time: March 05th 01:45 AM Name: user.username Level: 20 Message: User registered with username steve. Assigned id is 5
Out[96]: 1

Calling handler.handle() internally calls the associated formatter's format().

Handlers can also have multiple filters associated with them. Let's create a custom filter which can dictate to handler to log only events with severity equal to higher than logging.INFO.

In [105]: from logging import Filter

In [106]: class SeverityFilter(Filter):
     ...:     def filter(self, record):
     ...:         if record.levelno < logging.INFO:
     ...:             return False
     ...:         return True
     ...:

In [108]: severity_filter = SeverityFilter()

In [109]: handler.addFilter(severity_filter)

In [110]: handler.handle(record)
Out[110]: 0

The record wasn't logged here because the severity of this record is DEBUG and our filter does not allow the handler to log events with a severity less than INFO.

Let's create another record with severity of INFO and handle it using the handler.

In [111]: record = makeLogRecord({'name': 'user.username', 'levelname': getLevelName(logging.INFO), 'levelno':logging.INFO, 'msg': msg, 'args': args})

In [112]: handler.handle(record)
User registered with username steve. Assigned id is 5
Out[112]: 1

Logger

The final component of the logging is Logger. Docstring of Logger defines it as:

Instances of the Logger class represent a single logging channel.
A "logging channel" indicates an area of an application.

Loggers are used to compartmentalize different logging messages such that each message belongs to a particular application area. This will become clear with the upcoming examples.

A logger has following things associated with it:

  • Severity level
  • Multiple filters
  • Multiple handlers

If a logrecord's severity level is below the logger's severity level, then the record is ignored.

If any of the associated filters of the logger return False, then the record is ignored.

If record is not ignored by either of the severity level and the associated filters, then logrecord is passed to all associated handlers. And each handler will handle the logrecord as per it's own logic.

Let's first create a simple logger without any explicit level and log few messages.

In [1]: from logging import Logger

In [2]: logger = Logger('urbanpiper')

In [3]: logger.debug('This code flow kicks in')
No handlers could be found for logger "urbanpiper"

The logging system complains that No handlers could be found for logger "urbanpiper". This means that we definitely need to associate atleast one handler with the logger.

Let's create a handler and associate with the logger.

In [6]: handler = StreamHandler()

In [7]: logger.addHandler(handler)

In [8]: logger.debug('This code flow kicks in')
This code flow kicks in

Internally, a LogRecord was created with msg as This code flow kicks in. The created logrecord was passed to handler.handle(). Since we did not associate a formatter with handler, so the handler used the default formatter and the logrecord was logged.

Let's set the severity level on the logger.

In [11]: logger = Logger('urbanpiper', level=logging.INFO)

In [12]: handler = StreamHandler()

In [13]: logger.addHandler(handler)

In [14]: logger.debug('This code flow kicks in')

In this case, the severity level of logger is INFO while we are trying to log a record with severity level DEBUG. Since level DEBUG is less than INFO, so it wasn't logged.

Let's try logging a record with level INFO.

In [15]: logger.info('This code flow kicks in')
This code flow kicks in

As expected, this was logged.

Similar to how we earlier associated a filter with handler, we could associate the filter with loggger too.

In [16]: from logging import Filter

In [17]: class SeverityFilter(Filter):
    ...:     def filter(self, record):
    ...:         if record.levelno < logging.WARNING:
    ...:             return False
    ...:         return True
    ...:

In [18]: filter = SeverityFilter()

In [19]: logger = Logger('urbanpiper', level=logging.INFO)

In [20]: logger.addHandler(StreamHandler())

In [21]: logger.addFilter(filter)

Our filter doesn't allow a logrecord with severity less than WARNING to be logged.

In [22]: logger.debug('This code flow kicks in')

In [23]: logger.info('This code flow kicks in')

In [24]: logger.warning('This code flow kicks in')
This code flow kicks in

As expected, trying to log a record with severity level less than WARNING were ignored.

We can add multiple handlers to a logger too. eg: We can add a StreamHandler as well as a FileHandler to the logger.

In [26]: from logging import FileHandler

In [19]: logger = Logger('urbanpiper', level=logging.INFO)

In [20]: logger.addHandler(StreamHandler())

In [27]: handler = FileHandler('log.txt')

In [28]: logger.addHandler(handler)

In [29]: logger.warning('This code flow kicks in')
This code flow kicks in

A file called log.txt should have been created and you should be able to see the record message written to that file.

handler.close() needs to be called to close the underlying file.

In [30]: handler.close()

Logger Heirarchy

In last section, we explicitly created a Logger to understand what happens under the hood.

In a production setting, you shouldn't be creating a Logger explicitly. Instead you should be using getLogger.

getLogger is clever enought to allow nested loggers. The advantage of nested loggers is that every child logger gets the handlers of it's parent logger for free.

An example would make it clear. Let's create a logger.

In [4]: import logging

In [5]: urbanpiper_logger = logging.getLogger('urbanpiper')

In [11]: urbanpiper_logger.setLevel(logging.DEBUG)

We set name for our logger as urbanpiper. We need to associate a handler with the logger.

In [12]: handler = StreamHandler()

In [15]: formatter = logging.Formatter('Level:%(levelname)s Message:%(message)s')

In [16]: handler.setFormatter(formatter)

In [17]: urbanpiper_logger.addHandler(handler)

Let's log a record with severity INFO.

In [18]: urbanpiper_logger.info('User created with id %s', 'user-1')
Level:INFO Message:User created with id user-1

Let's create a child logger named urbanpiper.tasks

In [19]: task_logger = logging.getLogger('urbanpiper.tasks')

Without setting any handler on this child logger, let's try to log a record.

In [22]: task_logger.info('Task %s executed', 'send_mail')
Level:INFO Message:Task send_mail executed

Despite not associating any handler with the child logger, logging system didn't complain with No handlers could be found for logger "urbanpiper.tasks". The handler of logger urbanpiper was invoked instead.

Infact, any logger you create having name of the patter 'urbanpiper.A' or 'urbanpiper.A.B' etc. can use the parent logger as a fallback.

This pattern comes very handy when you have a large project and setting up the logger in each module would become cumbersome. Consider the following project structure.

settings.py
urls.py
polls
    |- __init__.py
    |- views.py
    |- tasks.py
    |- models.py
people
    |- __init__.py
    |- views.py
    |- tasks.py
    |- models.py

Your project has two packages namely polls and people. Each package has 3 modules each. polls has views.py, tasks.py and models.py. Similarly people has 3 modules.

If logging library didn't provide the nested logger semantics, you would have had to define a logger in each of the module and associate a handler with it. Something like the following:

# polls/views.py
logger = logging.getLogger('polls.views')
logger.setLevel(logging.INFO)
logger.addHandler(logging.StreamHandler())

...more code..

# polls/tasks.py
logger = logging.getLogger('polls.tasks')
logger.setLevel(logging.INFO)
logger.addHandler(logging.StreamHandler())

Same way you would need to add similar boilerplate logging code in other moduled.But with nested loggers, we can set severity level and handler at the package level and all modules of that package will get the level and handler for free.

We will have the following code in polls/__init__.py

# polls/__init__.py
logger = logging.getLogger('polls')
logger.setLevel(logging.INFO)
logger.addHandler(logging.StreamHandler())

And only define the logger in the modules, there is no need to set the level and handler.

# polls/views.py
logger = logging.getLogger('polls.views')

# polls/tasks.py
logger = logging.getLogger('polls.tasks')

Infact we can remove the hardcoded logger names and instead use the module name.

# polls/views.py
logger = logging.getLogger(__name__)

# polls/tasks.py
logger = logging.getLogger(__name__)

Logging system allows compartmentalizing the log records in a namespace while providing the flexibility to not associate the handler with each logger. Instead a logger's parent heirarchy is capable to handle the log records.


Hope you found this unraveling of the logging system insightful and it helps you understand what goes under the hood when you use the logging library.