Skip to content

Deadlock when objects emit messages at garbage collection #712

@jacksmith15

Description

@jacksmith15

Hello 👋

I have found a issue whereby loguru can hit a deadlock due to garbage collection. Specifically, if there is an object in memory that has the following properties:

  1. Is part of a Cyclic Isolate (i.e. it will be garbage collected by the generational garbage collector, not the reference counter)
  2. Implements a __del__ finalizer which explicitly or implicitly triggers a log statement

Then any memory assignment during log handling has a chance to trigger a deadlock.

Reproducible test case

Python version: 3.8.12
Loguru version: 0.6.0

The following is a script which demonstrates the deadlock:

import sys

import loguru


class CyclicReference:
    """A minimal cyclic reference.

    Cyclical references are garbage collected using the generational collector rather than
    via reference counting. This is important here, because the generational collector runs
    periodically, meaning that it is hard to predict when the stack will be overtaken by a
    garbage collection process - but it will almost always be when allocating memory of some
    kind.

    When this object is garbage-collected, a log will be emitted.
    """
    def __init__(self, _other: "CyclicReference" = None):
        self.other = _other or CyclicReference(_other=self)

    def __del__(self):
        loguru.logger.info("Goodbye!")


class Sink:
    def write(self, message):
        # This simulates assigning some memory as part of log handling. Eventually this will
        # trigger the generational garbage collector to take over here.
        # You could replace it with `json.dumps(message.record)` to get the same effect.
        _ = [dict() for _ in range(20)]

        # Actually write the message somewhere
        sys.stderr.write(message)

    def flush(self):
        sys.stderr.flush()


def main():
    # Set up logger with sink:
    loguru.logger.add(Sink(), level="INFO")

    # Keep logging until a cyclic reference is garbage collected, at which point deadlock
    # should be reached:
    for index in range(10000):
        CyclicReference()  # Here we create cyclic isolates, simulating code which uses them
        loguru.logger.info("Hello", index=index)


if __name__ == "__main__":
    main()

Expected behaviour

A series of Hello logs are emitted, interspersed with Goodbye! logs.

Actual behaviour

The program reaches a deadlock, requiring a SIGTERM to exit.

Explanation of the issue

The emit method of loguru's handler uses a lock to avoid contention for the log handler. It is for this reason, that emitting additional log messages in custom sinks is not supported and will result in deadlocks (see #390).

This seems to be a relatively sensible limitation, however the nature of the generational garbage collector in CPython can cause this to happen inadvertently and unpredictably in code which has cyclic isolates and performs some level of processing during log handling.

Real world example

The reproducible example above seems contrived, and one might suggest that the answer is simply to avoid reference cycles, so I will give an actual example of this that I've seen. The Python Elasticsearch client contains reference cycles as a means for sharing connections between interfaces.

As a transport it uses aiohttp, which implements a __del__ garbage collection finalizer. This in turn invokes the asyncio exception handler, which defaults to emitting a log statement. Since this is a stdlib logging call, it doesn't cause a problem on its own, but it is likely that users of loguru will set up some kind of intercept handler to forward third party stdlib logs to loguru.

Additional notes

It took me some time to figure out a reproducible example for this, as I was only familiar with the reference counting part of the garbage collector. I found the this article useful to understand the generational collector, which is what led me to isolate the circumstances to reproduce.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementImprovement to an already existing feature

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions