Skip to content

Fix DNS resolver object churn for multiple sessions #10897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
May 20, 2025
Merged

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented May 20, 2025

Problem

As reported in #10847, aiohttp currently creates a new DNSResolver object for each AsyncResolver instance, leading to excessive resolver object churn when using multiple sessions. This causes unnecessary resource usage and potential performance degradation, as the c-ares library documentation indicates that a single resolver can handle unlimited queries.

Solution

This PR introduces a shared resolver management system:

  1. Created a new _DNSResolverManager singleton class that maintains a single shared aiodns.DNSResolver instance
  2. Modified AsyncResolver to use the shared resolver for default arguments while still supporting custom resolver configurations
  3. Implemented proper client tracking and cleanup through the get_resolver and release_resolver methods
  4. Added comprehensive tests to ensure correct behavior

The key design point is requiring clients to be explicitly passed to get_resolver, ensuring that the manager can track all users of the shared resolver and properly clean up when they're done.

Implementation Details

  • _DNSResolverManager is a singleton that ensures only one shared resolver exists
  • Custom resolver configurations still create dedicated resolver instances
  • Client tracking uses weakref.WeakSet to avoid reference cycles
  • The shared resolver is automatically canceled and cleaned up when the last client is released
  • All resolvers are properly cleaned up after tests

Benefits

  • Drastically reduces the number of DNSResolver objects created for multiple sessions
  • Decreases memory and resource usage
  • Improves performance by avoiding unnecessary resolver creation and destruction
  • Properly leverages c-ares's ability to handle unlimited queries with a single resolver

Related issue number

fixes #10847

@bdraco bdraco requested a review from asvetlov as a code owner May 20, 2025 15:38
@bdraco bdraco added this to the 3.12 milestone May 20, 2025
@bdraco bdraco added the backport-3.12 Trigger automatic backporting to the 3.12 release branch by Patchback robot label May 20, 2025
@bdraco bdraco requested a review from webknjaz as a code owner May 20, 2025 15:41
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label May 20, 2025
Copy link

codecov bot commented May 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.76%. Comparing base (802152a) to head (dfbcb9f).
Report is 3 commits behind head on master.

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff            @@
##           master   #10897    +/-   ##
========================================
  Coverage   98.75%   98.76%            
========================================
  Files         129      129            
  Lines       38931    39085   +154     
  Branches     2164     2169     +5     
========================================
+ Hits        38447    38601   +154     
  Misses        336      336            
  Partials      148      148            
Flag Coverage Δ
CI-GHA 98.64% <100.00%> (+<0.01%) ⬆️
OS-Linux 98.35% <100.00%> (+<0.01%) ⬆️
OS-Windows 96.52% <100.00%> (+<0.01%) ⬆️
OS-macOS 97.51% <100.00%> (+<0.01%) ⬆️
Py-3.10.11 97.40% <100.00%> (+0.01%) ⬆️
Py-3.10.17 97.93% <100.00%> (+<0.01%) ⬆️
Py-3.11.12 98.00% <100.00%> (+<0.01%) ⬆️
Py-3.11.9 97.48% <100.00%> (+0.01%) ⬆️
Py-3.12.10 98.42% <100.00%> (+<0.01%) ⬆️
Py-3.13.3 98.41% <100.00%> (-0.01%) ⬇️
Py-3.9.13 97.27% <100.00%> (+0.01%) ⬆️
Py-3.9.22 97.80% <100.00%> (+0.01%) ⬆️
Py-pypy7.3.16 88.86% <100.00%> (-4.63%) ⬇️
VM-macos 97.51% <100.00%> (+<0.01%) ⬆️
VM-ubuntu 98.35% <100.00%> (+<0.01%) ⬆️
VM-windows 96.52% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

codspeed-hq bot commented May 20, 2025

CodSpeed Performance Report

Merging #10897 will improve performances by 8.06%

Comparing single_async_resolver (dfbcb9f) with master (a4be2cb)

Summary

⚡ 1 improvements
✅ 59 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
test_one_hundred_simple_get_requests_no_session[pyloop] 84.6 ms 78.3 ms +8.06%

@bdraco
Copy link
Member Author

bdraco commented May 20, 2025

oh right, we support multiple event loops still so we need to account for that

@bdraco
Copy link
Member Author

bdraco commented May 20, 2025

Something else is actually leaking them outside of this

@bdraco
Copy link
Member Author

bdraco commented May 20, 2025

I need to move the fixture to top level and make it auto use to find it.

I'm going to put it at top level, and than revert it after I fix all the test leaks

@bdraco
Copy link
Member Author

bdraco commented May 20, 2025

We have far too many session leaks to fix them in this PR. I'll have to count before and after instead

@bdraco bdraco merged commit 4624fed into master May 20, 2025
40 checks passed
@bdraco bdraco deleted the single_async_resolver branch May 20, 2025 20:12
Copy link
Contributor

patchback bot commented May 20, 2025

Backport to 3.12: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply 4624fed on top of patchback/backports/3.12/4624fed82608d9b659c4284814a570e0b332b8db/pr-10897

Backporting merged PR #10897 into master

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/aio-libs/aiohttp.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/3.12/4624fed82608d9b659c4284814a570e0b332b8db/pr-10897 upstream/3.12
  4. Now, cherry-pick PR Fix DNS resolver object churn for multiple sessions #10897 contents into that branch:
    $ git cherry-pick -x 4624fed82608d9b659c4284814a570e0b332b8db
    If it'll yell at you with something like fatal: Commit 4624fed82608d9b659c4284814a570e0b332b8db is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x 4624fed82608d9b659c4284814a570e0b332b8db
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Fix DNS resolver object churn for multiple sessions #10897 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/3.12/4624fed82608d9b659c4284814a570e0b332b8db/pr-10897
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

bdraco added a commit that referenced this pull request May 20, 2025
bdraco added a commit that referenced this pull request May 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-3.12 Trigger automatic backporting to the 3.12 release branch by Patchback robot bot:chronographer:provided There is a change note present in this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DNSResolver objects churn on transient requests without a session
2 participants