Skip to content

Fix cookies with duplicate names being lost when updating cookie jar #11106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Jun 1, 2025

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented May 31, 2025

Summary

This PR fixes an issue where cookies with duplicate names but different domains or paths were being lost when updating the cookie jar. The root cause was that SimpleCookie uses only the cookie name as its key, causing later cookies with the same name to overwrite earlier ones.

Changes

  • Modified ClientResponse to store raw Set-Cookie headers
  • Added update_cookies_from_headers method to AbstractCookieJar that processes each Set-Cookie header individually to preserve duplicate cookie names

Related Issues

Technical Details

The fix works by processing each Set-Cookie header separately and collecting all cookies before updating the jar, rather than relying on SimpleCookie's dictionary-like behavior which loses duplicates.

@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label May 31, 2025
Copy link

codecov bot commented May 31, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.83%. Comparing base (3a2a9b2) to head (7bc4b59).
Report is 1 commits behind head on master.

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff            @@
##           master   #11106    +/-   ##
========================================
  Coverage   98.82%   98.83%            
========================================
  Files         129      129            
  Lines       41505    41726   +221     
  Branches     2234     2241     +7     
========================================
+ Hits        41019    41239   +220     
  Misses        337      337            
- Partials      149      150     +1     
Flag Coverage Δ
CI-GHA 98.71% <100.00%> (+<0.01%) ⬆️
OS-Linux 98.43% <100.00%> (+<0.01%) ⬆️
OS-Windows 96.71% <100.00%> (+0.01%) ⬆️
OS-macOS 97.62% <100.00%> (+<0.01%) ⬆️
Py-3.10.11 97.40% <100.00%> (+0.01%) ⬆️
Py-3.10.17 97.91% <100.00%> (+<0.01%) ⬆️
Py-3.11.12 98.07% <100.00%> (+<0.01%) ⬆️
Py-3.11.9 97.57% <100.00%> (+0.01%) ⬆️
Py-3.12.10 98.47% <100.00%> (+<0.01%) ⬆️
Py-3.13.3 98.45% <100.00%> (+<0.01%) ⬆️
Py-3.9.13 97.29% <100.00%> (+0.01%) ⬆️
Py-3.9.22 97.78% <100.00%> (+<0.01%) ⬆️
Py-pypy7.3.16 84.12% <100.00%> (-9.28%) ⬇️
VM-macos 97.62% <100.00%> (+<0.01%) ⬆️
VM-ubuntu 98.43% <100.00%> (+<0.01%) ⬆️
VM-windows 96.71% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bdraco bdraco changed the title Fix Missing cookies Fix cookies with duplicate names being lost when updating cookie jar May 31, 2025
@bdraco bdraco added the backport-3.13 Trigger automatic backporting to the 3.13 release branch by Patchback robot label May 31, 2025
@bdraco bdraco marked this pull request as ready for review May 31, 2025 21:31
@bdraco bdraco requested review from webknjaz and asvetlov as code owners May 31, 2025 21:31
@bdraco
Copy link
Member Author

bdraco commented Jun 1, 2025

This will be bit slower... but correct.

We could go faster by doing something like... but I'm too afraid to do it because there might be subtle differences between it and SimpleCookie and we need bug for bug compatibility.

We could also reduce this to use http.cookiejar.parse_ns_headers as it be a good solution here as it works correctly to help solve #2502

It would be nice to make the parser more flexible because than we can solve #2683 as well.

diff --git a/aiohttp/abc.py b/aiohttp/abc.py
index 0b584eba4..b468caa8d 100644
--- a/aiohttp/abc.py
+++ b/aiohttp/abc.py
@@ -1,4 +1,5 @@
 import logging
+import re
 import socket
 from abc import ABC, abstractmethod
 from collections.abc import Sized
@@ -169,6 +170,75 @@ else:
 
 ClearCookiePredicate = Callable[["Morsel[str]"], bool]
 
+# Cookie parsing optimization constants
+_COOKIE_NAME_RE = re.compile(r"^[!#$%&\'*+\-.0-9A-Z^_`a-z|~]+$")
+_COOKIE_SPLIT_RE = re.compile(r";\s*")
+_KNOWN_ATTRS = frozenset(
+    ["path", "domain", "max-age", "expires", "secure", "httponly", "samesite"]
+)
+
+
+def _parse_cookie_header(header: str) -> Optional[Tuple[str, Morsel[str]]]:
+    """Fast cookie header parser that creates a Morsel directly."""
+    if not header:
+        return None
+
+    # Split by semicolon to get parts
+    parts = _COOKIE_SPLIT_RE.split(header)
+    if not parts:
+        return None
+
+    # First part must be name=value
+    first_part = parts[0]
+    eq_idx = first_part.find("=")
+    if eq_idx <= 0:  # No = or empty name
+        return None
+
+    name = first_part[:eq_idx].strip()
+    value = first_part[eq_idx + 1 :].strip()
+
+    # Validate cookie name - raise CookieError for invalid names
+    if not name:
+        return None
+    if not _COOKIE_NAME_RE.match(name):
+        raise CookieError(f"Illegal cookie name {name!r}")
+
+    # Remove quotes from value if present
+    if len(value) >= 2 and value[0] == '"' and value[-1] == '"':
+        value = value[1:-1]
+
+    # Create Morsel
+    morsel = Morsel()
+    morsel.set(name, value, value)
+
+    # Parse attributes
+    for part in parts[1:]:
+        part = part.strip()
+        if not part:
+            continue
+
+        eq_idx = part.find("=")
+        if eq_idx > 0:
+            # Attribute with value
+            attr_name = part[:eq_idx].strip().lower()
+
+            # Only process known attributes
+            if attr_name in _KNOWN_ATTRS:
+                attr_value = part[eq_idx + 1 :].strip()
+                morsel[attr_name] = attr_value
+        else:
+            # Boolean attribute
+            attr_name = part.lower()
+            if attr_name == "secure":
+                morsel["secure"] = True
+            elif attr_name == "httponly":
+                morsel["httponly"] = True
+            elif attr_name not in _KNOWN_ATTRS:
+                # Unknown boolean attribute - reject for SimpleCookie compatibility
+                return None
+
+    return name, morsel
+
 
 class AbstractCookieJar(Sized, IterableBase):
     """Abstract Cookie Jar."""
@@ -196,18 +266,16 @@ class AbstractCookieJar(Sized, IterableBase):
         """
         Update cookies from raw Set-Cookie headers.
 
-        Default implementation parses each header separately to preserve
-        cookies with same name but different domain/path.
+        Optimized implementation that parses cookies 40-57% faster than
+        SimpleCookie by using a direct parser that creates Morsel objects.
         """
-        # Default implementation for backward compatibility
         cookies_to_update: List[Tuple[str, Morsel[str]]] = []
+
         for cookie_header in headers:
-            tmp_cookie = SimpleCookie()
             try:
-                tmp_cookie.load(cookie_header)
-                # Collect all cookies as tuples (name, morsel)
-                for name, morsel in tmp_cookie.items():
-                    cookies_to_update.append((name, morsel))
+                result = _parse_cookie_header(cookie_header)
+                if result:
+                    cookies_to_update.append(result)
             except CookieError as exc:
                 client_logger.warning("Can not load response cookies: %s", exc)
 

@bdraco
Copy link
Member Author

bdraco commented Jun 1, 2025

If we decide we want to do something about #2683 and #2502 we can use the above and make this faster as well. Until we want to tackle that, I think its best to leave this one as SimpleCookie even if its a bit slower.

@bdraco bdraco enabled auto-merge (squash) June 1, 2025 14:20
@bdraco bdraco merged commit cfb9931 into master Jun 1, 2025
40 checks passed
@bdraco bdraco deleted the missing_cookies branch June 1, 2025 14:20
Copy link
Contributor

patchback bot commented Jun 1, 2025

Backport to 3.12: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply cfb9931 on top of patchback/backports/3.12/cfb99316e1fd89943b1f6c3793d286629b7acc3d/pr-11106

Backporting merged PR #11106 into master

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/aio-libs/aiohttp.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/3.12/cfb99316e1fd89943b1f6c3793d286629b7acc3d/pr-11106 upstream/3.12
  4. Now, cherry-pick PR Fix cookies with duplicate names being lost when updating cookie jar #11106 contents into that branch:
    $ git cherry-pick -x cfb99316e1fd89943b1f6c3793d286629b7acc3d
    If it'll yell at you with something like fatal: Commit cfb99316e1fd89943b1f6c3793d286629b7acc3d is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x cfb99316e1fd89943b1f6c3793d286629b7acc3d
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Fix cookies with duplicate names being lost when updating cookie jar #11106 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/3.12/cfb99316e1fd89943b1f6c3793d286629b7acc3d/pr-11106
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

Copy link
Contributor

patchback bot commented Jun 1, 2025

Backport to 3.13: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply cfb9931 on top of patchback/backports/3.13/cfb99316e1fd89943b1f6c3793d286629b7acc3d/pr-11106

Backporting merged PR #11106 into master

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/aio-libs/aiohttp.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/3.13/cfb99316e1fd89943b1f6c3793d286629b7acc3d/pr-11106 upstream/3.13
  4. Now, cherry-pick PR Fix cookies with duplicate names being lost when updating cookie jar #11106 contents into that branch:
    $ git cherry-pick -x cfb99316e1fd89943b1f6c3793d286629b7acc3d
    If it'll yell at you with something like fatal: Commit cfb99316e1fd89943b1f6c3793d286629b7acc3d is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x cfb99316e1fd89943b1f6c3793d286629b7acc3d
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Fix cookies with duplicate names being lost when updating cookie jar #11106 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/3.13/cfb99316e1fd89943b1f6c3793d286629b7acc3d/pr-11106
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

@bdraco
Copy link
Member Author

bdraco commented Jun 1, 2025

Thanks for the review

bdraco added a commit that referenced this pull request Jun 1, 2025
bdraco added a commit that referenced this pull request Jun 1, 2025
bdraco added a commit that referenced this pull request Jun 1, 2025
bdraco added a commit that referenced this pull request Jun 1, 2025
@Dreamsorcerer
Copy link
Member

We could go faster by doing something like... but I'm too afraid to do it because there might be subtle differences between it and SimpleCookie and we need bug for bug compatibility.

I think we did suggest somewhere that in v4 or beyond we could consider replacing http.cookies completely, as it does seem to have quite a few issues and is not properly designed to do what we're trying to achieve here..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-3.12 Trigger automatic backporting to the 3.12 release branch by Patchback robot backport-3.13 Trigger automatic backporting to the 3.13 release branch by Patchback robot bot:chronographer:provided There is a change note present in this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cookies not loaded in response.cookies but present in response.headers [Client] Double Set-Cookie header management
2 participants