Skip to content

nogil list reverse breaks atomicity #129619

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
luisggpina opened this issue Feb 3, 2025 · 3 comments
Open

nogil list reverse breaks atomicity #129619

luisggpina opened this issue Feb 3, 2025 · 3 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading type-bug An unexpected behavior, bug, or error

Comments

@luisggpina
Copy link

luisggpina commented Feb 3, 2025

Bug report

Bug description:

Hi,

We're a research group focused on testing concurrent runtimes. Our work-in-progress prototype found a violation of atomicity on the current nogil build when using list reverse with other concurrent operations on the same list. The program below shows the wrong behavior for reverse and __contains__:

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    s.reverse()

def t1(b1,s,res):
    b1.wait()
    res.append(s.__contains__(3))

def Test():
  l =  [1, 2, 3]
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, l,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, l,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] != True:
      print("found bug: " + str(res))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

Operation __contains__ should always see element 3 in the list, either the original or reversed. However, it can see a list without element 3 and return False, as shown in the sample output:

test begin...                                                                                                                                                                                                       
0                                                                                                         
found bug: [False]
found bug: [False]
1000

We observed the same behavior with operations count and index executing concurrently with reverse on the same list, I'll add a comment with those tests and sample outputs.

@flypoodles and @overlorde are part of the team, adding them so they get notified about further discussion.

We note that we observed these outputs in several x86_64 machines and one ARM machine.

CPython versions tested on:

3.14, CPython main branch

Operating systems tested on:

Linux

@luisggpina luisggpina added the type-bug An unexpected behavior, bug, or error label Feb 3, 2025
@luisggpina
Copy link
Author

Adding the tests mentioned in the main issue here, with sample outputs.

reverse and count:

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    res.append(s.count(3))

def t1(b1,s,res):
    b1.wait()
    s.reverse()

def Test():
  l =  [1, 2, 3]
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, l,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, l,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] != 1:
      print("found bug: " + str(res[0]))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

Sample output:

test begin...
0
1000
found bug: 0

reverse and index:

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    s.reverse()

def t1(b1,s,res):
    b1.wait()
    try:
        s.index(3)
    except Exception as e:
        res.append(e)

def Test():
  l =  [1, 2, 3]
  threads=[]
  barrier = threading.Barrier(2)
  e = []
  threads.append(threading.Thread(target= t0, args=(barrier, l,e)))
  threads.append(threading.Thread(target= t1, args=(barrier, l,e)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if len(e) != 0:
      print("found bug: " + str(e[0]))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

Sample output:

test begin...                                                                                             
0                                                                                                         
found bug: list.index(x): x not in list                                                                   
found bug: list.index(x): x not in list                                                                   
1000                                                

@picnixz picnixz added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading labels Feb 3, 2025
@sobolevn sobolevn self-assigned this Feb 4, 2025
@sobolevn
Copy link
Member

sobolevn commented Feb 4, 2025

I tried all examples several times, but I can't reproduce any of the errors on my macos with m2 with main version of --disable-git builds :(

@colesbury
Copy link
Contributor

I don't think you need to attempt to reproduce the issue. It's clear to me from the implementation that this can happen: list. __contains__ and list.__getitem__ do not lock, so they can overlap with operations like reverse. reverse operates in-place, so you can see partial results.

I don't think we should change the implementation, but we will want to document the behavior.

Making reverse behave atomically would require either locking in all read operations or changing reverse to perform the operation out of place before swapping the ob_item pointers atomically. I don't think either option is worthwhile.

In general, we should not rush to make multi-element operations atomic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants