Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Inconsistent behavior with zero-width matches on empty strings #1163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rootCircle opened this issue Feb 13, 2024 · 0 comments
Closed

Inconsistent behavior with zero-width matches on empty strings #1163

rootCircle opened this issue Feb 13, 2024 · 0 comments

Comments

@rootCircle
Copy link

rootCircle commented Feb 13, 2024

What version of regex are you using?

v1.10.3

Describe the bug at a high level.

replace_all in the regex crate replaces empty strings before non-matching characters differently than Python's standard library regex engine. (Rust version of regex doesn't consider empty strings before non-matching characters as valid matches.)

What are the steps to reproduce the behavior?

  1. Create a Regex object with the pattern r"a*" (matches zero or more "a"s).
  2. Apply replace_all to the string "abxd" with a hyphen as the replacement string.
  3. Observed output (Rust): "-a-b-d-"
  4. Expected output (Python): "-a-b--d-"

Rust Code

use regex::Regex;

fn main() {
    let re = Regex::new(r"x*").unwrap();
    let hay = "abxd";

    println!("{:?}", re.replace_all(hay, "-"));
}

Equivalent Python Code:

import re

regex = r"x*"
test_str = "abxd"
subst = "-"

result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

What is the actual behavior?

replace_all only replaces the empty string before "b" in Rust, not the one before "d".

What is the expected behavior?

Both empty strings should be replaced, resulting in "-a-b--d-".

By the way, I am not sure, if this is an intentional difference or a potential bug?

@rust-lang rust-lang locked and limited conversation to collaborators Feb 13, 2024
@BurntSushi BurntSushi converted this issue into discussion #1164 Feb 13, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant