Skip to content

incompatible change in number literals since v5.32 #22040

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
happy-barney opened this issue Feb 27, 2024 · 12 comments
Open

incompatible change in number literals since v5.32 #22040

happy-barney opened this issue Feb 27, 2024 · 12 comments
Labels
Closable? We might be able to close this ticket, but we need to check with the reporter

Comments

@happy-barney
Copy link

Before v5.32 Perl accepted hex / binary zeroes in form 0x_ / 0b_.

v5.32 doesn't anymore.

As far as change is not listed as incompatible change it means its a bug.

@sisyphus
Copy link
Contributor

From perl-5.32.0 perldelta documentation:

    *   Perl no longer treats strings starting with "0x" or "0b" as hex or
        binary numbers respectively when converting a string to a number.
        This reverts a change in behaviour inadvertently introduced in perl
        5.30.0 intended to improve precision when converting a string to a
        floating point number. [perl #134230
        <https://rt.perl.org/Ticket/Display.html?id=134230>]

That is, the "bug" was deemed to be in perl-5.30.0.
In perl-5.32.0 (and later), the behaviour is as it was for perl-5.28.0 and earlier.

I've always felt that decision to be an opportunity missed, but it would have required changes to looks_like_number() if we had stayed with it.
Besides, backwards-compatibility is sacrosanct.

@happy-barney
Copy link
Author

@sisyphus

  • that works even in v5.10
  • I'm not talking about strings but about number literals

backward-compatibility is a gem which perl could use to it's advantage. I'm not talking about CPAN libraries, I'm talking about tons of legacy codebase no one will pay for non-business driven modifications - it works, so don't touch it.

Should Perl introduce such small changes (or larger like that upcoming implicit builtins) ... it's often cheaper to use different language to write things from scratch then fix old code.

@sisyphus
Copy link
Contributor

sisyphus commented Feb 27, 2024

@happy-barney, a code example that demonstrates the issue might allow me to understand

@happy-barney
Copy link
Author

@sisyphus

perl -E 'say 0x_';
perl -E 'say 0b_';

@sisyphus
Copy link
Contributor

I've done a diff -wu ... on toke.c between 5.30.0 and 5.32.0. (IIUC, that's when the change occurred.)
This part of that diff looks likely to be relevant to the issue:

@@ -11329,6 +11766,21 @@
                 }
             }
 
+            if (shift != 3 && !has_digs) {
+                /* 0x or 0b with no digits, treat it as an error.
+                   Originally this backed up the parse before the b or
+                   x, but that has the potential for silent changes in
+                   behaviour, like for: "0x.3" and "0x+$foo".
+                */
+                const char *d = s;
+                char *oldbp = PL_bufptr;
+                if (*d) ++d; /* so the user sees the bad non-digit */
+                PL_bufptr = (char *)d; /* so yyerror reports the context */
+                yyerror(Perl_form(aTHX_ "No digits found for %s literal",
+                                  shift == 4 ? "hexadecimal" : "binary"));
+                PL_bufptr = oldbp;
+            }
+
 	    if (overflowed) {
 		if (n > 4294967295.0)
 		    Perl_ck_warner(aTHX_ packWARN(WARN_PORTABLE), 
@@ -11367,8 +11819,21 @@

@tonycoz, do we want to revert to the previous behaviour ? .... or do we document the change ?

@happy-barney
Copy link
Author

if document change, then it will be nice to wrap it into (pseudocode):

if (effective_use_version > v5.30) {
...
}

@tonycoz
Copy link
Contributor

tonycoz commented Feb 27, 2024

I don't think we'd revert to the previous behaviour exactly, which allowed 0x and 0b and was reported as a bug in #17010

I could see it accepting 0x_, 0b_ though both of these have warned since 5.8-ish:

$ ~/perl/5.005_04/bin/perl -wle 'print 0x_'
0
$ ~/perl/5.6.2/bin/perl -wle 'print 0x_'
0
$ ~/perl/5.8.8-nothread/bin/perl -wle 'print 0b_'
Misplaced _ in number at -e line 1.
Misplaced _ in number at -e line 1.
0
$ ~/perl/5.8.8-nothread/bin/perl -wle 'print 0x_'
Misplaced _ in number at -e line 1.
Misplaced _ in number at -e line 1.
0

The change here was reported as a new diagnostic in perl5320delta:

=item *

C<L<No digits found for %s literal|perldiag/"No digits found for %s literal">>

(F) No hexadecimal digits were found following C<0x> or no binary digits were
found following C<0b>.

@khwilliamson khwilliamson added the Closable? We might be able to close this ticket, but we need to check with the reporter label Apr 24, 2025
@khwilliamson
Copy link
Contributor

Is this closable?

@jkeenan
Copy link
Contributor

jkeenan commented May 1, 2025

Is this closable?

I think it is closable, but I'd like @tonycoz to approve its closing.

@tonycoz
Copy link
Contributor

tonycoz commented May 4, 2025

It depends on whether we want to support 0x_, 0b_ and 0o_ as a numeric literal.

I don't think so, but I've been wrong before.

@happy-barney
Copy link
Author

@tonycoz IMHO these three should not be supported (or should be defined as 0 explicitly)

But values like:

0b_1111_000
0x_0000_1111_222233334444_5555_6666

are IMHO more readable than

0b1111_000
0x0000_1111_222233334444_5555_6666
  • especially when multiple are written below each other
  • reason: there is visual boundary for eyes to be fixating on. Eliminating (however short) prolonged fixation on tokens without meaning increases eye fatigue

@tonycoz
Copy link
Contributor

tonycoz commented May 5, 2025

But values like:

0b_1111_000
0x_0000_1111_222233334444_5555_6666

The behaviour of these hasn't changed anytime recently:

$ ~/perl/5.10.0-debug/bin/perl -Wle 'print 0b_1111_000'
Misplaced _ in number at -e line 1.
120

# ./perl is blead-ish
$ ./perl -Wle 'print 0b_1111_000'
Misplaced _ in number at -e line 1.
120

$ ~/perl/5.10.0-debug/bin/perl -Wle 'print 0x_0000_1111_222233334444_5555_6666'
Misplaced _ in number at -e line 1.
Integer overflow in hexadecimal number at -e line 1.
Hexadecimal number > 0xffffffff non-portable at -e line 1.
5.2819580972354e+27

$ ./perl -Wle 'print 0x_0000_1111_222233334444_5555_6666'
Misplaced _ in number at -e line 1.
Integer overflow in hexadecimal number at -e line 1.
Hexadecimal number > 0xffffffff non-portable at -e line 1.
5.2819580972354e+27

# try a smaller number
$ ~/perl/5.10.0-debug/bin/perl -Wle 'print 0x_0000_1111_2244_5555_6666'
Misplaced _ in number at -e line 1.
Hexadecimal number > 0xffffffff non-portable at -e line 1.
1229801850133636710

$ ./perl -Wle 'print 0x_0000_1111_2244_5555_6666'
Misplaced _ in number at -e line 1.
Hexadecimal number > 0xffffffff non-portable at -e line 1.
1229801850133636710

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closable? We might be able to close this ticket, but we need to check with the reporter
Projects
None yet
Development

No branches or pull requests

5 participants