add support to ignore ansi sequences when formatting usage display. Fixes #879 #880

timmaffett · 2025-04-16T21:38:46Z

This PR adds logic to exclude hidden ANSI escape sequences when formatting the Usage display.
The ANSI escape sequences are excluded when calculating the lengths of help strings used in the Usage display by using a string extension lengthWithoutAnsi which removes any hidden ANSI escape sequences present before calculating the string length.

It includes tests of the lengthWithoutAnsi getter to ensure that it returns correct length values when no ANSI escape sequences are present as well as when a variety of ANSI escape sequences are present.

(The regex that i am using here is well exercised, as it is the same reg ex that I use within VS Code when parsing and formatting ANSI sequences within the debug console)

[X ] I’ve reviewed the contributor guide and applied the relevant portions to this PR.

munificent · 2025-04-24T17:20:36Z

Sorry, but I definitely don't have time right now to maintain args. :(

timmaffett · 2025-04-25T12:41:16Z

Understandable - I'm excited for whatever you are working on 😄

lrhn · 2025-04-29T15:16:15Z

pkgs/args/lib/src/utils.dart

+  int get lengthWithoutAnsi {
+    if (!_AnsiUtils.hasAnsi(this)) return length;
+    return _AnsiUtils.stripAnsi(this).length;
+  }


This is more expensive than it needs to be.
You know the length of the total string, and you have the matches for every Ansi escape sequence, so you could just subtract the lengths of that, without needing to actually create a new string.

int get lengthWithoutAnsi { var length = this.length; for (var escape in _AnsiUtils.ansiRE.allMatches(this)) { length -= escape.end - escape.start; } return length; }

(If we are a little lucky, the RegExp match won't eagerly allocate a match string if we don't ask for it.)

lrhn · 2025-04-29T15:18:05Z

pkgs/args/lib/src/utils.dart

+  static final String ansiCodePattern = [
+      '[\\u001B\\u009B][[\\]()#;?]*(?:(?:(?:[a-zA-Z\\d]*(?:;[-a-zA-Z\\d\\/#&.:=?%@~_]*)*)?\\u0007)',
+      '(?:(?:\\d{1,4}(?:;\\d{0,4})*)?[\\dA-PR-TZcf-ntqry=><~]))'
+    ].join('|');


Use raw strings for RegExp patterns. No need to use lists and join, just have adjacent strings:

static const String ansiCodePattern = r'[\x1b\x9b][[\]()#;?]*' r'(?:' // r'(?:' // - r'(?:[a-zA-Z\d]*' //-- r'(?:;[\-a-zA-Z\d/#&.:=?%@~_]*)*' r')?\x07' // - r'|' r'(?:' // -- r'(?:\d{1,4}(?:;\d{0,4})*[\dA-PR-TZcf-ntqry=><~]' r')' //- r')'; //

lrhn · 2025-04-29T15:32:56Z

pkgs/args/lib/src/utils.dart

+
+/// A utility class for finding and stripping ANSI codes from strings.
+class _AnsiUtils {
+  static final String ansiCodePattern = [


It would be very nice to have a textual description of what the pattern is supposed to match.
It's hard enough to debug RegExps even if you have a specification of what they're intended to do.
Without that, it's impossible to tell whether it's right or wrong.

Is this a correct description:

One of U+001b or U+009b

Followed by any number of the characters: [, ], (, ), #, ;, ?

Followed by either:

zero or more letters or digits

followed by any number of repeats of:

;

followed by any number of digits, letters, or characters from -/#^.:=?%@~_

or:

1-4 digits

followed by any number of repeats of:

;

followed by 0-4 digits

followed by on digit, letter other than QUVWXY or abdeopsuvwxz, or one of =><~.

I don't know what the official specification of this is.
I'd be tempted to just go with what Wikpedia says is a simple control sequence introduction:

r'\x1b[0-?]*[ -/]*[@-~]'

Using U+009B as well is dangerous. It means something else in both UTF-8 and Windows-1252 code pages. (But I guess we're working on Dart strings, so we're past encoding issues.)

lrhn · 2025-04-29T15:34:14Z

pkgs/args/lib/src/utils.dart

 /// Pads [source] to [length] by adding spaces at the end.
 String padRight(String source, int length) =>
-    source + ' ' * (length - source.length);
+    source + ' ' * (length - source.lengthWithoutAnsi);


I'd consider adding .ansiLength which only accumulates the length of the Ansi sequences,
so that it's just int get lengthWithoutAnsi => length - ansiLength;.
Then this can be source.padRight(length + source.ansiLength). (Using padRight is more efficient because it doesn't have to create the intermediate ' ' * ... string. I guess a good optimizing compiler can avoid that too, but it's better to be safe.)

lrhn · 2025-04-29T15:36:56Z

pkgs/args/test/utils_test.dart

+const _ansiMixedStyles = 'Normal, \x1B[31mRed\x1B[0m, \x1B[1mBold\x1B[0m, \x1B[4mUnderline\x1B[0m, \x1B[1;34mBold Blue\x1B[0m, Normal again.';
+const _ansiLongSequence = 'Start \x1B[1;3;4;5;7;9;31;42;38;5;196;48;5;226m Beaucoup formatting! \x1B[0m End';
+const _ansiCombined256 = '\x1B[1;38;5;27;48;5;220mBold Bright Blue FG (27) on Gold BG (220)\x1B[0m';
+const _ansiCombinedTrueColor = '\x1B[4;48;2;50;50;50;38;2;150;250;150mUnderlined Light Green FG on Dark Grey BG\x1B[0m';


These are all of the form \x1b\[[\d;]*m. That doesn't cover most of the complexity of the RegExp.

If this is all the functionality needs to do, the RegExp shouldn't be as complicated. (It probably shouldn't, even if recognizing more escapes than just the color ones.)

add support to ignore ansi sequences when formatting usage display

02d4bf2

timmaffett requested a review from munificent as a code owner April 16, 2025 21:38

github-actions bot added the package:args label Apr 16, 2025

munificent removed their request for review April 24, 2025 17:20

lrhn reviewed Apr 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add support to ignore ansi sequences when formatting usage display. Fixes #879 #880

add support to ignore ansi sequences when formatting usage display. Fixes #879 #880

timmaffett commented Apr 16, 2025

munificent commented Apr 24, 2025

timmaffett commented Apr 25, 2025

lrhn Apr 29, 2025

lrhn Apr 29, 2025

lrhn Apr 29, 2025 •

edited

Loading

lrhn Apr 29, 2025 •

edited

Loading

lrhn Apr 29, 2025

add support to ignore ansi sequences when formatting usage display. Fixes #879 #880

Are you sure you want to change the base?

add support to ignore ansi sequences when formatting usage display. Fixes #879 #880

Conversation

timmaffett commented Apr 16, 2025

munificent commented Apr 24, 2025

timmaffett commented Apr 25, 2025

lrhn Apr 29, 2025

Choose a reason for hiding this comment

lrhn Apr 29, 2025

Choose a reason for hiding this comment

lrhn Apr 29, 2025 • edited Loading

Choose a reason for hiding this comment

lrhn Apr 29, 2025 • edited Loading

Choose a reason for hiding this comment

lrhn Apr 29, 2025

Choose a reason for hiding this comment

lrhn Apr 29, 2025 •

edited

Loading

lrhn Apr 29, 2025 •

edited

Loading