-
Notifications
You must be signed in to change notification settings - Fork 50
[html] Various performance optimizations #2019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PR Health
Breaking changes
|
Package | Change | Current Version | New Version | Needed Version | Looking good? |
---|---|---|---|---|---|
html | Breaking | 0.15.5+1 | 0.15.6 | 0.16.0 Got "0.15.6" expected >= "0.16.0" (breaking changes) |
This check can be disabled by tagging the PR with skip-breaking-check
.
Changelog Entry ✔️
Package | Changed Files |
---|
Changes to files need to be accounted for in their respective changelogs.
Coverage ⚠️
File | Coverage |
---|---|
pkgs/html/lib/parser.dart | 💔 99 % ⬇️ 0 % |
pkgs/html/lib/src/constants.dart | 💚 93 % ⬆️ 0 % |
pkgs/html/lib/src/html_input_stream.dart | 💚 80 % ⬆️ 11 % |
pkgs/html/lib/src/tokenizer.dart | 💚 100 % |
pkgs/html/lib/src/treebuilder.dart | 💚 95 % |
pkgs/html/lib/src/trie.dart | 💔 Not covered |
pkgs/html/tool/generate_trie.dart | 💔 Not covered |
This check for test coverage is informational (issues shown here will not fail the PR).
This check can be disabled by tagging the PR with skip-coverage-check
.
API leaks ⚠️
The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.
Package | Leaked API symbols |
---|---|
html | HtmlTokenizer Token HtmlInputStream TagToken DoctypeToken StringToken TreeBuilder ActiveFormattingElements StartTagToken TagAttribute CommentToken CharactersToken SpaceCharactersToken EndTagToken |
This check can be disabled by tagging the PR with skip-leaking-check
.
License Headers ⚠️
// Copyright (c) 2025, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
Files |
---|
pkgs/html/lib/parser.dart |
pkgs/html/lib/src/constants.dart |
pkgs/html/lib/src/html_input_stream.dart |
pkgs/html/lib/src/tokenizer.dart |
pkgs/html/lib/src/treebuilder.dart |
pkgs/html/test/parser_feature_test.dart |
pkgs/html/test/trie_test.dart |
pkgs/html/tool/generate_trie.dart |
All source files should start with a license header.
Unrelated files missing license headers
Files |
---|
pkgs/bazel_worker/benchmark/benchmark.dart |
pkgs/bazel_worker/example/client.dart |
pkgs/bazel_worker/example/worker.dart |
pkgs/benchmark_harness/integration_test/perf_benchmark_test.dart |
pkgs/boolean_selector/example/example.dart |
pkgs/clock/lib/clock.dart |
pkgs/clock/lib/src/clock.dart |
pkgs/clock/lib/src/default.dart |
pkgs/clock/lib/src/stopwatch.dart |
pkgs/clock/lib/src/utils.dart |
pkgs/clock/test/clock_test.dart |
pkgs/clock/test/default_test.dart |
pkgs/clock/test/stopwatch_test.dart |
pkgs/clock/test/utils.dart |
pkgs/coverage/lib/src/coverage_options.dart |
pkgs/coverage/test/collect_coverage_config_test.dart |
pkgs/coverage/test/config_file_locator_test.dart |
pkgs/html/example/main.dart |
pkgs/html/lib/dom.dart |
pkgs/html/lib/dom_parsing.dart |
pkgs/html/lib/html_escape.dart |
pkgs/html/lib/src/encoding_parser.dart |
pkgs/html/lib/src/list_proxy.dart |
pkgs/html/lib/src/query_selector.dart |
pkgs/html/lib/src/token.dart |
pkgs/html/lib/src/utils.dart |
pkgs/html/test/dom_test.dart |
pkgs/html/test/parser_test.dart |
pkgs/html/test/query_selector_test.dart |
pkgs/html/test/selectors/level1_baseline_test.dart |
pkgs/html/test/selectors/level1_lib.dart |
pkgs/html/test/selectors/selectors.dart |
pkgs/html/test/support.dart |
pkgs/html/test/tokenizer_test.dart |
pkgs/pubspec_parse/test/git_uri_test.dart |
pkgs/stack_trace/example/example.dart |
pkgs/watcher/test/custom_watcher_factory_test.dart |
pkgs/yaml_edit/example/example.dart |
This check can be disabled by tagging the PR with skip-license-check
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! A 2x Improvement sounds great. I had some minor comments and you'll need to up the version to 0.15.6
and add a changelog and it's good to go!
pkgs/html/example/generate_trie.dart
Outdated
@@ -0,0 +1,18 @@ | |||
import 'package:dart_style/dart_style.dart'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should live in html/tool/
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkgs/html/example/generate_trie.dart
Outdated
'const entitiesTrieRoot = $root;'.replaceAll('{}', '<int, dynamic>{}'); | ||
final formatted = DartFormatter().format(source); | ||
// trimRight() because print adds its own newline | ||
print(formatted.trimRight()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of printing to stdout, this could override lib/src/trie.dart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkgs/html/example/generate_trie.dart
Outdated
} | ||
} | ||
final source = | ||
'const entitiesTrieRoot = $root;'.replaceAll('{}', '<int, dynamic>{}'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a // AUTO GENERATED by 'tool/generate_trie.dart'. DO NOT EDIT!
to the beginning of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkgs/html/example/generate_trie.dart
Outdated
for (final entity in entities.keys) { | ||
var node = root; | ||
for (final charCode in entity.codeUnits) { | ||
node = (node[charCode] ??= <int, dynamic>{}) as Map<int, dynamic>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good enough approach. Another way would be to create a 2d array of STATE * ALPHABET which would be around 500KBs of memory but faster to traverse. Best is using double array tries but that's more complicated to implement. A potential fun project for later perhaps since a 2x improvement is great already!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never heard of those, interesting
pkgs/html/lib/src/constants.dart
Outdated
@@ -424,7 +468,60 @@ const List<String> tableInsertModeElements = [ | |||
]; | |||
|
|||
// TODO(jmesserly): remove these in favor of the test functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the TODO comment mentions, a possible improvement here is to have a test function that only checks 2 integer bounds instead of using Set.contains
. Could be done in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems it doesn't quite contain all letters right? Since there are some symbols between upper and lowercase letters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well then 4 integer bound checks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -77,9 +77,6 @@ On line 4, column 3 of ParseError: Unexpected DOCTYPE. Ignored. | |||
expect(parser.errors.length, 1); | |||
final error = parser.errors[0]; | |||
expect(error.errorCode, 'unexpected-doctype'); | |||
expect(error.span!.start.line, 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't quite understand why we deleted these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I no longer generate fileInfo unless generateSpans=true, we don't get any error.span any more for this test which has generateSpans=false
|
pkgs/html/tool/generate_trie.dart
Outdated
'const entitiesTrieRoot = $root;'.replaceAll('{}', '<int, dynamic>{}'); | ||
final formatted = DartFormatter().format(source); | ||
final htmlDir = File(Platform.script.path).parent.parent; | ||
File('${htmlDir.path}/lib/src/trie.dart').writeAsStringSync(formatted); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work on windows? You can use join
from package:path
to make it platform independent:
File(p.join(htmlDir.path, 'lib', 'src', 'trie.dart').writeAsStringSync(formatted);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it seems that this file is not formatted using dart format
causing the CI to complain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Package publishing
Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation. |
- Format generate_trie.dart - Use path join() in generate_trie.dart - Add charsUntilAsciiLetter() - Move all codes to Charcode {}
This is a noticeable speedup by scanning and avoiding creating a new string first.
I found another 40% speedup via some new commits |
Actually surprisingly slow
pkgs/html/lib/src/constants.dart
Outdated
String toAsciiLowerCase() => | ||
String.fromCharCodes(codeUnits.map(_asciiToLower)); | ||
String toAsciiLowerCase() { | ||
if (codeUnits.any((c) => c >= Charcode.kUpperA && c <= Charcode.kUpperZ)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These conditions can live as functions: codeUnits.any(isUpperCaseCode)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkgs/html/lib/src/constants.dart
Outdated
static const int kGreaterThan = 0x3E; | ||
|
||
/// A | ||
static const int kUpperA = 0x41; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We usually don't use kFooBar
when identifying constants in Dart, instead we just use fooBar
.
https://dart.dev/effective-dart/style#dont-use-prefix-letters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
: String.fromCharCode(_chars[_offset]); | ||
final charCode = _chars[_offset]; | ||
if (charCode < 256) { | ||
return asciiCharacters[charCode]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does changing this to String.fromCharCode(charCode)
affect the performance much?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The improvement is 1% in AoT compilation and 6% in JIT
@@ -107,8 +107,9 @@ class HtmlInputStream { | |||
} | |||
} | |||
|
|||
final isSurrogatePair = | |||
i + 1 < charsLength && _isLeadSurrogate(c) && _isTrailSurrogate(rawChars[i + 1]); | |||
final isSurrogatePair = _isLeadSurrogate(c) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also order it in the original way but when you think about the fact that most characters aren't surrogate, this makes sense as it saves a check per character! Good find!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for your contribution!
Revisions updated by `dart tools/rev_sdk_deps.dart`. dartdoc (https://github.com/dart-lang/dartdoc/compare/1788710..1a08646): 1a08646a 2025-04-28 Sam Rawlins Stop using InterfaceElementImpl2 (dart-lang/dartdoc#4041) 38b7d2e0 2025-04-28 Sam Rawlins Stop using ConstructorElementImpl (dart-lang/dartdoc#4040) ad65ae89 2025-04-25 Sarah Zakarias Make redirect pages for categories (dart-lang/dartdoc#4037) 9ab9ba05 2025-04-23 Parker Lougheed Migrate remaining styles to use sass nesting (dart-lang/dartdoc#4024) aaea6377 2025-04-22 Jonas Finnemann Jensen Use _category name_ in the URL, instead of the _category displayName_. (dart-lang/dartdoc#4031) 0388c64e 2025-04-22 Jonas Finnemann Jensen Do not make a copy of PUB_CACHE, just create an empty one instead. (dart-lang/dartdoc#4032) tools (https://github.com/dart-lang/tools/compare/11a7719..11f4cf7): 11f4cf7e 2025-04-24 Callum Moffat [html] Various performance optimizations (dart-lang/tools#2019) 98d4e4d9 2025-04-22 Simone Stasi [html] fix TypeError in nth-child query selector (dart-lang/tools#2015) Change-Id: Ic5d0a5f7f9393fa1a585b80ec1e014647224f151 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/425200 Auto-Submit: Devon Carew <[email protected]> Reviewed-by: Konstantin Shcheglov <[email protected]> Commit-Queue: Konstantin Shcheglov <[email protected]>
Should be twice as fast according to my use cases. Happy to discuss what you need for testing, (micro)benchmarking, preserving public API for old slow functions, or whatever else to merge it. I know I still need to do stuff like bump version, changelog, etc.
This was originally at dart-archive/html#251
That version has proper separation of each commit which may be more helpful, it's been squashed for the repo migration.