Query planning benchmark #127550

idegtiarenko · 2025-04-30T08:32:08Z

As titled, this adds a micro benchmark for query planning (without execution phase).
This should allow us to detect expensive operations during planning that would otherwise be invisible when running a rally benchmark with both planning and execution.

This could be executed using:

./gradlew -p benchmarks/ run --args 'QueryParsingAndAnalysisBenchmark -prof "async:libPath=/path/to/async-profiler-3.0-6c0aff4-linux-x64/lib/libasyncProfiler.so;dir=/tmp/profile;output=flamegraph"'

elasticsearchmachine · 2025-04-30T08:32:42Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

idegtiarenko · 2025-04-30T08:38:13Z

benchmarks/src/main/java/org/elasticsearch/benchmark/esql/QueryPlanningBenchmark.java

+    }
+
+    private LogicalPlan plan(String query) {
+        var parsed = parser.createStatement(query, new QueryParams(), telemetry);


I am puzzled why telemetry is passed as a method argument (even though it is a stateless service like) and AnalyzerContext is passed to Analyzer via constructor even though it is query specific and prevents us from reusing Analyzer instance across queries, also similar with LogicalOptimizerContext and LogicalPlanOptimizer.

Could somebody share history behind that?
If we change it it would also allow to add benchmarks easier with plan(context, "FROM test | LIMIT 10")

I am puzzled why telemetry is passed as a method argument (even though it is a stateless service like)

The PlanTelemetry object gathers metrics specific to each query. Iif the query succeeds, the metrics are pushed through the service, late in a listener. If it fails, just one counter is bumped.

AnalyzerContext is passed to Analyzer via constructor even though it is query specific and prevents us from reusing Analyzer instance across queries

I think this is required by the inheritance / hierarchy, the context being set in a parent class ParameterizedRuleExecutor, so that any executor rules that are context specific can have guaranteed access to it.

alex-spies

Yeah, that should work for micro-benching just the logical planning on the coordinator, excluding the field caps calls and their construction (EsqlSession.fieldNames). Thanks Ievgen!

There's a bit of a risk that Java is too smart for us by loading fewer classes than a production environment, and thus being more aggressive with the JIT. To guard against this, we could have a static section that plans a bunch of different query plans (and appropriately consumes them) just so most of the relevant classes are loaded before the benchmark does its thing.

bpintea

The setting up and querying LGTM, but not very microbenchmarks-versed otherwise, maybe wait for another approval.

idegtiarenko · 2025-04-30T09:55:26Z

I agree. I hope to use this infrastructure for #127500 and possibly other changes. I will update it if I see something behaving unusual.

Thanks all!

quackaplop

Overall this looks reasonable, however this feels narrow. Maybe this is OK for now.
We probably want to add more testsover time that:

Deal with more complex types
Deal with more complex queries
Allow configurations for hardcoded stuff like mapping cardinality.

Maybe worth thinking how we do this going forward.
One thing we can do is craft a base class with all the boilerplate goo - the set up, calling of the parser/planner/analyzer etc with things like "mapping" and "query" as parameters.

We can then easily define a ton of new benchmarks with different configurations. Something to consider

quackaplop · 2025-04-30T09:54:19Z

benchmarks/src/main/java/org/elasticsearch/benchmark/esql/QueryPlanningBenchmark.java

+        var fields = 10_000;
+        var mapping = LinkedHashMap.<String, EsField>newLinkedHashMap(fields);
+        for (int i = 0; i < fields; i++) {
+            mapping.put("field" + i, new EsField("field-" + i, TEXT, emptyMap(), true));


For planning purposes, do we care that all fields are TEXT? Really asking - is field types something else we should vary?

You have to start somewhere I suppose. I don't believe the types make a big difference for the query being tested in this case.

It's pretty normal to build parameters on these things so you run it like:

./gradlew -p benchmark run --args 'QueryPlanning -pfield_count=10000 -pfield_config=long -pquery="FROM test | EVAL a = field1 + field2"'

Or something. There's a bit of tension here because the default if you run it without any parameters is to run it with the combinatorial explosion of all of a configured set of parameters - and that can be trouble. We're expecting to be a bit more selective on how we run these than that. But, yeah, I think this PR is a good start and can tell us a lot as is.

quackaplop · 2025-04-30T09:57:23Z

benchmarks/src/main/java/org/elasticsearch/benchmark/esql/QueryPlanningBenchmark.java

+
+    private LogicalPlan plan(String query) {
+        var parsed = parser.createStatement(query, new QueryParams(), telemetry);
+        var analyzed = analyzer.analyze(parsed);


Not sure if we care, but this literally can be a single line:
return optimizer.optimize(analyzer.analyze(parser.createStatement(query, new QueryParams(), telemetry)));

idegtiarenko · 2025-04-30T14:06:04Z

@quackaplop

Overall this looks reasonable, however this feels narrow. Maybe this is OK for now.
We probably want to add more testsover time that:
Deal with more complex types
Deal with more complex queries
Allow configurations for hardcoded stuff like mapping cardinality.

The purpose of this change is to add initial benchmark, mostly targeted on #127500

This should evolve over time and have more interesting types of queries to measure (discussed during prior the team sync teaming) as well as infrastructure to periodically run those and capture results and changes.

Query planning benchmark

4d67b7e

idegtiarenko added >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.1.0 labels Apr 30, 2025

idegtiarenko requested review from nik9000 and alex-spies April 30, 2025 08:32

idegtiarenko commented Apr 30, 2025

View reviewed changes

alex-spies approved these changes Apr 30, 2025

View reviewed changes

bpintea approved these changes Apr 30, 2025

View reviewed changes

idegtiarenko merged commit 51959da into elastic:main Apr 30, 2025
17 checks passed

idegtiarenko deleted the planning_benchmark branch April 30, 2025 09:55

quackaplop reviewed Apr 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query planning benchmark #127550

Query planning benchmark #127550

idegtiarenko commented Apr 30, 2025

elasticsearchmachine commented Apr 30, 2025

idegtiarenko Apr 30, 2025

bpintea Apr 30, 2025

alex-spies left a comment •

edited

Loading

bpintea left a comment

idegtiarenko commented Apr 30, 2025

quackaplop left a comment

quackaplop Apr 30, 2025

nik9000 Apr 30, 2025

quackaplop Apr 30, 2025

idegtiarenko commented Apr 30, 2025

Query planning benchmark #127550

Query planning benchmark #127550

Conversation

idegtiarenko commented Apr 30, 2025

elasticsearchmachine commented Apr 30, 2025

idegtiarenko Apr 30, 2025

Choose a reason for hiding this comment

bpintea Apr 30, 2025

Choose a reason for hiding this comment

alex-spies left a comment • edited Loading

Choose a reason for hiding this comment

bpintea left a comment

Choose a reason for hiding this comment

idegtiarenko commented Apr 30, 2025

quackaplop left a comment

Choose a reason for hiding this comment

quackaplop Apr 30, 2025

Choose a reason for hiding this comment

nik9000 Apr 30, 2025

Choose a reason for hiding this comment

quackaplop Apr 30, 2025

Choose a reason for hiding this comment

idegtiarenko commented Apr 30, 2025

alex-spies left a comment •

edited

Loading