You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/query/query-executions.md
+8-3Lines changed: 8 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -64,16 +64,21 @@ Query executions on Dune are triggered in four ways:
64
64
65
65
Dune has three query engine sizes: community, medium, and large. The query engine size determines the amount of resources allocated to your query. The larger the query engine, the more resources are allocated to your query. This means that queries executed on a larger query engine will run faster and are less likely to time out.
The community engine is the default query engine for all queries on Dune. It is a shared cluster, meaning that it is used by all Dune users. This means that the community cluster can be busy at times, which can lead to long loading times for the viewer if the query is resource-intensive. In the worst case, the query may even time out.
70
74
71
-
### Medium engine
75
+
### Medium engine (10 credits)
72
76
73
77
The medium engine is built to handle most queries on Dune. It is cheap, reliable and fast. The medium engine will scale up and down depening on the demand. In contrast to the community engine, that means that running a query on the medium engine will not be affected by other users' queries.
74
78
75
-
### Large engine
79
+
### Large engine (20 credits)
76
80
77
81
The large engine is built to handle the most resource-intensive queries on Dune. It's blazing fast, reliable and can easily deal with large amounts of data. The large engine also scales up and down depending on the demand. Running a query on the large engine will not be affected by other users' queries.
82
+
78
83
In addition to that, the large engine is also the only engine that can handle queries that requires lots of planning time. This mostly happens when you query a large amount of data, when you use a lot of joins or large aggregate window functions.
Copy file name to clipboardExpand all lines: docs/query/writing-efficient-queries.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,6 +25,8 @@ Each parquet file has a footer containing min/max values for every column stored
25
25
26
26
However, the min/max values of strings, such as tx_hash strings and address strings in blockchain systems, are often not helpful, as they are randomly generated and not sequentially ordered. As a result, queries that reference these strings will be inefficient since all related pages need to be read into memory.
To write efficient queries on DuneSQL, it's crucial to use filter conditions based on columns that are sequentially ordered and correlated with the file's sorting. Columns like `block_time` and `block_number` are suitable for this purpose. For instance, consider the following optimized query:
@@ -48,7 +50,7 @@ In addition to leveraging the columnar storage format and using sequentially ord
48
50
49
51
2.**Use the LIMIT clause**: If you're only interested in a specific number of rows, use the LIMIT clause to avoid processing more data than necessary.
50
52
51
-
3.**Leverage partition pruning**: If your data is partitioned, use partition keys in the WHERE clause to help the query engine prune unnecessary partitions and reduce the amount of data scanned.
53
+
3.**Leverage partition pruning**: If your data is partitioned, use partition keys in the WHERE clause to help the query engine prune unnecessary partitions and reduce the amount of data scanned. In Dune almost all tables are partitioned by time and/or block number.
52
54
53
55
4.**Filter early and use predicate pushdown**: Apply filters as early as possible in the query to reduce the amount of data being processed. This takes advantage of predicate pushdown, which pushes filter conditions down to the storage layer, reducing the amount of data read from storage.
54
56
@@ -58,8 +60,6 @@ In addition to leveraging the columnar storage format and using sequentially ord
58
60
59
61
7.**Optimize subqueries**: Subqueries can sometimes cause performance issues. Consider using Common Table Expressions (CTEs) or rewriting the query using JOINs to optimize subqueries.
60
62
61
-
8.**Use the EXPLAIN command**: The EXPLAIN command shows the query execution plan, which can help you understand the underlying operations and optimize your query. Analyze the output of EXPLAIN to identify potential bottlenecks or improvements.
62
-
63
-
9.**Optimize data types**: Use appropriate data types for your columns, as it can improve query performance by reducing the amount of data processed. For example, varbinary operations are faster that varchar so be careful casting around too much.
63
+
8.**Optimize data types**: Use appropriate data types for your columns, as it can improve query performance by reducing the amount of data processed. For example, varbinary operations are faster that varchar so be careful casting around too much.
64
64
65
65
By following these tips, you can write more efficient queries on DuneSQL with TrinoSQL and optimize the performance of your data processing tasks. Remember that DuneSQL's unique structure, such as the parquet file format and columnar storage, should be taken into account when optimizing your queries to fully benefit from the system's capabilities.
0 commit comments