Skip to content

Commit 35a5b25

Browse files
committed
Bug#34666531: Mysqld failure - String::ptr()
The problem is that we lose a non-deterministic property of a UDF function when that function is part of a derived table that is merged into the outer query block (or is part of a subquery that is converted to a semi-join). It happens because the function fix_after_pullout() forgets about this property. The function update_used_tables() handles the property by adding some special logic that checks the non-deterministic property before updating used tables information. The fix is to add a member m_non_deterministic that is set to true during resolving and used in fix_after_pullout() and update_used_tables() to set correct values for used_tables(). The member is used by get_initial_pseudo_tables() and means that we can also remove the special update_used_tables() implementation. Change-Id: I26ea348e450acb92062df5b676e2a24110af5dd8
1 parent 7a199e1 commit 35a5b25

File tree

4 files changed

+58
-47
lines changed

4 files changed

+58
-47
lines changed

mysql-test/r/udf.result

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -725,4 +725,21 @@ f1 median
725725
2 100
726726
DROP TABLE t1;
727727
DROP FUNCTION my_median;
728+
CREATE FUNCTION sequence RETURNS INTEGER SONAME "UDF_EXAMPLE_LIB";
729+
CREATE TABLE t1 (a INT);
730+
INSERT INTO t1 VALUES (1),(2),(3),(4);
731+
SELECT a FROM t1 WHERE a = sequence();
732+
a
733+
1
734+
2
735+
3
736+
4
737+
SELECT a FROM (SELECT sequence() AS seq, a FROM t1) AS dt WHERE a = seq;
738+
a
739+
1
740+
2
741+
3
742+
4
743+
DROP FUNCTION sequence;
744+
DROP TABLE t1;
728745
# End of the 8.0 tests

mysql-test/t/udf.test

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -867,4 +867,19 @@ let $query =
867867
DROP TABLE t1;
868868
DROP FUNCTION my_median;
869869

870+
#
871+
# Bug#34666531: Mysqld failure - String::ptr()
872+
#
873+
874+
--replace_result $UDF_EXAMPLE_LIB UDF_EXAMPLE_LIB
875+
eval CREATE FUNCTION sequence RETURNS INTEGER SONAME "$UDF_EXAMPLE_LIB";
876+
CREATE TABLE t1 (a INT);
877+
INSERT INTO t1 VALUES (1),(2),(3),(4);
878+
879+
SELECT a FROM t1 WHERE a = sequence();
880+
SELECT a FROM (SELECT sequence() AS seq, a FROM t1) AS dt WHERE a = seq;
881+
882+
DROP FUNCTION sequence;
883+
DROP TABLE t1;
884+
870885
--echo # End of the 8.0 tests

sql/item_func.cc

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4461,6 +4461,7 @@ bool Item_udf_func::fix_fields(THD *thd, Item **) {
44614461
if (udf.fix_fields(thd, this, arg_count, args)) return true;
44624462
if (thd->is_error()) return true;
44634463
used_tables_cache = udf.used_tables_cache;
4464+
m_non_deterministic = is_non_deterministic();
44644465
fixed = true;
44654466
return false;
44664467
}
@@ -4570,6 +4571,21 @@ bool udf_handler::fix_fields(THD *thd, Item_result_field *func, uint arg_count,
45704571

45714572
if (func->resolve_type(thd)) return true;
45724573

4574+
/*
4575+
Calculation of constness and non-deterministic property of a UDF is done
4576+
according to this algorithm:
4577+
- If any argument to the UDF is non-const, the used tables information
4578+
and constness of the UDF is derived from the aggregated properties of
4579+
the arguments.
4580+
- If all arguments to the UDF are const and the init function specifies
4581+
the UDF to be non-const, the UDF is marked as non-deterministic.
4582+
Thus, initid.const_item is only considered when all arguments are const,
4583+
and it's use is thus slightly inconsistent. However, the current behavior
4584+
seems to work well in most circumstances.
4585+
4586+
@todo Clarify the semantics of initid.const_item and make it affect
4587+
the constness and non-deterministic property more consistently.
4588+
*/
45734589
initid.max_length = func->max_length;
45744590
initid.maybe_null = func->m_nullable;
45754591
initid.const_item = used_tables_cache == 0;

sql/item_func.h

Lines changed: 10 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -2108,54 +2108,10 @@ class Item_udf_func : public Item_func {
21082108
bool itemize(Parse_context *pc, Item **res) override;
21092109
const char *func_name() const override { return udf.name(); }
21102110
enum Functype functype() const override { return UDF_FUNC; }
2111-
bool fix_fields(THD *thd, Item **ref) override;
2112-
void update_used_tables() override {
2113-
/*
2114-
TODO: Make a member in UDF_INIT and return if a UDF is deterministic or
2115-
not.
2116-
Currently UDF_INIT has a member (const_item) that is an in/out
2117-
parameter to the init() call.
2118-
The code in udf_handler::fix_fields also duplicates the arguments
2119-
handling code in Item_func::fix_fields().
2120-
2121-
The lack of information if a UDF is deterministic makes writing
2122-
a correct update_used_tables() for UDFs impossible.
2123-
One solution to this would be :
2124-
- Add a is_deterministic member of UDF_INIT
2125-
- (optionally) deprecate the const_item member of UDF_INIT
2126-
- Take away the duplicate code from udf_handler::fix_fields() and
2127-
make Item_udf_func call Item_func::fix_fields() to process its
2128-
arguments as for any other function.
2129-
- Store the deterministic flag returned by <udf>_init into the
2130-
udf_handler.
2131-
- Don't implement Item_udf_func::fix_fields, implement
2132-
Item_udf_func::resolve_type() instead (similar to non-UDF functions).
2133-
- Override Item_func::update_used_tables to call
2134-
Item_func::update_used_tables() and add a RAND_TABLE_BIT to the
2135-
result of Item_func::update_used_tables() if the UDF is
2136-
non-deterministic.
2137-
- (optionally) rename RAND_TABLE_BIT to NONDETERMINISTIC_BIT to
2138-
better describe its usage.
2139-
2140-
The above would require a change of the UDF API.
2141-
Until that change is done here's how the current code works:
2142-
We call Item_func::update_used_tables() only when we know that
2143-
the function depends on real non-const tables and is deterministic.
2144-
This can be done only because we know that the optimizer will
2145-
call update_used_tables() only when there's possibly a new const
2146-
table. So update_used_tables() can only make a Item_func more
2147-
constant than it is currently.
2148-
That's why we don't need to do anything if a function is guaranteed
2149-
to return non-constant (it's non-deterministic) or is already a
2150-
const.
2151-
*/
2152-
if ((used_tables_cache & ~PSEUDO_TABLE_BITS) &&
2153-
!(used_tables_cache & RAND_TABLE_BIT))
2154-
Item_func::update_used_tables();
2155-
2156-
not_null_tables_cache = 0;
2157-
assert(!null_on_null); // no need to update not_null_tables_cache
2111+
table_map get_initial_pseudo_tables() const override {
2112+
return m_non_deterministic ? RAND_TABLE_BIT : 0;
21582113
}
2114+
bool fix_fields(THD *thd, Item **ref) override;
21592115
void cleanup() override;
21602116
Item_result result_type() const override { return udf.result_type(); }
21612117
bool is_expensive() override { return true; }
@@ -2172,6 +2128,13 @@ class Item_udf_func : public Item_func {
21722128

21732129
protected:
21742130
bool may_have_named_parameters() const override { return true; }
2131+
2132+
private:
2133+
/**
2134+
This member is set during resolving and is used by update_used_tables() and
2135+
fix_after_pullout() to preserve the non-deterministic property.
2136+
*/
2137+
bool m_non_deterministic{false};
21752138
};
21762139

21772140
class Item_func_udf_float final : public Item_udf_func {

0 commit comments

Comments
 (0)