-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[libc++][format] Discard contents since null-terminator in character arrays in formatting #116571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-libcxx Author: A. Jiang (frederick-vs-ja) ChangesCurrently, built-in > otherwise, if The standard wording specifies that character arrays are decayed to pointers. When the null terminator is not the last element or there's no null terminator (the latter case is UB), libc++ currently produces different results. Fixes #115935. Full diff: https://github.com/llvm/llvm-project/pull/116571.diff 2 Files Affected:
diff --git a/libcxx/include/__format/format_arg_store.h b/libcxx/include/__format/format_arg_store.h
index 8b2c95c657c9bd..78b142ebcdb743 100644
--- a/libcxx/include/__format/format_arg_store.h
+++ b/libcxx/include/__format/format_arg_store.h
@@ -101,20 +101,14 @@ consteval __arg_t __determine_arg_t() {
return __arg_t::__long_double;
}
-// Char pointer
+// Char pointer or array
template <class _Context, class _Tp>
- requires(same_as<typename _Context::char_type*, _Tp> || same_as<const typename _Context::char_type*, _Tp>)
+ requires(same_as<typename _Context::char_type*, _Tp> || same_as<const typename _Context::char_type*, _Tp>) ||
+ (is_array_v<_Tp> && same_as<_Tp, typename _Context::char_type[extent_v<_Tp>]>)
consteval __arg_t __determine_arg_t() {
return __arg_t::__const_char_type_ptr;
}
-// Char array
-template <class _Context, class _Tp>
- requires(is_array_v<_Tp> && same_as<_Tp, typename _Context::char_type[extent_v<_Tp>]>)
-consteval __arg_t __determine_arg_t() {
- return __arg_t::__string_view;
-}
-
// String view
template <class _Context, class _Tp>
requires(same_as<typename _Context::char_type, typename _Tp::value_type> &&
@@ -188,15 +182,8 @@ _LIBCPP_HIDE_FROM_ABI basic_format_arg<_Context> __create_format_arg(_Tp& __valu
else if constexpr (__arg == __arg_t::__unsigned_long_long)
return basic_format_arg<_Context>{__arg, static_cast<unsigned long long>(__value)};
else if constexpr (__arg == __arg_t::__string_view)
- // Using std::size on a character array will add the NUL-terminator to the size.
- if constexpr (is_array_v<_Dp>)
- return basic_format_arg<_Context>{
- __arg, basic_string_view<typename _Context::char_type>{__value, extent_v<_Dp> - 1}};
- else
- // When the _Traits or _Allocator are different an implicit conversion will
- // fail.
- return basic_format_arg<_Context>{
- __arg, basic_string_view<typename _Context::char_type>{__value.data(), __value.size()}};
+ return basic_format_arg<_Context>{
+ __arg, basic_string_view<typename _Context::char_type>{__value.data(), __value.size()}};
else if constexpr (__arg == __arg_t::__ptr)
return basic_format_arg<_Context>{__arg, static_cast<const void*>(__value)};
else if constexpr (__arg == __arg_t::__handle)
diff --git a/libcxx/test/std/utilities/format/format.functions/format_tests.h b/libcxx/test/std/utilities/format/format.functions/format_tests.h
index b2ed6775fe8a13..33212fa663e4a5 100644
--- a/libcxx/test/std/utilities/format/format.functions/format_tests.h
+++ b/libcxx/test/std/utilities/format/format.functions/format_tests.h
@@ -3189,6 +3189,10 @@ void format_tests(TestFunction check, ExceptionTest check_exception) {
const CharT* data = buffer;
check(SV("hello 09azAZ!"), SV("hello {}"), data);
}
+ {
+ CharT buffer[] = {CharT('a'), CharT('b'), CharT('c'), 0, CharT('d'), CharT('e'), CharT('f'), 0};
+ check(SV("hello abc"), SV("hello {}"), buffer);
+ }
{
std::basic_string<CharT> data = STR("world");
check(SV("hello world"), SV("hello {}"), data);
|
6b05bb4
to
4424283
Compare
Currently, built-in `char`/`wchar_t` arrays are assumed to be null-terminated sequence with the terminator being the last element in formatting. This doesn't conform to [format.arg]/6.9. > otherwise, if `decay_t<TD>` is `char_type*` or `const char_type*`, > initializes value with `static_cast<const char_type*>(v)`; The standard wording specifies that character arrays are decayed to pointers. When the null terminator is not the last element or there's no null terminator (the latter case is UB), libc++ currently produces different results.
4424283
to
96f2f4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update! Still a few more comments but I think this should be good to go once they've been addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this. A few comments.
template <class _Arr, class _Elem> | ||
inline constexpr bool __is_bounded_array_of = false; | ||
|
||
template <class _Elem, size_t _Len> | ||
inline constexpr bool __is_bounded_array_of<_Elem[_Len], _Elem> = true; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these really useful? They are only used twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I roughly remember that the old form was related to something weird (possibly for the T[0]
extension), so I added the new internal trait.
Currently, built-in
char
/wchar_t
arrays are assumed to be null-terminated sequence with the terminator being the last element in formatting. This doesn't conform to [format.arg]/6.9.The standard wording specifies that character arrays are decayed to pointers. When the null terminator is not the last element or there's no null terminator (the latter case is UB), libc++ currently produces different results.
Also fixes and hardens
formatter<CharT[N], CharT>::format
in<__format/formatter_string.h>
. These specializations are rarely used.Fixes #115935. Also checks the preconditions in this case, which fixes #116570.