Skip to content

DateTimeComponents.Format.parse() should parse timeZoneId() using the Temporal grammar instead of checking if the time zone exists in the time zone database #532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Jun 2, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
173d1d6
#531: Rename TimeZoneParserOperation to OffsetTimezoneParserOperation
DmitryNekrasov May 29, 2025
4ca3eb4
#531: Refactor `ParserOperation` to introduce abstract timezone valid…
DmitryNekrasov May 29, 2025
e2f587d
#531: Add NamedTimezoneParserOperation and refactor TimeZoneIdDirective
DmitryNekrasov May 29, 2025
1cc0064
#531: Remove unused StringFieldFormatDirective class
DmitryNekrasov May 29, 2025
ee72c03
#531: Refactor timezone parsing and improve validation logic
DmitryNekrasov May 29, 2025
cdbe0a5
#531: Enhance timezone character validation with isAsciiDigit check
DmitryNekrasov May 29, 2025
9f61e22
#531: Remove redundant test case for invalid time zone parsing
DmitryNekrasov May 30, 2025
d839d6e
#531: Refactor timezone to TimeZone naming for consistency
DmitryNekrasov May 30, 2025
a8c87bd
#531: Add isAsciiLetter utility function and update time zone validat…
DmitryNekrasov May 30, 2025
7087b34
#531: Simplify state handling logic in TimeZoneParserOperation
DmitryNekrasov May 30, 2025
835db00
#531: Rename and clean up timezone test data, remove unused identifiers
DmitryNekrasov May 30, 2025
f19e3aa
#531: Add comprehensive unit tests for parsing named time zones
DmitryNekrasov May 30, 2025
38340ae
#531: Simplify TimeZoneParserOperation by removing unnecessary AFTER_…
DmitryNekrasov May 30, 2025
8592f51
#531: Simplify time zone validation logic by removing redundant state…
DmitryNekrasov May 30, 2025
d071b08
#531: Simplify time zone parsing logic in ParserOperation
DmitryNekrasov May 30, 2025
1d4b6c8
#531: Refactor TimeZoneIdDirective to use specific type
DmitryNekrasov May 30, 2025
e6383b2
#531: Refactor TimeZoneIdDirective to use method reference for getter
DmitryNekrasov May 30, 2025
a1d4110
#531: Add the next test case:
DmitryNekrasov May 30, 2025
964c3cd
#531: Expand and clarify timezone identifier handling in DateTimeForm…
DmitryNekrasov May 30, 2025
8f3668c
#531: Simplify TimeZone parsing logic by merging and refactoring oper…
DmitryNekrasov Jun 2, 2025
f874c75
#531: Refactor time zone validation logic into reusable helper functions
DmitryNekrasov Jun 2, 2025
fec0efb
#531: Clarify timezone ID parsing to prefer longest matching ID
DmitryNekrasov Jun 2, 2025
ec80910
#531: Add offset parsing with brackets test and fix prefix validation…
DmitryNekrasov Jun 2, 2025
bac5a11
#531: Refactor parser operations to use `Boolean.onTrue` and `Boolean…
DmitryNekrasov Jun 2, 2025
0f0a76b
#531: Simplify null check logic with `onNotNull` utility method integ…
DmitryNekrasov Jun 2, 2025
744d90f
#531: Fix type parameter usage in `onNotNull` to improve code clarity…
DmitryNekrasov Jun 2, 2025
ce4a5a4
#531: Refactor non-terminal state check
DmitryNekrasov Jun 2, 2025
6d6c0f9
#531: Remove unused `onNotNull` function and simplify prefix validation
DmitryNekrasov Jun 2, 2025
c5a97ed
#531: Fix punctuation in DateTimeFormatBuilder documentation
DmitryNekrasov Jun 2, 2025
9b75a71
#531: Update DateTimeFormatBuilder to reference RFC 9557 grammar for …
DmitryNekrasov Jun 2, 2025
8a21c75
#531: Clarify offset-based timezone format documentation in DateTimeF…
DmitryNekrasov Jun 2, 2025
0dfe0e2
#531: Refactor to reuse `assertParseableAsNamedTimeZone` for time zon…
DmitryNekrasov Jun 2, 2025
19af264
#531: Add tests for parsing time zones with delimiters and remove red…
DmitryNekrasov Jun 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 13 additions & 16 deletions core/common/src/format/DateTimeComponents.kt
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ import kotlinx.datetime.*
import kotlinx.datetime.DayOfWeek
import kotlinx.datetime.internal.*
import kotlinx.datetime.internal.format.*
import kotlinx.datetime.internal.format.formatter.FormatterStructure
import kotlinx.datetime.internal.format.formatter.StringFormatterStructure
import kotlinx.datetime.internal.format.parser.Copyable
import kotlinx.datetime.internal.format.parser.ParserStructure
import kotlinx.datetime.internal.format.parser.TimeZoneParserOperation
Expand Down Expand Up @@ -565,27 +567,22 @@ internal class DateTimeComponentsContents internal constructor(

internal val timeZoneField = GenericFieldSpec(PropertyAccessor(DateTimeComponentsContents::timeZoneId))

internal class TimeZoneIdDirective(private val knownZones: Set<String>) :
StringFieldFormatDirective<DateTimeComponentsContents>(timeZoneField, knownZones) {
internal class TimeZoneIdDirective() : FieldFormatDirective<DateTimeComponentsContents> {
override val field: FieldSpec<DateTimeComponentsContents, String>
get() = timeZoneField

override val builderRepresentation: String
get() =
"${DateTimeFormatBuilder.WithDateTimeComponents::timeZoneId.name}()"
get() = "${DateTimeFormatBuilder.WithDateTimeComponents::timeZoneId.name}()"

override fun formatter(): FormatterStructure<DateTimeComponentsContents> {
return StringFormatterStructure(field.accessor::getterNotNull)
}

override fun parser(): ParserStructure<DateTimeComponentsContents> =
ParserStructure(
emptyList(),
listOf(
super.parser(),
ParserStructure(
listOf(TimeZoneParserOperation(timeZoneField.accessor)),
emptyList()
)
)
listOf(TimeZoneParserOperation(timeZoneField.accessor)),
emptyList()
)

override fun equals(other: Any?): Boolean = other is TimeZoneIdDirective && other.knownZones == knownZones
override fun hashCode(): Int = knownZones.hashCode()
}

internal class DateTimeComponentsFormat(override val actualFormat: CachedFormatStructure<DateTimeComponentsContents>) :
Expand All @@ -609,7 +606,7 @@ internal class DateTimeComponentsFormat(override val actualFormat: CachedFormatS
}

override fun timeZoneId() =
actualBuilder.add(BasicFormatStructure(TimeZoneIdDirective(TimeZone.availableZoneIds)))
actualBuilder.add(BasicFormatStructure(TimeZoneIdDirective()))

@Suppress("NO_ELSE_IN_WHEN")
override fun dateTimeComponents(format: DateTimeFormat<DateTimeComponents>) = when (format) {
Expand Down
25 changes: 22 additions & 3 deletions core/common/src/format/DateTimeFormatBuilder.kt
Original file line number Diff line number Diff line change
Expand Up @@ -287,10 +287,29 @@ public sealed interface DateTimeFormatBuilder {
*/
public sealed interface WithDateTimeComponents : WithDateTime, WithUtcOffset {
/**
* The IANA time zone identifier, for example, "Europe/Berlin".
* A timezone identifier, either offset-based or a region-based IANA timezone ID.
*
* When formatting, the timezone identifier is supplied as is, without any validation.
* On parsing, [TimeZone.availableZoneIds] is used to validate the identifier.
* Offset-based timezones:
* - `Z` or `z` - UTC
* - Optional prefix (`UTC`, `GMT`, `UT`) followed by offset
* - Offset in one of the formats: `+H`, `+HH`, `+HHMM`, `+HHMMSS`, `+HH:MM`, `+HH:MM:SS`
*
* Region-based IANA timezone IDs:
* Parsed according to RFC 9557 grammar (section 4.1 of https://datatracker.ietf.org/doc/rfc9557/):
* ```
* time-zone-initial = ALPHA / "." / "_"
* time-zone-char = time-zone-initial / DIGIT / "-" / "+"
* time-zone-part = time-zone-initial *time-zone-char
* time-zone-name = time-zone-part *("/" time-zone-part)
* ```
*
* Note: This implementation doesn't follow the RFC 9557 grammar fully and allows
* "." and ".." as the time-zone-part.
*
* When formatting, outputs the identifier as-is. When parsing, validates syntax only;
* actual timezone validation is deferred until creating a [TimeZone] object.
*
* If more than one way to read a valid timezone ID matches the string, we always take the longest one.
*
* @sample kotlinx.datetime.test.samples.format.DateTimeComponentsFormatSamples.timeZoneId
*/
Expand Down
21 changes: 0 additions & 21 deletions core/common/src/internal/format/FieldFormatDirective.kt
Original file line number Diff line number Diff line change
Expand Up @@ -147,27 +147,6 @@ internal abstract class NamedEnumIntFieldFormatDirective<in Target, Type>(
)
}

internal abstract class StringFieldFormatDirective<in Target>(
final override val field: FieldSpec<Target, String>,
private val acceptedStrings: Set<String>,
) : FieldFormatDirective<Target> {

init {
require(acceptedStrings.isNotEmpty()) {
"The set of accepted strings is empty"
}
}

override fun formatter(): FormatterStructure<Target> =
StringFormatterStructure(field.accessor::getterNotNull)

override fun parser(): ParserStructure<Target> =
ParserStructure(
listOf(StringSetParserOperation(acceptedStrings, field.accessor, field.name)),
emptyList()
)
}

internal abstract class SignedIntFieldFormatDirective<in Target>(
final override val field: FieldSpec<Target, Int>,
private val minDigits: Int?,
Expand Down
113 changes: 62 additions & 51 deletions core/common/src/internal/format/parser/ParserOperation.kt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
package kotlinx.datetime.internal.format.parser

import kotlinx.datetime.internal.isAsciiDigit
import kotlinx.datetime.internal.isAsciiLetter

internal interface ParserOperation<in Output> {
fun consume(storage: Output, input: CharSequence, startIndex: Int): ParseResult
Expand Down Expand Up @@ -142,9 +143,8 @@ internal class UnconditionalModification<Output>(
internal class TimeZoneParserOperation<Output>(
private val setter: AssignableField<Output, String>
) : ParserOperation<Output> {

override fun consume(storage: Output, input: CharSequence, startIndex: Int): ParseResult {
val lastMatch = validateTimezone(input, startIndex)
val lastMatch = validateTimeZone(input, startIndex)
return if (lastMatch > startIndex) {
setter.setWithoutReassigning(storage, input.substring(startIndex, lastMatch), startIndex, lastMatch)
ParseResult.Ok(lastMatch)
Expand All @@ -158,95 +158,106 @@ internal class TimeZoneParserOperation<Output>(
START,
AFTER_PREFIX,
AFTER_SIGN,
AFTER_INIT_SIGN,
AFTER_HOUR,
AFTER_INIT_HOUR,
AFTER_MINUTE,
AFTER_COLON_MINUTE,
END,
INVALID
IN_PART,
AFTER_SLASH,
END
}

private fun validateTimezone(input: CharSequence, startIndex: Int): Int {
private inline fun Boolean.onTrue(action: () -> Unit): Boolean = if (this) { action(); true } else false

private inline fun Boolean.onFalse(action: () -> Unit): Boolean = if (this) true else { action(); false }

private fun validateTimeZone(input: CharSequence, startIndex: Int): Int {
var index = startIndex
var lastValidIndex = startIndex

fun validatePrefix(validValues: List<String>): Boolean =
validValues.firstOrNull { input.startsWith(it) }?.let {
index += it.length
lastValidIndex = index
true
} ?: false

fun validateTimeComponent(length: Int): Boolean {
if ((index..<(index + length)).all { input.getOrNull(it)?.isAsciiDigit() ?: false }) {
index += length
lastValidIndex = index
return true
}
return false
}
validValues.firstOrNull { input.startsWith(it, index) }?.also { index += it.length } != null

fun validateSign(): Boolean = (input[index] in listOf('+', '-')).onTrue { index++ }

fun validateTimeComponent(length: Int): Boolean =
(index..<(index + length))
.all { input.getOrNull(it)?.isAsciiDigit() ?: false }
.onTrue { index += length }

fun validateTimeComponentWithColon(): Boolean =
(input[index] == ':').onTrue { index++ } && validateTimeComponent(2).onFalse { index-- }

fun Char.isTimeZoneInitial(): Boolean = isAsciiLetter() || this == '.' || this == '_'
fun Char.isTimeZoneChar(): Boolean = isTimeZoneInitial() || isAsciiDigit() || this == '-' || this == '+'

fun validateTimeZoneInitial(): Boolean = input[index].isTimeZoneInitial().onTrue { index++ }
fun validateTimeZoneChar(): Boolean = input[index].isTimeZoneChar().onTrue { index++ }
fun validateSlash(): Boolean = (input[index] == '/').onTrue { index++ }

var state = State.START
while (index < input.length) {
state = when (state) {
State.START -> when {
input[index] == 'Z' || input[index] == 'z' -> {
index++
State.END
}

input[index] in listOf('+', '-') -> {
index++
State.AFTER_SIGN
}

validatePrefix(listOf("UTC", "GMT", "UT")) -> State.AFTER_PREFIX
else -> State.INVALID
validateSign() -> State.AFTER_INIT_SIGN
validateTimeZoneInitial() -> State.IN_PART
else -> break
}

State.AFTER_PREFIX -> when {
input[index] in listOf('+', '-') -> {
index++
State.AFTER_SIGN
}

else -> State.INVALID
validateSign() -> State.AFTER_SIGN
else -> State.IN_PART
}

State.AFTER_SIGN -> when {
validateTimeComponent(2) -> State.AFTER_HOUR
else -> State.IN_PART
}

State.AFTER_INIT_SIGN -> when {
validateTimeComponent(2) -> State.AFTER_INIT_HOUR
validateTimeComponent(1) -> State.END
else -> State.INVALID
else -> break
}

State.AFTER_HOUR -> when {
input[index] == ':' -> {
index++
if (validateTimeComponent(2)) State.AFTER_COLON_MINUTE else State.INVALID
}
validateTimeComponentWithColon() -> State.AFTER_COLON_MINUTE
else -> State.IN_PART
}

State.AFTER_INIT_HOUR -> when {
validateTimeComponentWithColon() -> State.AFTER_COLON_MINUTE
validateTimeComponent(2) -> State.AFTER_MINUTE
else -> State.INVALID
else -> break
}

State.AFTER_MINUTE -> when {
validateTimeComponent(2) -> State.END
else -> State.INVALID
else -> break
}

State.AFTER_COLON_MINUTE -> when {
input[index] == ':' -> {
index++
if (validateTimeComponent(2)) State.END else State.INVALID
}
validateTimeComponentWithColon() -> State.END
else -> break
}

State.IN_PART -> when {
validateTimeZoneChar() -> State.IN_PART
validateSlash() -> State.AFTER_SLASH
else -> break
}

else -> State.INVALID
State.AFTER_SLASH -> when {
validateTimeZoneInitial() -> State.IN_PART
else -> break
}

State.END, State.INVALID -> break
State.END -> break
}
}

return if (state == State.END) index else lastValidIndex
return index - if (state == State.AFTER_SLASH || state == State.AFTER_INIT_SIGN) 1 else 0
}
}
}
Expand Down
2 changes: 2 additions & 0 deletions core/common/src/internal/util.kt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ package kotlinx.datetime.internal

internal fun Char.isAsciiDigit(): Boolean = this in '0'..'9'

internal fun Char.isAsciiLetter(): Boolean = this in 'A'..'Z' || this in 'a'..'z'

internal fun Char.asciiDigitToInt(): Int = this - '0'

/** Working around the JSR-310 behavior of failing to parse long year numbers even when they start with leading zeros */
Expand Down
Loading