DateTimeComponents.Format.parse() should parse timeZoneId() using the Temporal grammar instead of checking if the time zone exists in the time zone database #532

DmitryNekrasov · 2025-05-29T13:10:53Z

Summary

This PR transforms how DateTimeComponents.Format.parse() handles time zone parsing by implementing the Temporal specification's grammar-based approach, which removes the requirement that parsed time zones must exist in the system's time zone database and extends this same parsing philosophy to offset time zones.

The implementation follows the Temporal grammar for time zones, which defines syntactic rules for valid time zone identifiers:

time-zone-initial = ALPHA / "." / "_"
time-zone-char    = time-zone-initial / DIGIT / "-" / "+"
time-zone-part    = time-zone-initial *time-zone-char
time-zone-name    = time-zone-part *("/" time-zone-part)

Previously, when parsing strings containing time zone IDs using DateTimeComponents.Format.parse() with timeZoneId(), the parser validated that time zones existed in the system's database. Additionally, offset timezone validation used a separate finite state automaton that followed different parsing rules.

This PR unifies the parsing approach by replacing the previous offset timezone validation FSA with a comprehensive implementation that handles both offset formats and named time zones according to Temporal grammar specifications. The new finite state automaton processes input character by character, transitioning between states based on grammar productions for all valid time zone identifiers, whether they are named zones like "America/New_York" or offset formats like "+01:00".

The unified parser accepts any syntactically valid time zone identifier according to the Temporal specification and defers actual time zone validation to the point of usage, such as when creating a TimeZone object. This consistent approach ensures that both named time zones and offset formats follow the same validation principles, improving code maintainability and specification compliance.

This change relaxes parsing constraints across all time zone formats, maintaining backward compatibility while providing more flexibility. Code that previously relied on parse-time validation of time zone existence will need to handle validation separately. Syntactically valid time zone IDs that were previously rejected will now parse successfully, with validation errors occurring later when attempting to use invalid time zones.

The test suite has been expanded to verify parsing of various time zone formats according to Temporal grammar, including both named zones and offset formats. Tests ensure that previously valid time zones continue to parse correctly while also covering edge cases and properly rejecting malformed identifiers. The unified finite state automaton implementation has been thoroughly tested against the complete grammar specification to ensure compliance across all supported time zone formats.

Example

// Before: Would fail if "Custom/TimeZone" doesn't exist in database
val parsed = format.parse("2024-01-01T12:00:00[Custom/TimeZone]")

// After: Parses successfully, validation happens when creating TimeZone
val parsed = format.parse("2024-01-01T12:00:00[Custom/TimeZone]")
val timeZone = TimeZone.of(parsed.timeZoneId!!)  // Validation happens here

…ation. Extracted common timezone validation logic into an abstract `TimezoneParserOperation` class. `OffsetTimezoneParserOperation` now extends this class

Introduced NamedTimezoneParserOperation to handle named time zone validation within parsing structures. Refactored TimeZoneIdDirective to simplify its implementation by removing the knownZones property and incorporating the new parser operation for named time zones, enhancing flexibility and maintainability.

The `StringFieldFormatDirective` class was removed as it was no longer being used

Replaced `TODO` implementation in `NamedTimezoneParserOperation` with a state-based validation logic to handle complex timezone formats

Replaced `isDigit` with `isAsciiDigit` in timezone validation logic to ensure stricter control over accepted characters.

The removed test case was unnecessary as it tested an invalid time zone scenario that the core parsing logic already handles elsewhere

…ion to use it

dkhalanskyjb

The documentation for kotlinx.datetime.format.DateTimeFormatBuilder.WithDateTimeComponents#timeZoneId is no longer valid with this. Please update it.

core/common/src/format/DateTimeComponents.kt

core/common/src/internal/format/parser/ParserOperation.kt

core/common/test/format/DateTimeComponentsFormatTest.kt

…SLASH state

… management.

DateTimeComponents.Format { timeZoneId(); char('/') }.parse("$zoneId/")

…atBuilder documentation.

…ations

core/common/src/internal/format/parser/ParserOperation.kt

… in parser

core/common/src/internal/format/parser/ParserOperation.kt

dkhalanskyjb

Good job! The documentation could be improved a bit, and I have a few stylistic suggestions, but other than that, the state machine looks correct and well implemented.

core/common/src/internal/format/parser/ParserOperation.kt

core/common/src/format/DateTimeFormatBuilder.kt

….onFalse` for cleaner and more concise logic.

…ration

… and consistency.

…timezone parsing

…ormatBuilder.

core/common/test/format/DateTimeComponentsFormatTest.kt

…e parsing tests

…undant bracket parsing test

Replace existing validation with a unified finite state automaton that implements RFC 9557 grammar for all time zone identifiers (both named zones and offsets). Parse-time validation is removed; time zones are now validated only when used. Syntactically valid but non-existent time zones will now parse successfully. Validation errors occur when creating TimeZone objects, not during parsing. Fixes #531

DmitryNekrasov added 6 commits May 29, 2025 15:52

#531: Rename TimeZoneParserOperation to OffsetTimezoneParserOperation

173d1d6

#531: Refactor ParserOperation to introduce abstract timezone valid…

4ca3eb4

…ation. Extracted common timezone validation logic into an abstract `TimezoneParserOperation` class. `OffsetTimezoneParserOperation` now extends this class

#531: Remove unused StringFieldFormatDirective class

1cc0064

The `StringFieldFormatDirective` class was removed as it was no longer being used

#531: Refactor timezone parsing and improve validation logic

ee72c03

Replaced `TODO` implementation in `NamedTimezoneParserOperation` with a state-based validation logic to handle complex timezone formats

#531: Enhance timezone character validation with isAsciiDigit check

cdbe0a5

Replaced `isDigit` with `isAsciiDigit` in timezone validation logic to ensure stricter control over accepted characters.

DmitryNekrasov self-assigned this May 29, 2025

DmitryNekrasov added the timezone The model and API of timezones label May 29, 2025

DmitryNekrasov added 6 commits May 30, 2025 11:38

#531: Remove redundant test case for invalid time zone parsing

9f61e22

The removed test case was unnecessary as it tested an invalid time zone scenario that the core parsing logic already handles elsewhere

#531: Refactor timezone to TimeZone naming for consistency

d839d6e

#531: Add isAsciiLetter utility function and update time zone validat…

a8c87bd

…ion to use it

#531: Simplify state handling logic in TimeZoneParserOperation

7087b34

#531: Rename and clean up timezone test data, remove unused identifiers

835db00

#531: Add comprehensive unit tests for parsing named time zones

f19e3aa

DmitryNekrasov marked this pull request as ready for review May 30, 2025 09:55

DmitryNekrasov requested a review from dkhalanskyjb May 30, 2025 09:55

dkhalanskyjb requested changes May 30, 2025

View reviewed changes

DmitryNekrasov added 7 commits May 30, 2025 16:43

#531: Simplify TimeZoneParserOperation by removing unnecessary AFTER_…

38340ae

…SLASH state

#531: Simplify time zone validation logic by removing redundant state…

8592f51

… management.

#531: Simplify time zone parsing logic in ParserOperation

d071b08

#531: Refactor TimeZoneIdDirective to use specific type

1d4b6c8

#531: Refactor TimeZoneIdDirective to use method reference for getter

e6383b2

#531: Add the next test case:

a1d4110

DateTimeComponents.Format { timeZoneId(); char('/') }.parse("$zoneId/")

#531: Expand and clarify timezone identifier handling in DateTimeForm…

964c3cd

…atBuilder documentation.

DmitryNekrasov requested a review from dkhalanskyjb May 30, 2025 14:02

DmitryNekrasov added 3 commits June 2, 2025 12:39

#531: Simplify TimeZone parsing logic by merging and refactoring oper…

8f3668c

…ations

#531: Refactor time zone validation logic into reusable helper functions

f874c75

#531: Clarify timezone ID parsing to prefer longest matching ID

fec0efb

dkhalanskyjb reviewed Jun 2, 2025

View reviewed changes

core/common/src/internal/format/parser/ParserOperation.kt Outdated Show resolved Hide resolved

dkhalanskyjb reviewed Jun 2, 2025

View reviewed changes

core/common/src/internal/format/parser/ParserOperation.kt Outdated Show resolved Hide resolved

core/common/src/internal/format/parser/ParserOperation.kt Outdated Show resolved Hide resolved

dkhalanskyjb self-requested a review June 2, 2025 11:00

#531: Add offset parsing with brackets test and fix prefix validation…

ec80910

… in parser

dkhalanskyjb reviewed Jun 2, 2025

View reviewed changes

core/common/src/internal/format/parser/ParserOperation.kt Outdated Show resolved Hide resolved

dkhalanskyjb self-requested a review June 2, 2025 11:08

dkhalanskyjb approved these changes Jun 2, 2025

View reviewed changes

DmitryNekrasov added 8 commits June 2, 2025 15:39

#531: Refactor parser operations to use Boolean.onTrue and `Boolean…

bac5a11

….onFalse` for cleaner and more concise logic.

#531: Simplify null check logic with onNotNull utility method integ…

0f0a76b

…ration

#531: Fix type parameter usage in onNotNull to improve code clarity…

744d90f

… and consistency.

#531: Refactor non-terminal state check

ce4a5a4

#531: Remove unused onNotNull function and simplify prefix validation

6d6c0f9

#531: Fix punctuation in DateTimeFormatBuilder documentation

c5a97ed

#531: Update DateTimeFormatBuilder to reference RFC 9557 grammar for …

9b75a71

…timezone parsing

#531: Clarify offset-based timezone format documentation in DateTimeF…

8a21c75

…ormatBuilder.

DmitryNekrasov requested a review from dkhalanskyjb June 2, 2025 12:06

dkhalanskyjb approved these changes Jun 2, 2025

View reviewed changes

core/common/test/format/DateTimeComponentsFormatTest.kt Outdated Show resolved Hide resolved

core/common/test/format/DateTimeComponentsFormatTest.kt Show resolved Hide resolved

DmitryNekrasov added 2 commits June 2, 2025 16:56

#531: Refactor to reuse assertParseableAsNamedTimeZone for time zon…

0dfe0e2

…e parsing tests

#531: Add tests for parsing time zones with delimiters and remove red…

19af264

…undant bracket parsing test

DmitryNekrasov merged commit 3e68613 into master Jun 2, 2025
1 check passed

DmitryNekrasov deleted the dmitry.nekrasov/feature/531 branch June 2, 2025 14:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DateTimeComponents.Format.parse() should parse timeZoneId() using the Temporal grammar instead of checking if the time zone exists in the time zone database #532

DateTimeComponents.Format.parse() should parse timeZoneId() using the Temporal grammar instead of checking if the time zone exists in the time zone database #532

Uh oh!

DmitryNekrasov commented May 29, 2025 •

edited

Loading

Uh oh!

dkhalanskyjb left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dkhalanskyjb left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DateTimeComponents.Format.parse() should parse timeZoneId() using the Temporal grammar instead of checking if the time zone exists in the time zone database #532

DateTimeComponents.Format.parse() should parse timeZoneId() using the Temporal grammar instead of checking if the time zone exists in the time zone database #532

Uh oh!

Conversation

DmitryNekrasov commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Example

Uh oh!

dkhalanskyjb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dkhalanskyjb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DmitryNekrasov commented May 29, 2025 •

edited

Loading