DateTimeComponents.Format parse throws an exception when timeZoneId() is UTC #528

DmitryNekrasov · 2025-05-16T11:41:01Z

TimeZone parser implementation using Finite State Automaton

This PR introduces a comprehensive parser for timezone identifiers built around a finite state automaton that processes input character by character through well-defined state transitions.

The parser recognizes several categories of timezone formats. Named timezone identifiers (UTC, GMT, UT, Z, z). Fixed offset formats support multiple notation styles ranging from compact single-digit representations (+1, -5) to fully specified formats with colons (+01:02:03, -05:30:15). Combined formats could merge named prefixes with offsets (UTC+01:00, GMT-05:30:45).

The finite state automaton architecture employs six distinct parsing states that guide the recognition process. The parser begins in the START state, recognizing either immediate completion tokens (Z/z), sign indicators leading to AFTER_SIGN, or timezone prefixes transitioning to AFTER_PREFIX. From AFTER_PREFIX, only sign indicators are valid, advancing to AFTER_SIGN. The AFTER_SIGN state processes either two-digit hour components leading to AFTER_HOUR or single-digit shortcuts that complete parsing immediately. The AFTER_HOUR state handles either colon-separated minute components transitioning to AFTER_COLON_MINUTE or direct two-digit minute parsing advancing to AFTER_MINUTE. Both AFTER_MINUTE and AFTER_COLON_MINUTE states can process additional time components, with AFTER_COLON_MINUTE specifically handling colon-prefixed seconds before reaching the END state.

A significant architectural decision involves the removal of temporal validation constraints, allowing the parser to accept mathematically invalid but syntactically correct formats such as +99:99:99. This design choice prioritizes format recognition over semantic validation, enabling downstream components to handle temporal logic validation as needed.

Comprehensive test coverage validates the parser's behavior across diverse scenarios, including edge cases involving malformed inputs, boundary conditions, and format variations. The tests confirm correct acceptance of valid timezone identifiers while appropriately rejecting invalid formats.

DmitryNekrasov · 2025-05-16T11:44:37Z

@dkhalanskyjb Dmitry, hello! Could you please look at the added tests, do they correctly document the expected behavior and fully cover possible scenarios? Thank you!

core/common/test/TimeZoneTest.kt

core/common/src/internal/format/parser/ParserOperation.kt

…eId) } to SHOULD_PARSE_INCORRECTLY case

dkhalanskyjb

Yes, this should do the trick, I only have minor suggestions. Please squash when merging and mention Fixes #444 in the full commit message.

core/common/test/format/DateTimeComponentsFormatTest.kt

core/common/src/internal/format/parser/ParserOperation.kt

dkhalanskyjb · 2025-05-26T12:02:23Z

When squash-merging, we also leave the (#528) at the end of the initial commit message line so that it's easier to find out what PR introduced the change. Here, #444 is enough of a clue, so it's fine, but in the future, please keep the PR number.

DmitryNekrasov added 10 commits May 16, 2025 14:55

#444: Add testSpecialNamedTimezones

d38a5a6

#444: Add testFixedOffsets

3714a12

#444: Add testUTCGMTWithOffsets

8ba6552

#444: Add testTimezoneDBIdentifiers

7b6f6e5

#444: Remove import

93ba0db

#444: Update date on license

d087dbb

#444: Rollback formatting

1229eee

#444: Fix compilation

e55e74a

#444: Fix formatting

5f6e5c4

#444: Fix formatting

67e0d49

DmitryNekrasov requested a review from dkhalanskyjb May 16, 2025 11:41

DmitryNekrasov marked this pull request as draft May 16, 2025 11:41

dkhalanskyjb reviewed May 16, 2025

View reviewed changes

core/common/test/TimeZoneTest.kt Outdated Show resolved Hide resolved

DmitryNekrasov self-assigned this May 16, 2025

DmitryNekrasov added the bug Something isn't working label May 16, 2025

DmitryNekrasov added 10 commits May 19, 2025 12:26

#444: Fix timeZoneId check in assertTimeZoneIdCanBeParsed

ffe3097

#444: Remove "SYSTEM" from test

6fe71ba

#444: Add reject tests

5ef14dc

#444: Add "Z" parsing

801ebf7

#444: Add validateTimezone method

672ffdd

#444: Refactor validateTimezone

a90171d

#444: Refactoring

04706c3

#444: Add validateTimezone call

ea983ee

#444: Refactoring

7950367

#444: Add test testParseUntilRightBound, fix bugs

d8a1cd7

DmitryNekrasov requested review from fzhinkin and dkhalanskyjb May 19, 2025 13:26

DmitryNekrasov marked this pull request as ready for review May 19, 2025 13:26

#444: Refactoring

5b5efe3

DmitryNekrasov added 2 commits May 23, 2025 12:55

#444: Fix test

7798db6

#444: Fix test

612e218

dkhalanskyjb requested changes May 23, 2025

View reviewed changes

DmitryNekrasov added 11 commits May 23, 2025 14:48

#444: Refactor tests

bf48fb1

#444: Refactor tests

66bc22c

#444: Move tests from TimeZoneTest.kt to DateTimeComponentsFormatTest.kt

dcd39ef

#444: Rollback TimeZoneTest.kt

7b4efee

#444: Refactoring

3a66257

#444: Simplify validateTimeComponent

748bb64

#444: Simplify validatePrefix

4587cf0

#444: Made enum class State private

59ce3ac

#444: Add +12:3456 and +1234:56 test cases

989a30e

#444: Handle +12:3456 and +1234:56 test cases

5a000d7

#444: Refactoring

9c412b9

DmitryNekrasov requested a review from dkhalanskyjb May 23, 2025 11:54

DmitryNekrasov added 3 commits May 23, 2025 16:18

#444: Fix license

4791998

#444: Fix tests

ce340d8

#444: Add assertFailsWith<IllegalTimeZoneException> { TimeZone.of(zon…

c9d7b59

…eId) } to SHOULD_PARSE_INCORRECTLY case

dkhalanskyjb approved these changes May 26, 2025

View reviewed changes

DmitryNekrasov added 2 commits May 26, 2025 14:58

#444: Fix comment

28fe1bb

#444: Fix comments in tests

1940587

dkhalanskyjb mentioned this pull request May 26, 2025

TimeZone.of("z") should work the same as calling from capital "Z" #530

Merged

DmitryNekrasov added 4 commits May 26, 2025 15:03

#444: Remove startIndex check

5c6d18b

#444: Convert if to singre expression fr return

5c14bbe

#444: Refactor tests

a80dda5

#444: Refactor tests

63b82ef

DmitryNekrasov merged commit 2094289 into master May 26, 2025
1 check passed

DmitryNekrasov deleted the dmitry.nekrasov/bugfix/444 branch May 26, 2025 11:57

dkhalanskyjb mentioned this pull request Jun 2, 2025

DateTimeComponents.Format.parse() should parse timeZoneId() using the Temporal grammar instead of checking if the time zone exists in the time zone database #532

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DateTimeComponents.Format parse throws an exception when timeZoneId() is UTC #528

DateTimeComponents.Format parse throws an exception when timeZoneId() is UTC #528

Uh oh!

DmitryNekrasov commented May 16, 2025 •

edited

Loading

Uh oh!

DmitryNekrasov commented May 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dkhalanskyjb left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dkhalanskyjb commented May 26, 2025

Uh oh!

Uh oh!

DateTimeComponents.Format parse throws an exception when timeZoneId() is UTC #528

DateTimeComponents.Format parse throws an exception when timeZoneId() is UTC #528

Uh oh!

Conversation

DmitryNekrasov commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TimeZone parser implementation using Finite State Automaton

Uh oh!

DmitryNekrasov commented May 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dkhalanskyjb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dkhalanskyjb commented May 26, 2025

Uh oh!

Uh oh!

DmitryNekrasov commented May 16, 2025 •

edited

Loading