Skip to content

DateTimeComponents.Format parse throws an exception when timeZoneId() is UTC #528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 51 commits into from
May 26, 2025

Conversation

DmitryNekrasov
Copy link
Contributor

@DmitryNekrasov DmitryNekrasov commented May 16, 2025

TimeZone parser implementation using Finite State Automaton

This PR introduces a comprehensive parser for timezone identifiers built around a finite state automaton that processes input character by character through well-defined state transitions.

The parser recognizes several categories of timezone formats. Named timezone identifiers (UTC, GMT, UT, Z, z). Fixed offset formats support multiple notation styles ranging from compact single-digit representations (+1, -5) to fully specified formats with colons (+01:02:03, -05:30:15). Combined formats could merge named prefixes with offsets (UTC+01:00, GMT-05:30:45).

Automata

The finite state automaton architecture employs six distinct parsing states that guide the recognition process. The parser begins in the START state, recognizing either immediate completion tokens (Z/z), sign indicators leading to AFTER_SIGN, or timezone prefixes transitioning to AFTER_PREFIX. From AFTER_PREFIX, only sign indicators are valid, advancing to AFTER_SIGN. The AFTER_SIGN state processes either two-digit hour components leading to AFTER_HOUR or single-digit shortcuts that complete parsing immediately. The AFTER_HOUR state handles either colon-separated minute components transitioning to AFTER_COLON_MINUTE or direct two-digit minute parsing advancing to AFTER_MINUTE. Both AFTER_MINUTE and AFTER_COLON_MINUTE states can process additional time components, with AFTER_COLON_MINUTE specifically handling colon-prefixed seconds before reaching the END state.

A significant architectural decision involves the removal of temporal validation constraints, allowing the parser to accept mathematically invalid but syntactically correct formats such as +99:99:99. This design choice prioritizes format recognition over semantic validation, enabling downstream components to handle temporal logic validation as needed.

Comprehensive test coverage validates the parser's behavior across diverse scenarios, including edge cases involving malformed inputs, boundary conditions, and format variations. The tests confirm correct acceptance of valid timezone identifiers while appropriately rejecting invalid formats.

@DmitryNekrasov DmitryNekrasov marked this pull request as draft May 16, 2025 11:41
@DmitryNekrasov
Copy link
Contributor Author

@dkhalanskyjb Dmitry, hello! Could you please look at the added tests, do they correctly document the expected behavior and fully cover possible scenarios? Thank you!

@DmitryNekrasov DmitryNekrasov self-assigned this May 16, 2025
@DmitryNekrasov DmitryNekrasov added the bug Something isn't working label May 16, 2025
@DmitryNekrasov DmitryNekrasov marked this pull request as ready for review May 19, 2025 13:26
Copy link
Collaborator

@dkhalanskyjb dkhalanskyjb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this should do the trick, I only have minor suggestions. Please squash when merging and mention Fixes #444 in the full commit message.

@DmitryNekrasov DmitryNekrasov merged commit 2094289 into master May 26, 2025
1 check passed
@DmitryNekrasov DmitryNekrasov deleted the dmitry.nekrasov/bugfix/444 branch May 26, 2025 11:57
@dkhalanskyjb
Copy link
Collaborator

When squash-merging, we also leave the (#528) at the end of the initial commit message line so that it's easier to find out what PR introduced the change. Here, #444 is enough of a clue, so it's fine, but in the future, please keep the PR number.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants