Skip to content

Parse error on old UTCTimes #123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
snoyberg opened this issue Oct 22, 2014 · 7 comments
Closed

Parse error on old UTCTimes #123

snoyberg opened this issue Oct 22, 2014 · 7 comments

Comments

@snoyberg
Copy link
Contributor

Example program:

{-# LANGUAGE OverloadedStrings #-}
import Database.PostgreSQL.Simple
import Data.Time

main :: IO ()
main = do
    conn <- connectPostgreSQL "host=localhost port=5432 user=test dbname=test password=test"
    execute_ conn "BEGIN TRANSACTION;"
    execute_ conn "CREATE TABLE old_dates(tz TIMESTAMP WITH TIME ZONE NOT NULL);"
    execute conn "INSERT INTO old_dates(tz) VALUES(?);"
        [ UTCTime (fromGregorian 1900 1 1) 0
        ]

    rows <- query_ conn "SELECT * FROM old_dates"
    mapM_ print (rows :: [Only UTCTime])

The result is:

foo.hs: ConversionFailed {errSQLType = "timestamptz", errSQLTableOid = Just (Oid 43633), errSQLField = "tz", errHaskellType = "UTCTime", errMessage = "endOfInput"}

Looking into this more closely, it appears that- for some older dates- PostgreSQL returns a funky timezone along the lines of +02:03:04.

snoyberg added a commit to yesodweb/persistent that referenced this issue Oct 22, 2014
@snoyberg
Copy link
Contributor Author

It's possible that this is a timezone-specific issue. For the record, my system is set to Asia/Jerusalem.

@snoyberg
Copy link
Contributor Author

To demonstrate this a little more easily, here's some example output from the psql program:

postgres=# select now() - interval '100 years';
              ?column?               
-------------------------------------
 1914-10-22 14:40:54.872265+02:20:40
(1 row)


postgres=# select timestamp '1900-01-01' AT TIME ZONE 'IST';
           timezone           
------------------------------
 1900-01-01 00:20:40+02:20:40
(1 row)

@lpsmith
Copy link
Owner

lpsmith commented Oct 22, 2014

I am aware of this issue, it is timezone specific. See issues #69 and #122. It appears as though these timestamps are almost invariably tied to when a locality moved from a local time standard (usually based on astronomical noon of a large nearby city) to some larger and more encompassing time standard. It isn't the easiest issue to fix in the context of the time package, and I haven't been motivated to work on this issue because I doubt it affects very many people.

A simple workaround for UTCTime is that, assuming you never care about using PostgreSQL's (admittedly very slick) time zone handling, just SET timezone TO 'utc', so that every timestamptz on that connection is returned with offset +00.

snoyberg added a commit to snoyberg/postgresql-simple that referenced this issue Oct 22, 2014
@snoyberg
Copy link
Contributor Author

Fixing the parser to handle that isn't too difficult. I've sent a pull request with that tweak, that also modifies the time for the number of seconds in the timezone, since time's timezone does not allow for second-level resolution.

@lpsmith
Copy link
Owner

lpsmith commented Oct 27, 2014

Ok, I finally did fix this. ZonedTime still chokes on these timestamps, as there really isn't any good fix. But you can now use UTCTime to retrieve these very old timestamps without the SET timezone TO 'utc' workaround. Take a look at b4624db and tell me what you think.

@snoyberg
Copy link
Contributor Author

Looks good!

@lpsmith
Copy link
Owner

lpsmith commented Oct 28, 2014

Regarding ZonedTime, GHC does turn out to have one way to access the tz database, at least on Ubuntu 14.04:

$ ghci-6.8.3
GHCi, version 6.8.3: http://www.haskell.org/ghc/  :? for help
> utcToLocalZonedTime (read "1883-11-18 17:44:59")
1883-11-18 11:59:59 LMT
> utcToLocalZonedTime (read "1883-11-18 17:59:59")
1883-11-18 12:14:59 LMT
> utcToLocalZonedTime (read "1883-11-18 18:00:00")
1883-11-18 12:00:00 CST

I'm not sure exactly what is going on here, but somehow the ZonedTime getting returned is the correct value from the tz database, with the offset rounded using some rule that ended up with the nearest minute in this case. I'll have to do a bit more digging to find out how GHC is arriving at this answer; perhaps it's basically just using some library call or syscall? Compare to psql:

$ psql
psql (9.3.5)
Type "help" for help.

lpsmith=> select '1883-11-18 17:44:59+00'::timestamptz;
         timestamptz          
------------------------------
 1883-11-18 12:00:21-05:44:38
(1 row)

lpsmith=> select '1883-11-18 17:59:59+00'::timestamptz;
         timestamptz          
------------------------------
 1883-11-18 12:15:21-05:44:38
(1 row)

lpsmith=> select '1883-11-18 18:00:00+00'::timestamptz;
      timestamptz       
------------------------
 1883-11-18 12:00:00-06
(1 row)

But of course the time package's conception of TimeZone is still hopelessly naive on a lot of counts, and AFAICT the only (possibly indirect) access it provides to the tz data is via utcToLocalZonedTime. So it seems to me that there isn't any way of dealing with other timezones through the time package, other than changing the timezone your system is set to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants