new adventures in date parsing

Besides the famous getYear ECMAScript date parsing is so badly specified that it is full of quirks. Opera has a good share of them..

Some random testing in IE7, Firefox 2 and Opera 9 shows that a simple new Date(somestring) dishes up several surprises:

  • Opera seems to be the only one that tries to parse European date formats, though it defaults to U.S. format for ambiguous dates. Hence, "1/2/2008" is January 2nd while "30/10/2008" is October 30th in Opera.
  • Greater than expected values for a specific part increases the whole date value in IE and Firefox. In other words, passing in January 32nd returns February first.
  • ..but this is more thoroughly implemented in Firefox than IE, "1/1/2008 23:59:60" gives you January 2nd in the former and NaN in the latter.
  • Opera defaults to values from current timestamp on computer for any input that doesn't make sense. In other words, if you pass in January 32nd on April 11th, January 11th is not-quite-expected outcome..
  • The IE and Firefox parsers support both m-d-y and y-m-d formats. So if the input is ambiguous, how do they know which is which? Elementary, dear Watson: if the first integer is greater than 70 it is assumed to be a year. If it is less than 70 it is assumed to be a month. "69-1-1" is obviously a 1906 date while "70-1-1" is new year's day 1970. Ah, the magic of arbitrary heuristics.
  • Speaking of the magic number "70", why can Firefox parse "1 january 70" but says "invalid date" for "1 january 69"??
  • ..and why does Opera refuse to recognise single digits as valid years, just substituting the current year instead? Opera, pretending "1 january 9" is today's date is NOT the correct fix for those Y2K bugs.. 😥
  • Finally, a quirk I'll leave my readers to ponder the inner workings of: if I tell Opera to parse "january 100 2007" on this very day, April 11th 2007 I get "Thu, 11 Jan 2007", which is consistent with the observation that it replaces out-of-range values with current values. Now, who can explain what Opera does if you parse "january 101 2007"?

Executive summary: good thing we're writing browsers and not time machines!

Advertisements

16 thoughts on “new adventures in date parsing

  1. Can't test, but I would assume that "january 101 2007" would be a day that didn't make sense, therefore being replaced with the current date as well. But for some reason I get the feeling that it does not react this way…haha. Do tell, do tell.Also, did you file a bug report about this?[EDIT]getYear seems to be returning a string of length 4 for me in Opera 9, FF2, FF3, and IE7. I'm not using any conditionals in my code. Did something change since your getYear blog?[/EDIT]

  2. @hallvors:january 100 2007-> Wed, 10 Jan 2007 15:00:00 GMT-0800january 101 2007-> Wed, 10 Jan 2007 14:59:00 GMT-0800But it's the 11th!! And the time is about 23:20 so the time doesn't make sense…You know, you could always write your own implementation of Date to replace the browser's 😉

  3. kyleabaker: it is simple to test. Just type this into the address bar:javascript:alert(new Date('January 100 2007'));and increase/decrease the "100" to see what happens for different values._Grey_: indeed, observations depend on system time (should have thought of that). You are onto something when you observe the hour/minute part change though. Can you tell what is happening?

  4. Oh, and for getYear: yes, we have decided to go with IE's quirk. The grapevine says Mozilla might too soon, and AFAIK WebKit already does.

  5. @hallvors:I figured it out (as much as possible).It replaced "day of week" with current one (now "saturday"). Then it kind-of decrements the day by one (e.g. 10 instead of 11 on wednesday and now 13 instead of 14 on saturday). The "100" always gives a "15:00" time, the "99" a "15:01", the "101" a "14:59" and so on. One thing I noticed: Timezone is wrong. Not -8, but -7 (DST). "new Date()" gives the "correct" value (according to system time).Weird. Was that the bug Jon introduced last time he coded? 😛

  6. Weird indeed. So what is Opera going to do now? Trying to be the first browser to implement javascript date functionality somewhat consistent? 😉

  7. _Grey_: impressive, you are right on there. The nonsense date value gets parsed as a time zone value somewhere! So "January 100" is seen as a date from UTC+10 or something like that (I have not experimented enough to see how it is parsed exactly). Hence, for many time zones Opera thinks it needs to subtract a suitable number of hours to get your local time, and at the right time of the day that will pass midnight so that overly large dates makes Opera randomly default to yesterday's date, depending on your location..

  8. @hallvors: Interesting theory. Indeed makes sense (well, not really, but you know what I mean). Unfortunately, I don't think I can confirm it. It has been wrong <24:00 and >0:00. Possible would be somewhere in the middle of the day, but I really doubt that.Looks like you're going to have to poke a dev with a stick to get more info *hint*.

  9. StoneCypher writes:Also, your blog's captcha is broken; I had to try to post six times before that would go through, on grounds that I was supposedly using the wrong security code when in fact I was not.

  10. Anonymous writes:

    This is incorrect. ECMA 262 page 122, 15.9.4.2 clearly defines string values passed to Date constructors as UTC. Coordinated Universal Time specifies, unambiguously, the format YYYY-MM-DD. Attempting to use any other format is undefined by standard.Opera is dead wrong here. The correct parsing for 01021999 is "undefined". I wish Opera progammers would read standards. 😦 You guys have so much ECMA wrong it's unbelievable.

  11. Attempting to use any other format is undefined by standard.Opera is dead wrong here. The correct parsing for 01021999 is "undefined".

    ..which is exactly why we're in such a mess. When the standard says behaviour is "undefined" it means "do whatever you think makes sense with this input", which for real-web compatibility means "try to do what the others do, somehow". It takes hours and hours to figure that out, believe me. If the standard didn't weasel out by saying the behaviour is "undefined" we wouldn't have such ridiculous date parsers in browsers.

    You guys have so much ECMA wrong it's unbelievable.

    Specifics please. Preferably failing test cases or web pages, please. Vague claims do not impress me – I know pretty well how compatible we are with the ES-262 3rd edition ECMAScript specification.Sorry about the captcha issues BTW. The whole site was being updated – perhaps something not yet in place after the big changes.

  12. @hallvors: regarding this post:I think you were right. I misunderstood what you where saying. As far as I can see, "January 100" defaults to 00:00 on GMT+1. Thus, when the time zone is GMT-8… local time will be 15:00.Other observations (times @ gmt+1):100-159 = 00:00 – 23:01200-259 = 23:00 – 22:01300-359 = 22:00 – 21:01400-459 = 21:00 – 20:01500-559 = 20:00 – 19:01600-659 = 19:00 – 18:01700-759 = 18:00 – 17:01800-859 = 17:00 – 16:01900-959 = 16:00 – 15:01Anything above 999 will be interpreted as year. This is on Opera 9.6x, may have been different in Merlin builds.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s