Consistently parsing a time both before and after the DST switch
Hello.
I would like to use java.text.SimpleDateFormat in order to parse a String containing a date, time, and timezone without needing to specify a different timezone depending on whether I'm using daylight savings time or not. That is, consider the following program:
import java.text.*;
publicclass TZ
{
publicstaticvoid main(String[] args)throws Exception
{
DateFormat format =new SimpleDateFormat("yyyyMMdd HHmmss z");
System.out.println(format.parse(args[0]));
}
}
Now take a look at the following output (the program was run on a computer in the US/Eastern timezone, though that probably isn't relevant):
> java TZ "20070310 120000 EST"
Sat Mar 10 12:00:00 EST 2007
The above example looks fine. The time in question was during standard time, and we print out a Date that seems to be the same moment in time.
> java TZ "20070312 120000 EST"
Mon Mar 12 13:00:00 EDT 2007
In this example, we've moved forward two days into the daylight savings time period. It appears that SimpleDateFormat, upon seeing the "EST" in the string to parse, is assuming that I really mean standard time here, despite the fact that the day should be on daylight savings time. This is a little weird, but could be reasonable, since the String is using "EST" explicitly.
> java TZ "20070312 120000 EDT"
Mon Mar 12 12:00:00 EDT 2007
In the above example, we switched the String to use EDT explicitly, and this did what we wanted. However, it's unfortunate that we had to modify the String in order to get the Date we wanted. So let's try again below.
> java TZ "20070312 120000 EST5EDT"
Mon Mar 12 13:00:00 EDT 2007
This last example is really where I was disappointed. I was hoping that by using "EST5EDT" here, the SimpleDateFormat would realize that I wanted it to parse the String using whatever DST behavior was correct at that particular moment. Unfortunately, it seems to treat "EST5EDT" identically to "EST" here.
The only way I can see to fix this is to avoid using a TimeZone format parameter in the SimpleDateFormat entirely, and instead parse the TimeZone independently. That is, consider this program:
import java.text.*;
import java.util.TimeZone;
publicclass TZFixed
{
publicstaticvoid main(String[] args)throws Exception
{
DateFormat format =new SimpleDateFormat("yyyyMMdd HHmmss");
format.setTimeZone(TimeZone.getTimeZone(args[1]));
System.out.println(format.parse(args[0]));
}
}
Now look at these runs:
> java TZFixed "20070310 120000" EST5EDT
Sat Mar 10 12:00:00 EST 2007
> java TZFixed "20070312 120000" EST5EDT
Mon Mar 12 12:00:00 EDT 2007
Perfect! The code parses the time using the correct DST behavior both before and after the DST switch. However, the code is uglier than simply using SimpleDateFormat. And this difference seems somewhat inconsistent. Why would SimpleDateFormat behave differently if the TimeZone is set programmatically versus being set by the "z" format character?
Can anyone shed any insight on this? Any way to get this to work just using the SimpleDateFormat String?
Thanks,
-Neil
[4171 byte] By [
katzna] at [2007-11-26 21:29:50]

> Now take a look at the following output (the program
> was run on a computer in the US/Eastern timezone,
> though that probably isn't relevant):
>
> > java TZ "20070310 120000 EST"
> Sat Mar 10 12:00:00 EST 2007
>
> The above example looks fine. The time in question
> was during standard time, and we print out a Date
> that seems to be the same moment in time.
>
> > java TZ "20070312 120000 EST"
> Mon Mar 12 13:00:00 EDT 2007
>
> In this example, we've moved forward two days into
> the daylight savings time period. It appears that
> SimpleDateFormat, upon seeing the "EST" in the string
> to parse, is assuming that I really mean standard
> time here, despite the fact that the day should be on
> daylight savings time. This is a little weird, but
> could be reasonable, since the String is using "EST"
> explicitly.
>
You took an EST time, parsed it, and then printed it using the default timezone.
The fact that your default timezone is EDT is entirely irrelevant to how the string was parsed.
> > java TZ "20070312 120000 EDT"
> Mon Mar 12 12:00:00 EDT 2007
>
> In the above example, we switched the String to use
> EDT explicitly, and this did what we wanted.
> However, it's unfortunate that we had to modify the
> String in order to get the Date we wanted. So let's
> try again below.
>
Did what you wanted?
You took a different timestamp value (in the string). Parsed it. Then printed it using the default time zone.
As a guess your problem is actually with the following line.
System.out.println(format.parse(args[0]));
That line takes a Date and converts it to the local (default) timezone.
If you want to print to a specific timezone then use a specific timezone.
I don't think I communicated my desire clearly, so let me try again.
I want to be able to parse a date with SimpleDateFormat that is in US/Eastern time, without explicitly saying whether it's in standard or daylight savings mode. That is, for the date "20070310 12:00:00 EST5EDT", I want SimpleDateFormat to parse this as if the timezone was specified as "EST", but for "20070312 12:00:00 EST5EDT", I want SimpleDateFormat to parse this as if the timezone was specified as "EDT". My original post shows this to be impossible, at least when using the "z" character in the SimpleDateFormat format string to specify the location of the TimeZone. The only way to do it is to parse the TimeZone separately from the rest of the date, as shown in the TZTest class in the original post.
So this leaves me with two questions:
1. Is there a way to get the behavior of TZTest without having to read the TimeZone separately from the format string? That is, I would really like to use "yyyyMMdd HHmmss z" as my format string and get SimpleDateFormat to behave like TZTest, but I can't figure out how to do that.
2. Why have this inconsistency between the TZ and TZTest classes? This seems inconsistent to me.
-Neil
I apologize -- in the previous post, I meant "TZFixed" where I wrote "TZTest".-Neil
> I don't think I communicated my desire clearly, so
> let me try again.
>
> I want to be able to parse a date with
> SimpleDateFormat that is in US/Eastern time, without
> explicitly saying whether it's in standard or
> daylight savings mode.
Ok. But that doesn't have anything to do with EDT versus EST.
Those represent two different timezone indicators.
A location can be set up to use EST where they are do NOT adjust for DST. That is the differentiation.
So how are you going to differentiate that if you are trying to change the meaning of EST?
Have you considered that your source is simply using the wrong indicator? If so then you would need to map it perhaps with an external configuration file that indicates that they are inconsistent.
> 2. Why have this inconsistency between the TZ and
> TZTest classes? This seems inconsistent to me.
What is 'TZTest'?
By the way, those three-character codes like EST have been deprecated for some time. (See the API documentation for TimeZone.) So you don't have any grounds for complaint if they don't work the way you like.
I think the core issue may be being confused here, as is wont to happen with any discussion involving timezones. Let me try to boil it down again.
I'm writing a program which consumes strings of the form "day time timezone", and I want to convert those strings to java.util.Date objects. Some of the strings have timezones of "EST5EDT". In this case, I'm supposed to interpret the time as if it's what would ordinarily be reported on the day in question, adjusted for daylight savings time if necessary.
That is, the string "20070310 120000 EST5EDT" should be interpreted as if it's noon EST, because 20070310 is a standard time day. However, the string "20070312 120000 EST5EDT" should be interpreted as if it's noon EDT, because 20070312 is a daylight savings time day.
Unfortunately, SimpleDateFormat doesn't do that. Running the "TZ" class from the original post in this thread, we see this:
> java TZ "20070310 120000 EST5EDT"
Sat Mar 10 12:00:00 EST 2007
> java TZ "20070312 120000 EST5EDT"
Mon Mar 12 13:00:00 EDT 2007
Notice how the second string is interpreted as 1:00pm, instead of noon. That is, SimpleDateFormat is pretending I used a string timezone of "EST", when in fact, I used "EST5EDT".
This seems especially bizarre when you consider that when you set the TimeZone programmatically (that is, using DateFormat.setTimeZone() rather than the "z" format character), the code seems to work properly. This is demonstrated by the TZFixed class in the original post in this thread:
> java TZFixed "20070310 120000" EST5EDT
Sat Mar 10 12:00:00 EST 2007
> java TZFixed "20070312 120000" EST5EDT
Mon Mar 12 12:00:00 EDT 2007
Note that now, both dates are interpreted as noon, as I would expect.
I think there are two possibilities:
1. There is a bug in SimpleDateFormat.
2. This is the desired behavior of SimpleDateFormat. If so, it would be interesting to hear the rationale.
In any case, TZFixed presents the solution to this problem. I just don't think it's a very good solution, since using the "z" format character when constructing my SimpleDateFormat is much easier than having to extract the TimeZone from the String myself and parse it separately. Does anyone know of a way to make the TZ class behave like the TZFixed class without having to parse the TimeZone separately?
Thanks,
-Neil
What happens if you do this:java TZ "20070312 120000 America/New_York"Seems to me if you expect it to work for EST8EDT then it should equally well work for America/New_York. The documentation for SimpleDateFormat doesn't have enough information for me to guess what I expect to see as the output.
> I think the core issue may be being confused here, as> is wont to happen with any discussion involving> timezones. Let me try to boil it down again.Ok....Lets try this....Use setLenient(false) before you parse any of the strings.
Responding to Dr. Clap's question:
It appears that the "z" format won't handle America/New_York:
> java TZ "20070312 120000 America/New_York"
Exception in thread "main" java.text.ParseException: Unparseable date: "20070312 120000 America/New_York"
at java.text.DateFormat.parse(DateFormat.java:337)
at TZ.main(TZ.java:8)
This is also somewhat odd, since it's certainly possible to use select America/New_York programmatically.
-Neil
Responding to jschell's question:
Using setLenient(false) doesn't seem to affect the behavior:
import java.text.*;
public class TZNonLenient
{
public static void main(String[] args) throws Exception
{
DateFormat format = new SimpleDateFormat("yyyyMMdd HHmmss z");
format.setLenient(false);
System.out.println(format.parse(args[0]));
}
}
> java TZNonLenient "20070312 120000 EST5EDT"
Mon Mar 12 13:00:00 EDT 2007
> I think there are two possibilities:
>
> 1. There is a bug in SimpleDateFormat.
> 2. This is the desired behavior of SimpleDateFormat.
I think we have a third possibility. You can't call it a bug because the documentation doesn't say how it's going to work. The documentation doesn't even say what you can parse as a timezone name apart from RFC-whatever-it-was standard names. But on the other hand I wouldn't call it desired behaviour either.
> Responding to jschell's question:
>
> Using setLenient(false) doesn't seem to affect the
> behavior:
>
Instead of EST5EDT if you use ESTXXX you will find that it should return the same result. setLenient() makes no difference.
So it only parses what it needs, just the first three chars.
So in terms of a solution you need to do one of the following.
1. Have the source change the way the create timestamp strings.
2. Parse out the timezone and interpret your self correctly. I suspect a simple mapping might be sufficient.