CDate - designing another C++ date class

Overview

A fast small date class that uses 4 bytes that allows calculation or and conversion from:

Download here

Status

Caveats

Introduction

I needed to develop a date class for the database system I was working on. There were a couple of important criteria that the date class needed:

Thinking about this, there are a couple of standard ways to go

Lets look at the range of each of the above items

Item Range Bits required
Day 1-31 5
Month 1-12 4
Year ? ?
Day of year 1-366 9
Week of year 1-53 6
Day of week 1-7 3

Adding these up we get 27 bits, which leaves 5 bits for a year value to fit in 4 bytes, which is really too small. Of course there is an overlap between these values. For instance we could calculate 'day of year' fairly easily based on the year, month and day, or vice-versa, which would free about a total of 14 bits for the year, which could easily represent all the years we'd be interested in.

Before we head down any of these roads, lets see more how the class is used.

Thus we might decide the following is the best:

Lets look at what epoch we can use for our integer representation.

Julian Day Numbers

Any search of date calculations on the web will quickly reveal 'Julian Day Numbers'. These have nothing to do with the Julian calendar, but instead are a representation of days and time since a given epoch, namely from January 1st, 4713 B.C. Fortunately Julian day numbers are used heavily in astronomy, and quite a bit of work has gone it generating a formula for converting between Gregorian dates and Julian day numbers. These are taken directly from the calendar FAQ part 2 and part 3:

a = (14-month)/12
y = year+4800-a
m = month + 12*a - 3

JD = day + (153*m+2)/5 + y*365 + y/4 - y/100 + y/400 - 32045

There is also a formula for going the other way:

a = JD + 32044
b = (4*a+3)/146097
c = a - (b*146097)/4

d = (4*c+3)/1461
e = c - (1461*d)/4
m = (5*e+2)/153

Day = e - (153*m+2)/5 + 1
Month = m + 3 - 12*(m/10)
Year = b*100 + d - 4800 + m/10

As well as a formula for calculating week numbers (as defined by ISO-8601 rather than the traditional US week number):

d4 = (JD+31741 - (JD mod 7)) mod 146097 mod 36524 mod 1461
L = d4/1460
d1 = ((d4-L) mod 365) + L
WeekOfYear = d1/7+1

Note that there is one extra complication on this. The 'year' for a given 'week of year' may not be the same as the year calculated above. Sometimes the last days in december of a year can actually be the first week of the next year. Ditto for some of the first days in January. This is pretty easy to determine though by looking for those cases.

// Week number 1 and 52/52 of a year may actually be in the
// previous/next year. Adjust the year number for those cases
UInt32 WeekYear = Year + ((WeekNumber == 1) && (Month == 12)) - ((WeekNumber > 51) && (Month == 1));

Given the above, we can easily work out some other formulas. To calculate the Julian Day from a year and a 'day of year', just consider the month as January, and 'day of year' as 'day of January' to get:

y = Year + 4799 
JD = DayOfYear + y*365 + y/4 - y/100 + y/400 - 32045

To go the other way, if we calculate the year from the Julian day to date formula, and then subtract the latter part of the formula above from the Julian day, we get the 'day of year':

y = Year+4799
DayOfYear = JD - (y*365 + y/4 - y/100 + y/400 - 31739)

The 'day of week' will just be the Julian day plus some offset modulo 7. The offset is easily pre-calculated to make sure the days of the week are correct. In fact, if we use the ISO 8601 definition of the first day of the week is a Monday, and then represent Monday-Sunday as 1-7, the formula is:

DayOfWeek = JD % 7 + 1

Interface

Now we've got all the calculations out of the way, we have to decide on our interface. We'll have 2 standard constructors to construct from a Julian Day when reading from the DB, and day/month/year components when constructing from a newly parsed text data.

We also want some way of retrieving the Julian day component to store back in the DB.

class CDate {
public:

// Construct new date either from Julian day number, or day/month/year components
CDate(Int32 julianDay);
CDate(Int32 year, Int32 month, Int32 day);

// Retrieve Julian day number of the date
Int32 GetJulianDay() const;

// ...
};

At this point though, we have to decide about how we retrieve the day/month/year components and the day of week/week of year/year components. There are a couple of alternatives:

  1. Retrieve all components at once as a struct with one method call
  2. Retrieve a pointer to either an allocated struct or static struct with all components with one method call
  3. As above 1, but separate calls for each of day/month/year components and day of week/week of year/year components
  4. As above 2, but separate calls for each of day/month/year components and day of week/week of year/year components
  5. Have separate calls to retrieve any component individually

Each has advantages and disadvantages. Method 2 is much like the unix gmtime() and localtime() calls, which are very classic C interfaces. Methods 1 and 2 mean you only have to make one call to get all the components. On the other hand, if you only want one component, it means all of them have to be calculated. Returning a struct means the entire struct must be copied, while having individual calls means that lots more function calls must be made. Also a pointer to a static buffer is non-thread safe, but a pointer to an allocated buffer is considerably slower, and you have to then make sure it's deleted. 

Methods 3 and 4 suffer the same problems as 1 and 2, but at least divide the problem into a finer grain. Method 5 is more like a 'properties' interface, where you can retrieve the day property of a date, whether it be a calculated value or not.

Given the above, I went for option 5 as being the most desirable object interface.

class CDate {
public:

Int32 GetYear() const;
Int32 GetMonth() const;
Int32 GetDay() const;
Int32 GetDayOfYear() const;

Int32 GetWeekOfYear() const;
Int32 GetYearForWeekOfYear() const;
Int32 GetDayOfWeek() const;

};

Internal representation

So now we have an interface, and a way to convert between day offsets and date components. The question we still haven't answered is which internal representation we'll use. The decision I went with was to allow the internal representation to change between the two. That is, the representation could be either Julian Day number, or date components as bit fields.

To make this work, we have to use a bit to determine which representation is currently being used. I decided that the best to use would be the highest bit. If it's zero, the remaining 31 bits are the Julian Day number so we can just use the value as an integer, if it's one, the remaining bits are used as bit fields describing each date component. Using the highest bit also means we can check for which representation is being used simply by checking if the signed 32 bit value is positive (Julian day) or negative (date components). Then the code for for each method checks if we've currently got a Julian day or date component representation, switches to the appropriate one, and returns the result.

Item Range Bits required Bit offset
Storage type 0-1 1 31
Is leap year? 0-1 1 30
Week year difference -1 - 1 2 28-29
Month 1-12 4 24-27
Day of week 1-7 3 21-23
Day 1-31 5 16-20
Week of year 1-53 6 10-15
Year 1500-2500 10 0-9

Note that we've left out the 'day of year' component, which can be easily calculated from a lookup table using the month and 'is leap year' components. I've also tried to organise the fields into 16 bit and 8 bits groups. This allows a good optimiser to turn certain shift and mask operations (eg shift by 16) into just loading the high word and mask operations.

So now we have all the components in place. I implemented all the Julian day/date conversion functions in a separate class with static member functions, and used the date class only to hide the representation. And that's about it. See the top of this page for how to download the class.