A fast small date class that uses 4 bytes that allows calculation or and conversion from:

- integer number of days from some epoch
- 'day', 'month', 'year' and 'day of year' components
- 'weekday', 'week of year' and 'week year' components

Status

- Compiles and tested on MSVC 6 and GCC 2.96
- Includes 'date', 'time' and 'datetime' classes

Caveats

- Gregorian calendar only
- Be careful in multi-threaded code. The internal representation can be changed by const methods. Read the class comments for details

I needed to develop a date class for the database system I was working on. There were a couple of important criteria that the date class needed:

- Small representation (at most 4 bytes in size)
- Ability to convert any date to some integer number easily
- Ability to quickly get day, month, year, 'day of year', 'week of year' or 'day of week' components
- Ability to calculate number of days between 2 dates easily
- Only need to worry about Gregorian calendar dates
- Only need to worry about reasonable year ranges, say about 1800 to 2200 at a minimum

Thinking about this, there are a couple of standard ways to go

- Store a single number that represents the days since some
*epoch* - Store a bit structure which holds a representation of each of the above values

Lets look at the range of each of the above items

Item |
Range |
Bits required |

Day | 1-31 | 5 |

Month | 1-12 | 4 |

Year | ? | ? |

Day of year | 1-366 | 9 |

Week of year | 1-53 | 6 |

Day of week | 1-7 | 3 |

Adding these up we get 27 bits, which leaves 5 bits for a year value to fit in 4 bytes, which is really too small. Of course there is an overlap between these values. For instance we could calculate 'day of year' fairly easily based on the year, month and day, or vice-versa, which would free about a total of 14 bits for the year, which could easily represent all the years we'd be interested in.

Before we head down any of these roads, lets see more how the class is used.

- Data is stored as integer values in the database. For each block of values stored in the DB, the range of values being stored is calculated and the results are then bit-packed. Thus for best compression, similar dates should have numerical representations that are close to each other
- When a date is used, it tends to be in one of three ways, usually used
distinctly from each other:
- Calculate a date range by subtracting two dates
- Use the year, month and day components
- Use the year, 'week of year' and weekday components

Thus we might decide the following is the best:

- Use a 'days since epoch' representation when storing in the DB. This means dates close to each other are stored with a minimum bit count, and date objects can be constructed quickly
- This also means that date span calculations are easy because we can just subtract the two integer representations to get the number of days between two dates
- We then want some way of converting either to day/month/year components, or week of year/day of week/year components

Lets look at what *epoch* we can use for our integer representation.

Any search of date calculations on the web will quickly reveal 'Julian Day Numbers'. These have nothing to do with the Julian calendar, but instead are a representation of days and time since a given epoch, namely from January 1st, 4713 B.C. Fortunately Julian day numbers are used heavily in astronomy, and quite a bit of work has gone it generating a formula for converting between Gregorian dates and Julian day numbers. These are taken directly from the calendar FAQ part 2 and part 3:

```
```a = (14-month)/12

y = year+4800-a

m = month + 12*a - 3

JD = day + (153*m+2)/5 + y*365 + y/4 - y/100 + y/400 - 32045

There is also a formula for going the other way:

```
```
a = ** JD** + 32044

b = (4*a+3)/146097

c = a - (b*146097)/4

d = (4*c+3)/1461

e = c - (1461*d)/4

m = (5*e+2)/153

**Day** = e - (153*m+2)/5 + 1

**Month** = m + 3 - 12*(m/10)

**Year** = b*100 + d - 4800 + m/10

As well as a formula for calculating week numbers (as defined by ISO-8601 rather than the traditional US week number):

```
```d4 = (JD+31741 - (**JD** mod 7)) mod 146097 mod 36524 mod 1461

L = d4/1460

d1 = ((d4-L) mod 365) + L

**WeekOfYear** = d1/7+1

Note that there is one extra complication on this. The 'year' for a given 'week of year' may not be the same as the year calculated above. Sometimes the last days in december of a year can actually be the first week of the next year. Ditto for some of the first days in January. This is pretty easy to determine though by looking for those cases.

```
```// Week number 1 and 52/52 of a year may actually be in the

// previous/next year. Adjust the year number for those cases

UInt32 ** WeekYear** = ** Year** + ((**WeekNumber** == 1) && (**Month** == 12)) -
((**WeekNumber**
> 51) && (**Month** == 1));

Given the above, we can easily work out some other formulas. To calculate the Julian Day from a year and a 'day of year', just consider the month as January, and 'day of year' as 'day of January' to get:

```
```y = **Year** + 4799

**
JD** = **DayOfYear** + y*365 + y/4 - y/100 + y/400 - 32045

To go the other way, if we calculate the year from the Julian day to date formula, and then subtract the latter part of the formula above from the Julian day, we get the 'day of year':

```
```y = **Year**+4799

**DayOfYear** = ** JD** - (y*365 + y/4 - y/100 + y/400 - 31739)

The 'day of week' will just be the Julian day plus some offset modulo 7. The offset is easily pre-calculated to make sure the days of the week are correct. In fact, if we use the ISO 8601 definition of the first day of the week is a Monday, and then represent Monday-Sunday as 1-7, the formula is:

```
```**DayOfWeek** = ** JD** % 7 + 1

We also want some way of retrieving the Julian day component to store back in the DB.

```
```class CDate {

public:

// Construct new date either from Julian day number, or
day/month/year components

CDate(Int32 julianDay);

CDate(Int32 year, Int32 month, Int32 day);

// Retrieve Julian day number of the date

Int32 GetJulianDay() const;

// ...

};

At this point though, we have to decide about how we retrieve the day/month/year components and the day of week/week of year/year components. There are a couple of alternatives:

- Retrieve all components at once as a struct with one method call
- Retrieve a pointer to either an allocated struct or static struct with all components with one method call
- As above 1, but separate calls for each of day/month/year components and day of week/week of year/year components
- As above 2, but separate calls for each of day/month/year components and day of week/week of year/year components
- Have separate calls to retrieve any component individually

Each has advantages and disadvantages. Method 2 is much like the unix gmtime() and localtime() calls, which are very classic C interfaces. Methods 1 and 2 mean you only have to make one call to get all the components. On the other hand, if you only want one component, it means all of them have to be calculated. Returning a struct means the entire struct must be copied, while having individual calls means that lots more function calls must be made. Also a pointer to a static buffer is non-thread safe, but a pointer to an allocated buffer is considerably slower, and you have to then make sure it's deleted.

Methods 3 and 4 suffer the same problems as 1 and 2, but at least divide the problem into a finer grain. Method 5 is more like a 'properties' interface, where you can retrieve the day property of a date, whether it be a calculated value or not.

Given the above, I went for option 5 as being the most desirable object interface.

```
```class CDate {

public:

Int32 GetYear() const;

Int32 GetMonth() const;

Int32 GetDay() const;

Int32 GetDayOfYear() const;
Int32 GetWeekOfYear() const;

Int32 GetYearForWeekOfYear() const;

Int32 GetDayOfWeek() const;

};

So now we have an interface, and a way to convert between day offsets and date components. The question we still haven't answered is which internal representation we'll use. The decision I went with was to allow the internal representation to change between the two. That is, the representation could be either Julian Day number, or date components as bit fields.

To make this work, we have to use a bit to determine which representation is currently being used. I decided that the best to use would be the highest bit. If it's zero, the remaining 31 bits are the Julian Day number so we can just use the value as an integer, if it's one, the remaining bits are used as bit fields describing each date component. Using the highest bit also means we can check for which representation is being used simply by checking if the signed 32 bit value is positive (Julian day) or negative (date components). Then the code for for each method checks if we've currently got a Julian day or date component representation, switches to the appropriate one, and returns the result.

Item |
Range |
Bits required |
Bit offset |

Storage type | 0-1 | 1 | 31 |

Is leap year? | 0-1 | 1 | 30 |

Week year difference | -1 - 1 | 2 | 28-29 |

Month | 1-12 | 4 | 24-27 |

Day of week | 1-7 | 3 | 21-23 |

Day | 1-31 | 5 | 16-20 |

Week of year | 1-53 | 6 | 10-15 |

Year | 1500-2500 | 10 | 0-9 |

Note that we've left out the 'day of year' component, which can be easily calculated from a lookup table using the month and 'is leap year' components. I've also tried to organise the fields into 16 bit and 8 bits groups. This allows a good optimiser to turn certain shift and mask operations (eg shift by 16) into just loading the high word and mask operations.

So now we have all the components in place. I implemented all the Julian day/date conversion functions in a separate class with static member functions, and used the date class only to hide the representation. And that's about it. See the top of this page for how to download the class.