The central goal of any database system is to allow the user to query the data to answer business questions, and the most important of these questions always involve dates. Querying date/time data in SQL Server has some interesting pitfalls, but for the most part they’re easily avoidable if you understand how the database management system (DBMS) treats temporal data. – Adapted from Expert SQL 2005 Development (Apress, 2007), Chapter 10, “Temporal Data”.
To start things off, create the following table:
USE TempDB GO CREATE TABLE VariousDates ( ADate DATETIME NOT NULL, PRIMARY KEY (ADate) WITH (IGNORE_DUP_KEY = ON) ) GONote that you should create the table in tempdb because it is just a throw-away table that will be used for these examples. Now insert some data into the table. The following T-SQL code inserts 2,048 rows into the table, with dates spanning from February 13, 2006, through February 20, 2006, in 5-minute increments:
INSERT VariousDates SELECT DATEADD(minute, number*5, ‘20060213’) FROM master..spt_values WHERE type = ‘p’ GOAfter you’ve inserted the data, the next logical step is, of course, to query it. First, you might want to ask the question “What is the minimum date value in the table?” The following query uses the MIN aggregate to answer that question:
SELECT MIN(ADate) FROM VariousDatesThis query returns one row, with the value 2006-02-13 00:05:00.000. But perhaps you’d like to know what other times from February 13, 2006, are in the table. A first shot at that query might look something like the following:
SELECT * FROM VariousDates WHERE ADate = ‘20060213’ But you might be surprised to find that instead of returning all rows for February 13, 2006, this query returns zero rows. The reason is that all dates in SQL Server must include a time component. When this query is evaluated and the search argument ADate = ‘20060213’ is processed, SQL Server sees that the ADate column, which is typed as DATETIME, is being compared to the date string, which is typed as VARCHAR. Based on SQL Server’s rules for data type precedence, the string is converted to DATETIME before being compared; and because the string includes no time portion, the default time of 00:00:00.000 is used. To see this conversion in action, try the following T-SQL code:
SELECT CONVERT(DATETIME, ‘20060213’) When you run this code, the default time portion is automatically added, and the output of the SELECT statement is the value 2006-02-13 00:00:00.000. Clearly, querying based on the implicit conversion between this string and the DATETIME type is ineffective-unless you only want values for midnight.To solve this problem, you must control the conversion in a slightly different way. Many developers’ first reaction is to try to avoid the conversion of the string to an instance of DATETIME altogether by converting the column itself and using a conversion style that eliminates the time portion. The following query shows one way of doing this:
SELECT * FROM VariousDates WHERE CONVERT(VARCHAR(20), ADate, 112) = ‘20060213’ You’ll find that this query returns the correct data: all rows from February 13, 2006. Although getting back correct results is wonderful, there is a major problem that might not be obvious with the small sample data used in this example. The table’s index on the ADate column is based on ADate as it is natively typed-in other words, as DATETIME. The table does not have an index for ADate converted to VARCHAR(20) using style 112 (or any other style or conversion, for that matter). As a result, this query is unable to seek an index, and SQL Server is forced to scan every row of the table, convert each ADate value to a string, and then compare it to the date string. To be able to effectively use the index to satisfy the query, you need to supply SQL Server with a search argument that doesn’t involve converting the data already in the table. The solution is to think slightly outside of the realm of using an exact match to satisfy the query. Another way to ask for dates from February 13, 2006, is to ask for any rows where the date/time value is between midnight February 13, 2006, and midnight February 14, 2006. Any instant recorded within that interval is a valid date for February 13, 2006.The following query is a first attempt at querying a range instead of an exact match:
SELECT * FROM VariousDates WHERE ADate BETWEEN ‘20060213’ AND ‘20060214’ This plan has an estimated cost of 0.0035983-less than one-half the estimated cost of the previous version; but remember that this is a very small table and that ratio will shoot up as more rows are inserted.Although this query performs a lot better than the version that uses string conversion, it has a subtle problem. The BETWEEN operator is inclusive on either end, meaning that X BETWEEN Y AND Z expands to X >= Y AND X <= Z. For the sake of this query, that means that the search argument, with times included (the implicit conversion will add those back in), will become the following:
ADate >= ‘20060213 00:00:00.000’ AND ADate <= ‘20060214 00:00:00.000’ That means this query can potentially return an incorrect result if there happens to be a row for midnight February 14, 2006 (and the data in the sample table includes such a row). Luckily, solving this problem is easy: Simply don’t use BETWEEN. Instead, always use the fully expanded version, inclusive of the start of the interval and exclusive of the end value:
SELECT * FROM VariousDates WHERE ADate >= ‘20060213’ AND ADate < ‘20060214’You can use this pattern, which is quite flexible, to query any kind of date range. Now, let’s see how to extend this pattern to find all of “today’s” rows, “this month’s” rows, and other similar requirements.
Reference Dates and Date/Time Calculations
The query pattern above works and returns the correct results, but it’s overly static. Expecting all date-range queries to have hard-coded values for the input dates is neither a realistic nor very maintainable solution. By using SQL Server’s date calculation functions, you can manipulate input dates to dynamically come up with whatever ranges are necessary for a given query.The two primary functions commonly used to perform date/time calculations are DATEDIFF and DATEADD. The first returns the difference between two dates; the second adds (or subtracts) time from an existing date. Each of these functions takes granularity as a parameter and can operate at any level between milliseconds and years.DATEDIFF takes three parameters: the time granularity that should be used to compare the two input dates, the start date, and the end date. For example, to find out how many hours elapsed between midnight on February 13, 2006, and midnight on February 14, 2006, you can use the following query:
SELECT DATEDIFF(hour, ‘20060113’, ‘20060114’) As expected, the result is 24. I mentioned that this query compares the two dates, both at midnight, even though neither of the input strings contains a time. Again, I want to stress that any time you use a string as an input where a date/time type is expected, SQL Server will implicitly convert the string to a date/time type. Also note that DATEDIFF maintains the idea of “start” and “end” times, and the result will change if you reverse the two. Changing the previous query so that February 14 is passed in before February 13 results in the output of -24.The DATEADD function takes three parameters: the time granularity, the amount of time to add, and the input date. For example, the following query adds 24 hours to midnight on February 13, 2006, resulting in an output of 2006-01-14 00:00:00.000:
SELECT DATEADD(hour, 24, ‘20060113’) DATEADD can also handle negative amounts and will subtract instead of add in that case.
The first step in doing date/time calculations in SQL Server is to learn to combine DATEDIFF and DATEADD to truncate date/time values. Understanding how it’s done takes a bit of a logical jump, so I break the process down into its four component parts:
- Truncation is essentially the same as rounding down, so you must first decide which date/time component you’d like to use to do the rounding. The component you’ll use to do the rounding should be one level of granularity above whatever data you need to truncate. For instance, if you want to remove the seconds and milliseconds, you’d round down using minutes. Likewise, to remove the entire time portion, you’d round down using days.
- After you decide on a level of granularity, you need to choose a reference date/time. For basic truncation, this date/time can be any within the range of the data type you are working with. I generally use 0, which corresponds to 1900-01-01 at midnight for both SMALLDATETIME and DATETIME.
- Using the DATEDIFF function, find the difference between the reference date/time and the date/time you want to truncate, at the level of granularity you’ve selected.
- Finally, use DATEADD to add the output from the DATEDIFF function to the same reference date/time that you used to find the difference. The result will be the truncated value of the original date/time.
Walking through an example should clarify this process. Assume that you want to start with 2006-04-23 13:45:43.233 and truncate the time portion (in other words, come out with 2006-04-23 at midnight). You will use a granularity of days because that is the lowest level of granularity above the time granularities (milliseconds, seconds, minutes, and hours). You can use the following T-SQL code to determine the number of days between the reference date of 0 and the input date:
DECLARE @InputDate DATETIME SET @InputDate = ‘2006-04-23 13:45:43.233’ SELECT DATEDIFF(day, 0, @InputDate) Upon running this code, you discover that 38,828 days passed between the reference date and the input date. Using DATEADD, you can add that number to the reference date:
SELECT DATEADD(day, 38828, 0) The result of this operation is the desired truncation: 2006-04-23 00:00:00.000. Because you added back to the reference date only the number of days-with no time portion-SQL Server rounded down the date and eliminated the time portion. Of course, you don’t have to run this T-SQL code step by step; in a stored procedure, you’d probably combine everything into one inline statement, such as
SELECT DATEADD(day, DATEDIFF(day, 0, @InputDate), 0) Once you understand this basic pattern, you can modify it to come up with any combination of dates. For instance, finding the first of the month is a simple matter of changing the granularity:
SELECT DATEADD(month, DATEDIFF(month, 0, @InputDate), 0)
Finding the last day of the month requires just a bit more thought: First, find the first day of the month, add an additional month, and then subtract 1 day:
SELECT DATEADD(day, -1, DATEADD(month, DATEDIFF(month, 0, @InputDate)+1, 0)) Another way to find the last day of the month is to use as a reference date any date that is the last day of a month-for instance, 1900-12-31:
SELECT DATEADD(month, DATEDIFF(month, ‘19001231’, @InputDate), ‘19001231’) Note that the month you choose is important in this case. I chose a 31-day month; what this T-SQL code is actually doing is finding the same day as the reference date in the month in question. But if the month does not have 31 days, SQL Server will automatically round down to the closest date. Had I used February 28 instead of December 31 for the reference date, the output would have been the 28th of the month any time this query was run. Other more interesting combinations are also possible. For example, a common requirement in many applications is to do calculations based on time periods, such as “every day between last Friday and today.” By modifying the truncation pattern a bit, finding “last Friday” is fairly simple. The main trick is to start with a well-known reference date. In this case, to find a day on Friday, the reference date should be any Friday. We know that the number of days between any Friday and any other Friday is divisible by seven, and we can use that knowledge to truncate the current date down to the nearest Friday. The following T-SQL example finds the number of days between the reference Friday, January 7, 2000, and the input date, February 9, 2006:
DECLARE @Friday DATETIME SET @Friday = ‘20000107’ SELECT DATEDIFF(day, @Friday, ‘20060209’)The result is 2225, which is an integer. Taking advantage of SQL Server’s integer math properties, dividing the result by 7 and then multiplying it by 7 again will round it down to the nearest number divisible by 7: 2219:
SELECT (2225 / 7) * 7Adding 2219 days to the original reference date of January 7, 2000, results in the desired output, the “last Friday” before February 9, 2006, which was February 3, 2006:
SELECT DATEADD(day, 2219, ‘20000107’) As with the previous example, this process can be simplified (and clarified) by combining the operations inline:
DECLARE @Friday DATETIME SET @Friday = ‘20000107’ SELECT DATEADD(day, ((DATEDIFF(day, @Friday, @InputDate) / 7) * 7), @Friday)You can also further simplify the second statement. After the result of the inner DATEDIFF function is divided by 7, its granularity is in weeks until it is multiplied by 7 again to produce days. But there is no reason to do the second multiplication, and the code becomes a bit cleaner if the result is simply treated as weeks for the final calculation, as in the following statement:
SELECT DATEADD(week, (DATEDIFF(day, @Friday, @InputDate) / 7), @Friday) Note that these examples return the input date itself (with the time portion truncated) if the input date is a Friday. If you want to return the “last” Friday every time and never the input date itself-even if it is a Friday-a small modification is required. To accomplish this, you must use two reference dates: one for any Friday and one for any day within a week of the original reference date (for simplicity, I recommend using the next day). By using the next day as the inner reference date, the result of the division will be one week lower if the input date is a Friday, meaning that the result will be the previous Friday. The following T-SQL example does this for a given input date:
DECLARE @Friday DATETIME SET @Friday = ‘20000107’ DECLARE @Saturday DATETIME SET @Saturday = ‘20000108’ SELECT DATEADD(week, (DATEDIFF(day, @Saturday, @InputDate) / 7), @Friday)By using this pattern and switching the reference date, you can easily find the last of any day of the week given an input date. To find the “next” one of a given day (e.g., “next Friday”), simply add one week to the result of the inner calculation before adding it to the reference date:
SELECT DATEADD(week, (DATEDIFF(day, @Friday, GETDATE()) / 7)+1, @Friday) As a final example of what you can do with date/time calculations, let’s look at a slightly more complex requirement. Say that you’re visiting the Boston area and want to attend a meeting of the New England SQL Server Users Group. The group meets on the second Thursday of each month. Given an input date, how do you find the date of the next meeting? To find the second Thursday of the month of the input date, first find the fourteenth day of the month, and then use it to find “last Thursday.” The fourteenth has significance because if the first of the month is a Thursday, the second Thursday will be the eighth day of the month. On the other hand, if the first of the month is a Friday, the second Thursday will be the fourteenth. So for any given month, the “last Thursday” (as of and including the fourteenth) will be the second Thursday of the month. The following T-SQL code performs this calculation:
DECLARE @Thursday DATETIME SET @Thursday = ‘20000914’ DECLARE @FourteenthOfMonth DATETIME SET @FourteenthOfMonth = DATEADD(month, DATEDIFF(month, @Thursday, @InputDate), @Thursday) SELECT DATEADD(week, (DATEDIFF(day, @Thursday, @FourteenthOfMonth) / 7), @Thursday)Of course, this example doesn’t find the next meeting; it finds the meeting for the month of the input date. To find the next meeting, you’ll need to use a CASE expression and note the following about second Thursdays: If the second Thursday of a month falls on the eighth, ninth, or tenth, the next month’s second Thursday is five weeks away; otherwise, the next month’s second Thursday is four weeks away. To find the day of the month represented by a given date/time instance, use T-SQL’s DATEPART function, which takes the same date granularity inputs as DATEADD and DATEDIFF. Given an input date, the following T-SQL code combines all these techniques to find the next date for the New England SQL Server Users Group meeting:
DECLARE @Thursday DATETIME SET @Thursday = ‘20000914’ DECLARE @FourteenthOfMonth DATETIME SET @FourteenthOfMonth = DATEADD(month, DATEDIFF(month, @Thursday, @InputDate), @Thursday) DECLARE @SecondThursday DATETIME SET @SecondThursday = DATEADD(week, (DATEDIFF(day, @Thursday, @FourteenthOfMonth) / 7), @Thursday) SELECT CASE WHEN @InputDate <= @SecondThursday THEN @SecondThursday ELSE DATEADD( week, CASE WHEN DATEPART(day, @SecondThursday) <= 10 THEN 5 ELSE 4 END, @SecondThursday) ENDFinding complex dates such as the second Thursday of a month is not a very common requirement unless you’re writing a scheduling application. More common are requirements along the lines of “find all of today’s rows.” Combining the range techniques we first looked at with the date/time math seen here, you can easily design stored procedures that both efficiently and dynamically query for required time periods. For instance, the following T-SQL code always returns rows for the given day, no matter what the day is:
SELECT * FROM VariousDates WHERE ADate >= DATEADD(day, DATEDIFF(day, 0, GETDATE()), 0) AND ADate < DATEADD(day, DATEDIFF(day, 0, GETDATE())+1, 0)The first search argument uses the calculation to find “today” with the time part truncated, and the second search argument adds 1 to the difference in days between the current day and the reference date to return “tomorrow” with the time part truncated. The result is all rows between today at midnight, inclusive, and tomorrow at midnight, exclusive.As a final example of date/time calculations in T-SQL, consider a seemingly simple task: finding out how many years old you are as of today. You might think the following code would return the correct answer:
SELECT DATEDIFF(year, @YourBirthday, GETDATE()) Unfortunately, this result-depending on the current day-is wrong. Consider someone born on March 25, 1965. On March 25, 2006, that person’s 41st birthday should be celebrated. Yet, according to SQL Server, that person was already 41 on March 24, 2006:
SELECT DATEDIFF(year, ‘19650325’, ‘20060324’) As a matter of fact, according to SQL Server, this person was turned 41 on January 1, 2006. Happy New Year and Happy Birthday-combined, thanks to the magic of SQL Server! The discrepancy is due to the way SQL Server calculates date differences. Only the date/time component for which you’re calculating the difference is considered, and any components below are truncated. This feature makes the previous date/time truncation examples work but makes age calculations fail because when comparing years, days and months are not taken into account.To get around this problem, you must add a CASE expression that subtracts one year if the day and month of the current date is less than the day and month of the input date. The following T-SQL example, which accomplishes the primary goal and takes leap years into consideration, was created by SQL Server MVP Steve Kass:
SELECT DATEDIFF ( YEAR, @YourBirthday, GETDATE()) – CASE WHEN 100 * MONTH(GETDATE()) + DAY(GETDATE()) < 100 * MONTH(@YourBirthday) + DAY(@YourBirthday) THEN 1 ELSE 0 END This is just a brief introduction to some of the things you can do with SQL Server’s date/time data types. Understanding how the types work lets you easily and efficiently query and calculate to find the data you need.