Extracting Year, Month, and Day from Dates in Azure Data Factory
In Azure Data Factory (ADF), working with dates is a common task, especially when dealing with data transformations and scheduling tasks. ADF allows you to handle dates in different formats, such as timestamps and strings. This blog post will guide you on how to extract the year, month, and day from two types of dates: a string-based date and a UTC timestamp. We’ll use two variables to demonstrate this process:
date_to_process_str: A string representing a date in the formatyyyyMMdd.date_to_process_utc: A string representing a UTC timestamp in ISO 8601 format.
Variables Definition
First, let’s define our variables:
{
"date_to_process_str": {
"type": "string",
"defaultValue": "20240801"
},
"date_to_process_utc": {
"type": "string",
"defaultValue": "2024-08-01T00:00:00Z"
}
}
Extracting Year, Month, and Day from a String Date (date_to_process_str)
The date_to_process_str variable is in the format yyyyMMdd. To extract the year, month, and day, we can use ADF’s substring function. Here’s how you can extract each part:
- Year: Extract the first four characters.textCopy code
@substring(variables('date_to_process_str'), 0, 4) - Month: Extract the fifth and sixth characters.textCopy code
@substring(variables('date_to_process_str'), 4, 2) - Day: Extract the seventh and eighth characters.textCopy code
@substring(variables('date_to_process_str'), 6, 2)
Extracting Year, Month, and Day from a UTC Timestamp (date_to_process_utc)
The date_to_process_utc variable is an ISO 8601 timestamp. ADF provides the formatDateTime function, which can be used to format and extract parts of a date. Here’s how to get the year, month, and day:
Putting It All Together
To summarize, here are the expressions you would use to extract year, month, and day from both types of date formats in Azure Data Factory:
- String Date (
date_to_process_str):- Year:
@substring(variables('date_to_process_str'), 0, 4) - Month:
@substring(variables('date_to_process_str'), 4, 2) - Day:
@substring(variables('date_to_process_str'), 6, 2)
- Year:
- UTC Timestamp (
date_to_process_utc):- Year:
@formatDateTime(variables('date_to_process_utc'), 'yyyy') - Month:
@int(formatDateTime(variables('date_to_process_utc'), 'MM')) - Day:
@int(formatDateTime(variables('date_to_process_utc'), 'dd'))
- Year:
Conclusion
Handling dates in Azure Data Factory is straightforward with the right functions. By using substring for string dates and formatDateTime combined with int for UTC timestamps, you can easily extract the year, month, and day in the format you need. Whether you’re orchestrating complex data pipelines or simply organizing your data, mastering these functions will make your life easier.