Geek Logbook

Tech sea log book

Extracting Year, Month, and Day from Dates in Azure Data Factory

In Azure Data Factory (ADF), working with dates is a common task, especially when dealing with data transformations and scheduling tasks. ADF allows you to handle dates in different formats, such as timestamps and strings. This blog post will guide you on how to extract the year, month, and day from two types of dates: a string-based date and a UTC timestamp. We’ll use two variables to demonstrate this process:

  1. date_to_process_str: A string representing a date in the format yyyyMMdd.
  2. date_to_process_utc: A string representing a UTC timestamp in ISO 8601 format.

Variables Definition

First, let’s define our variables:

{
    "date_to_process_str": {
        "type": "string",
        "defaultValue": "20240801"
    },
    "date_to_process_utc": {
        "type": "string",
        "defaultValue": "2024-08-01T00:00:00Z"
    }
}

Extracting Year, Month, and Day from a String Date (date_to_process_str)

The date_to_process_str variable is in the format yyyyMMdd. To extract the year, month, and day, we can use ADF’s substring function. Here’s how you can extract each part:

  1. Year: Extract the first four characters.textCopy code@substring(variables('date_to_process_str'), 0, 4)
  2. Month: Extract the fifth and sixth characters.textCopy code@substring(variables('date_to_process_str'), 4, 2)
  3. Day: Extract the seventh and eighth characters.textCopy code@substring(variables('date_to_process_str'), 6, 2)

Extracting Year, Month, and Day from a UTC Timestamp (date_to_process_utc)

The date_to_process_utc variable is an ISO 8601 timestamp. ADF provides the formatDateTime function, which can be used to format and extract parts of a date. Here’s how to get the year, month, and day:

Putting It All Together

To summarize, here are the expressions you would use to extract year, month, and day from both types of date formats in Azure Data Factory:

  • String Date (date_to_process_str):
    • Year: @substring(variables('date_to_process_str'), 0, 4)
    • Month: @substring(variables('date_to_process_str'), 4, 2)
    • Day: @substring(variables('date_to_process_str'), 6, 2)
  • UTC Timestamp (date_to_process_utc):
    • Year: @formatDateTime(variables('date_to_process_utc'), 'yyyy')
    • Month: @int(formatDateTime(variables('date_to_process_utc'), 'MM'))
    • Day: @int(formatDateTime(variables('date_to_process_utc'), 'dd'))

Conclusion

Handling dates in Azure Data Factory is straightforward with the right functions. By using substring for string dates and formatDateTime combined with int for UTC timestamps, you can easily extract the year, month, and day in the format you need. Whether you’re orchestrating complex data pipelines or simply organizing your data, mastering these functions will make your life easier.

Tags: