Geek Logbook

Tech sea log book

Generating a Calendar Table in Power Query (M Language)

When working with Power BI or other Power Query-supported tools, having a well-structured calendar table is essential for time-based analysis. In this blog post, we will walk through an M Language function that generates a comprehensive calendar table. Why Use a Calendar Table? A calendar table provides essential time-based fields such as year, quarter, month,

Handling Schema Changes in a Data Warehouse

When building and maintaining a Data Warehouse (DWH), handling schema changes without breaking existing processes is a crucial challenge for data engineers. As new requirements emerge, we often need to add new fields, modify existing structures, or adjust data models while ensuring smooth operation for reporting and analytics. This blog post explores best practices and

Delta Lake vs. Traditional Data Lakes: Key Differences and Vendor Options

Introduction As data-driven organizations scale their analytics and machine learning workloads, the limitations of traditional data lakes become more apparent. Delta Lake is an open-source storage layer that enhances data lakes with ACID transactions, schema enforcement, and time travel, making them more reliable for big data workloads. In this post, we will explore how Delta

Why OLTP Systems Don’t Retain Historical Changes

Online Transaction Processing (OLTP) systems are designed for high-speed transactions and efficient data management. However, one of their characteristics is that they do not retain historical changes by default. In this post, we will explore why this happens and provide an example to illustrate the concept. OLTP Systems: Focused on Current Data OLTP databases are

Understanding Slowly Changing Dimensions (SCD) in Data Warehousing

When dealing with data warehouses, handling changes in dimension data over time is crucial. Unlike operational databases where updates are straightforward, data warehouses require preserving historical data for accurate analysis. This is where Slowly Changing Dimensions (SCDs) come into play. What are Slowly Changing Dimensions (SCD)? SCDs are a technique in data warehousing used to

Modes and Examples of KPIs in Data Analysis Expressions (DAX)

Last Year Comparison When analyzing sales performance, it is often useful to compare the current year’s sales with the same period in the previous year. To do this, we create several calculated measures: DAX expressions to calculate similar measures for Units instead of Revenue: Year-to-Date (YTD) Comparison Year-to-date (YTD) measures track performance from the beginning

Understanding the Relationship Between Database Replication and the CAP Theorem

Introduction Database replication is a fundamental strategy in distributed systems that ensures data is duplicated across multiple nodes. However, when designing a replicated database, one must consider the CAP theorem, which defines the fundamental trade-offs in distributed computing. In this post, we will explore how the CAP theorem applies to database replication and what trade-offs

Understanding Surrogate Keys in Databases

When designing relational databases, one crucial decision is how to uniquely identify each record in a table. This is where surrogate keys come into play. Unlike natural keys, which derive from existing attributes in the data, surrogate keys are system-generated identifiers, typically using numeric sequences or UUIDs. What is a Surrogate Key? A surrogate key