Geek Logbook

Tech sea log book

Adding Custom Columns to Your Date Table in Power BI

Introduction A Date Table is an integral part of building robust and insightful Power BI reports. While a basic Date Table allows for time-based filtering and analysis, custom columns can add even more depth and flexibility. This blog post will guide you through adding custom columns to your Date Table using DAX. 1. Why Add

Grouping Data in PySpark with Aliases for Aggregated Columns

When working with large datasets in PySpark, grouping data and applying aggregations is a common task. In this post, we’ll explore how to group data by a specific column and use aliases for the resulting aggregated columns to improve readability and clarity. Problem Statement Consider the following sample dataset: IdCompra Fecha IdProducto Cantidad Precio IdProveedor

Handling Offset-Naive and Offset-Aware Datetimes in Python

When working with datetime objects in Python, you may encounter the error: This error occurs when comparing two datetime objects where one contains timezone information (offset-aware) and the other does not (offset-naive). To resolve this, you must ensure both datetime objects are either offset-aware or offset-naive before making the comparison. Making a Datetime Offset-Aware in

Automating SQL Script Execution with Cron

In this blog post, we’ll explore how to automate the execution of SQL scripts using cron, a powerful scheduling tool available on Unix-based systems. This approach is ideal for database administrators and developers who need to run SQL scripts at specific intervals without manual intervention. Overview Cron jobs allow you to schedule tasks to run

Counting Word Frequency in a SQL Column

Sometimes, you may need to analyze text data stored in a database, such as counting the frequency of words in a text column. This blog post demonstrates how to achieve this in SQL using a practical example. Problem Overview Let’s assume you have a table named feedback with a column comentarios that contains text data.

Are Indexes a Good Strategy for Analytical Databases?

Indexes are a well-known optimization technique in database management, often associated with improving query performance. However, whether they are a good strategy for analytical databases depends on the specific use case and database architecture. Let’s delve into the topic to understand where indexes shine and where they may fall short in analytical workloads. Indexes: Designed

Orchestrating SQL Files: Efficiently Managing Multiple Scripts

When working on database projects, you often find yourself managing and executing multiple SQL files. Whether these files are for creating schemas, seeding data, or running migrations, orchestrating them efficiently can save you time and reduce errors. In this post, we’ll explore different ways to orchestrate SQL files, catering to various levels of complexity and

Understanding the Evolution of Data Warehousing: From Codd’s Relational Model to Modern Data Warehouses

Data management has undergone significant transformations since the advent of the relational model by Edgar F. Codd. Today, data warehouses stand as a cornerstone of modern data analytics. This blog post explores the differences between Codd’s relational model and data warehouses, highlighting their unique roles and applications in data management. The Relational Model: A Brief

Troubleshooting Import Errors in Python: A Case Study

Python’s modular design allows developers to break their code into smaller, reusable components. However, import errors can often disrupt the flow, especially in complex projects. In this post, we’ll discuss a real-world example of resolving an import error while working on a Python project. The Scenario The project’s directory structure is as follows: The file