Geek Logbook

Tech sea log book

What Does an Exploratory Data Analysis (EDA) Evaluate?

An Exploratory Data Analysis (EDA) is a critical step in the data analysis process that focuses on evaluating and examining data to uncover its main characteristics. It is performed before delving deeper into analysis or building predictive models. The primary purpose of an EDA is to understand the dataset, identify issues, and gain insights that

Exploring Free Resources to Learn AWS and Azure Cloud Platforms

Cloud computing is an essential skill in today’s tech landscape. Among the major players, AWS and Azure stand out as leading cloud platforms, offering a wealth of free resources to help individuals learn and experiment. This blog post outlines some of the most valuable free tools, learning paths, and tips for getting started with AWS

Adding Custom Columns to Your Date Table in Power BI

Introduction A Date Table is an integral part of building robust and insightful Power BI reports. While a basic Date Table allows for time-based filtering and analysis, custom columns can add even more depth and flexibility. This blog post will guide you through adding custom columns to your Date Table using DAX. 1. Why Add

Grouping Data in PySpark with Aliases for Aggregated Columns

When working with large datasets in PySpark, grouping data and applying aggregations is a common task. In this post, we’ll explore how to group data by a specific column and use aliases for the resulting aggregated columns to improve readability and clarity. Problem Statement Consider the following sample dataset: IdCompra Fecha IdProducto Cantidad Precio IdProveedor

Handling Offset-Naive and Offset-Aware Datetimes in Python

When working with datetime objects in Python, you may encounter the error: This error occurs when comparing two datetime objects where one contains timezone information (offset-aware) and the other does not (offset-naive). To resolve this, you must ensure both datetime objects are either offset-aware or offset-naive before making the comparison. Making a Datetime Offset-Aware in

Automating SQL Script Execution with Cron

In this blog post, we’ll explore how to automate the execution of SQL scripts using cron, a powerful scheduling tool available on Unix-based systems. This approach is ideal for database administrators and developers who need to run SQL scripts at specific intervals without manual intervention. Overview Cron jobs allow you to schedule tasks to run

Counting Word Frequency in a SQL Column

Sometimes, you may need to analyze text data stored in a database, such as counting the frequency of words in a text column. This blog post demonstrates how to achieve this in SQL using a practical example. Problem Overview Let’s assume you have a table named feedback with a column comentarios that contains text data.

Are Indexes a Good Strategy for Analytical Databases?

Indexes are a well-known optimization technique in database management, often associated with improving query performance. However, whether they are a good strategy for analytical databases depends on the specific use case and database architecture. Let’s delve into the topic to understand where indexes shine and where they may fall short in analytical workloads. Indexes: Designed

Orchestrating SQL Files: Efficiently Managing Multiple Scripts

When working on database projects, you often find yourself managing and executing multiple SQL files. Whether these files are for creating schemas, seeding data, or running migrations, orchestrating them efficiently can save you time and reduce errors. In this post, we’ll explore different ways to orchestrate SQL files, catering to various levels of complexity and