Geek Logbook

By - Geek Logbook
Posted on 2025-07-19
Posted in Architectures

Versioning Terraform Resources to Meet CIS Security Standards

Infrastructure as Code (IaC) has become a foundational practice for modern DevOps and cloud-native teams. Terraform, as one of the most widely adopted IaC tools, enables infrastructure automation, consistency, and repeatability. However, when working in regulated environments or organizations with strict compliance requirements, it’s not enough to just automate. You must also govern and secure

By - Geek Logbook
Posted on 2025-07-132025-07-13
Posted in Programming

Handling Python datetime Objects in Amazon DynamoDB

When developing data pipelines or applications that store time-based records in Amazon DynamoDB, developers frequently encounter serialization errors when working with Python’s datetime objects. Understanding how to properly store temporal data in DynamoDB is essential to avoid runtime issues and to enable meaningful queries. The Problem DynamoDB, as a NoSQL database, supports a limited set

By - Geek Logbook
Posted on 2025-07-132025-07-19
Posted in Architectures

Choosing Between DynamoDB and Cassandra for a Crypto Exchange

When designing the backend of a crypto exchange, selecting the right database architecture is crucial. Two common NoSQL databases often considered for this type of application are Amazon DynamoDB and Apache Cassandra. Both offer horizontal scalability and high availability, but they shine in different use cases. This post explores their differences using concrete examples from

By - Geek Logbook
Posted on 2025-07-122025-07-12
Posted in Data

AWS Glue Workflow vs Apache Airflow: A Professional Comparison

While both serve the common purpose of managing and automating data workflows, they differ significantly in architecture, flexibility, integration capabilities, and operational control. This article offers a comprehensive and professional comparison of AWS Glue Workflow and Apache Airflow to help data engineers, architects, and decision-makers choose the most suitable tool for their use case. 1.

By - Geek Logbook
Posted on 2025-07-09
Posted in Data

Reducing AWS Costs: How to Temporarily Stop an Aurora Serverless v2 Cluster

When managing cloud infrastructure, minimizing costs without compromising data integrity is a continuous priority. Amazon Aurora Serverless v2 offers scalability and high availability, but unlike traditional RDS instances, it introduces nuances in how compute resources are billed. One common question arises: Can an Aurora Serverless v2 database be stopped to save costs? Understanding Aurora Serverless

By - Geek Logbook
Posted on 2025-07-08
Posted in Notes

he Enduring Relevance of Peter Chen’s Entity-Relationship Model

In the landscape of data modeling, few contributions have had the long-lasting impact of Peter Chen’s Entity-Relationship (E-R) Model, introduced in 1976. More than four decades later, it remains a foundational framework for conceptualizing and designing data systems—bridging the gap between abstract business understanding and concrete database implementation. A Unified View of Data Chen’s model

By - Geek Logbook
Posted on 2025-07-06
Posted in Notes

How Hadoop Made Specialized Storage Hardware Obsolete

In the early 2000s, enterprise data processing was dominated by high-end hardware. Organizations relied heavily on centralized storage systems such as SAN (Storage Area Networks) and NAS (Network Attached Storage), typically connected to symmetric multiprocessing (SMP) servers or high-performance computing (HPC) clusters. These environments were expensive to scale, difficult to manage, and designed to avoid

By - Geek Logbook
Posted on 2025-07-06
Posted in Notes

EMR vs AWS Glue: Choosing the Right Data Processing Tool on AWS

When working with big data on AWS, two commonly used services for data processing are Amazon EMR and AWS Glue. Although both support scalable data transformation and analytics, they differ significantly in architecture, control, use cases, and cost models. Choosing the right tool depends on your specific workload, performance needs, and operational preferences. In this

By - Geek Logbook
Posted on 2025-07-05
Posted in Programming

Why You Should Use the -out Option with terraform plan

When working with Terraform, a common workflow involves running terraform plan followed by terraform apply. However, you may have come across the following warning: “You didn’t use the -out option to save this plan, so Terraform can’t guarantee to take exactly these actions if you run ‘terraform apply’ now.” This message is more than a

By - Geek Logbook
Posted on 2025-07-05
Posted in Notes

When Should You Use Iceberg with Athena? Partitioning Strategies and Best Practices

As data lakes grow in size and complexity, tools like Amazon Athena combined with table formats like Apache Iceberg become essential for scalability, data governance, and performance. In this post, we’ll explore: Athena + S3: How far does the classic approach go? The typical pattern when querying data in S3 using Athena is: This approach

Recent Posts

Categories

Archives

Versioning Terraform Resources to Meet CIS Security Standards

Handling Python datetime Objects in Amazon DynamoDB

Choosing Between DynamoDB and Cassandra for a Crypto Exchange

AWS Glue Workflow vs Apache Airflow: A Professional Comparison

Reducing AWS Costs: How to Temporarily Stop an Aurora Serverless v2 Cluster

he Enduring Relevance of Peter Chen’s Entity-Relationship Model

How Hadoop Made Specialized Storage Hardware Obsolete

EMR vs AWS Glue: Choosing the Right Data Processing Tool on AWS

Why You Should Use the -out Option with terraform plan

When Should You Use Iceberg with Athena? Partitioning Strategies and Best Practices