Geek Logbook

Tech sea log book

Estimating the Cost of an AWS Glue Workflow

When working with AWS Glue, one of the most common questions data engineers ask is: How much will this job cost me? If you have a workflow that runs for 13 minutes, understanding the cost model of AWS Glue helps you avoid surprises on your AWS bill. How AWS Glue Pricing Works AWS Glue pricing

AWS EventBridge Rules vs EventBridge Scheduler: Which One Should You Use?

In the AWS ecosystem, there are two main ways to schedule and automate tasks: EventBridge Rules (scheduled rules) and the newer EventBridge Scheduler, which introduces Schedule Groups. While both can trigger actions at defined times, their design, scalability, and flexibility differ significantly. Choosing the right option depends on your workload requirements. 1. What Are EventBridge

Running Production Servers on AWS: EC2 vs RDS Cost Breakdown

When planning to run production workloads in the cloud, cost is one of the most important considerations. In this post, we will explore the monthly expenses of running two application servers and a database server on AWS, and compare two deployment approaches: EC2-only vs EC2 + RDS. Infrastructure Requirements Our baseline infrastructure looks like this:

Modern Table Formats: Iceberg, Delta Lake, and Hudi

Data Lakes made it possible to store raw data at scale, but they lacked the reliability and governance of data warehouses. Files could be dropped into storage (S3, HDFS, MinIO), but analysts struggled with schema changes, updates, and deletes. To solve these issues, the community created modern table formats that brought ACID transactions, schema evolution,