Geek Logbook

Tech sea log book

Daily Failure Reporting in DynamoDB Using Lambda, EventBridge Scheduler, and SES

Operational monitoring requires structured visibility into failures. If your processes write execution logs to DynamoDB and mark failed executions with status = FAILED, you can implement a deterministic daily reporting pipeline using AWS Lambda, EventBridge Scheduler, and Amazon SES. This article describes a single, production-grade implementation. Objective Architecture This solution is fully serverless and horizontally

Understanding ip-api Batch Limits and Effective Throughput

When integrating IP geolocation into a data pipeline, understanding rate limits and batching constraints is essential. This post analyzes the practical limits of the ip-api free tier and how to compute effective throughput. 1. Free Tier Constraints The ip-api free plan imposes the following restrictions: These limits apply globally per source IP address. 2. Maximum

From OLTP to OLAP: How Data Moves from 3NF to a Dimensional Data Warehouse

Modern data architectures typically separate operational systems from analytical systems. This separation is not accidental—it reflects fundamentally different workloads, data models, and optimization strategies. This article explains the conceptual transition: Operational Systems (OLTP) and 3rd Normal Form Transactional systems—CRM platforms, payment processors, ERPs, application databases—are designed for: These systems are usually modeled in Third Normal

How PostHog Uses ClickHouse for High-Performance Product Analytics

Modern product analytics platforms must process billions of events while still delivering low-latency queries for dashboards, funnels, and retention analysis. PostHog addresses this requirement by building its analytics engine on top of ClickHouse, a column-oriented OLAP database designed for large-scale analytical workloads. This article focuses exclusively on how ClickHouse is used within PostHog, from data

Google Bigtable vs. Amazon DynamoDB: Understanding the Differences

When choosing a NoSQL database for scalable, low-latency applications, two major options stand out: Google Cloud Bigtable and Amazon DynamoDB. While both are managed, highly available, and horizontally scalable, they are designed with different models and use cases in mind. 1. Data Model Google Bigtable: Amazon DynamoDB: 2. Query Capabilities Bigtable: DynamoDB: 3. Scalability and

Designing a Semantic Layer for Athena + Power BI

Modern data architectures benefit from a clear separation of layers: Ingesta, Staging, and Semantic (Presentation). When using Amazon Athena as the query engine and Power BI as the visualization tool, this layered approach enables scalability, governance, and cost control. 1. Ingesta (Raw Layer) Purpose: Store data exactly as it arrives from source systems, preserving fidelity.

Versioning Terraform Resources to Meet CIS Security Standards

Infrastructure as Code (IaC) has become a foundational practice for modern DevOps and cloud-native teams. Terraform, as one of the most widely adopted IaC tools, enables infrastructure automation, consistency, and repeatability. However, when working in regulated environments or organizations with strict compliance requirements, it’s not enough to just automate. You must also govern and secure

Choosing Between DynamoDB and Cassandra for a Crypto Exchange

When designing the backend of a crypto exchange, selecting the right database architecture is crucial. Two common NoSQL databases often considered for this type of application are Amazon DynamoDB and Apache Cassandra. Both offer horizontal scalability and high availability, but they shine in different use cases. This post explores their differences using concrete examples from

How to Implement MVC in CodeIgniter to Clean Up Your Views

When building web applications, it’s easy to end up with PHP logic mixed directly into your HTML views, especially in smaller projects. However, this can lead to messy, hard-to-maintain code. The Model-View-Controller (MVC) pattern is a great solution to separate concerns and make your application cleaner and more maintainable. In this post, we’ll explore how

Why to choose a Data Lake?

There are some reason that will take you to choose to use a Data Lake as solution for your Data Operations. The most importants are: The common work of the databases is that the storage and the processing is tied together making less flexible to scale storage without processing. In the other hands, the data