Geek Logbook

Tech sea log book

Window Functions vs JOIN in Spark: A Physical Plan Perspective

When solving analytical queries in Spark SQL, there are often multiple correct formulations. However, they do not produce equivalent execution plans. This article compares two approaches to the same problem: “Find the second highest salary per department, but only in departments with at least two employees.” We analyze which approach is more efficient and why,

Resolving the Node.js Error: Cannot find module jsonwebtoken

When developing backend services with Node.js, especially APIs that implement authentication, it is common to rely on JSON Web Tokens (JWT). One frequent runtime error encountered in this context is: This article explains the root cause of this error and provides a precise, production-oriented solution. Problem Description The error indicates that Node.js cannot resolve the

Extracting and Managing Access Tokens in Postman

When working with APIs that use OAuth 2.0 or token-based authentication, a common requirement is to extract an access_token from a successful authentication request and reuse it in subsequent API calls. Postman provides a built-in scripting environment that makes this straightforward and repeatable. This article explains how to capture an access token from a POST

Running Scheduled GitHub Actions Locally for Safer Debugging

Overview When working with scheduled automation jobs in GitHub Actions, it is common to face a simple but critical question: Can this workflow be executed locally before pushing to production? The short answer is yes, and in many cases, the local execution is functionally identical to what GitHub Actions performs in the cloud. This article

Querying JSONB in PostgreSQL Efficiently

In modern applications, it is common to store semi-structured data in JSON format inside a relational database like PostgreSQL. However, to analyze this data properly, you need a way to transform it into a tabular structure that can be queried with standard SQL. In this article, we will demonstrate a real-world example of reading a

Understanding the Strategy Design Pattern

In the landscape of software design, maintaining flexibility and scalability is crucial. One of the most effective ways to achieve these qualities is by leveraging design patterns. Among the behavioral design patterns, the Strategy Pattern stands out as a powerful tool to manage algorithms dynamically. What is the Strategy Pattern? The Strategy Pattern allows you

Can You Perform Data Grouping Directly with the yFinance API?

When working with financial data, efficient aggregation and analysis are essential for generating meaningful insights. A common question among developers and data analysts is whether the yFinance Python library, a popular tool for retrieving historical stock market data, allows grouping or aggregation of data directly via its API. The short answer is: no, yFinance does

Handling Python datetime Objects in Amazon DynamoDB

When developing data pipelines or applications that store time-based records in Amazon DynamoDB, developers frequently encounter serialization errors when working with Python’s datetime objects. Understanding how to properly store temporal data in DynamoDB is essential to avoid runtime issues and to enable meaningful queries. The Problem DynamoDB, as a NoSQL database, supports a limited set