Geek Logbook

Tech sea log book

Understanding ip-api Batch Limits and Effective Throughput

When integrating IP geolocation into a data pipeline, understanding rate limits and batching constraints is essential. This post analyzes the practical limits of the ip-api free tier and how to compute effective throughput.


1. Free Tier Constraints

The ip-api free plan imposes the following restrictions:

  • 45 HTTP requests per minute
  • Maximum 100 IPs per batch request
  • HTTP only (no HTTPS)
  • Non-commercial use only
  • No SLA guarantees

These limits apply globally per source IP address.


2. Maximum Theoretical Throughput

Maximum theoretical throughput = 4,500 IPs per minute

This assumes:

  • Every request contains exactly 100 IPs
  • No request failures
  • No network latency bottlenecks

3. Practical Throughput Considerations

Theoretical capacity does not equal sustained production capacity. In practice, throughput may be lower due to:

  • Network latency
  • Request serialization time
  • Retry logic
  • Rate-limit window timing
  • Temporary throttling

To avoid blocking:

  • Send full batches (100 IPs).
  • Space requests evenly (~1.3 seconds apart).
  • Monitor rate-limit headers.

4. Monitoring Rate Limits via Headers

ip-api provides operational metadata in response headers:

HeaderMeaning
X-RlRemaining requests in the current minute window
X-TtlTime (seconds) until rate limit resets
DateServer timestamp
Content-TypeJSON payload format

Example:

X-Rl: 12
X-Ttl: 9

Interpretation:

  • 12 requests remain
  • Window resets in 9 seconds

If X-Rl reaches 0, further requests may result in temporary blocking.


5. Data Validation in Responses

Each IP lookup returns a JSON object containing:

  • query (IP queried)
  • status (success or fail)
  • country, regionName, city
  • lat, lon
  • isp
  • as

There is no confidence score or precision metric. Latitude/longitude represent approximate ISP-level or city-level resolution.

Always validate:

status == "success"

before using location fields downstream.


6. Design Recommendations for Data Pipelines

For controlled ingestion (e.g., Spark, Airflow, or batch ETL):

  1. Implement request throttling.
  2. Cache results at least X-Ttl seconds.
  3. Avoid duplicate IP lookups.
  4. Persist IP → geo mappings in a dimension table.
  5. Implement exponential backoff on HTTP 429 responses.

7. When to Upgrade

If your workload requires:

  • HTTPS
  • Commercial use
  • Higher sustained throughput
  • SLA guarantees

Then a paid plan is required.


Conclusion

Under the free plan, ip-api allows up to 4,500 IPs per minute in theory, but sustainable production throughput will be lower unless rate limiting, batching, and caching are implemented correctly.

For data engineering workflows, the limiting factor is not batch size, but request frequency control.