Geek Logbook

Tech sea log book

Merging Data in PostgreSQL vs. MySQL: How to Handle Upserts

When working with databases, you often need to update existing records or insert new ones based on whether a match is found. In PostgreSQL, this is efficiently handled using the MERGE statement. However, MySQL does not support MERGE, so alternative approaches must be used. This post explores how to achieve the same functionality in MySQL.

Understanding MERGE in PostgreSQL

PostgreSQL (since version 15) supports the MERGE statement, which allows you to perform an upsert (update or insert) operation in a single command. Consider the following example:

MERGE INTO fact_measurements_v3 AS fact
USING stage_measurements AS stg
ON (stg.station_number = fact.station_number
    AND stg.timestamp_measured = fact.timestamp_measured
    AND stg.formula = fact.formula)
WHEN NOT MATCHED THEN
    INSERT (station_number, value, timestamp_measured, formula)
    VALUES (stg.station_number, stg.value, stg.timestamp_measured, stg.formula);

Explanation:

  • The ON clause defines the matching condition between the source (stage_measurements) and target (fact_measurements_v3).
  • If a match is found (WHEN MATCHED), an update can be performed (though not included in this query).
  • If no match is found (WHEN NOT MATCHED), a new row is inserted.

Achieving the Same in MySQL

MySQL does not support MERGE, but you can achieve similar behavior using INSERT ... ON DUPLICATE KEY UPDATE or REPLACE INTO.

Using INSERT ... ON DUPLICATE KEY UPDATE

This approach requires a unique constraint on the target table:

INSERT INTO fact_measurements_v3 (station_number, value, timestamp_measured, formula)
VALUES (stg.station_number, stg.value, stg.timestamp_measured, stg.formula)
ON DUPLICATE KEY UPDATE value = VALUES(value);

Considerations:

  • A unique key (e.g., on station_number, timestamp_measured, and formula) must be defined.
  • If a match is found, the value field is updated.
  • If no match is found, a new row is inserted.

Using REPLACE INTO (⚠️ Risky)

Alternatively, you can use REPLACE INTO, but this deletes the existing row and inserts a new one, which may not be desirable:

REPLACE INTO fact_measurements_v3 (station_number, value, timestamp_measured, formula)
VALUES (stg.station_number, stg.value, stg.timestamp_measured, stg.formula);

Downsides of REPLACE INTO:

  • If a match exists, the row is deleted and reinserted, which can cause unintended data loss.
  • Auto-increment values may be affected.

Key Differences Between PostgreSQL and MySQL

FeaturePostgreSQL (MERGE)MySQL (INSERT ... ON DUPLICATE KEY UPDATE)
Native MERGE support✅ Yes (since v15)❌ No
Requires unique key❌ No✅ Yes
Handles updates✅ Yes✅ Yes
Deletes before insert❌ No⚠️ Yes (if using REPLACE INTO)

Conclusion

While PostgreSQL’s MERGE is a more flexible and expressive way to perform upserts, MySQL users must rely on INSERT ... ON DUPLICATE KEY UPDATE or REPLACE INTO. If you are working with MySQL, always ensure that you have a unique key when using upsert strategies to avoid unintended data loss.

Do you have experience handling upserts in MySQL or PostgreSQL? Share your insights in the comments! 🚀

Tags: