Merging Data in PostgreSQL vs. MySQL: How to Handle Upserts
When working with databases, you often need to update existing records or insert new ones based on whether a match is found. In PostgreSQL, this is efficiently handled using the MERGE
statement. However, MySQL does not support MERGE
, so alternative approaches must be used. This post explores how to achieve the same functionality in MySQL.
Understanding MERGE
in PostgreSQL
PostgreSQL (since version 15) supports the MERGE
statement, which allows you to perform an upsert (update or insert) operation in a single command. Consider the following example:
MERGE INTO fact_measurements_v3 AS fact
USING stage_measurements AS stg
ON (stg.station_number = fact.station_number
AND stg.timestamp_measured = fact.timestamp_measured
AND stg.formula = fact.formula)
WHEN NOT MATCHED THEN
INSERT (station_number, value, timestamp_measured, formula)
VALUES (stg.station_number, stg.value, stg.timestamp_measured, stg.formula);
Explanation:
- The
ON
clause defines the matching condition between the source (stage_measurements
) and target (fact_measurements_v3
). - If a match is found (
WHEN MATCHED
), an update can be performed (though not included in this query). - If no match is found (
WHEN NOT MATCHED
), a new row is inserted.
Achieving the Same in MySQL
MySQL does not support MERGE
, but you can achieve similar behavior using INSERT ... ON DUPLICATE KEY UPDATE
or REPLACE INTO
.
Using INSERT ... ON DUPLICATE KEY UPDATE
This approach requires a unique constraint on the target table:
INSERT INTO fact_measurements_v3 (station_number, value, timestamp_measured, formula)
VALUES (stg.station_number, stg.value, stg.timestamp_measured, stg.formula)
ON DUPLICATE KEY UPDATE value = VALUES(value);
Considerations:
- A unique key (e.g., on
station_number
,timestamp_measured
, andformula
) must be defined. - If a match is found, the
value
field is updated. - If no match is found, a new row is inserted.
Using REPLACE INTO
(⚠️ Risky)
Alternatively, you can use REPLACE INTO
, but this deletes the existing row and inserts a new one, which may not be desirable:
REPLACE INTO fact_measurements_v3 (station_number, value, timestamp_measured, formula)
VALUES (stg.station_number, stg.value, stg.timestamp_measured, stg.formula);
Downsides of REPLACE INTO
:
- If a match exists, the row is deleted and reinserted, which can cause unintended data loss.
- Auto-increment values may be affected.
Key Differences Between PostgreSQL and MySQL
Feature | PostgreSQL (MERGE ) | MySQL (INSERT ... ON DUPLICATE KEY UPDATE ) |
---|---|---|
Native MERGE support | ✅ Yes (since v15) | ❌ No |
Requires unique key | ❌ No | ✅ Yes |
Handles updates | ✅ Yes | ✅ Yes |
Deletes before insert | ❌ No | ⚠️ Yes (if using REPLACE INTO ) |
Conclusion
While PostgreSQL’s MERGE
is a more flexible and expressive way to perform upserts, MySQL users must rely on INSERT ... ON DUPLICATE KEY UPDATE
or REPLACE INTO
. If you are working with MySQL, always ensure that you have a unique key when using upsert strategies to avoid unintended data loss.
Do you have experience handling upserts in MySQL or PostgreSQL? Share your insights in the comments! 🚀