Generate Test Sales Data
To quickly get my bigquery table setup with support for partitioning by sales_date, I used the following command
bq query \
--use_legacy_sql=false \
--format=csv \
'SELECT order_id, product_id, user_id, inventory_item_id, sale_price, status, DATE(created_at) AS sales_date
FROM bigquery-public-data.thelook_ecommerce.order_items
WHERE DATE(created_at) = "2025-09-23"' \
> sales-20250923.csv
There are some reference here if you like to know more about partition table
Creating partition table
And then I proceed to create my partition table
bq mk \
--table \
--time_partitioning_field=sales_date \
my_dataset.sales_order_items \
order_id:INT64,product_id:INT64,user_id:INT64,inventory_item_id:INT64,sale_price:FLOAT,status:STRING,sales_date:DATE
Using SQL
CREATE TABLE `secondary-471605.thelooker.sales_order_items`
(
order_id INT64,
product_id INT64,
user_id INT64,
inventory_item_id INT64,
sale_price FLOAT64,
status STRING,
sales_date DATE
)
PARTITION BY sales_date
OPTIONS (
description = "Partitioned sales order items table (daily partition by sales_date)"
);
Creating staging table
Using bq util
bq mk --table thelooker.staging_sales_order_items order_id:INT64,product_id:INT64,user_id:INT64,inventory_item_id:INT64,sale_price:FLOAT,status:STRING,sales_date:DATE
Using SQL
CREATE TABLE `secondary-471605.thelooker.staging_sales_order_items`
(
order_id INT64,
product_id INT64,
user_id INT64,
inventory_item_id INT64,
sale_price FLOAT64,
status STRING,
sales_date DATE
)
OPTIONS (
description = "Temporary staging table for daily sales data load"
);
Truncate just in case
bq query --use_legacy_sql=false \
'TRUNCATE TABLE `thelooker.staging_sales_order_items`'
Next load up data from my google cloud storage using the following command
bq load --source_format=CSV --autodetect --replace thelooker.staging_sales_order_items gs://bucket/thelooker/sales-data/20250922/sales-20250922.csv
Then to move data from staging environment to production (Enterprise-safe swap).
Atomically replaces the entire partition — deletes existing rows for that date and inserts the new ones from staging. No locks, no performance degradation for other partitions.
MERGE `secondary-471605.thelooker.sales_order_items` T
USING `secondary-471605.thelooker.staging_sales_order_items` S
ON T.sales_date = S.sales_date AND T.order_id = S.order_id
WHEN MATCHED THEN
UPDATE SET
product_id = S.product_id,
user_id = S.user_id,
inventory_item_id = S.inventory_item_id,
sale_price = S.sale_price,
status = S.status
WHEN NOT MATCHED THEN
INSERT (order_id, product_id, user_id, inventory_item_id, sale_price, status, sales_date)
VALUES (order_id, product_id, user_id, inventory_item_id, sale_price, status, sales_date);
Or you can have something like:
MERGE `secondary-471605.thelooker.sales_order_items` T
USING (
SELECT * FROM `secondary-471605.thelooker.staging_sales_order_items`
WHERE sales_date = '2024-06-01'
) S
ON FALSE -- Never match → always go to WHEN NOT MATCHED
WHEN NOT MATCHED THEN
INSERT ROW
WHEN NOT MATCHED BY SOURCE AND T.sales_date = '2024-06-01' THEN
DELETE;
In Azure Synapse, you can do something like:
-- Azure Synapse style (conceptual)
ALTER TABLE sales_order_items SWITCH PARTITION $PARTITION.pf_date('2024-06-01')
WITH staging_sales_order_items;
This is a metadata-only, instantaneous, zero-copy operation — it just swaps pointers under the hood. Super fast, no data movement, no locks.
To remove the partition tabl
To show table partion
You can use the following to show partition used in your table.
SELECT
partition_id,
total_rows,
last_modified_time
FROM `projectid.thelooker.INFORMATION_SCHEMA.PARTITIONS`
WHERE table_name = 'sales_order_items'
ORDER BY partition_id;
Comments