using bq to create a partition sales order table

Generate Test Sales Data

To quickly get my bigquery table setup with support for partitioning by sales_date, I used the following command

bq query \ --use_legacy_sql=false \ --format=csv \ 'SELECT order_id, product_id, user_id, inventory_item_id, sale_price, status, DATE(created_at) AS sales_date FROM bigquery-public-data.thelook_ecommerce.order_items WHERE DATE(created_at) = "2025-09-23"' \ > sales-20250923.csv

There are some reference here if you like to know more about partition table

Creating partition table

And then I proceed to create my partition table

bq mk \

--table \

--time_partitioning_field=sales_date \

my_dataset.sales_order_items \

order_id:INT64,product_id:INT64,user_id:INT64,inventory_item_id:INT64,sale_price:FLOAT,status:STRING,sales_date:DATE

Using SQL

CREATE TABLE `secondary-471605.thelooker.sales_order_items`

(

order_id INT64,

product_id INT64,

user_id INT64,

inventory_item_id INT64,

sale_price FLOAT64,

status STRING,

sales_date DATE

)

PARTITION BY sales_date

OPTIONS (

description = "Partitioned sales order items table (daily partition by sales_date)"

);

Creating staging table

Using bq util

bq mk --table thelooker.staging_sales_order_items order_id:INT64,product_id:INT64,user_id:INT64,inventory_item_id:INT64,sale_price:FLOAT,status:STRING,sales_date:DATE

Using SQL

CREATE TABLE `secondary-471605.thelooker.staging_sales_order_items`

(

order_id INT64,

product_id INT64,

user_id INT64,

inventory_item_id INT64,

sale_price FLOAT64,

status STRING,

sales_date DATE

)

OPTIONS (

description = "Temporary staging table for daily sales data load"

);

Truncate just in case

bq query --use_legacy_sql=false \

'TRUNCATE TABLE `thelooker.staging_sales_order_items`'

Next load up data from my google cloud storage using the following command

bq load --source_format=CSV --autodetect --replace thelooker.staging_sales_order_items gs://bucket/thelooker/sales-data/20250922/sales-20250922.csv

Then to move data from staging environment to production (Enterprise-safe swap).

Atomically replaces the entire partition — deletes existing rows for that date and inserts the new ones from staging. No locks, no performance degradation for other partitions.

MERGE `secondary-471605.thelooker.sales_order_items` T
USING `secondary-471605.thelooker.staging_sales_order_items` S
ON T.sales_date = S.sales_date AND T.order_id = S.order_id
WHEN MATCHED THEN
  UPDATE SET
    product_id = S.product_id,
    user_id = S.user_id,
    inventory_item_id = S.inventory_item_id,
    sale_price = S.sale_price,
    status = S.status
WHEN NOT MATCHED THEN
  INSERT (order_id, product_id, user_id, inventory_item_id, sale_price, status, sales_date)
  VALUES (order_id, product_id, user_id, inventory_item_id, sale_price, status, sales_date);

Or you can have something like:

MERGE `secondary-471605.thelooker.sales_order_items` T

USING (

SELECT * FROM `secondary-471605.thelooker.staging_sales_order_items`

WHERE sales_date = '2024-06-01'

) S

ON FALSE -- Never match → always go to WHEN NOT MATCHED

WHEN NOT MATCHED THEN

INSERT ROW

WHEN NOT MATCHED BY SOURCE AND T.sales_date = '2024-06-01' THEN

DELETE;

In Azure Synapse, you can do something like:

-- Azure Synapse style (conceptual)

ALTER TABLE sales_order_items SWITCH PARTITION $PARTITION.pf_date('2024-06-01')

WITH staging_sales_order_items;

This is a metadata-only, instantaneous, zero-copy operation — it just swaps pointers under the hood. Super fast, no data movement, no locks.

To remove the partition tabl

To show table partion

You can use the following to show partition used in your table.

SELECT
  partition_id,
  total_rows,
  last_modified_time
FROM `projectid.thelooker.INFORMATION_SCHEMA.PARTITIONS`
WHERE table_name = 'sales_order_items'
ORDER BY partition_id;

Search This Blog

mitzen

using bq to create a partition sales order table

Comments

Popular posts from this blog

vllm : Failed to infer device type

NodeJS: Error: spawn EINVAL in window for node version 20.20 and 18.20

android studio kotlin source is null error