# TPC-DS Query Reference The [TPC-DS](https://www.tpc.org/tpcds/) benchmark defines 99 decision-support queries covering three retail channels (store, catalog, web) and associated returns. This page documents all 99 queries as implemented in Bexhoma, including the TPC-DS reference SQL and any DBMS-specific dialect variants. > **Reference version:** All SQL templates and parameter distributions on this page are taken from > **TPC-DS v4.0.0** (`DSGen-software-code-4.0.0/query_templates/` and `query_variants/`). ## Configuration File Queries are stored in [`experiments/tpcds/queries-tpcds.config`](../experiments/tpcds/queries-tpcds.config). The format is defined by the [`dbmsbenchmarker`](https://github.com/perdelt/dbmsbenchmarker) Python package. Each entry is a Python dict: | Key | Description | |-----|-------------| | `'title'` | Label used in result tables | | `'query'` | SQL string or list of SQL strings (multi-statement queries) | | `'DBMS'` | Optional dict: DBMS name → dialect-specific SQL | | `'parameter'` | Dict of runtime substitution parameters | | `'active'` | Whether this query runs | | `'numWarmup'` / `'numCooldown'` | Warm-up / cool-down runs | | `'numRun'` | Number of measured runs | | `'timer'` | Timer configuration | ## Placeholder Convention | Notation | Used in | Meaning | |----------|---------|---------| | `[PARAM]` | TPC-DS template files (`.tpl`) | Substitution placeholder | | `{PARAM}` | Bexhoma config SQL | Runtime parameter, substituted by `dbmsbenchmarker` | | `[_LIMITA] select [_LIMITB] … [_LIMITC]` | TPC-DS template | Row-limit markers | | `select … limit N` | Bexhoma config SQL | Concrete row limit (typically 100) | ## Parameter Types | Config type | Meaning | |-------------|---------| | `"integer"` with `"range": [min, max]` | Uniform random integer in [min, max] | | `"list"` with `"range": [v1, v2, …]` | One value chosen uniformly from the list | | `"integer"` or `"list"` with `"size": N` | N independent draws, substituted as `{PARAM1}`, `{PARAM2}`, …, `{PARAMN}` | ## Multi-Statement Queries Q14, Q23, Q24, and Q39 each consist of two SQL statements executed in sequence. In the config, `'query'` is a Python list `[sql_a, sql_b]`. Both statements share the same parameter substitution. ## DBMS Dialect Overrides Sixteen queries carry a `'DBMS'` key with dialect-specific SQL for one or more systems: | Query | DBMS overrides | Reason | |-------|---------------|--------| | Q5 | MariaDB | Nested CTE workaround (MariaDB <10.4 cannot have CTEs defined inside CTE bodies) | | Q14a+b | MariaDB | Nested CTE workaround + `INTERSECT` substitute | | Q18 | MariaDB, MonetDB, PostgreSQL | `GROUP BY ROLLUP` syntax differences | | Q22 | MariaDB, MonetDB, PostgreSQL | `GROUP BY ROLLUP` syntax differences | | Q27 | MonetDB, PostgreSQL | `GROUP BY ROLLUP` syntax differences | | Q36 | MonetDB, PostgreSQL | `GROUP BY ROLLUP` syntax differences | | Q51 | MySQL | `FULL OUTER JOIN` not supported; rewritten with `UNION ALL` | | Q54 | MySQL | `CAST(… AS INT)` not supported; replaced by `CAST(… AS SIGNED)` | | Q66 | MonetDB, PostgreSQL | String concatenation `\|\|` → `CONCAT()` | | Q67 | MonetDB, PostgreSQL | `GROUP BY ROLLUP` syntax differences | | Q70 | MonetDB, PostgreSQL, Exasol, MemSQL | `GROUP BY ROLLUP` syntax differences | | Q77 | MariaDB, MonetDB, PostgreSQL | Nested CTE workaround + `GROUP BY ROLLUP` | | Q80 | MariaDB, MonetDB, PostgreSQL, MemSQL | Nested CTE workaround + `GROUP BY ROLLUP` | | Q84 | MonetDB, PostgreSQL | String concatenation `\|\|` → `CONCAT()` | | Q86 | MariaDB, MonetDB, PostgreSQL, Exasol, MemSQL | `GROUP BY ROLLUP` syntax differences | | Q97 | MySQL | `FULL OUTER JOIN` not supported; rewritten with `UNION ALL + GROUP BY` | --- ## Deviations from the TPC-DS Reference Templates Not every query in bexhoma is a verbatim translation of the TPC-DS template. Differences fall into two categories: - **Syntactic** — different SQL dialect or notation, but the query returns the same result set on the same data. - **Semantic** — the query logic or filter differs, so a different result set is expected compared to a conforming TPC-DS implementation. ### Semantic Differences (different result set expected) | Query | What bexhoma does | What the template specifies | Effect | |-------|-------------------|-----------------------------|--------| | **Q2** | `d_year = {YEAR}` and `d_year = {YEAR}+1` now substituted correctly (previously hardcoded 1998/1999) | Draws `[YEAR]` and `[YEAR]+1` from a uniform distribution (1998–2001) | **Fixed** — all year pairs now exercised | | **Q5** | Date window corrected to `interval '30' day` (previously `interval '14' day`) | `+ 30 days` | **Fixed** | | **Q13** | `d_year = {YEAR}` now substituted (previously hardcoded 2001); `YEAR` parameter added (1998–2002) | Draws `[YEAR]` from uniform distribution | **Fixed** | | **Q30** | `STATE` expanded to all 43 US state codes (previously only `["TN"]`) | Draws `[STATE]` from `fips_county` distribution (any US state) | **Fixed** | | **Q49** | `d_moy` filter removed from all three channel subqueries; `MONTH` parameter removed | No month filter; the template aggregates an entire year | **Fixed** | | **Q98** | Start date uses `{DAY}` parameter (integer 1–28) instead of hardcoded `01`; `DAY` parameter added | Start date drawn from a sales-weighted date distribution within Jan–Jul of `[YEAR]` | **Approximated** — uniform day distribution used; sales-weighted draw not expressible in dbmsbenchmarker | ### Syntactic Differences (same result set, dialect adaptation) | Difference | Queries affected | Notes | |------------|-----------------|-------| | `GROUP BY … WITH ROLLUP` (MySQL/MariaDB default) vs. `GROUP BY ROLLUP(…)` (standard SQL) | Q5, Q18, Q22, Q27, Q36, Q67, Q70, Q77, Q80, Q86 | Per-DBMS overrides in the `'DBMS'` key handle each target system | | `CONCAT('a', ',', 'b')` vs. standard `'a' \|\| ',' \|\| 'b'` | Q5, Q66, Q84 | MySQL/MariaDB default; `\|\|` used in PostgreSQL/MonetDB/Exasol overrides | | `interval 'N' day` vs. `+ N days` date arithmetic | Q5, Q21, Q32, Q72, Q77, Q80, Q92 | Both express the same time offset; dialect normalised for portability | | `ORDER BY col IS NOT NULL, col` (explicit NULL-last sort) vs. plain `ORDER BY col` | Q5, Q65, Q72 | Added to make NULL ordering deterministic across DBMS that differ in default NULL position | | `FULL OUTER JOIN` rewritten as `UNION ALL + GROUP BY` for MySQL | Q51, Q97 | MySQL/MariaDB override; the rewrite is logically equivalent | | `CAST(… AS SIGNED)` vs. `CAST(… AS INT)` | Q54 (MySQL override) | MySQL does not support `INT` as a cast target | | `CAST(inv_after AS FLOAT) / inv_before` vs. plain division | Q21 | Forces floating-point division; semantically equivalent on non-integer inputs | | Nested CTE workaround (outer CTE wraps everything) | Q5 (MariaDB), Q14a+b (MariaDB), Q77 (MariaDB), Q80 (MariaDB) | MariaDB <10.4 cannot reference a CTE from inside another CTE body | --- ## Q1 — Store Returns: High Return Rate Customers Finds customers whose total returns at a given store exceed 120% of the average return for that store, filtered to a specific state and year. | Parameter | Type | Range / Values | |-----------|------|---------------| | `AGG_FIELD` | list | `SR_RETURN_AMT`, `SR_FEE`, `SR_REFUNDED_CASH`, `SR_RETURN_AMT_INC_TAX`, `SR_REVERSED_CHARGE`, `SR_STORE_CREDIT`, `SR_RETURN_TAX` | | `STATE` | list | US state abbreviations (51 values, AK–WY) | | `YEAR` | integer | 1998–2002 |
TPC-DS Reference SQL (query1.tpl) ```sql with customer_total_return as (select sr_customer_sk as ctr_customer_sk ,sr_store_sk as ctr_store_sk ,sum([AGG_FIELD]) as ctr_total_return from store_returns ,date_dim where sr_returned_date_sk = d_date_sk and d_year =[YEAR] group by sr_customer_sk ,sr_store_sk) [_LIMITA] select [_LIMITB] c_customer_id from customer_total_return ctr1 ,store ,customer where ctr1.ctr_total_return > (select avg(ctr_total_return)*1.2 from customer_total_return ctr2 where ctr1.ctr_store_sk = ctr2.ctr_store_sk) and s_store_sk = ctr1.ctr_store_sk and s_state = '[STATE]' and ctr1.ctr_customer_sk = c_customer_sk order by c_customer_id [_LIMITC]; ```
--- ## Q2 — Weekly Sales Ratio Year-over-Year Computes the ratio of sales per day-of-week between two consecutive years across web and catalog channels. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2001 |
TPC-DS Reference SQL (query2.tpl) ```sql with wscs as (select sold_date_sk ,sales_price from (select ws_sold_date_sk sold_date_sk ,ws_ext_sales_price sales_price from web_sales union all select cs_sold_date_sk sold_date_sk ,cs_ext_sales_price sales_price from catalog_sales)), wswscs as (select d_week_seq, sum(case when (d_day_name='Sunday') then sales_price else null end) sun_sales, sum(case when (d_day_name='Monday') then sales_price else null end) mon_sales, sum(case when (d_day_name='Tuesday') then sales_price else null end) tue_sales, sum(case when (d_day_name='Wednesday') then sales_price else null end) wed_sales, sum(case when (d_day_name='Thursday') then sales_price else null end) thu_sales, sum(case when (d_day_name='Friday') then sales_price else null end) fri_sales, sum(case when (d_day_name='Saturday') then sales_price else null end) sat_sales from wscs ,date_dim where d_date_sk = sold_date_sk group by d_week_seq) select d_week_seq1 ,round(sun_sales1/sun_sales2,2) ,round(mon_sales1/mon_sales2,2) ,round(tue_sales1/tue_sales2,2) ,round(wed_sales1/wed_sales2,2) ,round(thu_sales1/thu_sales2,2) ,round(fri_sales1/fri_sales2,2) ,round(sat_sales1/sat_sales2,2) from (select wswscs.d_week_seq d_week_seq1 ,sun_sales sun_sales1 ,mon_sales mon_sales1 ,tue_sales tue_sales1 ,wed_sales wed_sales1 ,thu_sales thu_sales1 ,fri_sales fri_sales1 ,sat_sales sat_sales1 from wswscs,date_dim where date_dim.d_week_seq = wswscs.d_week_seq and d_year = [YEAR]) y, (select wswscs.d_week_seq d_week_seq2 ,sun_sales sun_sales2 ,mon_sales mon_sales2 ,tue_sales tue_sales2 ,wed_sales wed_sales2 ,thu_sales thu_sales2 ,fri_sales fri_sales2 ,sat_sales sat_sales2 from wswscs ,date_dim where date_dim.d_week_seq = wswscs.d_week_seq and d_year = [YEAR]+1) z where d_week_seq1=d_week_seq2-53 order by d_week_seq1; ```
> **Bexhoma note:** The template uses `[YEAR]` and `[YEAR]+1` for consecutive years. Bexhoma hardcodes > `d_year = 1998` and `d_year = 1999` in the subqueries instead of using the `{YEAR}` parameter for > the year-plus-one case. --- ## Q3 — Brand Sales by Month Finds the top-selling brands in the store channel for a given manufacturer and month, with selectable aggregation. | Parameter | Type | Range / Values | |-----------|------|---------------| | `MONTH` | integer | 11–12 | | `MANUFACT` | integer | 1–1000 | | `AGGC` | list | `ss_ext_sales_price`, `ss_sales_price`, `ss_ext_discount_amt`, `ss_net_profit` |
TPC-DS Reference SQL (query3.tpl) ```sql [_LIMITA] select [_LIMITB] dt.d_year ,item.i_brand_id brand_id ,item.i_brand brand ,sum([AGGC]) sum_agg from date_dim dt ,store_sales ,item where dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = item.i_item_sk and item.i_manufact_id = [MANUFACT] and dt.d_moy=[MONTH] group by dt.d_year ,item.i_brand ,item.i_brand_id order by dt.d_year ,sum_agg desc ,brand_id [_LIMITC]; ```
--- ## Q4 — Multi-Channel Customer Year-over-Year Growth Finds customers whose spending growth in catalog and web channels exceeds their store spending growth between two consecutive years. Uses a selectable customer attribute in the output. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2001 | | `SELECTONE` | list | `t_s_secyear.customer_preferred_cust_flag`, `t_s_secyear.customer_birth_country`, `t_s_secyear.customer_login`, `t_s_secyear.customer_email_address` |
TPC-DS Reference SQL (query4.tpl) ```sql with year_total as ( select c_customer_id customer_id ,c_first_name customer_first_name ,c_last_name customer_last_name ,c_preferred_cust_flag customer_preferred_cust_flag ,c_birth_country customer_birth_country ,c_login customer_login ,c_email_address customer_email_address ,d_year dyear ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2) year_total ,'s' sale_type from customer, store_sales, date_dim where c_customer_sk = ss_customer_sk and ss_sold_date_sk = d_date_sk group by c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address, d_year union all select c_customer_id customer_id ,c_first_name customer_first_name ,c_last_name customer_last_name ,c_preferred_cust_flag customer_preferred_cust_flag ,c_birth_country customer_birth_country ,c_login customer_login ,c_email_address customer_email_address ,d_year dyear ,sum((((cs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2)) year_total ,'c' sale_type from customer, catalog_sales, date_dim where c_customer_sk = cs_bill_customer_sk and cs_sold_date_sk = d_date_sk group by c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address, d_year union all select c_customer_id customer_id ,c_first_name customer_first_name ,c_last_name customer_last_name ,c_preferred_cust_flag customer_preferred_cust_flag ,c_birth_country customer_birth_country ,c_login customer_login ,c_email_address customer_email_address ,d_year dyear ,sum((((ws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2)) year_total ,'w' sale_type from customer, web_sales, date_dim where c_customer_sk = ws_bill_customer_sk and ws_sold_date_sk = d_date_sk group by c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address, d_year) [_LIMITA] select [_LIMITB] t_s_secyear.customer_id ,t_s_secyear.customer_first_name ,t_s_secyear.customer_last_name ,[SELECTONE] from year_total t_s_firstyear, year_total t_s_secyear, year_total t_c_firstyear, year_total t_c_secyear, year_total t_w_firstyear, year_total t_w_secyear where t_s_secyear.customer_id = t_s_firstyear.customer_id and t_s_firstyear.customer_id = t_c_secyear.customer_id and t_s_firstyear.customer_id = t_c_firstyear.customer_id and t_s_firstyear.customer_id = t_w_firstyear.customer_id and t_s_firstyear.customer_id = t_w_secyear.customer_id and t_s_firstyear.sale_type = 's' and t_c_firstyear.sale_type = 'c' and t_w_firstyear.sale_type = 'w' and t_s_secyear.sale_type = 's' and t_c_secyear.sale_type = 'c' and t_w_secyear.sale_type = 'w' and t_s_firstyear.dyear = [YEAR] and t_s_secyear.dyear = [YEAR]+1 and t_c_firstyear.dyear = [YEAR] and t_c_secyear.dyear = [YEAR]+1 and t_w_firstyear.dyear = [YEAR] and t_w_secyear.dyear = [YEAR]+1 and t_s_firstyear.year_total > 0 and t_c_firstyear.year_total > 0 and t_w_firstyear.year_total > 0 and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / t_c_firstyear.year_total else null end > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else null end and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / t_c_firstyear.year_total else null end > case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else null end order by t_s_secyear.customer_id, t_s_secyear.customer_first_name, t_s_secyear.customer_last_name, [SELECTONE] [_LIMITC]; ```
--- ## Q5 — Net Revenue by Sales Channel with Rollup Reports sales, returns, and net profit for store, catalog, and web channels with subtotals using `ROLLUP`. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DAY` | integer | 1–30 | | `MONTH` | integer | 8–8 | | `YEAR` | integer | 1998–2002 |
TPC-DS Reference SQL (query5.tpl) ```sql with ssr as ( select s_store_id, sum(sales_price) as sales, sum(profit) as profit, sum(return_amt) as returns, sum(net_loss) as profit_loss from ( select ss_store_sk as store_sk, ss_sold_date_sk as date_sk, ss_ext_sales_price as sales_price, ss_net_profit as profit, cast(0 as decimal(7,2)) as return_amt, cast(0 as decimal(7,2)) as net_loss from store_sales union all select sr_store_sk as store_sk, sr_returned_date_sk as date_sk, cast(0 as decimal(7,2)) as sales_price, cast(0 as decimal(7,2)) as profit, sr_return_amt as return_amt, sr_net_loss as net_loss from store_returns) salesreturns, date_dim, store where date_sk = d_date_sk and d_date between cast('[SALES_DATE]' as date) and (cast('[SALES_DATE]' as date) + 14 days) and store_sk = s_store_sk group by s_store_id), csr as (... catalog channel analog ...), wsr as (... web channel analog ...) [_LIMITA] select [_LIMITB] channel, id, sum(sales) as sales, sum(returns) as returns, sum(profit) as profit from ( select 'store channel' as channel, 'store' || s_store_id as id, sales, returns, (profit - profit_loss) as profit from ssr union all select 'catalog channel' as channel, 'catalog_page' || cp_catalog_page_id as id, sales, returns, (profit - profit_loss) as profit from csr union all select 'web channel' as channel, 'web_site' || web_site_id as id, sales, returns, (profit - profit_loss) as profit from wsr ) x group by rollup (channel, id) order by channel, id [_LIMITC]; ```
> **Bexhoma note:** The template uses `SALES_DATE` (a generated date in month `{MONTH}`/year `{YEAR}`); > Bexhoma exposes `{YEAR}`, `{MONTH}`, and `{DAY}` as separate integer parameters and constructs > `cast('{YEAR}-{MONTH}-{DAY}' as date)` in the SQL. The template uses standard SQL `|| ` for > string concatenation and `GROUP BY ROLLUP(channel, id)` syntax; the bexhoma default SQL uses > MySQL-compatible `GROUP BY channel, id WITH ROLLUP` and `CONCAT(...)` instead.
MariaDB dialect (nested CTE workaround) MariaDB older versions cannot reference one CTE body from another within the same `WITH` clause. Bexhoma wraps `ssr`, `csr`, and `wsr` in a nested CTE structure using `with total as (...)`.
--- ## Q6 — State-Level Customer Purchase Behavior Counts customers per state who bought items priced above 120% of their category average in a specific year-month. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MONTH` | integer | 1–7 |
TPC-DS Reference SQL (query6.tpl) ```sql [_LIMITA] select [_LIMITB] a.ca_state state, count(*) cnt from customer_address a, customer c, store_sales s, date_dim d, item i where a.ca_address_sk = c.c_current_addr_sk and c.c_customer_sk = s.ss_customer_sk and s.ss_sold_date_sk = d.d_date_sk and s.ss_item_sk = i.i_item_sk and d.d_month_seq = (select distinct (d_month_seq) from date_dim where d_year = [YEAR] and d_moy = [MONTH]) and i.i_current_price > 1.2 * (select avg(j.i_current_price) from item j where j.i_category = i.i_category) group by a.ca_state having count(*) >= 10 order by cnt, a.ca_state [_LIMITC]; ```
--- ## Q7 — Store Sales by Item and Demographics Averages quantity, list price, coupon amount, and sales price per item in the store channel, filtered by customer gender, marital status, and education status. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MS` | list | M, S, D, W, U | | `GEN` | list | M, F | | `ES` | list | Primary, Secondary, College, 2 yr Degree, 4 yr Degree, Advanced Degree, Unknown |
TPC-DS Reference SQL (query7.tpl) ```sql [_LIMITA] select [_LIMITB] i_item_id, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales, customer_demographics, date_dim, item, promotion where ss_sold_date_sk = d_date_sk and ss_item_sk = i_item_sk and ss_cdemo_sk = cd_demo_sk and ss_promo_sk = p_promo_sk and cd_gender = '[GEN]' and cd_marital_status = '[MS]' and cd_education_status = '[ES]' and (p_channel_email = 'N' or p_channel_event = 'N') and d_year = [YEAR] group by i_item_id order by i_item_id [_LIMITC]; ```
--- ## Q8 — Net Profit per Store by ZIP Code Finds net profit per store for customers whose ZIP codes appear in a large IN-list of 400 codes (intersected with preferred customers), filtered by quarter and year. | Parameter | Type | Range / Values | |-----------|------|---------------| | `ZIP` | integer (×400) | 10000–99999 — 400 independent draws, substituted as `{ZIP.1}`…`{ZIP.400}` | | `YEAR` | integer | 1998–2002 | | `QOY` | integer | 1–2 |
TPC-DS Reference SQL (query8.tpl) ```sql [_LIMITA] select [_LIMITB] s_store_name, sum(ss_net_profit) from store_sales, date_dim, store, (select ca_zip from ( SELECT substr(ca_zip,1,5) ca_zip FROM customer_address WHERE substr(ca_zip,1,5) IN ('[ZIP.1]','[ZIP.2]',...,'[ZIP.400]') intersect select ca_zip from ( SELECT substr(ca_zip,1,5) ca_zip, count(*) cnt FROM customer_address, customer WHERE ca_address_sk = c_current_addr_sk and c_preferred_cust_flag='Y' group by ca_zip having count(*) > 10) A1) A2) V1 where ss_store_sk = s_store_sk and ss_sold_date_sk = d_date_sk and d_qoy = [QOY] and d_year = [YEAR] and (substr(s_zip,1,2) = substr(V1.ca_zip,1,2)) group by s_store_name order by s_store_name [_LIMITC]; ```
> **Bexhoma note:** The 400-element IN-list uses `{ZIP.1}` through `{ZIP.400}`, generated from a > single `ZIP` parameter with `'size': 400`. --- ## Q9 — Conditional Bucket Aggregation Computes conditional averages over store sales quantity buckets (1–20, 21–40, 41–60, 61–80, 81–100), choosing between two aggregation columns based on a count threshold. | Parameter | Type | Range / Values | |-----------|------|---------------| | `RC` | integer (×5) | 1–567080 — 5 draws, substituted as `{RC.1}`…`{RC.5}` | | `AGGCTHEN` | list | `ss_ext_discount_amt`, `ss_ext_sales_price`, `ss_ext_list_price`, `ss_ext_tax` | | `AGGCELSE` | list | `ss_net_paid`, `ss_net_paid_inc_tax`, `ss_net_profit` |
TPC-DS Reference SQL (query9.tpl) ```sql select case when (select count(*) from store_sales where ss_quantity between 1 and 20) > [RC.1] then (select avg([AGGCTHEN]) from store_sales where ss_quantity between 1 and 20) else (select avg([AGGCELSE]) from store_sales where ss_quantity between 1 and 20) end bucket1, case when (select count(*) from store_sales where ss_quantity between 21 and 40) > [RC.2] then (select avg([AGGCTHEN]) from store_sales where ss_quantity between 21 and 40) else (select avg([AGGCELSE]) from store_sales where ss_quantity between 21 and 40) end bucket2, case when (select count(*) from store_sales where ss_quantity between 41 and 60) > [RC.3] then (select avg([AGGCTHEN]) from store_sales where ss_quantity between 41 and 60) else (select avg([AGGCELSE]) from store_sales where ss_quantity between 41 and 60) end bucket3, case when (select count(*) from store_sales where ss_quantity between 61 and 80) > [RC.4] then (select avg([AGGCTHEN]) from store_sales where ss_quantity between 61 and 80) else (select avg([AGGCELSE]) from store_sales where ss_quantity between 61 and 80) end bucket4, case when (select count(*) from store_sales where ss_quantity between 81 and 100) > [RC.5] then (select avg([AGGCTHEN]) from store_sales where ss_quantity between 81 and 100) else (select avg([AGGCELSE]) from store_sales where ss_quantity between 81 and 100) end bucket5 from reason where r_reason_sk = 1; ```
--- ## Q10 — Customer Demographics Multi-Channel Analysis Reports demographic breakdown of customers who shopped in store sales and also in web or catalog sales, filtered by a 10-county set over a 3-month window. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1999–2002 | | `MONTH` | integer | 1–4 | | `COUNTY` | list (×10) | 10 random draws from US county list (1870+ values) — substituted as `{COUNTY.1}`…`{COUNTY.10}` |
TPC-DS Reference SQL (query10.tpl) ```sql [_LIMITA] select [_LIMITB] cd_gender, cd_marital_status, cd_education_status, count(*) cnt1, cd_purchase_estimate, count(*) cnt2, cd_credit_rating, count(*) cnt3, cd_dep_count, count(*) cnt4, cd_dep_employed_count, count(*) cnt5, cd_dep_college_count, count(*) cnt6 from customer c, customer_address ca, customer_demographics where c.c_current_addr_sk = ca.ca_address_sk and ca_county in ('[COUNTY.1]','[COUNTY.2]','[COUNTY.3]','[COUNTY.4]','[COUNTY.5]', '[COUNTY.6]','[COUNTY.7]','[COUNTY.8]','[COUNTY.9]','[COUNTY.10]') and cd_demo_sk = c.c_current_cdemo_sk and exists (select * from store_sales, date_dim where c.c_customer_sk = ss_customer_sk and ss_sold_date_sk = d_date_sk and d_year = [YEAR] and d_moy between [MONTH] and [MONTH]+3) and (exists (select * from web_sales, date_dim where c.c_customer_sk = ws_bill_customer_sk and ws_sold_date_sk = d_date_sk and d_year = [YEAR] and d_moy between [MONTH] and [MONTH]+3) or exists (select * from catalog_sales, date_dim where c.c_customer_sk = cs_ship_customer_sk and cs_sold_date_sk = d_date_sk and d_year = [YEAR] and d_moy between [MONTH] and [MONTH]+3)) group by cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating, cd_dep_count, cd_dep_employed_count, cd_dep_college_count order by cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating, cd_dep_count, cd_dep_employed_count, cd_dep_college_count [_LIMITC]; ```
--- ## Q11 — Web vs. Store Year-over-Year Customer Growth Finds customers whose web spending growth exceeds their store spending growth between two consecutive years. Uses list price minus discount as the spending measure (vs. Q4 which uses a more complex average). | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2001 | | `SELECTONE` | list | `t_s_secyear.customer_preferred_cust_flag`, `t_s_secyear.customer_birth_country`, `t_s_secyear.customer_login`, `t_s_secyear.customer_email_address` |
TPC-DS Reference SQL (query11.tpl) ```sql with year_total as ( select c_customer_id customer_id, c_first_name customer_first_name, c_last_name customer_last_name, c_preferred_cust_flag customer_preferred_cust_flag, c_birth_country customer_birth_country, c_login customer_login, c_email_address customer_email_address, d_year dyear, sum(ss_ext_list_price-ss_ext_discount_amt) year_total, 's' sale_type from customer, store_sales, date_dim where c_customer_sk = ss_customer_sk and ss_sold_date_sk = d_date_sk group by c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address, d_year union all select c_customer_id customer_id, c_first_name customer_first_name, c_last_name customer_last_name, c_preferred_cust_flag customer_preferred_cust_flag, c_birth_country customer_birth_country, c_login customer_login, c_email_address customer_email_address, d_year dyear, sum(ws_ext_list_price-ws_ext_discount_amt) year_total, 'w' sale_type from customer, web_sales, date_dim where c_customer_sk = ws_bill_customer_sk and ws_sold_date_sk = d_date_sk group by c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address, d_year) [_LIMITA] select [_LIMITB] t_s_secyear.customer_id, t_s_secyear.customer_first_name, t_s_secyear.customer_last_name, [SELECTONE] from year_total t_s_firstyear, year_total t_s_secyear, year_total t_w_firstyear, year_total t_w_secyear where t_s_secyear.customer_id = t_s_firstyear.customer_id and t_s_firstyear.customer_id = t_w_secyear.customer_id and t_s_firstyear.customer_id = t_w_firstyear.customer_id and t_s_firstyear.sale_type = 's' and t_w_firstyear.sale_type = 'w' and t_s_secyear.sale_type = 's' and t_w_secyear.sale_type = 'w' and t_s_firstyear.dyear = [YEAR] and t_s_secyear.dyear = [YEAR]+1 and t_w_firstyear.dyear = [YEAR] and t_w_secyear.dyear = [YEAR]+1 and t_s_firstyear.year_total > 0 and t_w_firstyear.year_total > 0 and case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else 0.0 end > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else 0.0 end order by t_s_secyear.customer_id, t_s_secyear.customer_first_name, t_s_secyear.customer_last_name, [SELECTONE] [_LIMITC]; ```
--- ## Q12 — Web Sales Revenue Ratio by Item Class Computes revenue contribution (as a percentage of class total) per item in up to 3 selected categories for a 30-day sales window. | Parameter | Type | Range / Values | |-----------|------|---------------| | `CATEGORY` | list (×3) | 3 draws from: Books, Children, Electronics, Home, Jewelry, Men, Music, Shoes, Sports, Women |
TPC-DS Reference SQL (query12.tpl) ```sql [_LIMITA] select [_LIMITB] i_item_id, i_item_desc, i_category, i_class, i_current_price, sum(ws_ext_sales_price) as itemrevenue, sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over (partition by i_class) as revenueratio from web_sales, item, date_dim where ws_item_sk = i_item_sk and i_category in ('[CATEGORY.1]', '[CATEGORY.2]', '[CATEGORY.3]') and ws_sold_date_sk = d_date_sk and d_date between cast('[SDATE]' as date) and (cast('[SDATE]' as date) + 30 days) group by i_item_id, i_item_desc, i_category, i_class, i_current_price order by i_category, i_class, i_item_id, i_item_desc, revenueratio [_LIMITC]; ```
> **Bexhoma note:** The template uses `[SDATE]` (a random date sampled from sales data). Bexhoma > does not include a date parameter in the config; instead the date filter is replaced with a fixed > date range in the SQL. --- ## Q13 — Store Sales Aggregates by Demographics and Geography Computes simple aggregates on store sales filtered by three demographic combinations paired with three geographic regions, each with different price/profit constraints. | Parameter | Type | Range / Values | |-----------|------|---------------| | `STATE` | list (×9) | 9 draws from US state codes (44 values) — substituted as `{STATE.1}`…`{STATE.9}` | | `MS` | list (×3) | 3 draws from: M, S, D, W, U — substituted as `{MS.1}`…`{MS.3}` | | `ES` | list (×3) | 3 draws from: Primary, Secondary, College, 2 yr Degree, 4 yr Degree, Advanced Degree, Unknown |
TPC-DS Reference SQL (query13.tpl) ```sql select avg(ss_quantity), avg(ss_ext_sales_price), avg(ss_ext_wholesale_cost), sum(ss_ext_wholesale_cost) from store_sales, store, customer_demographics, household_demographics, customer_address, date_dim where s_store_sk = ss_store_sk and ss_sold_date_sk = d_date_sk and d_year = 2001 and ((ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = '[MS.1]' and cd_education_status = '[ES.1]' and ss_sales_price between 100.00 and 150.00 and hd_dep_count = 3) or (ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = '[MS.2]' and cd_education_status = '[ES.2]' and ss_sales_price between 50.00 and 100.00 and hd_dep_count = 1) or (ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = '[MS.3]' and cd_education_status = '[ES.3]' and ss_sales_price between 150.00 and 200.00 and hd_dep_count = 1)) and ((ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('[STATE.1]','[STATE.2]','[STATE.3]') and ss_net_profit between 100 and 200) or (ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('[STATE.4]','[STATE.5]','[STATE.6]') and ss_net_profit between 150 and 300) or (ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('[STATE.7]','[STATE.8]','[STATE.9]') and ss_net_profit between 50 and 250)); ```
> **Bexhoma note:** The template hardcodes `d_year = 2001`, so date filtering uses a fixed year. --- ## Q14a+b — Cross-Channel Item Sales (Multi-Statement) Two-statement query. **Part a** finds items sold in November across all three channels in a given year whose cross-channel sales exceed the average, then rolls up with `UNION`. **Part b** compares store sales in a specific December week against the prior year for the same items. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2000 | | `DAY` | integer | 1–28 |
TPC-DS Reference SQL (query_variants/query14a.tpl — Part a) ```sql with cross_items as ( select i_item_sk ss_item_sk from item, (select iss.i_brand_id brand_id, iss.i_class_id class_id, iss.i_category_id category_id from store_sales, item iss, date_dim d1 where ss_item_sk = iss.i_item_sk and ss_sold_date_sk = d1.d_date_sk and d1.d_year between 1999 and 1999+2 intersect select ics.i_brand_id, ics.i_class_id, ics.i_category_id from catalog_sales, item ics, date_dim d2 where cs_item_sk = ics.i_item_sk and cs_sold_date_sk = d2.d_date_sk and d2.d_year between 1999 and 1999+2 intersect select iws.i_brand_id, iws.i_class_id, iws.i_category_id from web_sales, item iws, date_dim d3 where ws_item_sk = iws.i_item_sk and ws_sold_date_sk = d3.d_date_sk and d3.d_year between 1999 and 1999+2) x where i_brand_id = brand_id and i_class_id = class_id and i_category_id = category_id), avg_sales as ( select avg(quantity*list_price) average_sales from (select ss_quantity quantity, ss_list_price list_price from store_sales, date_dim where ss_sold_date_sk = d_date_sk and d_year between 1999 and 2001 union all select cs_quantity quantity, cs_list_price list_price from catalog_sales, date_dim where cs_sold_date_sk = d_date_sk and d_year between [YEAR] and [YEAR]+2 union all select ws_quantity quantity, ws_list_price list_price from web_sales, date_dim where ws_sold_date_sk = d_date_sk and d_year between [YEAR] and [YEAR]+2) x), results AS (...) [_LIMITA] select [_LIMITB] channel, i_brand_id, i_class_id, i_category_id, sum_sales, number_sales from (select channel, i_brand_id, i_class_id, i_category_id, sum_sales, number_sales from results union select channel, i_brand_id, i_class_id, null, sum(sum_sales), sum(number_sales) from results group by channel, i_brand_id, i_class_id union select channel, i_brand_id, null, null, sum(sum_sales), sum(number_sales) from results group by channel, i_brand_id union select channel, null, null, null, sum(sum_sales), sum(number_sales) from results group by channel union select null, null, null, null, sum(sum_sales), sum(number_sales) from results) z order by channel, i_brand_id, i_class_id, i_category_id [_LIMITC]; ```
TPC-DS Reference SQL (query_variants/query14a.tpl — Part b) ```sql with cross_items as (...), avg_sales as (...) [_LIMITA] select [_LIMITB] * from (select 'store' channel, i_brand_id, i_class_id, i_category_id, sum(ss_quantity*ss_list_price) sales, count(*) number_sales from store_sales, item, date_dim where ss_item_sk in (select ss_item_sk from cross_items) and ss_item_sk = i_item_sk and ss_sold_date_sk = d_date_sk and d_week_seq = (select d_week_seq from date_dim where d_year = [YEAR]+1 and d_moy = 12 and d_dom = [DAY]) group by i_brand_id, i_class_id, i_category_id having sum(ss_quantity*ss_list_price) > (select average_sales from avg_sales)) this_year, (select 'store' channel, i_brand_id, i_class_id, i_category_id, sum(ss_quantity*ss_list_price) sales, count(*) number_sales from store_sales, item, date_dim where ss_item_sk in (select ss_item_sk from cross_items) and ss_item_sk = i_item_sk and ss_sold_date_sk = d_date_sk and d_week_seq = (select d_week_seq from date_dim where d_year = [YEAR] and d_moy = 12 and d_dom = [DAY]) group by i_brand_id, i_class_id, i_category_id having sum(ss_quantity*ss_list_price) > (select average_sales from avg_sales)) last_year where this_year.i_brand_id = last_year.i_brand_id and this_year.i_class_id = last_year.i_class_id and this_year.i_category_id = last_year.i_category_id order by this_year.channel, this_year.i_brand_id, this_year.i_class_id, this_year.i_category_id [_LIMITC]; ```
MariaDB dialect MariaDB cannot use `INTERSECT` and cannot reference CTEs from within sibling CTE bodies. Bexhoma rewrites using a nested `with total as (...)` wrapper and replaces `INTERSECT` with `EXISTS` subqueries. Additionally, the `UNION`-based rollup in Part a is replaced by `GROUP BY ... WITH ROLLUP`.
--- ## Q15 — Catalog Sales by ZIP Code Summarizes catalog sales amounts by customer ZIP code, filtered by specific ZIP codes, states, or high-value sales. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `QOY` | integer | 1–2 |
TPC-DS Reference SQL (query15.tpl) ```sql [_LIMITA] select [_LIMITB] ca_zip, sum(cs_sales_price) from catalog_sales, customer, customer_address, date_dim where cs_bill_customer_sk = c_customer_sk and c_current_addr_sk = ca_address_sk and (substr(ca_zip,1,5) in ('85669','86197','88274','83405','86475', '85392','85460','80348','81792') or ca_state in ('CA','WA','GA') or cs_sales_price > 500) and cs_sold_date_sk = d_date_sk and d_qoy = [QOY] and d_year = [YEAR] group by ca_zip order by ca_zip [_LIMITC]; ```
--- ## Q16 — Catalog Order Count by County and State Counts distinct catalog orders shipped to a given state over a 60-day window with cross-warehouse fulfillment and no returns. | Parameter | Type | Range / Values | |-----------|------|---------------| | `COUNTY` | list (×5) | 5 random US counties — substituted as `{COUNTY.1}`…`{COUNTY.5}` | | `MONTH` | integer | 2–5 | | `STATE` | list | US state codes (44 values) | | `YEAR` | integer | 1999–2002 |
TPC-DS Reference SQL (query16.tpl) ```sql [_LIMITA] select [_LIMITB] count(distinct cs_order_number) as "order count", sum(cs_ext_ship_cost) as "total shipping cost", sum(cs_net_profit) as "total net profit" from catalog_sales cs1, date_dim, customer_address, call_center where d_date between '[YEAR]-[MONTH]-01' and (cast('[YEAR]-[MONTH]-01' as date) + 60 days) and cs1.cs_ship_date_sk = d_date_sk and cs1.cs_ship_addr_sk = ca_address_sk and ca_state = '[STATE]' and cs1.cs_call_center_sk = cc_call_center_sk and cc_county in ('[COUNTY.1]','[COUNTY.2]','[COUNTY.3]','[COUNTY.4]','[COUNTY.5]') and exists (select * from catalog_sales cs2 where cs1.cs_order_number = cs2.cs_order_number and cs1.cs_warehouse_sk <> cs2.cs_warehouse_sk) and not exists (select * from catalog_returns cr1 where cs1.cs_order_number = cr1.cr_order_number) order by count(distinct cs_order_number) [_LIMITC]; ```
--- ## Q17 — Item Return and Re-Purchase Analysis Computes counts, averages, and coefficient of variation for quantities in store sales, store returns, and subsequent catalog re-purchases, within the same three quarters. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 |
TPC-DS Reference SQL (query17.tpl) ```sql [_LIMITA] select [_LIMITB] i_item_id, i_item_desc, s_state, count(ss_quantity) as store_sales_quantitycount, avg(ss_quantity) as store_sales_quantityave, stddev_samp(ss_quantity) as store_sales_quantitystdev, stddev_samp(ss_quantity)/avg(ss_quantity) as store_sales_quantitycov, count(sr_return_quantity) as store_returns_quantitycount, avg(sr_return_quantity) as store_returns_quantityave, stddev_samp(sr_return_quantity) as store_returns_quantitystdev, stddev_samp(sr_return_quantity)/avg(sr_return_quantity) as store_returns_quantitycov, count(cs_quantity) as catalog_sales_quantitycount, avg(cs_quantity) as catalog_sales_quantityave, stddev_samp(cs_quantity) as catalog_sales_quantitystdev, stddev_samp(cs_quantity)/avg(cs_quantity) as catalog_sales_quantitycov from store_sales, store_returns, catalog_sales, date_dim d1, date_dim d2, date_dim d3, store, item where d1.d_quarter_name = '[YEAR]Q1' and d1.d_date_sk = ss_sold_date_sk and i_item_sk = ss_item_sk and s_store_sk = ss_store_sk and ss_customer_sk = sr_customer_sk and ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number and sr_returned_date_sk = d2.d_date_sk and d2.d_quarter_name in ('[YEAR]Q1','[YEAR]Q2','[YEAR]Q3') and sr_customer_sk = cs_bill_customer_sk and sr_item_sk = cs_item_sk and cs_sold_date_sk = d3.d_date_sk and d3.d_quarter_name in ('[YEAR]Q1','[YEAR]Q2','[YEAR]Q3') group by i_item_id, i_item_desc, s_state order by i_item_id, i_item_desc, s_state [_LIMITC]; ```
--- ## Q18 — Catalog Sales Aggregates by Geography and Demographics (Rollup) Averages catalog sales quantities, prices, and demographics by item, country, state, county using `ROLLUP`. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `STATE` | list (×7) | 7 draws from US state codes (44 values) — substituted as `{STATE.1}`…`{STATE.7}` | | `MONTH` | integer (×6) | 6 draws from 1–12 — substituted as `{MONTH.1}`…`{MONTH.6}` | | `GEN` | list | M, F | | `ES` | list | Primary, Secondary, College, 2 yr Degree, 4 yr Degree, Advanced Degree, Unknown |
TPC-DS Reference SQL (query18.tpl) ```sql [_LIMITA] select [_LIMITB] i_item_id, ca_country, ca_state, ca_county, avg(cast(cs_quantity as decimal(12,2))) agg1, avg(cast(cs_list_price as decimal(12,2))) agg2, avg(cast(cs_coupon_amt as decimal(12,2))) agg3, avg(cast(cs_sales_price as decimal(12,2))) agg4, avg(cast(cs_net_profit as decimal(12,2))) agg5, avg(cast(c_birth_year as decimal(12,2))) agg6, avg(cast(cd1.cd_dep_count as decimal(12,2))) agg7 from catalog_sales, customer_demographics cd1, customer_demographics cd2, customer, customer_address, date_dim, item where cs_sold_date_sk = d_date_sk and cs_item_sk = i_item_sk and cs_bill_cdemo_sk = cd1.cd_demo_sk and cs_bill_customer_sk = c_customer_sk and cd1.cd_gender = '[GEN]' and cd1.cd_education_status = '[ES]' and c_current_cdemo_sk = cd2.cd_demo_sk and c_current_addr_sk = ca_address_sk and c_birth_month in ([MONTH.1],[MONTH.2],[MONTH.3],[MONTH.4],[MONTH.5],[MONTH.6]) and d_year = [YEAR] and ca_state in ('[STATE.1]','[STATE.2]','[STATE.3]', '[STATE.4]','[STATE.5]','[STATE.6]','[STATE.7]') group by rollup (i_item_id, ca_country, ca_state, ca_county) order by ca_country, ca_state, ca_county, i_item_id [_LIMITC]; ```
MariaDB / MonetDB / PostgreSQL dialects MariaDB uses `GROUP BY i_item_id, ca_country, ca_state, ca_county WITH ROLLUP`. MonetDB and PostgreSQL use standard `GROUP BY ROLLUP(i_item_id, ca_country, ca_state, ca_county)`.
--- ## Q19 — Store Sales by Brand and Manager Reports total external sales price grouped by brand and manufacturer for a given store manager, month, and year, filtering out customers who share ZIP code prefix with the store. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MONTH` | integer | 11–12 | | `MANAGER` | integer | 1–100 |
TPC-DS Reference SQL (query19.tpl) ```sql [_LIMITA] select [_LIMITB] i_brand_id brand_id, i_brand brand, i_manufact_id, i_manufact, sum(ss_ext_sales_price) ext_price from date_dim, store_sales, item, customer, customer_address, store where d_date_sk = ss_sold_date_sk and ss_item_sk = i_item_sk and i_manager_id = [MANAGER] and d_moy = [MONTH] and d_year = [YEAR] and ss_customer_sk = c_customer_sk and c_current_addr_sk = ca_address_sk and substr(ca_zip,1,5) <> substr(s_zip,1,5) and ss_store_sk = s_store_sk group by i_brand, i_brand_id, i_manufact_id, i_manufact order by ext_price desc, i_brand, i_brand_id, i_manufact_id, i_manufact [_LIMITC]; ```
--- ## Q20 — Catalog Sales Revenue Ratio by Item Class Analogous to Q12 but for the catalog channel. Computes revenue contribution per item in selected categories over a 30-day window. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `CATEGORY` | list (×3) | 3 draws from: Books, Children, Electronics, Home, Jewelry, Men, Music, Shoes, Sports, Women |
TPC-DS Reference SQL (query20.tpl) ```sql [_LIMITA] select [_LIMITB] i_item_id, i_item_desc, i_category, i_class, i_current_price, sum(cs_ext_sales_price) as itemrevenue, sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over (partition by i_class) as revenueratio from catalog_sales, item, date_dim where cs_item_sk = i_item_sk and i_category in ('[CATEGORY.1]', '[CATEGORY.2]', '[CATEGORY.3]') and cs_sold_date_sk = d_date_sk and d_date between cast('[SDATE]' as date) and (cast('[SDATE]' as date) + 30 days) group by i_item_id, i_item_desc, i_category, i_class, i_current_price order by i_category, i_class, i_item_id, i_item_desc, revenueratio [_LIMITC]; ```
> **Bexhoma note:** `[SDATE]` is replaced by a date derived from `{YEAR}` and `{MONTH}` parameters > in the bexhoma SQL. --- ## Q21 — Inventory Levels Before and After a Sale Date Checks whether inventory level ratios (before vs. after a key date) remain within 2/3 to 3/2, for items priced between $0.99 and $1.49. | Parameter | Type | Range / Values | |-----------|------|---------------| | `MONTH` | integer | 1–12 | | `DAY` | integer | 1–28 | | `YEAR` | integer | 1998–2002 |
TPC-DS Reference SQL (query21.tpl) ```sql [_LIMITA] select [_LIMITB] * from (select w_warehouse_name, i_item_id, sum(case when (cast(d_date as date) < cast('[SALES_DATE]' as date)) then inv_quantity_on_hand else 0 end) as inv_before, sum(case when (cast(d_date as date) >= cast('[SALES_DATE]' as date)) then inv_quantity_on_hand else 0 end) as inv_after from inventory, warehouse, item, date_dim where i_current_price between 0.99 and 1.49 and i_item_sk = inv_item_sk and inv_warehouse_sk = w_warehouse_sk and inv_date_sk = d_date_sk and d_date between (cast('[SALES_DATE]' as date) - 30 days) and (cast('[SALES_DATE]' as date) + 30 days) group by w_warehouse_name, i_item_id) x where (case when inv_before > 0 then inv_after / inv_before else null end) between 2.0/3.0 and 3.0/2.0 order by w_warehouse_name, i_item_id [_LIMITC]; ```
> **Bexhoma note:** `[SALES_DATE]` is replaced with a date constructed from `{YEAR}`, `{MONTH}`, > `{DAY}` integer parameters. --- ## Q22 — Average Inventory by Product with Rollup Averages inventory on hand per product (name, brand, class, category) over a 12-month period, using `ROLLUP` to generate subtotals. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 (month sequence number) |
TPC-DS Reference SQL (query22.tpl) ```sql [_LIMITA] select [_LIMITB] i_product_name, i_brand, i_class, i_category, avg(inv_quantity_on_hand) qoh from inventory, date_dim, item where inv_date_sk = d_date_sk and inv_item_sk = i_item_sk and d_month_seq between [DMS] and [DMS]+11 group by rollup(i_product_name, i_brand, i_class, i_category) order by qoh, i_product_name, i_brand, i_class, i_category [_LIMITC]; ```
MariaDB / MonetDB / PostgreSQL dialects MariaDB uses `GROUP BY i_product_name, i_brand, i_class, i_category WITH ROLLUP`. MonetDB and PostgreSQL use standard `GROUP BY ROLLUP(...)`.
--- ## Q23a+b — Frequent Store Buyers in Catalog and Web (Multi-Statement) Two-statement query. Identifies "best store sales customers" (top 5% by spend) who also purchased frequently-sold items via catalog or web in a given month. **Part a** returns total sales; **Part b** returns customers with their names and totals. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2000 | | `MONTH` | integer | 1–7 | | `TOPPERCENT` | integer | 95–95 (fixed at 95th percentile) |
TPC-DS Reference SQL (query23.tpl — Part a) ```sql with frequent_ss_items as ( select substr(i_item_desc,1,30) itemdesc, i_item_sk item_sk, d_date solddate, count(*) cnt from store_sales, date_dim, item where ss_sold_date_sk = d_date_sk and ss_item_sk = i_item_sk and d_year in ([YEAR],[YEAR]+1,[YEAR]+2,[YEAR]+3) group by substr(i_item_desc,1,30), i_item_sk, d_date having count(*) > 4), max_store_sales as ( select max(csales) tpcds_cmax from ( select c_customer_sk, sum(ss_quantity*ss_sales_price) csales from store_sales, customer, date_dim where ss_customer_sk = c_customer_sk and ss_sold_date_sk = d_date_sk and d_year in ([YEAR],[YEAR]+1,[YEAR]+2,[YEAR]+3) group by c_customer_sk)), best_ss_customer as ( select c_customer_sk, sum(ss_quantity*ss_sales_price) ssales from store_sales, customer where ss_customer_sk = c_customer_sk group by c_customer_sk having sum(ss_quantity*ss_sales_price) > ([TOPPERCENT]/100.0) * (select * from max_store_sales)) [_LIMITA] select [_LIMITB] sum(sales) from (select cs_quantity*cs_list_price sales from catalog_sales, date_dim where d_year = [YEAR] and d_moy = [MONTH] and cs_sold_date_sk = d_date_sk and cs_item_sk in (select item_sk from frequent_ss_items) and cs_bill_customer_sk in (select c_customer_sk from best_ss_customer) union all select ws_quantity*ws_list_price sales from web_sales, date_dim where d_year = [YEAR] and d_moy = [MONTH] and ws_sold_date_sk = d_date_sk and ws_item_sk in (select item_sk from frequent_ss_items) and ws_bill_customer_sk in (select c_customer_sk from best_ss_customer)) [_LIMITC]; ```
TPC-DS Reference SQL (query23.tpl — Part b) ```sql ... same CTEs ... [_LIMITA] select [_LIMITB] c_last_name, c_first_name, sales from (select c_last_name, c_first_name, sum(cs_quantity*cs_list_price) sales from catalog_sales, customer, date_dim where d_year = [YEAR] and d_moy = [MONTH] and cs_sold_date_sk = d_date_sk and cs_item_sk in (select item_sk from frequent_ss_items) and cs_bill_customer_sk in (select c_customer_sk from best_ss_customer) and cs_bill_customer_sk = c_customer_sk group by c_last_name, c_first_name union all select c_last_name, c_first_name, sum(ws_quantity*ws_list_price) sales from web_sales, customer, date_dim where d_year = [YEAR] and d_moy = [MONTH] and ws_sold_date_sk = d_date_sk and ws_item_sk in (select item_sk from frequent_ss_items) and ws_bill_customer_sk in (select c_customer_sk from best_ss_customer) and ws_bill_customer_sk = c_customer_sk group by c_last_name, c_first_name) order by c_last_name, c_first_name, sales [_LIMITC]; ```
--- ## Q24a+b — Store Sales Paid Amount by Color (Multi-Statement) Two-statement query. Each part finds customers who paid more than 5% of average net paid per store for items in a specific color (Part a uses `{COLOR.1}`, Part b uses `{COLOR.2}`), within a given market. | Parameter | Type | Range / Values | |-----------|------|---------------| | `MARKET` | integer | 5–10 | | `COLOR` | list (×2) | 2 draws from 67 color names — substituted as `{COLOR.1}`, `{COLOR.2}` | | `AMOUNTONE` | list | `ss_net_paid`, `ss_net_paid_inc_tax`, `ss_net_profit`, `ss_sales_price`, `ss_ext_sales_price` |
TPC-DS Reference SQL (query24.tpl — Part a) ```sql with ssales as ( select c_last_name, c_first_name, s_store_name, ca_state, s_state, i_color, i_current_price, i_manager_id, i_units, i_size, sum([AMOUNTONE]) netpaid from store_sales, store_returns, store, item, customer, customer_address where ss_ticket_number = sr_ticket_number and ss_item_sk = sr_item_sk and ss_customer_sk = c_customer_sk and ss_item_sk = i_item_sk and ss_store_sk = s_store_sk and c_current_addr_sk = ca_address_sk and c_birth_country <> upper(ca_country) and s_zip = ca_zip and s_market_id = [MARKET] group by c_last_name, c_first_name, s_store_name, ca_state, s_state, i_color, i_current_price, i_manager_id, i_units, i_size) select c_last_name, c_first_name, s_store_name, sum(netpaid) paid from ssales where i_color = '[COLOR.1]' group by c_last_name, c_first_name, s_store_name having sum(netpaid) > (select 0.05*avg(netpaid) from ssales) order by c_last_name, c_first_name, s_store_name; ```
TPC-DS Reference SQL (query24.tpl — Part b) Same structure as Part a but filtering on `i_color = '[COLOR.2]'`.
--- ## Q25 — Store-Return-Catalog Profit Analysis Reports net profit aggregation (with selectable aggregate function) for items that appear in store sales, get returned, and are re-purchased in the catalog, all in the same year and adjacent months. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `AGG` | list | `sum`, `min`, `max`, `avg`, `stddev_samp` |
TPC-DS Reference SQL (query25.tpl) ```sql [_LIMITA] select [_LIMITB] i_item_id, i_item_desc, s_store_id, s_store_name, [AGG](ss_net_profit) as store_sales_profit, [AGG](sr_net_loss) as store_returns_loss, [AGG](cs_net_profit) as catalog_sales_profit from store_sales, store_returns, catalog_sales, date_dim d1, date_dim d2, date_dim d3, store, item where d1.d_moy = 4 and d1.d_year = [YEAR] and d1.d_date_sk = ss_sold_date_sk and i_item_sk = ss_item_sk and s_store_sk = ss_store_sk and ss_customer_sk = sr_customer_sk and ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number and sr_returned_date_sk = d2.d_date_sk and d2.d_moy between 4 and 10 and d2.d_year = [YEAR] and sr_customer_sk = cs_bill_customer_sk and sr_item_sk = cs_item_sk and cs_sold_date_sk = d3.d_date_sk and d3.d_moy between 4 and 10 and d3.d_year = [YEAR] group by i_item_id, i_item_desc, s_store_id, s_store_name order by i_item_id, i_item_desc, s_store_id, s_store_name [_LIMITC]; ```
--- ## Q26 — Catalog Sales Averages by Item and Demographics Reports average quantities, list prices, coupon amounts, and sales prices for catalog sales filtered by gender, marital status, education, year, and promotional channels. | Parameter | Type | Range / Values | |-----------|------|---------------| | `GEN` | list | M, F | | `MS` | list | S, M, D, W, U | | `ES` | list | Primary, Secondary, College, 2 yr Degree, 4 yr Degree, Advanced Degree, Unknown | | `YEAR` | integer | 1998–2002 |
TPC-DS Reference SQL (query26.tpl) ```sql [_LIMITA] select [_LIMITB] i_item_id, avg(cs_quantity) agg1, avg(cs_list_price) agg2, avg(cs_coupon_amt) agg3, avg(cs_sales_price) agg4 from catalog_sales, customer_demographics, date_dim, item, promotion where cs_sold_date_sk = d_date_sk and cs_item_sk = i_item_sk and cs_bill_cdemo_sk = cd_demo_sk and cs_promo_sk = p_promo_sk and cd_gender = '[GEN]' and cd_marital_status = '[MS]' and cd_education_status = '[ES]' and (p_channel_email = 'N' or p_channel_event = 'N') and d_year = [YEAR] group by i_item_id order by i_item_id [_LIMITC]; ```
--- ## Q27 — Store Sales Aggregates by State and Demographics (Rollup) Similar to Q26 but for store sales, with `ROLLUP` over item and state to produce subtotals. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `GEN` | list | M, F | | `MS` | list | S, M, D, W, U | | `ES` | list | education distribution values | | `STATE` | list (×6) | 6 draws from state codes — substituted as `{STATE.1}`…`{STATE.6}` |
TPC-DS Reference SQL (query27.tpl) ```sql [_LIMITA] select [_LIMITB] i_item_id, s_state, grouping(s_state) g_state, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales, customer_demographics, date_dim, store, item where ss_sold_date_sk = d_date_sk and ss_item_sk = i_item_sk and ss_store_sk = s_store_sk and ss_cdemo_sk = cd_demo_sk and cd_gender = '[GEN]' and cd_marital_status = '[MS]' and cd_education_status = '[ES]' and d_year = [YEAR] and s_state in ('[STATE_A]','[STATE_B]','[STATE_C]','[STATE_D]','[STATE_E]','[STATE_F]') group by rollup (i_item_id, s_state) order by i_item_id, s_state [_LIMITC]; ```
MariaDB / MonetDB / PostgreSQL dialects MariaDB uses `GROUP BY i_item_id, s_state WITH ROLLUP`. MonetDB and PostgreSQL use standard `GROUP BY ROLLUP(i_item_id, s_state)`.
--- ## Q28 — Multi-Band List Price and Discount Crosstab Cross-tabulates list price averages, counts, and distinct counts across six quantity bands (0–5, 6–10, 11–15, 16–20, 21–25, 26–30) simultaneously. | Parameter | Type | Range / Values | |-----------|------|---------------| | `LISTPRICE` | integer (×6) | 6 independent draws from 0–190 | | `COUPONAMT` | integer (×6) | 6 independent draws from 0–18000 | | `WHOLESALECOST` | integer (×6) | 6 independent draws from 0–80 |
TPC-DS Reference SQL (query28.tpl) ```sql [_LIMITA] select [_LIMITB] * from (select avg(ss_list_price) B1_LP, count(ss_list_price) B1_CNT, count(distinct ss_list_price) B1_CNTD from store_sales where ss_quantity between 0 and 5 and (ss_list_price between [LISTPRICE.1] and [LISTPRICE.1]+10 or ss_coupon_amt between [COUPONAMT.1] and [COUPONAMT.1]+1000 or ss_wholesale_cost between [WHOLESALECOST.1] and [WHOLESALECOST.1]+20)) B1, (select avg(ss_list_price) B2_LP, count(ss_list_price) B2_CNT, count(distinct ss_list_price) B2_CNTD from store_sales where ss_quantity between 6 and 10 and (ss_list_price between [LISTPRICE.2] and [LISTPRICE.2]+10 or ss_coupon_amt between [COUPONAMT.2] and [COUPONAMT.2]+1000 or ss_wholesale_cost between [WHOLESALECOST.2] and [WHOLESALECOST.2]+20)) B2, ... (bands B3-B6 analogous) [_LIMITC]; ```
--- ## Q29 — Store-Return-Catalog Quantity Analysis Similar to Q25 but computes aggregate quantities rather than profit, with a shorter return-window (month + 3) and a wider re-purchase window (3 years). | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2000 | | `MONTH` | integer | 4 (fixed) | | `AGG` | list | sum, min, max, avg, stddev_samp |
TPC-DS Reference SQL (query29.tpl) ```sql [_LIMITA] select [_LIMITB] i_item_id, i_item_desc, s_store_id, s_store_name, [AGG](ss_quantity) as store_sales_quantity, [AGG](sr_return_quantity) as store_returns_quantity, [AGG](cs_quantity) as catalog_sales_quantity from store_sales, store_returns, catalog_sales, date_dim d1, date_dim d2, date_dim d3, store, item where d1.d_moy = [MONTH] and d1.d_year = [YEAR] and d1.d_date_sk = ss_sold_date_sk and i_item_sk = ss_item_sk and s_store_sk = ss_store_sk and ss_customer_sk = sr_customer_sk and ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number and sr_returned_date_sk = d2.d_date_sk and d2.d_moy between [MONTH] and [MONTH] + 3 and d2.d_year = [YEAR] and sr_customer_sk = cs_bill_customer_sk and sr_item_sk = cs_item_sk and cs_sold_date_sk = d3.d_date_sk and d3.d_year in ([YEAR],[YEAR]+1,[YEAR]+2) group by i_item_id, i_item_desc, s_store_id, s_store_name order by i_item_id, i_item_desc, s_store_id, s_store_name [_LIMITC]; ```
--- ## Q30 — Web Returns by State vs. Average Finds customers whose total web return amount is 20% above the average for their state, in a given year. | Parameter | Type | Range / Values | |-----------|------|---------------| | `STATE` | list | US state codes (44 values) | | `YEAR` | integer | 1999–2002 |
TPC-DS Reference SQL (query30.tpl) ```sql with customer_total_return as ( select wr_returning_customer_sk as ctr_customer_sk, ca_state as ctr_state, sum(wr_return_amt) as ctr_total_return from web_returns, date_dim, customer_address where wr_returned_date_sk = d_date_sk and d_year = [YEAR] and wr_returning_addr_sk = ca_address_sk group by wr_returning_customer_sk, ca_state) [_LIMITA] select [_LIMITB] c_customer_id, c_salutation, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_day, c_birth_month, c_birth_year, c_birth_country, c_login, c_email_address, c_last_review_date_sk, ctr_total_return from customer_total_return ctr1, customer_address, customer where ctr1.ctr_total_return > (select avg(ctr_total_return)*1.2 from customer_total_return ctr2 where ctr1.ctr_state = ctr2.ctr_state) and ca_address_sk = c_current_addr_sk and ca_state = '[STATE]' and ctr1.ctr_customer_sk = c_customer_sk order by c_customer_id, c_salutation, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_day, c_birth_month, c_birth_year, c_birth_country, c_login, c_email_address, c_last_review_date_sk, ctr_total_return [_LIMITC]; ```
--- ## Q31 — Quarter-over-Quarter Sales Growth by County Compares web vs. store sales growth between consecutive quarters for counties where the web/store growth ratios are consistently positive. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `AGG` | list | one of 6 column expressions used in ORDER BY |
TPC-DS Reference SQL (query31.tpl) ```sql with ss as ( select ca_county, d_qoy, d_year, sum(ss_ext_sales_price) as store_sales from store_sales, date_dim, customer_address where ss_sold_date_sk = d_date_sk and ss_addr_sk = ca_address_sk group by ca_county, d_qoy, d_year), ws as ( select ca_county, d_qoy, d_year, sum(ws_ext_sales_price) as web_sales from web_sales, date_dim, customer_address where ws_sold_date_sk = d_date_sk and ws_bill_addr_sk = ca_address_sk group by ca_county, d_qoy, d_year) select ss1.ca_county, ss1.d_year, ws2.web_sales/ws1.web_sales web_q1_q2_increase, ss2.store_sales/ss1.store_sales store_q1_q2_increase, ws3.web_sales/ws2.web_sales web_q2_q3_increase, ss3.store_sales/ss2.store_sales store_q2_q3_increase from ss ss1, ss ss2, ss ss3, ws ws1, ws ws2, ws ws3 where ss1.d_qoy = 1 and ss1.d_year = [YEAR] and ss1.ca_county = ss2.ca_county and ss2.d_qoy = 2 and ss2.d_year = [YEAR] and ss2.ca_county = ss3.ca_county and ss3.d_qoy = 3 and ss3.d_year = [YEAR] and ss1.ca_county = ws1.ca_county and ws1.d_qoy = 1 and ws1.d_year = [YEAR] and ws1.ca_county = ws2.ca_county and ws2.d_qoy = 2 and ws2.d_year = [YEAR] and ws1.ca_county = ws3.ca_county and ws3.d_qoy = 3 and ws3.d_year = [YEAR] and case when ws1.web_sales > 0 then ws2.web_sales/ws1.web_sales else null end > case when ss1.store_sales > 0 then ss2.store_sales/ss1.store_sales else null end and case when ws2.web_sales > 0 then ws3.web_sales/ws2.web_sales else null end > case when ss2.store_sales > 0 then ss3.store_sales/ss2.store_sales else null end order by [AGG]; ```
--- ## Q32 — Catalog Discount Excess Over Average Finds total excess discount amounts in catalog sales where items from a given manufacturer have discount amounts above 1.3× their 90-day average, during a date window. | Parameter | Type | Range / Values | |-----------|------|---------------| | `IMID` | integer | 1–1000 | | `YEAR` | integer | 1998–2002 | | `CSDATE` | date | derived: Jan 1 – Apr 1 of `{YEAR}` |
TPC-DS Reference SQL (query32.tpl) ```sql [_LIMITA] select [_LIMITB] sum(cs_ext_discount_amt) as "excess discount amount" from catalog_sales, item, date_dim where i_manufact_id = [IMID] and i_item_sk = cs_item_sk and d_date between '[CSDATE]' and (cast('[CSDATE]' as date) + 90 days) and d_date_sk = cs_sold_date_sk and cs_ext_discount_amt > ( select 1.3 * avg(cs_ext_discount_amt) from catalog_sales, date_dim where cs_item_sk = i_item_sk and d_date between '[CSDATE]' and (cast('[CSDATE]' as date) + 90 days) and d_date_sk = cs_sold_date_sk) [_LIMITC]; ```
> **Bexhoma note:** `[CSDATE]` is constructed from `{YEAR}`, `{MONTH}`, `{DAY}` integer parameters > in the bexhoma SQL. --- ## Q33 — Cross-Channel Sales by Manufacturer in a Region Sums external sales price across all three channels (store, catalog, web) for manufacturers whose items belong to a specified category, in a given GMT timezone, year, and month. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MONTH` | integer | 1–7 | | `COUNTY` | integer | 1–N (active store counties) | | `GMT` | list | GMT offset derived from county | | `CATEGORY` | list | Books, Home, Electronics, Jewelry, Sports |
TPC-DS Reference SQL (query33.tpl) ```sql with ss as ( select i_manufact_id, sum(ss_ext_sales_price) total_sales from store_sales, date_dim, customer_address, item where i_manufact_id in (select i_manufact_id from item where i_category in ('[CATEGORY]')) and ss_item_sk = i_item_sk and ss_sold_date_sk = d_date_sk and d_year = [YEAR] and d_moy = [MONTH] and ss_addr_sk = ca_address_sk and ca_gmt_offset = [GMT] group by i_manufact_id), cs as (... same for catalog_sales ...), ws as (... same for web_sales ...) [_LIMITA] select [_LIMITB] i_manufact_id, sum(total_sales) total_sales from (select * from ss union all select * from cs union all select * from ws) tmp1 group by i_manufact_id order by total_sales [_LIMITC]; ```
--- ## Q34 — Store Customer Ticket Frequency in High-Purchase-Potential Households Finds customers with 15–20 store purchases in high-purchase-potential households across specific counties and years. | Parameter | Type | Range / Values | |-----------|------|---------------| | `BPONE` | list | `1001-5000`, `>10000`, `501-1000` | | `BPTWO` | list | `0-500`, `Unknown`, `5001-10000` | | `YEAR` | integer | 1998–2000 | | `COUNTY` | list (×8) | 8 draws from active store counties — substituted as `{COUNTY.1}`…`{COUNTY.8}` |
TPC-DS Reference SQL (query34.tpl) ```sql select c_last_name, c_first_name, c_salutation, c_preferred_cust_flag, ss_ticket_number, cnt from (select ss_ticket_number, ss_customer_sk, count(*) cnt from store_sales, date_dim, store, household_demographics where store_sales.ss_sold_date_sk = date_dim.d_date_sk and store_sales.ss_store_sk = store.s_store_sk and store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk and (date_dim.d_dom between 1 and 3 or date_dim.d_dom between 25 and 28) and (household_demographics.hd_buy_potential = '[BPONE]' or household_demographics.hd_buy_potential = '[BPTWO]') and household_demographics.hd_vehicle_count > 0 and (case when household_demographics.hd_vehicle_count > 0 then household_demographics.hd_dep_count / household_demographics.hd_vehicle_count else null end) > 1.2 and date_dim.d_year in ([YEAR],[YEAR]+1,[YEAR]+2) and store.s_county in ('[COUNTY.1]','[COUNTY.2]','[COUNTY.3]','[COUNTY.4]', '[COUNTY.5]','[COUNTY.6]','[COUNTY.7]','[COUNTY.8]') group by ss_ticket_number, ss_customer_sk) dn, customer where ss_customer_sk = c_customer_sk and cnt between 15 and 20 order by c_last_name, c_first_name, c_salutation, c_preferred_cust_flag desc, ss_ticket_number; ```
--- ## Q35 — Customer Demographics with Aggregate Dependency Counts Reports demographic aggregates for customers who purchased in store (and also via web or catalog) within a given year, using three independent aggregate functions. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1999–2002 | | `AGGONE` | list | sum, min, max, avg, stddev_samp | | `AGGTWO` | list | sum, min, max, avg, stddev_samp | | `AGGTHREE` | list | sum, min, max, avg, stddev_samp |
TPC-DS Reference SQL (query35.tpl) ```sql [_LIMITA] select [_LIMITB] ca_state, cd_gender, cd_marital_status, cd_dep_count, count(*) cnt1, [AGGONE](cd_dep_count) aggone1, [AGGTWO](cd_dep_count) aggtwo1, [AGGTHREE](cd_dep_count) aggthree1, cd_dep_employed_count, count(*) cnt2, [AGGONE](cd_dep_employed_count) aggone2, [AGGTWO](cd_dep_employed_count) aggtwo2, [AGGTHREE](cd_dep_employed_count) aggthree2, cd_dep_college_count, count(*) cnt3, [AGGONE](cd_dep_college_count) aggone3, [AGGTWO](cd_dep_college_count) aggtwo3, [AGGTHREE](cd_dep_college_count) aggthree3 from customer c, customer_address ca, customer_demographics where c.c_current_addr_sk = ca.ca_address_sk and cd_demo_sk = c.c_current_cdemo_sk and exists (select * from store_sales, date_dim where c.c_customer_sk = ss_customer_sk and ss_sold_date_sk = d_date_sk and d_year = [YEAR] and d_qoy < 4) and (exists (select * from web_sales, date_dim where c.c_customer_sk = ws_bill_customer_sk and ws_sold_date_sk = d_date_sk and d_year = [YEAR] and d_qoy < 4) or exists (select * from catalog_sales, date_dim where c.c_customer_sk = cs_ship_customer_sk and cs_sold_date_sk = d_date_sk and d_year = [YEAR] and d_qoy < 4)) group by ca_state, cd_gender, cd_marital_status, cd_dep_count, cd_dep_employed_count, cd_dep_college_count order by ca_state, cd_gender, cd_marital_status, cd_dep_count, cd_dep_employed_count, cd_dep_college_count [_LIMITC]; ```
--- ## Q36 — Store Gross Margin by Category with Rollup Computes gross margin ratio (net profit / ext sales price) grouped by item category and class using `ROLLUP`, then ranks within each hierarchical level. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `STATE` | list (×8) | 8 draws from active store states — substituted as `{STATE.1}`…`{STATE.8}` |
TPC-DS Reference SQL (query36.tpl) ```sql [_LIMITA] select [_LIMITB] sum(ss_net_profit)/sum(ss_ext_sales_price) as gross_margin, i_category, i_class, grouping(i_category)+grouping(i_class) as lochierarchy, rank() over ( partition by grouping(i_category)+grouping(i_class), case when grouping(i_class) = 0 then i_category end order by sum(ss_net_profit)/sum(ss_ext_sales_price) asc) as rank_within_parent from store_sales, date_dim d1, item, store where d1.d_year = [YEAR] and d1.d_date_sk = ss_sold_date_sk and i_item_sk = ss_item_sk and s_store_sk = ss_store_sk and s_state in ('[STATE_A]','[STATE_B]','[STATE_C]','[STATE_D]', '[STATE_E]','[STATE_F]','[STATE_G]','[STATE_H]') group by rollup(i_category, i_class) order by lochierarchy desc, case when lochierarchy = 0 then i_category end, rank_within_parent [_LIMITC]; ```
MariaDB / MonetDB / PostgreSQL dialects MariaDB uses `GROUP BY i_category, i_class WITH ROLLUP`. MonetDB and PostgreSQL use standard `GROUP BY ROLLUP(i_category, i_class)`.
--- ## Q37 — Catalog Items in Stock with Matching Sales Finds items with current price in a range, manufactured by a set of 4 manufacturers, having between 100 and 500 units on hand in inventory, and actually sold via catalog. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `INVDATE` | date | derived from `{YEAR}` (Jan 1 – Jul 24) | | `MANUFACT_ID` | integer (×4) | 4 draws from 667–1000 | | `PRICE` | integer | 10–70 |
TPC-DS Reference SQL (query37.tpl) ```sql [_LIMITA] select [_LIMITB] i_item_id, i_item_desc, i_current_price from item, inventory, date_dim, catalog_sales where i_current_price between [PRICE] and [PRICE] + 30 and inv_item_sk = i_item_sk and d_date_sk = inv_date_sk and d_date between cast('[INVDATE]' as date) and (cast('[INVDATE]' as date) + 60 days) and i_manufact_id in ([MANUFACT_ID.1],[MANUFACT_ID.2],[MANUFACT_ID.3],[MANUFACT_ID.4]) and inv_quantity_on_hand between 100 and 500 and cs_item_sk = i_item_sk group by i_item_id, i_item_desc, i_current_price order by i_item_id [_LIMITC]; ```
--- ## Q38 — Customers Who Bought in All Three Channels Counts distinct customers (by name and date) who appear in all three channels (store, catalog, web) within the same 12-month sequence. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 |
TPC-DS Reference SQL (query38.tpl) ```sql [_LIMITA] select [_LIMITB] count(*) from ( select distinct c_last_name, c_first_name, d_date from store_sales, date_dim, customer where store_sales.ss_sold_date_sk = date_dim.d_date_sk and store_sales.ss_customer_sk = customer.c_customer_sk and d_month_seq between [DMS] and [DMS] + 11 intersect select distinct c_last_name, c_first_name, d_date from catalog_sales, date_dim, customer where catalog_sales.cs_sold_date_sk = date_dim.d_date_sk and catalog_sales.cs_bill_customer_sk = customer.c_customer_sk and d_month_seq between [DMS] and [DMS] + 11 intersect select distinct c_last_name, c_first_name, d_date from web_sales, date_dim, customer where web_sales.ws_sold_date_sk = date_dim.d_date_sk and web_sales.ws_bill_customer_sk = customer.c_customer_sk and d_month_seq between [DMS] and [DMS] + 11 ) hot_cust [_LIMITC]; ```
--- ## Q39a+b — Inventory Coefficient of Variation (Multi-Statement) Two-statement query. Computes coefficient of variation (stddev/mean) of inventory by warehouse and item per month, then joins adjacent months. **Part a** returns all pairs where CoV > 1; **Part b** additionally requires CoV > 1.5. | Parameter | Type | Range / Values | |-----------|------|---------------| | `MONTH` | integer | 1–4 | | `YEAR` | integer | 1998–2002 |
TPC-DS Reference SQL (query39.tpl — Part a) ```sql with inv as ( select w_warehouse_name, w_warehouse_sk, i_item_sk, d_moy, stdev, mean, case mean when 0 then null else stdev/mean end cov from (select w_warehouse_name, w_warehouse_sk, i_item_sk, d_moy, stddev_samp(inv_quantity_on_hand) stdev, avg(inv_quantity_on_hand) mean from inventory, item, warehouse, date_dim where inv_item_sk = i_item_sk and inv_warehouse_sk = w_warehouse_sk and inv_date_sk = d_date_sk and d_year = [YEAR] group by w_warehouse_name, w_warehouse_sk, i_item_sk, d_moy) foo where case mean when 0 then 0 else stdev/mean end > 1) select inv1.w_warehouse_sk, inv1.i_item_sk, inv1.d_moy, inv1.mean, inv1.cov, inv2.w_warehouse_sk, inv2.i_item_sk, inv2.d_moy, inv2.mean, inv2.cov from inv inv1, inv inv2 where inv1.i_item_sk = inv2.i_item_sk and inv1.w_warehouse_sk = inv2.w_warehouse_sk and inv1.d_moy = [MONTH] and inv2.d_moy = [MONTH]+1 order by inv1.w_warehouse_sk, inv1.i_item_sk, inv1.d_moy, inv1.mean, inv1.cov, inv2.d_moy, inv2.mean, inv2.cov; ```
TPC-DS Reference SQL (query39.tpl — Part b) Same CTE structure as Part a, but the main query adds `and inv1.cov > 1.5`.
--- ## Q40 — Catalog Sales Before and After a Key Date Computes item-level net sales (price minus any refund) split by before/after a key date, for items priced between $0.99 and $1.49 over a ±30-day window. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `SALES_DATE` | date | derived: Jan 31 – Jul 1 of `{YEAR}` |
TPC-DS Reference SQL (query40.tpl) ```sql [_LIMITA] select [_LIMITB] w_state, i_item_id, sum(case when (cast(d_date as date) < cast('[SALES_DATE]' as date)) then cs_sales_price - coalesce(cr_refunded_cash,0) else 0 end) as sales_before, sum(case when (cast(d_date as date) >= cast('[SALES_DATE]' as date)) then cs_sales_price - coalesce(cr_refunded_cash,0) else 0 end) as sales_after from catalog_sales left outer join catalog_returns on (cs_order_number = cr_order_number and cs_item_sk = cr_item_sk), warehouse, item, date_dim where i_current_price between 0.99 and 1.49 and i_item_sk = cs_item_sk and cs_warehouse_sk = w_warehouse_sk and cs_sold_date_sk = d_date_sk and d_date between (cast('[SALES_DATE]' as date) - 30 days) and (cast('[SALES_DATE]' as date) + 30 days) group by w_state, i_item_id order by w_state, i_item_id [_LIMITC]; ```
--- ## Q41 — Items with Many Attribute Combinations Finds distinct product names for items with a manufacturer ID in a 40-unit range, where more than 0 items match a complex set of color/unit/size combinations across Women's and Men's categories. | Parameter | Type | Range / Values | |-----------|------|---------------| | `MANUFACT` | integer | 667–1000 | | `COLOR` | list (×16) | 16 draws from 67 color names | | `UNIT` | list (×16) | 16 draws from unit types | | `SIZE` | list (×6) | 6 draws from size values |
TPC-DS Reference SQL (query41.tpl) ```sql [_LIMITA] select [_LIMITB] distinct(i_product_name) from item i1 where i_manufact_id between [MANUFACT] and [MANUFACT]+40 and (select count(*) as item_cnt from item where (i_manufact = i1.i_manufact and ((i_category = 'Women' and (i_color = '[COLOR.1]' or i_color = '[COLOR.2]') and (i_units = '[UNIT.1]' or i_units = '[UNIT.2]') and (i_size = '[SIZE.1]' or i_size = '[SIZE.2]')) or (i_category = 'Women' and ... ) or (i_category = 'Men' and ...) or (i_category = 'Men' and ...))) or (i_manufact = i1.i_manufact and ((i_category = 'Women' and ... ) or ... ))) > 0 order by i_product_name [_LIMITC]; ```
--- ## Q42 — Store Sales by Category and Manager Summarizes total extended sales price per year, category, and category ID for store manager ID = 1, in a given month and year. | Parameter | Type | Range / Values | |-----------|------|---------------| | `MONTH` | integer | 11–12 | | `YEAR` | integer | 1998–2002 |
TPC-DS Reference SQL (query42.tpl) ```sql [_LIMITA] select [_LIMITB] dt.d_year, item.i_category_id, item.i_category, sum(ss_ext_sales_price) from date_dim dt, store_sales, item where dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = item.i_item_sk and item.i_manager_id = 1 and dt.d_moy = [MONTH] and dt.d_year = [YEAR] group by dt.d_year, item.i_category_id, item.i_category order by sum(ss_ext_sales_price) desc, dt.d_year, item.i_category_id, item.i_category [_LIMITC]; ```
--- ## Q43 — Store Sales by Day-of-Week per Store Pivots store sales by day of the week (Sun–Sat), for stores in a given GMT offset and year. | Parameter | Type | Range / Values | |-----------|------|---------------| | `COUNTY` | integer | random active county | | `GMT` | list | GMT offset derived from county | | `YEAR` | integer | 1998–2002 |
TPC-DS Reference SQL (query43.tpl) ```sql [_LIMITA] select [_LIMITB] s_store_name, s_store_id, sum(case when d_day_name='Sunday' then ss_sales_price else null end) sun_sales, sum(case when d_day_name='Monday' then ss_sales_price else null end) mon_sales, sum(case when d_day_name='Tuesday' then ss_sales_price else null end) tue_sales, sum(case when d_day_name='Wednesday' then ss_sales_price else null end) wed_sales, sum(case when d_day_name='Thursday' then ss_sales_price else null end) thu_sales, sum(case when d_day_name='Friday' then ss_sales_price else null end) fri_sales, sum(case when d_day_name='Saturday' then ss_sales_price else null end) sat_sales from date_dim, store_sales, store where d_date_sk = ss_sold_date_sk and s_store_sk = ss_store_sk and s_gmt_offset = [GMT] and d_year = [YEAR] group by s_store_name, s_store_id order by s_store_name, s_store_id, sun_sales, mon_sales, tue_sales, wed_sales, thu_sales, fri_sales, sat_sales [_LIMITC]; ```
--- ## Q44 — Best and Worst Performing Items in a Store Ranks items in a given store by net profit and joins the top-10 ascending with top-10 descending to find the best-performing vs. worst-performing item pairs. | Parameter | Type | Range / Values | |-----------|------|---------------| | `STORE` | integer | random store ID | | `NULLCOLSS` | list | `ss_customer_sk`, `ss_cdemo_sk`, `ss_hdemo_sk`, `ss_addr_sk`, `ss_promo_sk` |
TPC-DS Reference SQL (query44.tpl) ```sql [_LIMITA] select [_LIMITB] asceding.rnk, i1.i_product_name best_performing, i2.i_product_name worst_performing from (select * from (select item_sk, rank() over (order by rank_col asc) rnk from (select ss_item_sk item_sk, avg(ss_net_profit) rank_col from store_sales ss1 where ss_store_sk = [STORE] group by ss_item_sk having avg(ss_net_profit) > 0.9*(select avg(ss_net_profit) rank_col from store_sales where ss_store_sk = [STORE] and [NULLCOLSS] is null group by ss_store_sk)) V1) V11 where rnk < 11) asceding, (select * from (select item_sk, rank() over (order by rank_col desc) rnk from (...same avg logic...) V2) V21 where rnk < 11) descending, item i1, item i2 where asceding.rnk = descending.rnk and i1.i_item_sk = asceding.item_sk and i2.i_item_sk = descending.item_sk order by asceding.rnk [_LIMITC]; ```
--- ## Q45 — Web Sales by ZIP and Geography Sums web sales for customers in specific ZIP codes or for items with specific item IDs, grouped by customer ZIP and a geographic breakdown column. | Parameter | Type | Range / Values | |-----------|------|---------------| | `GBOBC` | list | `ca_city`, `ca_county`, `ca_state` | | `YEAR` | integer | 1998–2002 | | `QOY` | integer | 1–2 |
TPC-DS Reference SQL (query45.tpl) ```sql [_LIMITA] select [_LIMITB] ca_zip, [GBOBC], sum(ws_sales_price) from web_sales, customer, customer_address, date_dim, item where ws_bill_customer_sk = c_customer_sk and c_current_addr_sk = ca_address_sk and ws_item_sk = i_item_sk and (substr(ca_zip,1,5) in ('85669','86197','88274','83405','86475', '85392','85460','80348','81792') or i_item_id in (select i_item_id from item where i_item_sk in (2,3,5,7,11,13,17,19,23,29))) and ws_sold_date_sk = d_date_sk and d_qoy = [QOY] and d_year = [YEAR] group by ca_zip, [GBOBC] order by ca_zip, [GBOBC] [_LIMITC]; ```
--- ## Q46 — Customer Coupon and Profit by City (Weekend Purchases) Finds customers whose current city differs from the city where they bought, for weekend purchases in high-vehicle-count or specific dependent-count households. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DEPCNT` | integer | 0–9 | | `YEAR` | integer | 1998–2000 | | `VEHCNT` | integer | -1–4 | | `CITY` | list (×5) | 5 draws from active store cities — substituted as `{CITY.1}`…`{CITY.5}` |
TPC-DS Reference SQL (query46.tpl) ```sql [_LIMITA] select [_LIMITB] c_last_name, c_first_name, ca_city, bought_city, ss_ticket_number, amt, profit from (select ss_ticket_number, ss_customer_sk, ca_city bought_city, sum(ss_coupon_amt) amt, sum(ss_net_profit) profit from store_sales, date_dim, store, household_demographics, customer_address where store_sales.ss_sold_date_sk = date_dim.d_date_sk and store_sales.ss_store_sk = store.s_store_sk and store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk and store_sales.ss_addr_sk = customer_address.ca_address_sk and (household_demographics.hd_dep_count = [DEPCNT] or household_demographics.hd_vehicle_count = [VEHCNT]) and date_dim.d_dow in (6,0) and date_dim.d_year in ([YEAR],[YEAR]+1,[YEAR]+2) and store.s_city in ('[CITY_A]','[CITY_B]','[CITY_C]','[CITY_D]','[CITY_E]') group by ss_ticket_number, ss_customer_sk, ss_addr_sk, ca_city) dn, customer, customer_address current_addr where ss_customer_sk = c_customer_sk and customer.c_current_addr_sk = current_addr.ca_address_sk and current_addr.ca_city <> bought_city order by c_last_name, c_first_name, ca_city, bought_city, ss_ticket_number [_LIMITC]; ```
--- ## Q47 — Month-over-Month Sales Volatility by Store and Brand Computes monthly sales vs. 12-month average (within the same category, brand, store) and selects rows where monthly deviation exceeds 10% of the annual average. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1999–2001 | | `SELECTONE` | list | various projection column combinations | | `SELECTTWO` | list | `,v1.d_year` or `,v1.d_year, v1.d_moy` | | `ORDERBY` | list | `avg_monthly_sales`, `sum_sales`, `psum`, `nsum` |
TPC-DS Reference SQL (query47.tpl) ```sql with v1 as ( select i_category, i_brand, s_store_name, s_company_name, d_year, d_moy, sum(ss_sales_price) sum_sales, avg(sum(ss_sales_price)) over (partition by i_category, i_brand, s_store_name, s_company_name, d_year) avg_monthly_sales, rank() over (partition by i_category, i_brand, s_store_name, s_company_name order by d_year, d_moy) rn from item, store_sales, date_dim, store where ss_item_sk = i_item_sk and ss_sold_date_sk = d_date_sk and ss_store_sk = s_store_sk and (d_year = [YEAR] or (d_year = [YEAR]-1 and d_moy = 12) or (d_year = [YEAR]+1 and d_moy = 1)) group by i_category, i_brand, s_store_name, s_company_name, d_year, d_moy), v2 as ( select [SELECTONE][SELECTTWO], v1.avg_monthly_sales, v1.sum_sales, v1_lag.sum_sales psum, v1_lead.sum_sales nsum from v1, v1 v1_lag, v1 v1_lead where v1.rn = v1_lag.rn + 1 and v1.rn = v1_lead.rn - 1 and same category/brand/store conditions) [_LIMITA] select [_LIMITB] * from v2 where d_year = [YEAR] and avg_monthly_sales > 0 and abs(sum_sales - avg_monthly_sales)/avg_monthly_sales > 0.1 order by sum_sales - avg_monthly_sales, [ORDERBY] [_LIMITC]; ```
--- ## Q48 — Store Sales Quantity by Demographic and State Sums quantities from store sales combining three groups of demographic filters (marital status, education, price range) with three groups of state and profit filters. | Parameter | Type | Range / Values | |-----------|------|---------------| | `MS` | list (×3) | 3 draws from marital status — substituted as `{MS.1}`…`{MS.3}` | | `ES` | list (×3) | 3 draws from education — substituted as `{ES.1}`…`{ES.3}` | | `STATE` | list (×9) | 9 draws from state codes — substituted as `{STATE.1}`…`{STATE.9}` | | `YEAR` | integer | 1998–2002 |
TPC-DS Reference SQL (query48.tpl) ```sql select sum(ss_quantity) from store_sales, store, customer_demographics, customer_address, date_dim where s_store_sk = ss_store_sk and ss_sold_date_sk = d_date_sk and d_year = [YEAR] and ((cd_demo_sk = ss_cdemo_sk and cd_marital_status = '[MS.1]' and cd_education_status = '[ES.1]' and ss_sales_price between 100.00 and 150.00) or (cd_demo_sk = ss_cdemo_sk and cd_marital_status = '[MS.2]' and cd_education_status = '[ES.2]' and ss_sales_price between 50.00 and 100.00) or (cd_demo_sk = ss_cdemo_sk and cd_marital_status = '[MS.3]' and cd_education_status = '[ES.3]' and ss_sales_price between 150.00 and 200.00)) and ((ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('[STATE.1]','[STATE.2]','[STATE.3]') and ss_net_profit between 0 and 2000) or (ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('[STATE.4]','[STATE.5]','[STATE.6]') and ss_net_profit between 150 and 3000) or (ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('[STATE.7]','[STATE.8]','[STATE.9]') and ss_net_profit between 50 and 25000)); ```
--- ## Q49 — Top-10 High-Return-Ratio Items by Channel Ranks items in each of three channels (web, catalog, store) by return-to-sales quantity ratio and currency ratio, returning items in the top 10 by either metric. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MONTH` | integer | 11–12 |
TPC-DS Reference SQL (query49.tpl) ```sql [_LIMITA] select [_LIMITB] channel, item, return_ratio, return_rank, currency_rank from (select 'web' as channel, web.item, web.return_ratio, web.return_rank, web.currency_rank from (select item, return_ratio, currency_ratio, rank() over (order by return_ratio) as return_rank, rank() over (order by currency_ratio) as currency_rank from (select ws.ws_item_sk as item, (sum(coalesce(wr.wr_return_quantity,0)) / sum(coalesce(ws.ws_quantity,0))) as return_ratio, (sum(coalesce(wr.wr_return_amt,0)) / sum(coalesce(ws.ws_net_paid,0))) as currency_ratio from web_sales ws left outer join web_returns wr on (ws.ws_order_number = wr.wr_order_number and ws.ws_item_sk = wr.wr_item_sk), date_dim where wr.wr_return_amt > 10000 and ws.ws_net_profit > 1 and ws_sold_date_sk = d_date_sk and d_year = [YEAR] and d_moy = [MONTH] group by ws.ws_item_sk) in_web) web where web.return_rank <= 10 or web.currency_rank <= 10 union ... same for catalog and store ...) order by 1, 4, 5, 2 [_LIMITC]; ```
--- ## Q50 — Store Returns Aged by Return Latency Reports counts of returns in each time bucket (≤30 days, 31-60, 61-90, 91-120, >120 days) per store address, for returns in a given month and year. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MONTH` | integer | 8–10 |
TPC-DS Reference SQL (query50.tpl) ```sql [_LIMITA] select [_LIMITB] s_store_name, s_company_id, s_street_number, s_street_name, s_street_type, s_suite_number, s_city, s_county, s_state, s_zip, sum(case when (sr_returned_date_sk - ss_sold_date_sk <= 30) then 1 else 0 end) as "30 days", sum(case when (sr_returned_date_sk - ss_sold_date_sk > 30 and sr_returned_date_sk - ss_sold_date_sk <= 60) then 1 else 0 end) as "31-60 days", sum(case when (sr_returned_date_sk - ss_sold_date_sk > 60 and sr_returned_date_sk - ss_sold_date_sk <= 90) then 1 else 0 end) as "61-90 days", sum(case when (sr_returned_date_sk - ss_sold_date_sk > 90 and sr_returned_date_sk - ss_sold_date_sk <= 120) then 1 else 0 end) as "91-120 days", sum(case when (sr_returned_date_sk - ss_sold_date_sk > 120) then 1 else 0 end) as ">120 days" from store_sales, store_returns, store, date_dim d1, date_dim d2 where d2.d_year = [YEAR] and d2.d_moy = [MONTH] and ss_ticket_number = sr_ticket_number and ss_item_sk = sr_item_sk and ss_sold_date_sk = d1.d_date_sk and sr_returned_date_sk = d2.d_date_sk and ss_customer_sk = sr_customer_sk and ss_store_sk = s_store_sk group by s_store_name, s_company_id, s_street_number, s_street_name, s_street_type, s_suite_number, s_city, s_county, s_state, s_zip order by s_store_name, s_company_id, ... [_LIMITC]; ```
--- ## Q51 — Web vs. Store Cumulative Sales Race Computes running cumulative web and store sales per item (ordered by date), then finds items where cumulative web sales ever exceed cumulative store sales. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 |
TPC-DS Reference SQL (query51.tpl) ```sql with web_v1 as ( select ws_item_sk item_sk, d_date, sum(sum(ws_sales_price)) over (partition by ws_item_sk order by d_date rows between unbounded preceding and current row) cume_sales from web_sales, date_dim where ws_sold_date_sk = d_date_sk and d_month_seq between [DMS] and [DMS]+11 and ws_item_sk is not null group by ws_item_sk, d_date), store_v1 as (... same for store_sales ...) [_LIMITA] select [_LIMITB] * from (select item_sk, d_date, web_sales, store_sales, max(web_sales) over (partition by item_sk order by d_date rows between unbounded preceding and current row) web_cumulative, max(store_sales) over (partition by item_sk order by d_date rows between unbounded preceding and current row) store_cumulative from (select coalesce(web.item_sk, store.item_sk) item_sk, coalesce(web.d_date, store.d_date) d_date, web.cume_sales web_sales, store.cume_sales store_sales from web_v1 web full outer join store_v1 store on (web.item_sk = store.item_sk and web.d_date = store.d_date)) x) y where web_cumulative > store_cumulative order by item_sk, d_date [_LIMITC]; ```
MySQL dialect MySQL does not support `FULL OUTER JOIN`. The bexhoma MySQL variant rewrites the join as `UNION ALL + GROUP BY + COALESCE()` to simulate a full outer join.
--- ## Q52 — Store Sales by Brand and Month (Manager 1) Similar to Q42 but groups by brand instead of category, reporting total external sales price for manager ID = 1 in a given month and year. | Parameter | Type | Range / Values | |-----------|------|---------------| | `MONTH` | integer | 11–12 | | `YEAR` | integer | 1998–2002 |
TPC-DS Reference SQL (query52.tpl) ```sql [_LIMITA] select [_LIMITB] dt.d_year, item.i_brand_id brand_id, item.i_brand brand, sum(ss_ext_sales_price) ext_price from date_dim dt, store_sales, item where dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = item.i_item_sk and item.i_manager_id = 1 and dt.d_moy = [MONTH] and dt.d_year = [YEAR] group by dt.d_year, item.i_brand, item.i_brand_id order by dt.d_year, ext_price desc, brand_id [_LIMITC]; ```
--- ## Q53 — Manufacturer Quarterly Sales Deviation Computes quarterly sales sum vs. the 12-month average per manufacturer for specific category, class, and brand combinations, finding those with > 10% deviation. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 |
TPC-DS Reference SQL (query53.tpl) ```sql [_LIMITA] select [_LIMITB] * from (select i_manufact_id, sum(ss_sales_price) sum_sales, avg(sum(ss_sales_price)) over (partition by i_manufact_id) avg_quarterly_sales from item, store_sales, date_dim, store where ss_item_sk = i_item_sk and ss_sold_date_sk = d_date_sk and ss_store_sk = s_store_sk and d_month_seq in ([DMS],[DMS]+1,...,[DMS]+11) and ((i_category in ('Books','Children','Electronics') and i_class in ('personal','portable','reference','self-help') and i_brand in ('scholaramalgamalg #14','scholaramalgamalg #7', 'exportiunivamalg #9','scholaramalgamalg #9')) or (i_category in ('Women','Music','Men') and i_class in ('accessories','classical','fragrances','pants') and i_brand in ('amalgimporto #1','edu packscholar #1', 'exportiimporto #1','importoamalg #1'))) group by i_manufact_id, d_qoy) tmp1 where abs(sum_sales - avg_quarterly_sales) / avg_quarterly_sales > 0.1 order by avg_quarterly_sales, sum_sales, i_manufact_id [_LIMITC]; ```
--- ## Q54 — Customer Revenue Segmentation Post-Purchase Finds customers who purchased items in a specific category and class via catalog or web in a given month, then segments them by their subsequent store revenue in the next 3 months. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MONTH` | integer | 1–7 | | `CATEGORY` | list | derived from categories distribution | | `CLASS` | list | derived from category-specific class distribution |
TPC-DS Reference SQL (query54.tpl) ```sql with my_customers as ( select distinct c_customer_sk, c_current_addr_sk from (select cs_sold_date_sk sold_date_sk, cs_bill_customer_sk customer_sk, cs_item_sk item_sk from catalog_sales union all select ws_sold_date_sk, ws_bill_customer_sk, ws_item_sk from web_sales) cs_or_ws_sales, item, date_dim, customer where sold_date_sk = d_date_sk and item_sk = i_item_sk and i_category = '[CATEGORY]' and i_class = '[CLASS]' and c_customer_sk = cs_or_ws_sales.customer_sk and d_moy = [MONTH] and d_year = [YEAR]), my_revenue as ( select c_customer_sk, sum(ss_ext_sales_price) as revenue from my_customers, store_sales, customer_address, store, date_dim where c_current_addr_sk = ca_address_sk and ca_county = s_county and ca_state = s_state and ss_sold_date_sk = d_date_sk and c_customer_sk = ss_customer_sk and d_month_seq between (select distinct d_month_seq+1 from date_dim where d_year = [YEAR] and d_moy = [MONTH]) and (select distinct d_month_seq+3 from date_dim where d_year = [YEAR] and d_moy = [MONTH]) group by c_customer_sk), segments as (select cast((revenue/50) as int) as segment from my_revenue) [_LIMITA] select [_LIMITB] segment, count(*) as num_customers, segment*50 as segment_base from segments group by segment order by segment, num_customers [_LIMITC]; ```
MariaDB dialect The bexhoma MariaDB variant rewrites the `with` clause because MariaDB cannot reference one CTE from another. The CTEs are inlined as subqueries or wrapped in outer `with total as (...)`.
--- ## Q55 — Store Brand Sales by Month and Manager Reports total external sales price per brand for a given store manager, month, and year. Similar to Q52 but with a configurable manager ID. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MONTH` | integer | 11–12 | | `MANAGER` | integer | 1–100 |
TPC-DS Reference SQL (query55.tpl) ```sql [_LIMITA] select [_LIMITB] i_brand_id brand_id, i_brand brand, sum(ss_ext_sales_price) ext_price from date_dim, store_sales, item where d_date_sk = ss_sold_date_sk and ss_item_sk = i_item_sk and i_manager_id = [MANAGER] and d_moy = [MONTH] and d_year = [YEAR] group by i_brand, i_brand_id order by ext_price desc, i_brand_id [_LIMITC]; ```
--- ## Q56 — Cross-Channel Sales by Color in a Region Sums total external sales price across all three channels for items matching 3 specific colors, in a given GMT offset, year, and month. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MONTH` | integer | 1–7 | | `GMT` | list | GMT offset derived from county | | `COLOR` | list (×3) | 3 draws from 67 color names — substituted as `{COLOR.1}`…`{COLOR.3}` |
TPC-DS Reference SQL (query56.tpl) ```sql with ss as ( select i_item_id, sum(ss_ext_sales_price) total_sales from store_sales, date_dim, customer_address, item where i_item_id in (select i_item_id from item where i_color in ('[COLOR.1]','[COLOR.2]','[COLOR.3]')) and ss_item_sk = i_item_sk and ss_sold_date_sk = d_date_sk and d_year = [YEAR] and d_moy = [MONTH] and ss_addr_sk = ca_address_sk and ca_gmt_offset = [GMT] group by i_item_id), cs as (... same for catalog_sales ...), ws as (... same for web_sales ...) [_LIMITA] select [_LIMITB] i_item_id, sum(total_sales) total_sales from (select * from ss union all select * from cs union all select * from ws) tmp1 group by i_item_id order by total_sales, i_item_id [_LIMITC]; ```
--- ## Q57 — Catalog Sales Monthly Volatility by Call Center Analogous to Q47 but for catalog sales grouped by call center instead of store, finding brands with > 10% deviation from monthly average. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1999–2001 | | `SELECTONE` | list | projection column combinations | | `SELECTTWO` | list | `,v1.d_year` or `,v1.d_year, v1.d_moy` | | `ORDERBY` | list | avg_monthly_sales, sum_sales, psum, nsum |
TPC-DS Reference SQL (query57.tpl) ```sql with v1 as ( select i_category, i_brand, cc_name, d_year, d_moy, sum(cs_sales_price) sum_sales, avg(sum(cs_sales_price)) over (partition by i_category, i_brand, cc_name, d_year) avg_monthly_sales, rank() over (partition by i_category, i_brand, cc_name order by d_year, d_moy) rn from item, catalog_sales, date_dim, call_center where cs_item_sk = i_item_sk and cs_sold_date_sk = d_date_sk and cc_call_center_sk = cs_call_center_sk and (d_year = [YEAR] or (d_year = [YEAR]-1 and d_moy = 12) or (d_year = [YEAR]+1 and d_moy = 1)) group by i_category, i_brand, cc_name, d_year, d_moy), v2 as (select [SELECTONE][SELECTTWO], v1.avg_monthly_sales, v1.sum_sales, v1_lag.sum_sales psum, v1_lead.sum_sales nsum from v1, v1 v1_lag, v1 v1_lead where v1.rn = v1_lag.rn + 1 and v1.rn = v1_lead.rn - 1 and same category/brand/call-center conditions) [_LIMITA] select [_LIMITB] * from v2 where d_year = [YEAR] and avg_monthly_sales > 0 and abs(sum_sales - avg_monthly_sales)/avg_monthly_sales > 0.1 order by sum_sales - avg_monthly_sales, [ORDERBY] [_LIMITC]; ```
--- ## Q58 — Cross-Channel Weekly Revenue Comparison Joins store, catalog, and web revenues for the same items in the same fiscal week, then returns items where all three channels have revenues within 10% of each other. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `SALES_DATE` | date | derived from `{YEAR}` (Jan 1 – Jul 24) |
TPC-DS Reference SQL (query58.tpl) ```sql with ss_items as ( select i_item_id item_id, sum(ss_ext_sales_price) ss_item_rev from store_sales, item, date_dim where ss_item_sk = i_item_sk and d_date in (select d_date from date_dim where d_week_seq = (select d_week_seq from date_dim where d_date = '[SALES_DATE]')) and ss_sold_date_sk = d_date_sk group by i_item_id), cs_items as (... same for catalog_sales ...), ws_items as (... same for web_sales ...) [_LIMITA] select [_LIMITB] ss_items.item_id, ss_item_rev, ss_item_rev/((ss_item_rev+cs_item_rev+ws_item_rev)/3)*100 ss_dev, cs_item_rev, cs_item_rev/((ss_item_rev+cs_item_rev+ws_item_rev)/3)*100 cs_dev, ws_item_rev, ws_item_rev/((ss_item_rev+cs_item_rev+ws_item_rev)/3)*100 ws_dev, (ss_item_rev+cs_item_rev+ws_item_rev)/3 average from ss_items, cs_items, ws_items where ss_items.item_id = cs_items.item_id and ss_items.item_id = ws_items.item_id and ss_item_rev between 0.9*cs_item_rev and 1.1*cs_item_rev and ss_item_rev between 0.9*ws_item_rev and 1.1*ws_item_rev and cs_item_rev between 0.9*ws_item_rev and 1.1*ws_item_rev order by item_id, ss_item_rev [_LIMITC]; ```
--- ## Q59 — Weekly Store Sales YoY Day-of-Week Ratios Computes ratios of weekly store sales (by day of week) comparing two consecutive 12-month periods, for stores with matching IDs. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1212 |
TPC-DS Reference SQL (query59.tpl) ```sql with wss as ( select d_week_seq, ss_store_sk, sum(case when d_day_name='Sunday' then ss_sales_price else null end) sun_sales, sum(case when d_day_name='Monday' then ss_sales_price else null end) mon_sales, sum(case when d_day_name='Tuesday' then ss_sales_price else null end) tue_sales, sum(case when d_day_name='Wednesday' then ss_sales_price else null end) wed_sales, sum(case when d_day_name='Thursday' then ss_sales_price else null end) thu_sales, sum(case when d_day_name='Friday' then ss_sales_price else null end) fri_sales, sum(case when d_day_name='Saturday' then ss_sales_price else null end) sat_sales from store_sales, date_dim where d_date_sk = ss_sold_date_sk group by d_week_seq, ss_store_sk) [_LIMITA] select [_LIMITB] s_store_name1, s_store_id1, d_week_seq1, sun_sales1/sun_sales2, mon_sales1/mon_sales2, tue_sales1/tue_sales2, wed_sales1/wed_sales2, thu_sales1/thu_sales2, fri_sales1/fri_sales2, sat_sales1/sat_sales2 from (select s_store_name s_store_name1, wss.d_week_seq d_week_seq1, s_store_id s_store_id1, sun_sales sun_sales1, mon_sales mon_sales1, ... from wss, store, date_dim d where d.d_week_seq = wss.d_week_seq and ss_store_sk = s_store_sk and d_month_seq between [DMS] and [DMS]+11) y, (select ... from wss, store, date_dim d where d_month_seq between [DMS]+12 and [DMS]+23) x where s_store_id1 = s_store_id2 and d_week_seq1 = d_week_seq2-52 order by s_store_name1, s_store_id1, d_week_seq1 [_LIMITC]; ```
--- ## Q60 — Cross-Channel Sales by Category in a Region Analogous to Q33 and Q56 but filters by item category and GMT offset in months 8–10. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MONTH` | integer | 8–10 | | `GMT` | list | GMT offset derived from county | | `CATEGORY` | list | Children, Men, Music, Jewelry, Shoes |
TPC-DS Reference SQL (query60.tpl) ```sql with ss as ( select i_item_id, sum(ss_ext_sales_price) total_sales from store_sales, date_dim, customer_address, item where i_item_id in (select i_item_id from item where i_category in ('[CATEGORY]')) and ss_item_sk = i_item_sk and ss_sold_date_sk = d_date_sk and d_year = [YEAR] and d_moy = [MONTH] and ss_addr_sk = ca_address_sk and ca_gmt_offset = [GMT] group by i_item_id), cs as (... same for catalog_sales ...), ws as (... same for web_sales ...) [_LIMITA] select [_LIMITB] i_item_id, sum(total_sales) total_sales from (select * from ss union all select * from cs union all select * from ws) tmp1 group by i_item_id order by i_item_id, total_sales [_LIMITC]; ```
--- ## Q61 — Promotional vs. Total Store Sales Ratio Computes the percentage of store sales that came through promotional channels (email, mail, TV) for a given category, GMT offset, month, and year. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MONTH` | integer | 11–12 | | `GMT` | list | -6 or -7 | | `CATEGORY` | list | Books, Home, Electronics, Jewelry, Sports |
TPC-DS Reference SQL (query61.tpl) ```sql [_LIMITA] select [_LIMITB] promotions, total, cast(promotions as decimal(15,4))/cast(total as decimal(15,4))*100 from (select sum(ss_ext_sales_price) promotions from store_sales, store, promotion, date_dim, customer, customer_address, item where ss_sold_date_sk = d_date_sk and ss_store_sk = s_store_sk and ss_promo_sk = p_promo_sk and ss_customer_sk = c_customer_sk and ca_address_sk = c_current_addr_sk and ss_item_sk = i_item_sk and ca_gmt_offset = [GMT] and i_category = '[CATEGORY]' and (p_channel_dmail = 'Y' or p_channel_email = 'Y' or p_channel_tv = 'Y') and s_gmt_offset = [GMT] and d_year = [YEAR] and d_moy = [MONTH]) promotional_sales, (select sum(ss_ext_sales_price) total from store_sales, store, date_dim, customer, customer_address, item where ... same without promotion filter ...) all_sales order by promotions, total [_LIMITC]; ```
--- ## Q62 — Web Order Shipping Latency Buckets Reports counts of web orders in time buckets (≤30 days to >120 days) from sale to ship date, per warehouse, shipping mode, and web site. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 |
TPC-DS Reference SQL (query62.tpl) ```sql [_LIMITA] select [_LIMITB] substr(w_warehouse_name,1,20), sm_type, web_name, sum(case when ws_ship_date_sk - ws_sold_date_sk <= 30 then 1 else 0 end) as "30 days", sum(case when ws_ship_date_sk - ws_sold_date_sk > 30 and ws_ship_date_sk - ws_sold_date_sk <= 60 then 1 else 0 end) as "31-60 days", sum(case when ws_ship_date_sk - ws_sold_date_sk > 60 and ws_ship_date_sk - ws_sold_date_sk <= 90 then 1 else 0 end) as "61-90 days", sum(case when ws_ship_date_sk - ws_sold_date_sk > 90 and ws_ship_date_sk - ws_sold_date_sk <= 120 then 1 else 0 end) as "91-120 days", sum(case when ws_ship_date_sk - ws_sold_date_sk > 120 then 1 else 0 end) as ">120 days" from web_sales, warehouse, ship_mode, web_site, date_dim where d_month_seq between [DMS] and [DMS]+11 and ws_ship_date_sk = d_date_sk and ws_warehouse_sk = w_warehouse_sk and ws_ship_mode_sk = sm_ship_mode_sk and ws_web_site_sk = web_site_sk group by substr(w_warehouse_name,1,20), sm_type, web_name order by substr(w_warehouse_name,1,20), sm_type, web_name [_LIMITC]; ```
--- ## Q63 — Manager-Level Monthly Sales Deviation Analogous to Q53 but groups by manager instead of manufacturer, finding managers with monthly sales deviation > 10% of their personal monthly average. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 |
TPC-DS Reference SQL (query63.tpl) ```sql [_LIMITA] select [_LIMITB] * from (select i_manager_id, sum(ss_sales_price) sum_sales, avg(sum(ss_sales_price)) over (partition by i_manager_id) avg_monthly_sales from item, store_sales, date_dim, store where ss_item_sk = i_item_sk and ss_sold_date_sk = d_date_sk and ss_store_sk = s_store_sk and d_month_seq in ([DMS],[DMS]+1,...,[DMS]+11) and ((i_category in ('Books','Children','Electronics') and ...) or (i_category in ('Women','Music','Men') and ...)) group by i_manager_id, d_moy) tmp1 where abs(sum_sales - avg_monthly_sales)/avg_monthly_sales > 0.1 order by i_manager_id, avg_monthly_sales, sum_sales [_LIMITC]; ```
--- ## Q64 — Cross-Year Item and Store Sales with Customer Demographics One of the most complex TPC-DS queries. Joins store sales, returns, catalog cross-sales, customer demographics, household demographics, addresses, income bands, and promotions to compare two consecutive years (syear and fsyear) for items with low return ratios. | Parameter | Type | Range / Values | |-----------|------|---------------| | `COLOR` | list (×6) | 6 draws from 67 color names — substituted as `{COLOR.1}`…`{COLOR.6}` | | `PRICE` | integer | 0–85 | | `YEAR` | integer | 1999–2001 |
TPC-DS Reference SQL (query64.tpl) ```sql with cs_ui as ( select cs_item_sk, sum(cs_ext_list_price) as sale, sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit) as refund from catalog_sales, catalog_returns where cs_item_sk = cr_item_sk and cs_order_number = cr_order_number group by cs_item_sk having sum(cs_ext_list_price) > 2*sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit)), cross_sales as ( select i_product_name, i_item_sk, s_store_name, s_zip, ... demographics, address, date fields ..., count(*) cnt, sum(ss_wholesale_cost) s1, sum(ss_list_price) s2, sum(ss_coupon_amt) s3 from store_sales, store_returns, cs_ui, date_dim d1, date_dim d2, date_dim d3, store, customer, customer_demographics cd1, customer_demographics cd2, promotion, household_demographics hd1, hd2, customer_address ad1, ad2, income_band ib1, ib2, item where ... complex join conditions ... and cd1.cd_marital_status <> cd2.cd_marital_status and i_color in ('[COLOR.1]',... ,'[COLOR.6]') and i_current_price between [PRICE] and [PRICE]+15 group by i_product_name, i_item_sk, s_store_name, s_zip, ...) select cs1.product_name, cs1.store_name, cs1.store_zip, ..., cs1.syear, cs1.cnt, cs1.s1 as s11, cs1.s2 as s21, cs1.s3 as s31, cs2.s1 as s12, cs2.s2 as s22, cs2.s3 as s32, cs2.syear, cs2.cnt from cross_sales cs1, cross_sales cs2 where cs1.item_sk = cs2.item_sk and cs1.syear = [YEAR] and cs2.syear = [YEAR]+1 and cs2.cnt <= cs1.cnt and cs1.store_name = cs2.store_name and cs1.store_zip = cs2.store_zip order by cs1.product_name, cs1.store_name, cs2.cnt, cs1.s1, cs2.s1; ```
--- ## Q65 — Items Underperforming Store Average Finds items whose revenue in a 12-month sequence is at most 10% of the store's average item revenue. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 |
TPC-DS Reference SQL (query65.tpl) ```sql [_LIMITA] select [_LIMITB] s_store_name, i_item_desc, sc.revenue, i_current_price, i_wholesale_cost, i_brand from store, item, (select ss_store_sk, avg(revenue) as ave from (select ss_store_sk, ss_item_sk, sum(ss_sales_price) as revenue from store_sales, date_dim where ss_sold_date_sk = d_date_sk and d_month_seq between [DMS] and [DMS]+11 group by ss_store_sk, ss_item_sk) sa group by ss_store_sk) sb, (select ss_store_sk, ss_item_sk, sum(ss_sales_price) as revenue from store_sales, date_dim where ss_sold_date_sk = d_date_sk and d_month_seq between [DMS] and [DMS]+11 group by ss_store_sk, ss_item_sk) sc where sb.ss_store_sk = sc.ss_store_sk and sc.revenue <= 0.1 * sb.ave and s_store_sk = sc.ss_store_sk and i_item_sk = sc.ss_item_sk order by s_store_name, i_item_desc [_LIMITC]; ```
--- ## Q66 — Warehouse Monthly Sales and Net by Ship Mode (Pivot) Pivots monthly sales amounts and net amounts for web and catalog channels across all 12 months, per warehouse, filtered by time window and ship mode carrier. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `TIMEONE` | integer | 1–57597 (seconds from midnight) | | `SMC` | list (×2) | 2 draws from carrier names — substituted as `{SMC.1}`, `{SMC.2}` | | `NETONE` | list | web net price column | | `NETTWO` | list | catalog net price column | | `SALESONE` | list | web sales price column | | `SALESTWO` | list | catalog sales price column |
TPC-DS Reference SQL (query66.tpl) ```sql [_LIMITA] select [_LIMITB] w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, ship_carriers, year, sum(jan_sales) as jan_sales, ..., sum(dec_sales) as dec_sales, sum(jan_sales/w_warehouse_sq_ft) as jan_sales_per_sq_foot, ..., sum(jan_net) as jan_net, ..., sum(dec_net) as dec_net from (select w_warehouse_name, ..., '[SMC.1]'||','||'[SMC.2]' as ship_carriers, d_year as year, sum(case when d_moy=1 then [SALESONE]*ws_quantity else 0 end) as jan_sales, ..., sum(case when d_moy=1 then [NETONE]*ws_quantity else 0 end) as jan_net, ... from web_sales, warehouse, date_dim, time_dim, ship_mode where ws_warehouse_sk = w_warehouse_sk and ws_sold_date_sk = d_date_sk and ws_sold_time_sk = t_time_sk and ws_ship_mode_sk = sm_ship_mode_sk and d_year = [YEAR] and t_time between [TIMEONE] and [TIMEONE]+28800 and sm_carrier in ('[SMC.1]','[SMC.2]') group by w_warehouse_name, ..., d_year union all select ... same for catalog_sales ...) x group by w_warehouse_name, ..., ship_carriers, year order by w_warehouse_name [_LIMITC]; ```
MariaDB / MonetDB / PostgreSQL dialects The string concatenation `'[SMC.1]' || ',' || '[SMC.2]'` is standard SQL. The bexhoma MariaDB variant may use `CONCAT()` instead.
--- ## Q67 — Top-100 Items by Category, Brand, and Rollup Ranks items within category by total sales using `ROLLUP` over multiple dimensions, returning only those with rank ≤ 100. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 |
TPC-DS Reference SQL (query67.tpl) ```sql [_LIMITA] select [_LIMITB] * from (select i_category, i_class, i_brand, i_product_name, d_year, d_qoy, d_moy, s_store_id, sum(coalesce(ss_sales_price*ss_quantity,0)) sumsales, rank() over (partition by i_category order by sumsales desc) rk from (select i_category, i_class, i_brand, i_product_name, d_year, d_qoy, d_moy, s_store_id, sum(coalesce(ss_sales_price*ss_quantity,0)) sumsales from store_sales, date_dim, store, item where ss_sold_date_sk = d_date_sk and ss_item_sk = i_item_sk and ss_store_sk = s_store_sk and d_month_seq between [DMS] and [DMS]+11 group by rollup(i_category, i_class, i_brand, i_product_name, d_year, d_qoy, d_moy, s_store_id)) dw1) dw2 where rk <= 100 order by i_category, i_class, i_brand, i_product_name, d_year, d_qoy, d_moy, s_store_id, sumsales, rk [_LIMITC]; ```
MariaDB / MonetDB / PostgreSQL dialects MariaDB uses `GROUP BY i_category, i_class, ... WITH ROLLUP`. MonetDB and PostgreSQL use standard `GROUP BY ROLLUP(...)`.
--- ## Q68 — Customer City vs. Purchase City Mismatch (2 Cities) Simplified version of Q46 with only 2 store cities and purchases on days-of-month 1–2. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DEPCNT` | integer | 0–9 | | `YEAR` | integer | 1998–2000 | | `VEHCNT` | integer | -1–4 | | `CITY` | list (×2) | 2 draws from active store cities — substituted as `{CITY.1}`, `{CITY.2}` |
TPC-DS Reference SQL (query68.tpl) ```sql [_LIMITA] select [_LIMITB] c_last_name, c_first_name, ca_city, bought_city, ss_ticket_number, extended_price, extended_tax, list_price from (select ss_ticket_number, ss_customer_sk, ca_city bought_city, sum(ss_ext_sales_price) extended_price, sum(ss_ext_list_price) list_price, sum(ss_ext_tax) extended_tax from store_sales, date_dim, store, household_demographics, customer_address where store_sales.ss_sold_date_sk = date_dim.d_date_sk and store_sales.ss_store_sk = store.s_store_sk and store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk and store_sales.ss_addr_sk = customer_address.ca_address_sk and date_dim.d_dom between 1 and 2 and (household_demographics.hd_dep_count = [DEPCNT] or household_demographics.hd_vehicle_count = [VEHCNT]) and date_dim.d_year in ([YEAR],[YEAR]+1,[YEAR]+2) and store.s_city in ('[CITY_A]','[CITY_B]') group by ss_ticket_number, ss_customer_sk, ss_addr_sk, ca_city) dn, customer, customer_address current_addr where ss_customer_sk = c_customer_sk and customer.c_current_addr_sk = current_addr.ca_address_sk and current_addr.ca_city <> bought_city order by c_last_name, ss_ticket_number [_LIMITC]; ```
--- ## Q69 — Store-Only Customers Not Buying Web or Catalog Finds customers who shopped in-store in a 3-month window but NOT via web or catalog in the same period, reporting demographic distributions. | Parameter | Type | Range / Values | |-----------|------|---------------| | `MONTH` | integer | 1–4 | | `YEAR` | integer | 1999–2004 | | `STATE` | list (×3) | 3 draws from state codes — substituted as `{STATE.1}`…`{STATE.3}` |
TPC-DS Reference SQL (query69.tpl) ```sql [_LIMITA] select [_LIMITB] cd_gender, cd_marital_status, cd_education_status, count(*) cnt1, cd_purchase_estimate, count(*) cnt2, cd_credit_rating, count(*) cnt3 from customer c, customer_address ca, customer_demographics where c.c_current_addr_sk = ca.ca_address_sk and ca_state in ('[STATE.1]','[STATE.2]','[STATE.3]') and cd_demo_sk = c.c_current_cdemo_sk and exists (select * from store_sales, date_dim where c.c_customer_sk = ss_customer_sk and ss_sold_date_sk = d_date_sk and d_year = [YEAR] and d_moy between [MONTH] and [MONTH]+2) and not exists (select * from web_sales, date_dim where c.c_customer_sk = ws_bill_customer_sk and ws_sold_date_sk = d_date_sk and d_year = [YEAR] and d_moy between [MONTH] and [MONTH]+2) and not exists (select * from catalog_sales, date_dim where c.c_customer_sk = cs_ship_customer_sk and cs_sold_date_sk = d_date_sk and d_year = [YEAR] and d_moy between [MONTH] and [MONTH]+2) group by cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating order by cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating [_LIMITC]; ```
--- ## Q70 — Store Net Profit by State and County with Rollup Ranks stores within the top-5 states by net profit, using `ROLLUP` to generate subtotals at state and county level. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 |
TPC-DS Reference SQL (query70.tpl) ```sql [_LIMITA] select [_LIMITB] sum(ss_net_profit) as total_sum, s_state, s_county, grouping(s_state)+grouping(s_county) as lochierarchy, rank() over (partition by grouping(s_state)+grouping(s_county), case when grouping(s_county) = 0 then s_state end order by sum(ss_net_profit) desc) as rank_within_parent from store_sales, date_dim d1, store where d1.d_month_seq between [DMS] and [DMS]+11 and d1.d_date_sk = ss_sold_date_sk and s_store_sk = ss_store_sk and s_state in (select s_state from (select s_state, rank() over (partition by s_state order by sum(ss_net_profit) desc) as ranking from store_sales, store, date_dim where d_month_seq between [DMS] and [DMS]+11 and d_date_sk = ss_sold_date_sk and s_store_sk = ss_store_sk group by s_state) tmp1 where ranking <= 5) group by rollup(s_state, s_county) order by lochierarchy desc, case when lochierarchy = 0 then s_state end, rank_within_parent [_LIMITC]; ```
MariaDB / MonetDB / PostgreSQL dialects MariaDB uses `GROUP BY s_state, s_county WITH ROLLUP`. MonetDB and PostgreSQL use standard `GROUP BY ROLLUP(s_state, s_county)`.
--- ## Q71 — Cross-Channel Brand Sales by Time of Day Sums external sales across all three channels for manager ID = 1 brands, sold during breakfast or dinner meal times in a given month and year. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MONTH` | integer | 11–12 |
TPC-DS Reference SQL (query71.tpl) ```sql select i_brand_id brand_id, i_brand brand, t_hour, t_minute, sum(ext_price) ext_price from item, (select ws_ext_sales_price as ext_price, ws_sold_date_sk as sold_date_sk, ws_item_sk as sold_item_sk, ws_sold_time_sk as time_sk from web_sales, date_dim where d_date_sk = ws_sold_date_sk and d_moy = [MONTH] and d_year = [YEAR] union all select cs_ext_sales_price, cs_sold_date_sk, cs_item_sk, cs_sold_time_sk from catalog_sales, date_dim where d_date_sk = cs_sold_date_sk and d_moy = [MONTH] and d_year = [YEAR] union all select ss_ext_sales_price, ss_sold_date_sk, ss_item_sk, ss_sold_time_sk from store_sales, date_dim where d_date_sk = ss_sold_date_sk and d_moy = [MONTH] and d_year = [YEAR]) tmp, time_dim where sold_item_sk = i_item_sk and i_manager_id = 1 and time_sk = t_time_sk and (t_meal_time = 'breakfast' or t_meal_time = 'dinner') group by i_brand, i_brand_id, t_hour, t_minute order by ext_price desc, i_brand_id; ```
--- ## Q72 — Catalog Orders with Inventory Shortfalls (No Promotion or Returned) Counts and breaks out catalog orders where inventory was insufficient at sale time, shipped after 5+ days, without promotion or with returns. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `BP` | list | `1001-5000`, `>10000`, `501-1000` | | `MS` | list | marital status codes |
TPC-DS Reference SQL (query72.tpl) ```sql [_LIMITA] select [_LIMITB] i_item_desc, w_warehouse_name, d1.d_week_seq, sum(case when p_promo_sk is null then 1 else 0 end) no_promo, sum(case when p_promo_sk is not null then 1 else 0 end) promo, count(*) total_cnt from catalog_sales join inventory on (cs_item_sk = inv_item_sk) join warehouse on (w_warehouse_sk = inv_warehouse_sk) join item on (i_item_sk = cs_item_sk) join customer_demographics on (cs_bill_cdemo_sk = cd_demo_sk) join household_demographics on (cs_bill_hdemo_sk = hd_demo_sk) join date_dim d1 on (cs_sold_date_sk = d1.d_date_sk) join date_dim d2 on (inv_date_sk = d2.d_date_sk) join date_dim d3 on (cs_ship_date_sk = d3.d_date_sk) left outer join promotion on (cs_promo_sk = p_promo_sk) left outer join catalog_returns on (cr_item_sk = cs_item_sk and cr_order_number = cs_order_number) where d1.d_week_seq = d2.d_week_seq and inv_quantity_on_hand < cs_quantity and d3.d_date > d1.d_date + 5 and hd_buy_potential = '[BP]' and d1.d_year = [YEAR] and cd_marital_status = '[MS]' group by i_item_desc, w_warehouse_name, d1.d_week_seq order by total_cnt desc, i_item_desc, w_warehouse_name, d_week_seq [_LIMITC]; ```
--- ## Q73 — Customer Ticket Frequency (1–5 Purchases) in Specific Counties Finds customers with 1–5 store purchases in households with good buy potential, for early-month (dom 1–2) purchases across 4 counties. | Parameter | Type | Range / Values | |-----------|------|---------------| | `BPONE` | list | `1001-5000`, `>10000`, `501-1000` | | `BPTWO` | list | `0-500`, `Unknown`, `5001-10000` | | `YEAR` | integer | 1998–2000 | | `COUNTY` | list (×4) | 4 draws from active store counties — substituted as `{COUNTY.1}`…`{COUNTY.4}` |
TPC-DS Reference SQL (query73.tpl) ```sql select c_last_name, c_first_name, c_salutation, c_preferred_cust_flag, ss_ticket_number, cnt from (select ss_ticket_number, ss_customer_sk, count(*) cnt from store_sales, date_dim, store, household_demographics where store_sales.ss_sold_date_sk = date_dim.d_date_sk and store_sales.ss_store_sk = store.s_store_sk and store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk and date_dim.d_dom between 1 and 2 and (household_demographics.hd_buy_potential = '[BPONE]' or household_demographics.hd_buy_potential = '[BPTWO]') and household_demographics.hd_vehicle_count > 0 and household_demographics.hd_dep_count/household_demographics.hd_vehicle_count > 1 and date_dim.d_year in ([YEAR],[YEAR]+1,[YEAR]+2) and store.s_county in ('[COUNTY.1]','[COUNTY.2]','[COUNTY.3]','[COUNTY.4]') group by ss_ticket_number, ss_customer_sk) dj, customer where ss_customer_sk = c_customer_sk and cnt between 1 and 5 order by cnt desc, c_last_name asc; ```
--- ## Q74 — YoY Growth in Store vs. Web Net Paid Finds customers whose web sales growth year-over-year exceeded their store sales growth, using an aggregate function on net paid amounts. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2001 | | `AGGONE` | list | sum, min, max, avg, stddev_samp | | `ORDERC` | integer (×3) | 3 draws from 1–3 (column positions used in ORDER BY) |
TPC-DS Reference SQL (query74.tpl) ```sql with year_total as ( select c_customer_id customer_id, c_first_name, c_last_name, d_year as year, [AGGONE](ss_net_paid) year_total, 's' sale_type from customer, store_sales, date_dim where c_customer_sk = ss_customer_sk and ss_sold_date_sk = d_date_sk and d_year in ([YEAR],[YEAR]+1) group by c_customer_id, c_first_name, c_last_name, d_year union all select c_customer_id, c_first_name, c_last_name, d_year, [AGGONE](ws_net_paid), 'w' from customer, web_sales, date_dim where c_customer_sk = ws_bill_customer_sk and ws_sold_date_sk = d_date_sk and d_year in ([YEAR],[YEAR]+1) group by c_customer_id, c_first_name, c_last_name, d_year) [_LIMITA] select [_LIMITB] t_s_secyear.customer_id, t_s_secyear.customer_first_name, t_s_secyear.customer_last_name from year_total t_s_firstyear, year_total t_s_secyear, year_total t_w_firstyear, year_total t_w_secyear where t_s_firstyear.customer_id = t_s_secyear.customer_id and t_s_firstyear.customer_id = t_w_firstyear.customer_id and t_s_firstyear.customer_id = t_w_secyear.customer_id and t_s_firstyear.sale_type = 's' and t_w_firstyear.sale_type = 'w' and t_s_secyear.sale_type = 's' and t_w_secyear.sale_type = 'w' and t_s_firstyear.year = [YEAR] and t_s_secyear.year = [YEAR]+1 and t_w_firstyear.year = [YEAR] and t_w_secyear.year = [YEAR]+1 and t_s_firstyear.year_total > 0 and t_w_firstyear.year_total > 0 and t_w_secyear.year_total/t_w_firstyear.year_total > t_s_secyear.year_total/t_s_firstyear.year_total order by [ORDERC.1], [ORDERC.2], [ORDERC.3] [_LIMITC]; ```
--- ## Q75 — Cross-Channel Sales Count and Amount YoY Computes total quantity sold and amount for each item dimension (brand, class, category, manufacturer) across all three channels in two consecutive years, flagging items with more than 10% drop in quantity. | Parameter | Type | Range / Values | |-----------|------|---------------| | `CATEGORY` | list | derived from categories distribution | | `YEAR` | integer | 1999–2002 |
TPC-DS Reference SQL (query75.tpl) ```sql with all_sales as ( select d_year, i_brand_id, i_class_id, i_category_id, i_manufact_id, sum(sales_cnt) as sales_cnt, sum(sales_amt) as sales_amt from (select d_year, i_brand_id, i_class_id, i_category_id, i_manufact_id, cs_quantity - coalesce(cr_return_quantity,0) as sales_cnt, cs_ext_sales_price - coalesce(cr_return_amount,0.0) as sales_amt from catalog_sales join item ... join date_dim ... left join catalog_returns ... where i_category = '[CATEGORY]' union select ... from store_sales ... where i_category = '[CATEGORY]' union select ... from web_sales ... where i_category = '[CATEGORY]') sales_detail group by d_year, i_brand_id, i_class_id, i_category_id, i_manufact_id) [_LIMITA] select [_LIMITB] prev_yr.d_year as prev_year, curr_yr.d_year as year, curr_yr.i_brand_id, curr_yr.i_class_id, curr_yr.i_category_id, curr_yr.i_manufact_id, prev_yr.sales_cnt as prev_yr_cnt, curr_yr.sales_cnt as curr_yr_cnt, curr_yr.sales_cnt - prev_yr.sales_cnt as sales_cnt_diff, curr_yr.sales_amt - prev_yr.sales_amt as sales_amt_diff from all_sales curr_yr, all_sales prev_yr where curr_yr.i_brand_id = prev_yr.i_brand_id and ... and curr_yr.d_year = [YEAR] and prev_yr.d_year = [YEAR]-1 and cast(curr_yr.sales_cnt as decimal(17,2))/cast(prev_yr.sales_cnt as decimal(17,2)) < 0.9 order by sales_cnt_diff, sales_amt_diff [_LIMITC]; ```
--- ## Q76 — Null-Column Sales Count by Channel Finds sales records with a null value in a specific key column (one of many possible nullable FKs) across all three channels, reporting counts and totals by channel, year, quarter, and category. | Parameter | Type | Range / Values | |-----------|------|---------------| | `NULLCOLSS` | list | nullable store sales FK column | | `NULLCOLWS` | list | nullable web sales FK column | | `NULLCOLCS` | list | nullable catalog sales FK column |
TPC-DS Reference SQL (query76.tpl) ```sql [_LIMITA] select [_LIMITB] channel, col_name, d_year, d_qoy, i_category, count(*) sales_cnt, sum(ext_sales_price) sales_amt from (select 'store' as channel, '[NULLCOLSS]' col_name, d_year, d_qoy, i_category, ss_ext_sales_price ext_sales_price from store_sales, item, date_dim where [NULLCOLSS] is null and ss_sold_date_sk = d_date_sk and ss_item_sk = i_item_sk union all select 'web', '[NULLCOLWS]', d_year, d_qoy, i_category, ws_ext_sales_price from web_sales, item, date_dim where [NULLCOLWS] is null and ws_sold_date_sk = d_date_sk and ws_item_sk = i_item_sk union all select 'catalog', '[NULLCOLCS]', d_year, d_qoy, i_category, cs_ext_sales_price from catalog_sales, item, date_dim where [NULLCOLCS] is null and cs_sold_date_sk = d_date_sk and cs_item_sk = i_item_sk) foo group by channel, col_name, d_year, d_qoy, i_category order by channel, col_name, d_year, d_qoy, i_category [_LIMITC]; ```
--- ## Q77 — Cross-Channel Net Profit with Returns (Rollup) Computes sales, returns, and net profit for each channel (store, catalog, web) in a 30-day period, using `ROLLUP` to generate channel-level and grand totals. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `SALES_DATE` | date | derived: Aug 1 – Aug 30 of `{YEAR}` |
TPC-DS Reference SQL (query77.tpl) ```sql with ss as ( select s_store_sk, sum(ss_ext_sales_price) as sales, sum(ss_net_profit) as profit from store_sales, date_dim, store where ss_sold_date_sk = d_date_sk and d_date between '[SALES_DATE]' and '[SALES_DATE]'+30 days and ss_store_sk = s_store_sk group by s_store_sk), sr as (... store returns in same window ...), cs as (... catalog sales ...), cr as (... catalog returns ...), ws as (... web sales ...), wr as (... web returns ...) [_LIMITA] select [_LIMITB] channel, id, sum(sales) as sales, sum(returns) as returns, sum(profit) as profit from (select 'store channel' as channel, ss.s_store_sk as id, sales, coalesce(returns,0), (profit-coalesce(profit_loss,0)) as profit from ss left join sr on ss.s_store_sk = sr.s_store_sk union all select 'catalog channel', cs_call_center_sk, sales, returns, (profit-profit_loss) from cs, cr union all select 'web channel', ws.wp_web_page_sk, sales, coalesce(returns,0), (profit-coalesce(profit_loss,0)) from ws left join wr on ws.wp_web_page_sk = wr.wp_web_page_sk) x group by rollup (channel, id) order by channel, id [_LIMITC]; ```
MariaDB / MonetDB / PostgreSQL dialects MariaDB uses `GROUP BY channel, id WITH ROLLUP`. MonetDB and PostgreSQL use standard `GROUP BY ROLLUP(channel, id)`.
--- ## Q78 — Store-Only Buyers vs. Other-Channel Ratios Computes the ratio of store quantity/cost/price to web+catalog equivalents for customers who did NOT have returns in the respective channel. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `SELECTONE` | list | column(s) used in SELECT and ORDER BY |
TPC-DS Reference SQL (query78.tpl) ```sql with ws as ( select d_year as ws_sold_year, ws_item_sk, ws_bill_customer_sk ws_customer_sk, sum(ws_quantity) ws_qty, sum(ws_wholesale_cost) ws_wc, sum(ws_sales_price) ws_sp from web_sales left join web_returns on wr_order_number = ws_order_number and ws_item_sk = wr_item_sk join date_dim on ws_sold_date_sk = d_date_sk where wr_order_number is null group by d_year, ws_item_sk, ws_bill_customer_sk), cs as (... same for catalog_sales without returns ...), ss as (... same for store_sales without returns ...) [_LIMITA] select [_LIMITB] [SELECTONE], round(ss_qty/(coalesce(ws_qty,0)+coalesce(cs_qty,0)),2) ratio, ss_qty store_qty, ss_wc store_wholesale_cost, ss_sp store_sales_price, coalesce(ws_qty,0)+coalesce(cs_qty,0) other_chan_qty, ... from ss left join ws on (ws_sold_year = ss_sold_year and ws_item_sk = ss_item_sk and ws_customer_sk = ss_customer_sk) left join cs on (cs_sold_year = ss_sold_year and cs_item_sk = ss_item_sk and cs_customer_sk = ss_customer_sk) where (coalesce(ws_qty,0)>0 or coalesce(cs_qty,0)>0) and ss_sold_year = [YEAR] order by [SELECTONE], ss_qty desc, ... [_LIMITC]; ```
--- ## Q79 — Monday Store Purchases in Large Stores Finds customers who bought on Mondays (`d_dow = 1`) in stores with 200–295 employees, across multiple years, reporting coupon amount and net profit. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DEPCNT` | integer | 0–9 | | `YEAR` | integer | 1998–2000 | | `VEHCNT` | integer | -1–4 |
TPC-DS Reference SQL (query79.tpl) ```sql [_LIMITA] select [_LIMITB] c_last_name, c_first_name, substr(s_city,1,30), ss_ticket_number, amt, profit from (select ss_ticket_number, ss_customer_sk, store.s_city, sum(ss_coupon_amt) amt, sum(ss_net_profit) profit from store_sales, date_dim, store, household_demographics where store_sales.ss_sold_date_sk = date_dim.d_date_sk and store_sales.ss_store_sk = store.s_store_sk and store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk and (household_demographics.hd_dep_count = [DEPCNT] or household_demographics.hd_vehicle_count > [VEHCNT]) and date_dim.d_dow = 1 and date_dim.d_year in ([YEAR],[YEAR]+1,[YEAR]+2) and store.s_number_employees between 200 and 295 group by ss_ticket_number, ss_customer_sk, ss_addr_sk, store.s_city) ms, customer where ss_customer_sk = c_customer_sk order by c_last_name, c_first_name, substr(s_city,1,30), profit [_LIMITC]; ```
--- ## Q80 — Channel Profit/Loss by Store/Catalog/Web Page (Rollup) Computes sales, returns, and net profit for each channel (store, catalog, web), filtered to items priced > $50 and not promoted via TV, with `ROLLUP` for totals. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `SALES_DATE` | date | derived: Aug 1 – Aug 30 of `{YEAR}` |
TPC-DS Reference SQL (query80.tpl) ```sql with ssr as ( select s_store_id as store_id, sum(ss_ext_sales_price) as sales, sum(coalesce(sr_return_amt,0)) as returns, sum(ss_net_profit - coalesce(sr_net_loss,0)) as profit from store_sales left outer join store_returns on (ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number), date_dim, store, item, promotion where ss_sold_date_sk = d_date_sk and d_date between '[SALES_DATE]' and '[SALES_DATE]'+30 days and ss_store_sk = s_store_sk and ss_item_sk = i_item_sk and i_current_price > 50 and ss_promo_sk = p_promo_sk and p_channel_tv = 'N' group by s_store_id), csr as (... similar for catalog ...), wsr as (... similar for web ...) [_LIMITA] select [_LIMITB] channel, id, sum(sales) as sales, sum(returns) as returns, sum(profit) as profit from (select 'store channel', 'store'||store_id, sales, returns, profit from ssr union all select 'catalog channel', 'catalog_page'||catalog_page_id, sales, returns, profit from csr union all select 'web channel', 'web_site'||web_site_id, sales, returns, profit from wsr) x group by rollup (channel, id) order by channel, id [_LIMITC]; ```
MariaDB / MonetDB / PostgreSQL dialects MariaDB uses `GROUP BY channel, id WITH ROLLUP`. MonetDB and PostgreSQL use standard `GROUP BY ROLLUP(channel, id)`.
--- ## Q81 — Catalog Returns Above State Average Analogous to Q30 (web returns) but for catalog returns — finds customers whose catalog return total exceeds 120% of the state average in a given year. | Parameter | Type | Range / Values | |-----------|------|---------------| | `STATE` | list | US state codes | | `YEAR` | integer | 1998–2002 |
TPC-DS Reference SQL (query81.tpl) ```sql with customer_total_return as ( select cr_returning_customer_sk as ctr_customer_sk, ca_state as ctr_state, sum(cr_return_amt_inc_tax) as ctr_total_return from catalog_returns, date_dim, customer_address where cr_returned_date_sk = d_date_sk and d_year = [YEAR] and cr_returning_addr_sk = ca_address_sk group by cr_returning_customer_sk, ca_state) [_LIMITA] select [_LIMITB] c_customer_id, c_salutation, c_first_name, c_last_name, ca_street_number, ca_street_name, ca_street_type, ca_suite_number, ca_city, ca_county, ca_state, ca_zip, ca_country, ca_gmt_offset, ca_location_type, ctr_total_return from customer_total_return ctr1, customer_address, customer where ctr1.ctr_total_return > (select avg(ctr_total_return)*1.2 from customer_total_return ctr2 where ctr1.ctr_state = ctr2.ctr_state) and ca_address_sk = c_current_addr_sk and ca_state = '[STATE]' and ctr1.ctr_customer_sk = c_customer_sk order by c_customer_id, c_salutation, ..., ctr_total_return [_LIMITC]; ```
--- ## Q82 — In-Stock Items Sold in Store Analogous to Q37 (catalog) but for store sales — finds items with given price range, 4 specific manufacturer IDs, 100–500 units in inventory, actually sold in-store. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `PRICE` | integer | 0–90 | | `INVDATE` | date | derived: Jan 1 – Jul 24 of `{YEAR}` | | `MANUFACT_ID` | integer (×4) | 4 draws from 1–1000 |
TPC-DS Reference SQL (query82.tpl) ```sql [_LIMITA] select [_LIMITB] i_item_id, i_item_desc, i_current_price from item, inventory, date_dim, store_sales where i_current_price between [PRICE] and [PRICE]+30 and inv_item_sk = i_item_sk and d_date_sk = inv_date_sk and d_date between cast('[INVDATE]' as date) and (cast('[INVDATE]' as date) + 60 days) and i_manufact_id in ([MANUFACT_ID.1],[MANUFACT_ID.2],[MANUFACT_ID.3],[MANUFACT_ID.4]) and inv_quantity_on_hand between 100 and 500 and ss_item_sk = i_item_sk group by i_item_id, i_item_desc, i_current_price order by i_item_id [_LIMITC]; ```
--- ## Q83 — Cross-Channel Return Quantity Comparison Computes return quantities per item for all three channels (store, catalog, web) in weeks that contain any of three given return dates, then joins all three for items present in all channels. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `RETURNED_DATE_ONE` | date | derived: Jan 1 – Jul 24 of `{YEAR}` | | `RETURNED_DATE_TWO` | date | derived: Aug 1 – Oct 24 of `{YEAR}` | | `RETURNED_DATE_THREE` | date | derived: Nov 1 – Nov 24 of `{YEAR}` |
TPC-DS Reference SQL (query83.tpl) ```sql with sr_items as ( select i_item_id item_id, sum(sr_return_quantity) sr_item_qty from store_returns, item, date_dim where sr_item_sk = i_item_sk and d_date in (select d_date from date_dim where d_week_seq in (select d_week_seq from date_dim where d_date in ('[RETURNED_DATE_ONE]', '[RETURNED_DATE_TWO]', '[RETURNED_DATE_THREE]'))) and sr_returned_date_sk = d_date_sk group by i_item_id), cr_items as (... same for catalog_returns ...), wr_items as (... same for web_returns ...) [_LIMITA] select [_LIMITB] sr_items.item_id, sr_item_qty, sr_item_qty/(sr_item_qty+cr_item_qty+wr_item_qty)/3.0*100 sr_dev, cr_item_qty, cr_item_qty/(sr_item_qty+cr_item_qty+wr_item_qty)/3.0*100 cr_dev, wr_item_qty, wr_item_qty/(sr_item_qty+cr_item_qty+wr_item_qty)/3.0*100 wr_dev, (sr_item_qty+cr_item_qty+wr_item_qty)/3.0 average from sr_items, cr_items, wr_items where sr_items.item_id = cr_items.item_id and sr_items.item_id = wr_items.item_id order by sr_items.item_id, sr_item_qty [_LIMITC]; ```
--- ## Q84 — Customers in a City with Store Returns and Income Band Finds customers living in a specific city with income in a $50,000 range who have made store returns, using concatenation for the customer name output. | Parameter | Type | Range / Values | |-----------|------|---------------| | `CITY` | list | from large cities distribution | | `INCOME` | integer | 0–70000 |
TPC-DS Reference SQL (query84.tpl) ```sql [_LIMITA] select [_LIMITB] c_customer_id as customer_id, coalesce(c_last_name,'')||', '||coalesce(c_first_name,'') as customername from customer, customer_address, customer_demographics, household_demographics, income_band, store_returns where ca_city = '[CITY]' and c_current_addr_sk = ca_address_sk and ib_lower_bound >= [INCOME] and ib_upper_bound <= [INCOME] + 50000 and ib_income_band_sk = hd_income_band_sk and cd_demo_sk = c_current_cdemo_sk and hd_demo_sk = c_current_hdemo_sk and sr_cdemo_sk = cd_demo_sk order by c_customer_id [_LIMITC]; ```
MariaDB / MonetDB / PostgreSQL dialects The string concatenation `coalesce(c_last_name,'')||', '||coalesce(c_first_name,'')` is standard SQL. The bexhoma MariaDB variant uses `CONCAT()` instead of `||`.
--- ## Q85 — Web Return Reason Analysis by Demographics Reports average return quantities, refund cash, and fees by return reason, filtered by demographic groups (marital status × education × price range) and state/profit combinations. | Parameter | Type | Range / Values | |-----------|------|---------------| | `MS` | list (×3) | 3 draws from marital status — substituted as `{MS.1}`…`{MS.3}` | | `ES` | list (×3) | 3 draws from education — substituted as `{ES.1}`…`{ES.3}` | | `STATE` | list (×9) | 9 draws from state codes — substituted as `{STATE.1}`…`{STATE.9}` | | `YEAR` | integer | 1998–2002 |
TPC-DS Reference SQL (query85.tpl) ```sql [_LIMITA] select [_LIMITB] substr(r_reason_desc,1,20), avg(ws_quantity), avg(wr_refunded_cash), avg(wr_fee) from web_sales, web_returns, web_page, customer_demographics cd1, customer_demographics cd2, customer_address, date_dim, reason where ws_web_page_sk = wp_web_page_sk and ws_item_sk = wr_item_sk and ws_order_number = wr_order_number and ws_sold_date_sk = d_date_sk and d_year = [YEAR] and cd1.cd_demo_sk = wr_refunded_cdemo_sk and cd2.cd_demo_sk = wr_returning_cdemo_sk and ca_address_sk = wr_refunded_addr_sk and r_reason_sk = wr_reason_sk and ((cd1.cd_marital_status = '[MS.1]' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = '[ES.1]' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 100.00 and 150.00) or (cd1.cd_marital_status = '[MS.2]' and ... and ws_sales_price between 50.00 and 100.00) or (cd1.cd_marital_status = '[MS.3]' and ... and ws_sales_price between 150.00 and 200.00)) and ((ca_country = 'United States' and ca_state in ('[STATE.1]','[STATE.2]','[STATE.3]') and ws_net_profit between 100 and 200) or (... '[STATE.4]'–'[STATE.6]' and profit between 150 and 300) or (... '[STATE.7]'–'[STATE.9]' and profit between 50 and 250)) group by r_reason_desc order by substr(r_reason_desc,1,20), avg(ws_quantity), avg(wr_refunded_cash), avg(wr_fee) [_LIMITC]; ```
--- ## Q86 — Web Net Paid Rollup by Category and Class Ranks web sales net paid within each category using `ROLLUP` over category and class, producing subtotals and grand total. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 |
TPC-DS Reference SQL (query86.tpl) ```sql [_LIMITA] select [_LIMITB] sum(ws_net_paid) as total_sum, i_category, i_class, grouping(i_category)+grouping(i_class) as lochierarchy, rank() over (partition by grouping(i_category)+grouping(i_class), case when grouping(i_class) = 0 then i_category end order by sum(ws_net_paid) desc) as rank_within_parent from web_sales, date_dim d1, item where d1.d_month_seq between [DMS] and [DMS]+11 and d1.d_date_sk = ws_sold_date_sk and i_item_sk = ws_item_sk group by rollup(i_category, i_class) order by lochierarchy desc, case when lochierarchy = 0 then i_category end, rank_within_parent [_LIMITC]; ```
MariaDB / MonetDB / PostgreSQL dialects MariaDB uses `GROUP BY i_category, i_class WITH ROLLUP`. MonetDB and PostgreSQL use standard `GROUP BY ROLLUP(i_category, i_class)`.
--- ## Q87 — Store-Only Customers (EXCEPT Catalog and Web) Counts customers who purchased in-store but not via catalog and not via web in a given 12-month period, using set difference (`EXCEPT`). | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 |
TPC-DS Reference SQL (query87.tpl) ```sql select count(*) from ((select distinct c_last_name, c_first_name, d_date from store_sales, date_dim, customer where store_sales.ss_sold_date_sk = date_dim.d_date_sk and store_sales.ss_customer_sk = customer.c_customer_sk and d_month_seq between [DMS] and [DMS]+11) except (select distinct c_last_name, c_first_name, d_date from catalog_sales, date_dim, customer where catalog_sales.cs_sold_date_sk = date_dim.d_date_sk and catalog_sales.cs_bill_customer_sk = customer.c_customer_sk and d_month_seq between [DMS] and [DMS]+11) except (select distinct c_last_name, c_first_name, d_date from web_sales, date_dim, customer where web_sales.ws_sold_date_sk = date_dim.d_date_sk and web_sales.ws_bill_customer_sk = customer.c_customer_sk and d_month_seq between [DMS] and [DMS]+11) ) cool_cust; ```
--- ## Q88 — Hourly Store Purchase Counts (Pivot by 30-Min Window) Returns 8 counts (one per 30-minute window from 8:30 AM to 12:30 PM) for customers with specific dependent-count and vehicle-count households, all in one cross-product row. | Parameter | Type | Range / Values | |-----------|------|---------------| | `HOUR` | integer (×3) | 3 draws from -1–4 (used as `hd_dep_count` values) | | `STORE` | list | from store names distribution |
TPC-DS Reference SQL (query88.tpl) ```sql select * from (select count(*) h8_30_to_9 from store_sales, household_demographics, time_dim, store where ss_sold_time_sk = time_dim.t_time_sk and ss_hdemo_sk = household_demographics.hd_demo_sk and ss_store_sk = s_store_sk and time_dim.t_hour = 8 and time_dim.t_minute >= 30 and (... hd_dep_count in ({HOUR.1},{HOUR.2},{HOUR.3} with vehicle_count ...) ...) and store.s_store_name = 'ese') s1, (select count(*) h9_to_9_30 ... t_hour = 9 and t_minute < 30 ...) s2, (select count(*) h9_30_to_10 ... t_hour = 9 and t_minute >= 30 ...) s3, (select count(*) h10_to_10_30 ... t_hour = 10 and t_minute < 30 ...) s4, (select count(*) h10_30_to_11 ... t_hour = 10 and t_minute >= 30 ...) s5, (select count(*) h11_to_11_30 ... t_hour = 11 and t_minute < 30 ...) s6, (select count(*) h11_30_to_12 ... t_hour = 11 and t_minute >= 30 ...) s7, (select count(*) h12_to_12_30 ... t_hour = 12 and t_minute < 30 ...) s8; ```
--- ## Q89 — Monthly Sales Deviation by Category and Brand Finds store items where a month's sales deviate more than 10% from the brand's monthly average, using window function `AVG(SUM(...)) OVER (PARTITION BY ...)`. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `CAT_A`…`CAT_F` | list (×6) | 6 draws of category+class pairs | | `CLASS_A`…`CLASS_F` | list (×6) | matching class draws |
TPC-DS Reference SQL (query89.tpl) ```sql [_LIMITA] select [_LIMITB] * from (select i_category, i_class, i_brand, s_store_name, s_company_name, d_moy, sum(ss_sales_price) sum_sales, avg(sum(ss_sales_price)) over (partition by i_category, i_brand, s_store_name, s_company_name) avg_monthly_sales from item, store_sales, date_dim, store where ss_item_sk = i_item_sk and ss_sold_date_sk = d_date_sk and ss_store_sk = s_store_sk and d_year in ([YEAR]) and ((i_category in ('[CAT_A]','[CAT_B]','[CAT_C]') and i_class in ('[CLASS_A]','[CLASS_B]','[CLASS_C]')) or (i_category in ('[CAT_D]','[CAT_E]','[CAT_F]') and i_class in ('[CLASS_D]','[CLASS_E]','[CLASS_F]'))) group by i_category, i_class, i_brand, s_store_name, s_company_name, d_moy) tmp1 where case when (avg_monthly_sales <> 0) then (abs(sum_sales - avg_monthly_sales) / avg_monthly_sales) else null end > 0.1 order by sum_sales - avg_monthly_sales, s_store_name [_LIMITC]; ```
--- ## Q90 — AM vs. PM Web Sales Ratio Computes the ratio of web sales counts in a morning hour window to an afternoon hour window, for customers with a specific dependent count and high-character-count web pages. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DEPCNT` | integer | 0–9 | | `HOUR_AM` | integer | 6–12 | | `HOUR_PM` | integer | 13–21 |
TPC-DS Reference SQL (query90.tpl) ```sql [_LIMITA] select [_LIMITB] cast(amc as decimal(15,4))/cast(pmc as decimal(15,4)) am_pm_ratio from (select count(*) amc from web_sales, household_demographics, time_dim, web_page where ws_sold_time_sk = time_dim.t_time_sk and ws_ship_hdemo_sk = household_demographics.hd_demo_sk and ws_web_page_sk = web_page.wp_web_page_sk and time_dim.t_hour between [HOUR_AM] and [HOUR_AM]+1 and household_demographics.hd_dep_count = [DEPCNT] and web_page.wp_char_count between 5000 and 5200) at, (select count(*) pmc from web_sales, household_demographics, time_dim, web_page where ws_sold_time_sk = time_dim.t_time_sk and ws_ship_hdemo_sk = household_demographics.hd_demo_sk and ws_web_page_sk = web_page.wp_web_page_sk and time_dim.t_hour between [HOUR_PM] and [HOUR_PM]+1 and household_demographics.hd_dep_count = [DEPCNT] and web_page.wp_char_count between 5000 and 5200) pt order by am_pm_ratio [_LIMITC]; ```
--- ## Q91 — Call Center Returns Loss by Manager Sums catalog return net losses by call center manager, filtered by customer demographics (marital status + education), buy potential, and GMT offset month. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `MONTH` | integer | 11–12 | | `BUY_POTENTIAL` | list | 1001–5000, >10000, 501–1000, 0–500, Unknown, 5001–10000 | | `GMT` | list | `-6`, `-7` |
TPC-DS Reference SQL (query91.tpl) ```sql select cc_call_center_id call_center, cc_name call_center_name, cc_manager manager, sum(cr_net_loss) returns_loss from call_center, catalog_returns, date_dim, customer, customer_address, customer_demographics, household_demographics where cr_call_center_sk = cc_call_center_sk and cr_returned_date_sk = d_date_sk and cr_returning_customer_sk = c_customer_sk and cd_demo_sk = c_current_cdemo_sk and hd_demo_sk = c_current_hdemo_sk and ca_address_sk = c_current_addr_sk and d_year = [YEAR] and d_moy = [MONTH] and ((cd_marital_status = 'M' and cd_education_status = 'Unknown') or (cd_marital_status = 'W' and cd_education_status = 'Advanced Degree')) and hd_buy_potential like '[BUY_POTENTIAL]%' and ca_gmt_offset = [GMT] group by cc_call_center_id, cc_name, cc_manager, cd_marital_status, cd_education_status order by sum(cr_net_loss) desc; ```
--- ## Q92 — Web Items with Excess Discount Amount Finds web sales where the discount amount exceeds 1.3x the average for that item, in a 90-day window, for a specific manufacturer. | Parameter | Type | Range / Values | |-----------|------|---------------| | `IMID` | integer | 1–1000 (manufacturer ID) | | `YEAR` | integer | 1998–2002 | | `WSDATE` | date | derived: Jan 1 – Apr 1 of `{YEAR}` |
TPC-DS Reference SQL (query92.tpl) ```sql [_LIMITA] select [_LIMITB] sum(ws_ext_discount_amt) as "Excess Discount Amount" from web_sales, item, date_dim where i_manufact_id = [IMID] and i_item_sk = ws_item_sk and d_date between '[WSDATE]' and (cast('[WSDATE]' as date) + 90 days) and d_date_sk = ws_sold_date_sk and ws_ext_discount_amt > (select 1.3 * avg(ws_ext_discount_amt) from web_sales, date_dim where ws_item_sk = i_item_sk and d_date between '[WSDATE]' and (cast('[WSDATE]' as date) + 90 days) and d_date_sk = ws_sold_date_sk) order by sum(ws_ext_discount_amt) [_LIMITC]; ```
--- ## Q93 — Customer Net Sales After Returns by Reason Computes effective net sales per customer (quantity minus returns, times price), for store returns with a specific return reason. | Parameter | Type | Range / Values | |-----------|------|---------------| | `REASON` | list | from return reasons distribution |
TPC-DS Reference SQL (query93.tpl) ```sql [_LIMITA] select [_LIMITB] ss_customer_sk, sum(act_sales) sumsales from (select ss_item_sk, ss_ticket_number, ss_customer_sk, case when sr_return_quantity is not null then (ss_quantity - sr_return_quantity) * ss_sales_price else (ss_quantity * ss_sales_price) end act_sales from store_sales left outer join store_returns on (sr_item_sk = ss_item_sk and sr_ticket_number = ss_ticket_number), reason where sr_reason_sk = r_reason_sk and r_reason_desc = '[REASON]') t group by ss_customer_sk order by sumsales, ss_customer_sk [_LIMITC]; ```
--- ## Q94 — Multi-Warehouse Web Orders Not Returned Counts distinct web orders that shipped from more than one warehouse and were never returned, using `EXISTS`/`NOT EXISTS` correlated subqueries. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1999–2002 | | `MONTH` | integer | 2–5 | | `STATE` | list | US state codes |
TPC-DS Reference SQL (query94.tpl) ```sql [_LIMITA] select [_LIMITB] count(distinct ws_order_number) as "order count", sum(ws_ext_ship_cost) as "total shipping cost", sum(ws_net_profit) as "total net profit" from web_sales ws1, date_dim, customer_address, web_site where d_date between '[YEAR]-[MONTH]-01' and (cast('[YEAR]-[MONTH]-01' as date) + 60 days) and ws1.ws_ship_date_sk = d_date_sk and ws1.ws_ship_addr_sk = ca_address_sk and ca_state = '[STATE]' and ws1.ws_web_site_sk = web_site_sk and web_company_name = 'pri' and exists (select * from web_sales ws2 where ws1.ws_order_number = ws2.ws_order_number and ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk) and not exists (select * from web_returns wr1 where ws1.ws_order_number = wr1.wr_order_number) order by count(distinct ws_order_number) [_LIMITC]; ```
--- ## Q95 — Multi-Warehouse Web Orders With Returns (CTE Variant) Same as Q94 but with returns included, using a CTE to find multi-warehouse order numbers and then joining back to filter with those order numbers. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1999–2002 | | `MONTH` | integer | 2–5 | | `STATE` | list | US state codes |
TPC-DS Reference SQL (query95.tpl) ```sql with ws_wh as ( select ws1.ws_order_number, ws1.ws_warehouse_sk wh1, ws2.ws_warehouse_sk wh2 from web_sales ws1, web_sales ws2 where ws1.ws_order_number = ws2.ws_order_number and ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk) [_LIMITA] select [_LIMITB] count(distinct ws_order_number) as "order count", sum(ws_ext_ship_cost) as "total shipping cost", sum(ws_net_profit) as "total net profit" from web_sales ws1, date_dim, customer_address, web_site where d_date between '[YEAR]-[MONTH]-01' and (cast('[YEAR]-[MONTH]-01' as date) + 60 days) and ws1.ws_ship_date_sk = d_date_sk and ws1.ws_ship_addr_sk = ca_address_sk and ca_state = '[STATE]' and ws1.ws_web_site_sk = web_site_sk and web_company_name = 'pri' and ws1.ws_order_number in (select ws_order_number from ws_wh) and ws1.ws_order_number in (select wr_order_number from web_returns, ws_wh where wr_order_number = ws_wh.ws_order_number) order by count(distinct ws_order_number) [_LIMITC]; ```
--- ## Q96 — Store Sales Count in Late Evening Hour Counts store sales in a specific late-hour 30-minute window (minute ≥ 30) for customers with a given dependent count, in the store named 'ese'. | Parameter | Type | Range / Values | |-----------|------|---------------| | `HOUR` | list | 20, 15, 16, 8 | | `DEPCNT` | integer | 0–9 |
TPC-DS Reference SQL (query96.tpl) ```sql [_LIMITA] select [_LIMITB] count(*) from store_sales, household_demographics, time_dim, store where ss_sold_time_sk = time_dim.t_time_sk and ss_hdemo_sk = household_demographics.hd_demo_sk and ss_store_sk = s_store_sk and time_dim.t_hour = [HOUR] and time_dim.t_minute >= 30 and household_demographics.hd_dep_count = [DEPCNT] and store.s_store_name = 'ese' order by count(*) [_LIMITC]; ```
--- ## Q97 — Store vs. Catalog Overlap via Full Outer Join Counts customers/items that appear in only store sales, only catalog sales, or both, in a 12-month period using a `FULL OUTER JOIN` of two aggregated CTEs. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 |
TPC-DS Reference SQL (query97.tpl) ```sql with ssci as ( select ss_customer_sk customer_sk, ss_item_sk item_sk from store_sales, date_dim where ss_sold_date_sk = d_date_sk and d_month_seq between [DMS] and [DMS]+11 group by ss_customer_sk, ss_item_sk), csci as ( select cs_bill_customer_sk customer_sk, cs_item_sk item_sk from catalog_sales, date_dim where cs_sold_date_sk = d_date_sk and d_month_seq between [DMS] and [DMS]+11 group by cs_bill_customer_sk, cs_item_sk) [_LIMITA] select [_LIMITB] sum(case when ssci.customer_sk is not null and csci.customer_sk is null then 1 else 0 end) store_only, sum(case when ssci.customer_sk is null and csci.customer_sk is not null then 1 else 0 end) catalog_only, sum(case when ssci.customer_sk is not null and csci.customer_sk is not null then 1 else 0 end) store_and_catalog from ssci full outer join csci on (ssci.customer_sk = csci.customer_sk and ssci.item_sk = csci.item_sk) [_LIMITC]; ```
MySQL / MariaDB dialect MySQL and MariaDB do not support `FULL OUTER JOIN`. The bexhoma implementation rewrites this as a `UNION ALL` of left and right joins: ```sql select sum(case when ssci.customer_sk is not null and csci.customer_sk is null then 1 else 0 end) store_only, sum(case when ssci.customer_sk is null and csci.customer_sk is not null then 1 else 0 end) catalog_only, sum(case when ssci.customer_sk is not null and csci.customer_sk is not null then 1 else 0 end) store_and_catalog from ( select ssci.customer_sk, ssci.item_sk, csci.customer_sk as c_customer_sk, csci.item_sk as c_item_sk from ssci left join csci on (ssci.customer_sk = csci.customer_sk and ssci.item_sk = csci.item_sk) union all select ssci.customer_sk, ssci.item_sk, csci.customer_sk, csci.item_sk from csci left join ssci on (ssci.customer_sk = csci.customer_sk and ssci.item_sk = csci.item_sk) where ssci.customer_sk is null) combined; ```
--- ## Q98 — Item Revenue Share Within Class Reports each item's revenue and its percentage share of the class-level total, using a window function, for 3 item categories in a 30-day window. | Parameter | Type | Range / Values | |-----------|------|---------------| | `YEAR` | integer | 1998–2002 | | `SDATE` | date | derived: Jan 1 – Jul 1 of `{YEAR}` | | `CATEGORY` | list (×3) | 3 draws from categories |
TPC-DS Reference SQL (query98.tpl) ```sql select i_item_id, i_item_desc, i_category, i_class, i_current_price, sum(ss_ext_sales_price) as itemrevenue, sum(ss_ext_sales_price)*100/sum(sum(ss_ext_sales_price)) over (partition by i_class) as revenueratio from store_sales, item, date_dim where ss_item_sk = i_item_sk and i_category in ('[CATEGORY.1]','[CATEGORY.2]','[CATEGORY.3]') and ss_sold_date_sk = d_date_sk and d_date between cast('[SDATE]' as date) and (cast('[SDATE]' as date) + 30 days) group by i_item_id, i_item_desc, i_category, i_class, i_current_price order by i_category, i_class, i_item_id, i_item_desc, revenueratio; ```
--- ## Q99 — Catalog Shipping Latency Pivot by Warehouse, Mode, and Call Center Pivots catalog order shipping latency into 5 buckets (≤30 days, 31–60, 61–90, 91–120, >120) per warehouse × ship mode × call center combination. | Parameter | Type | Range / Values | |-----------|------|---------------| | `DMS` | integer | 1176–1224 |
TPC-DS Reference SQL (query99.tpl) ```sql [_LIMITA] select [_LIMITB] substr(w_warehouse_name,1,20), sm_type, cc_name, sum(case when (cs_ship_date_sk - cs_sold_date_sk <= 30) then 1 else 0 end) as "30 days", sum(case when (cs_ship_date_sk - cs_sold_date_sk > 30) and (cs_ship_date_sk - cs_sold_date_sk <= 60) then 1 else 0 end) as "31-60 days", sum(case when (cs_ship_date_sk - cs_sold_date_sk > 60) and (cs_ship_date_sk - cs_sold_date_sk <= 90) then 1 else 0 end) as "61-90 days", sum(case when (cs_ship_date_sk - cs_sold_date_sk > 90) and (cs_ship_date_sk - cs_sold_date_sk <= 120) then 1 else 0 end) as "91-120 days", sum(case when (cs_ship_date_sk - cs_sold_date_sk > 120) then 1 else 0 end) as ">120 days" from catalog_sales, warehouse, ship_mode, call_center, date_dim where d_month_seq between [DMS] and [DMS]+11 and cs_ship_date_sk = d_date_sk and cs_warehouse_sk = w_warehouse_sk and cs_ship_mode_sk = sm_ship_mode_sk and cs_call_center_sk = cc_call_center_sk group by substr(w_warehouse_name,1,20), sm_type, cc_name order by substr(w_warehouse_name,1,20), sm_type, cc_name [_LIMITC]; ```
---