— SQL — 1 min read
without window function or CTE
I recently had to work on bid_modifier
data for our Search Engine Marketing (SEM) bidding. I have a daily summary table which consist of the bid_modifier
column.
Take the table below as an example:
1date INT, -- yyyymmdd format 2adgroup_id INT, 3bid_modifier DOUBLE, 4cost INT, 5revenue INT
I want to produce a table with adgroup_id dimension, latest bid_modifier value, and SUM of cost & revenue. There are multiple alternatives to do this, but most of them require multiple steps:
WHERE rank = 1
, and then join it with another cte that contains the sum of other columnsI was too lazy, so I tried to come up with a hack. I ended up doing this:
1SELECT 2 adgroup_id3 SUBSTR(MAX(CONCAT(date,bid_modifier)), 9, MAX(CONCAT(date,bid_modifier))+1) AS latest_bid_modifier4 SUM(cost),5 SUM(revenue)6FROM ...7GROUP BY 8 adgroup_id
This particular line
1SUBSTR(MAX(CONCAT(date,bid_modifier)), 9, MAX(CONCAT(date,bid_modifier))+1)
will sort the data based on the max date, and then it will be appended with the value of bid_modifier, and I take only the chars after the date part. Hence, I got the latest value of bid_modifier
.
The MAX(CONCAT(date,bid_modifier))+1
is used because the bid_modifier
column don't have a fixed total chars. It could be 0 (1 char), or 1.5 (3 chars), or 4 chars too. Therefore it's better to have a dynamic bound for the substring.
For each day, I want to know what was the total value from the past week, a week ago, two weeks ago, and so on. But in this case, I need the week to be generated dynamically, instead of the default week grouping that will always start and end on the same day. I want 1 week to be define as a group of 7 day, starting from the last day.
There was an existing solution, but I found it quite ugly, because I would need to define all possible cases. It's something like this:
1CASE 2 WHEN DATEDIFF(date, CURRENT_DATE()) < 7 THEN 03 WHEN DATEDIFF(date, CURRENT_DATE()) < 14 THEN 14 WHEN DATEDIFF(date, CURRENT_DATE()) < 21 THEN 25 ...6END AS num_weeks_ago
I noticed that it could be simplified into:
1DATEDIFF(date, CURRENT_DATE()) DIV 7 AS num_weeks_ago
If you need to run in on a workflow, you can replace the CURRENT_DATE()
with the date parameter, so that you're not restricted to only calculating the value from the current date.
By using DIV
, I will always get the integer value of from the date difference operation result that can be divided by 7, which is the week!