숙제
채널별 월별 매출액 테이블 만들기
Table |
Fields |
session_timestamp |
sessionId (string), ts (timestamp) |
user_session_channel |
userid (integer), sessionid (string), channel (string) |
session_transaction |
sessionid (string), refunded (boolean), amount (integer) |
channel |
channelname (string) |
- adhoc 밑에 CTAS로 본인이름을 포함한 테이블로 만들기
- session_timestamp, user_session_channel, session_transaction 사용아래와 같은 필드로 구성
- month
- channel
- uniqueUsers (총 방문 사용자)
- paidUsers (구매 사용자: refund한 경우도 판매로 고려)
- conversionRate (구매사용자 / 총 방문 사용자)
- grossRevenue (refund 포함)
- netRevenue (refund 제외)
문제 해결
DROP TABLE IF EXISTS adhoc.sanseo_session_summary;
CREATE TABLE adhoc.sanseo_session_summary AS (
SELECT
LEFT(B.ts, 7) AS month,
A.channel AS channel,
COUNT(DISTINCT A.userId) AS uniqueUsers,
COUNT(DISTINCT CASE WHEN C.refunded is NOT NULL THEN A.userId ELSE NULL END) AS paidUsers,
ROUND(CAST(paidUsers AS FLOAT)/CAST(uniqueUsers AS FLOAT), 3) AS conversionRate,
SUM(C.amount) AS grossRevenue,
SUM(CASE WHEN C.refunded is false THEN C.amount ELSE 0 END) AS netRevenue
FROM raw_data.user_session_channel A
JOIN raw_data.session_timestamp B ON A.sessionId = B.sessionId
LEFT JOIN raw_data.session_transaction C ON B.sessionId = C.sessionId
GROUP BY 1, 2
);
SELECT *
FROM adhoc.sanseo_session_summary
ORDER BY month, channel;
/*
month channel uniqueusers paidusers conversionrate grossrevenue netrevenue
2019-05 Facebook 247 14 0.057 1199 997
2019-05 Google 253 10 0.04 580 580
2019-05 Instagram 234 11 0.047 959 770
2019-05 Naver 237 11 0.046 867 844
2019-05 Organic 238 17 0.071 1846 1571
...
*/
모범 답안
실행 결과는 위의 결과와 다르지 않다. 코드 상 달라진 것은 conversionRate에 * 100을 해준 것과 NULLIF를 통해 Divide by zero 예외 처리를 추가해 주었다.
DROP TABLE IF EXISTS adhoc.sanseo_session_summary;
CREATE TABLE adhoc.sanseo_session_summary AS (
SELECT
LEFT(B.ts, 7) AS month,
A.channel AS channel,
COUNT(DISTINCT A.userId) AS uniqueUsers,
COUNT(DISTINCT CASE WHEN C.amount > 0 THEN A.userId END) AS paidUsers,
COALESCE(ROUND(paidUsers*100.0/NULLIF(uniqueUsers, 0), 2), 0.0) AS conversionRate, -- Divide by zero 방지
SUM(C.amount) AS grossRevenue,
SUM(CASE WHEN C.refunded is false THEN C.amount END) AS netRevenue
FROM raw_data.user_session_channel A
JOIN raw_data.session_timestamp B ON A.sessionId = B.sessionId
LEFT JOIN raw_data.session_transaction C ON B.sessionId = C.sessionId
GROUP BY 1, 2
);