We want to create our own dashboards from the raw data that we received from Revenucat. We transfer the data to BigQuery about 1 day behind. When we group monthly active subscribers by month, we cannot match them with the data in the table that revenuecat provides us. Meanwhile, we find minor differences acceptable. Unfortunately the difference is too much.
For example, while the number of active users in May is 1419 in our code, it is 1893 in the active subscriber chart of Revenucat. I am sharing the SQL code I wrote. The logic we have established here, in short, is this. We got price_in_usd > 0 to eliminate trial users so that data is not duplicated. We also set the sandbox area to false to avoid getting sandbox data. We thought that the original_store_transaction_id refreshes itself when creating the store transaction id with every purchase. That's why we thought that when we group and count the original_store_transaction_id by months, we can reach active subscribers. So what am I missing? If you have any suggestions, please feel free to comment.
WITH revenuecat AS(
DISTINCT FORMAT_DATE("%Y%m", DATE(start_time)) AS start_time,
FORMAT_DATE("%Y%m", DATE(end_time)) AS end_time,
ROUND(price_in_usd,2) AS price_in_usd,
price_in_usd > 0
AND is_sandbox = false
COUNT(original_store_transaction_id) AS org_trans_count
GROUP BY 1
ORDER BY 1