Question

Fill missing dates before join


Hello,

 

To provide some context, I’m looking to make an “ARPPU Cohort”, which looks like something like this ;

 

Basically, I’m grouping all the users who made a first payment on each months, and evaluate their revenue behavior over the next months. Then I’m divided the revenue value by the size (count of user), and I get an “ARPPU” value (this part is not in the previous image) “average revenue per paying user”

 

To do that, I have a payment table which looks like this :

event_date event_of_first_payment amount
2021-03-01 2021-01-01  
2021-03-02 2021-01-01  
2021-03-02 2021-01-03  
2021-03-03 2021-01-04  

It’s basically a row per payment, with the date of the event, and the associated date of the first payment for the user concerned by the event. It allows me to calculate a “time since first payment” for each payment, which is usefull to make the cohort I need.

 

And a size table which looks like this

date size
2021-01-01 100
2021-01-02 125
2021-01-03 120
2021-01-04 110

which is the users number, who made a first payment, for each date

 

And my model looks like this ;

explore: payments{
  view_name: payments
  persist_with: daily
  join: size{
    type: left_outer
    sql_on: ${payments.event_of_first_payment_date} = ${size.event_date};;
    relationship: many_to_one
  }

 

The problem is : some “Size” values are missing because the joining is made on the event_of_first_payment date and sometime there aren’t a row for every “event_of_first_payment” date on the payment table.

Like on the previous example, there is no payment with a “event of first payment” on the “2021-01-02”. So I’ll miss the associated size : “120”.

So you can see in the following image that the issue on real values, with the inconsistent sizing depend on the month (should be the same size value for one row)

 

So I’ll rather need to get something like that for the joined table;

event_date event_of_first_payment Size
2021-03-01 2021-01-01 100
2021-03-02 2021-01-01 100
null 2021-01-02 125
2021-03-02 2021-01-03 120
2021-03-03 2021-01-04 110

 

I need to fill the payment table with the missing dates (but the combinaison of the two dates ! event_of_first_payment and event_date). In this way, when the joining is done, I’ll get all the “size” values. Taking in consideration that a “event of first payment” date is always lower or equal to the “event_date”.

Do you have any ideas of a better approach to fix this issue please ?

I already tried to put a full outer join instead of a left join on my model, but it didn’t work (And I really don’t know why)

 

Thank you


0 replies

Be the first to reply!

Reply