Cumulative Sums are a little tricky in MySQL.
The simplest way to tackle this would be to use the running_total measure, or utilize the running_total() function in Table Calculations.
If you need to be able to reference your running total or sum in other fields, these approaches won’t do it. In this case, you could try something like the following:
orders.user_id AS user_id
, DATE_FORMAT(orders.created_at,'%Y-%m') AS orders_created_month
, SUM(orders.amount) AS this_month_sum
FROM orders AS o
WHERE o.user_id = orders.user_id
AND o.id <= orders.id
) AS cumulative_sum
GROUP BY 1,2
ORDER BY 1,2,3
Adapted from this StackOverflow Article.
Grouping by user_id gives flexibility to join users into this to get more flexible cohorting.
Inequality joins are another option that are a bit less flexible, but can be fast.
What happens when even the inequality join is taking a long time? Our performance is completely lost when trying to execute this method. We have other options nowadays?
If you are using a SQL dialect that supports window functions, the much better option is to use a SUM() window function in a derived table. Window functions are really fast. Unfortunately MySQL doesn’t have window functions, but there is another fast method using it, see example below.
For dialects that support Window Functions (redshift, bigquery, postgres, etc)
In MySQL in a derived table, you can use variable to approximate the same thing.
A decent stack overflow article explains it:
@lloydtabb. I will look into utilizing the variable method, as we’re using mysql.