Per-Customer Margins using Stripe and Kubernetes

by Akash Khanolkar
Margin analytics

[The picture above depicts the results of a real-time margin analysis experiment, overlaying AWS infrastructure costs and Stripe revenue]

A simple (naive?) approach to a complicated problem!

Let's set the scene. I'm a subscription-based SaaS company running expensive cloud infrastructure (on Kubernetes) that is shared across all of my customers.

Today, I can approximate my total margin using monthly revenue from Stripe and my infrastructure bill from AWS. However, I have no insights into how each customer contributes to margin. Given margin per customer, I could answer several valuable questions:

  • Who are my good (high-margin) customers and who are my bad (low/negative-margin) customers?
  • How are my margins changing as my business grows?
  • How are changes to my products affecting my profits?

The solution here can be complicated.

Assuming that my cost is primarily cloud infrastructure costs, to get cost per customer*, first I need to set up monitoring layers to properly measure customer usage. Then, I need to enrich the usage data with cloud prices. To complicate matters further, I need to consolidate and store this data long-term to maintain historical views of my business across my distributed infrastructure. Finally, I need to join this data with revenue per customer**.

Fortunately, my customers are segmented by jobs (i.e., pods, namespaces, and(or) clusters). In this case, cost per customer is approximately the cost of a customer's namespace. I can track cost per namespace using the Octane Cost API and join my revenue per customer using the Stripe API. Through these integrations, I can extract real-time margin per customer.

[Important to note that we understand that many SaaS companies will not have customers segmented by namespaces. For purposes of blog post we have chosen to simplify the problem]

Here are main components and steps to complete:

cost per customer:

  1. (pre-requisite) set up Octane and connect cluster(s) to measure infrastructure costs
  2. set up Octane accounts to track namespaces (I've chosen to segment customers by namespace for this example)
  3. fetch costs for the customer accounts

revenue per customer:

  1. (pre-requisite) set up Stripe to track customer subscriptions
  2. fetch Stripe subscriptions and compute the expected charge for a given time range

margin per customer: 

  1. formula: (revenue per customer - cost per customer) / revenue per customer

Getting Started

I did some pre-requisite steps to get the data I needed to get real-time cost per customer (Octane) and revenue per customer (Stripe).

Set up Octane

  • Register / Login
  • Add Kubernetes cluster

Set up Stripe

  • Register / Login
  • Create customers
  • Add subscriptions for customers

Cost per customer

Now, that our tools are setup, I need to start leveraging them to get real-time margin. First let's get cost per customer data. I begin by setting up a mapping from the Stripe customer list to the Kubernetes customer namespaces.

# Import python libraries
import pandas as pd
import pprint
import requests
HOURS_TO_SECONDS = 60.0 * 60.0
# Octane API URL
OCTANE_URL = "https://hasura.cloud.getoctane.io/v1/graphql"
# Set up time range for computing cost and revenue
LOOKBACK_HOURS = 4
START_TIME = pd.Timestamp.utcnow() - pd.Timedelta(hours=LOOKBACK_HOURS)
# Set up mapping from customer to corresponding namespace in Kubernetes clusters
CUSTOMERS = {
    "Slow Sally": {
        "namespace": "slow-sally"
    },
    "Stingy Sid": {
        "namespace": "stingy-sid"
    },
    "Mike SpendALot": {
        "namespace": "mike-spendalot"
    },
    "Jamie Jay": {
        "namespace": "jamie-jay"
    },
    "John Du": {
        "namespace": "john-du"
    }
}

Next, I need to send a login request to the Octane API to fetch an access token for further requests.

login_mutation = """
mutation Login{
    login(args: {username: "EMAIL_ADDRESS", password: "PASSWORD"}){
        accessToken
    }
}
"""
res = requests.post(OCTANE_URL, json={"query": login_mutation})
if res.status_code != 200:
    raise Exception('Failed login request')
access_token = res.json().get('data', {}).get('login', {}).get('accessToken', "")
auth_headers = {"Authorization": f"Bearer {access_token}"}

Once I am authenticated, I need to create an Octane grouping (account) for each customer, specifying the corresponding namespace.

# addAccount creates a logical grouping of costs in Octane
add_account_mutation = """
mutation AddAccount{
    addAccount(args: {accountName: "%s", aggregationType: "namespace", aggregationRegex: "%s"}){
        id
    }
}
"""
# Loops through CUSTOMERS to create groupings in Octane
for name, details in CUSTOMERS.items():
    add_account_request = {
        "query": add_account_mutation % (name, details["namespace"])
    }
    res = requests.post(OCTANE_URL, json=add_account_request, headers=auth_headers)
    if res.status_code != 200:
        raise Exception('Failed add account request')

Now that Octane knows how to group real-time cloud costs by customer, I am ready to query for those costs.

account_cost_query = """
query CostQuery {
    pod_cost_aggregate(where: {end_time: {_gte: "%s"}, pod: {account_pods: {account: {name: {_eq: "%s" } } } } }) {
        aggregate {
            sum {
                cost
            }
        }
    }
}
"""
for name in CUSTOMERS:
    get_costs_request = {
        "query": account_cost_query % (START_TIME, name)
    }
    res = requests.post(OCTANE_URL, json=get_costs_request, headers=auth_headers)
    if res.status_code != 200:
        raise Exception('Failed getting account costs')
    cost = res.json().get('data', {}).get('pod_cost_aggregate', {}).get('aggregate', {}).get('sum', {}).get('cost')
    CUSTOMERS[name]['cost'] = cost
print(pd.DataFrame(CUSTOMERS))

Great! Now, I have a side by side comparison of customers (namespaces) and cost (spend).

[The graph above was pulled from Octane UI]

As you can see in the tables and graphs above, I have real-time visibility into our most expensive customers (Mike SpendALot) and cost-efficient customers (stingy-sid, slow-sally).

Revenue per customer

Now that I have cost per customer, I need to overlay that data with revenues from Stripe to get real-time margins per customer.

First, I start by getting customer subscriptions.

import stripe
stripe.api_key = 'STRIPE_API_KEY'
customers = { customer.id: customer for customer in stripe.Customer.list().data}
subscriptions = { subscription.customer: subscription for subscription in stripe.Subscription.list().data}

Now I need to convert subscriptions to revenue. Revenue is extrapolated from the unit price of a customer subscription. [Note that revenue structure may be more complex when using other Stripe features (e.g. metered billing, discounts, trials, etc.)].

for customer in customers:
    subscription = subscriptions.get(customer)
    if subscription:
        customer_name = customers[customer].name
        customer_subscription_price = subscription.get('items', {}).get('data')[0].price
        interval_seconds = pd.Timedelta(f"{customer_subscription_price.recurring.interval_count} {customer_subscription_price.recurring.interval}").total_seconds()
        price_per_hour = (customer_subscription_price.unit_amount / 100.0) / (interval_seconds / HOURS_TO_SECONDS)
        CUSTOMERS[customer_name]['revenue'] = price_per_hour * LOOKBACK_HOURS
print(pd.DataFrame(CUSTOMERS))

Excellent, now I have customer, cost, and revenue grouped together. All that is left to do is compute margins.

for name, details in CUSTOMERS.items():
    CUSTOMERS[name]['pct_margin'] = 100 * (details['revenue'] - details['cost']) / details['revenue']
print(pd.DataFrame(CUSTOMERS))
Margin analytics

Viola! I have calculated real-time margins per customer. I can quickly see that Mike SpendAlot is bring me the lowest percent margins.

Conclusion

Calculating customer cost for SaaS companies is complex. By leveraging Octane and Stripe, I can capture, segment, and visualize customer costs in real-time to answer valuable business questions. The example above enabled us to differentiate between my low and high margin customers (Stingy Sid vs Mike SpendALot). It allowed us to do it real-time and on an ongoing basis, which is especially useful as my products, customers and business evolve.

I hope to spark discussion around the potential of leveraging infrastructure to inform business performance. I've made several assumptions and simplifications around the complexities of margin as there is no cookie-cutter solution.

If you have any thoughts or questions we would love to hear from you - email me at akash@getoctane.io.

*I define cost per customer as the recurring infrastructure costs attributed to a customer's usage. Please note that infrastructure may only be a part of recurring costs. Cost of revenue can include other significant components not covered here (e.g. DevOps, customer success, support).

**In this walk-through, there is a single revenue stream. However, SaaS companies may have multiple revenue streams including professional services. It is important to compute margin separately for different revenue streams.