Your 80% Margin Died the Day You Shipped AI
By Framesta Fernando · Engineering Manager & Technical Architect · 11 min read · Published June 26, 2026
TL;DR: AI gross margin is the number your board has not repriced. Traditional SaaS runs near 80% gross margin because software COGS is mostly fixed. Inference makes COGS variable, scaling with every request, so a single AI feature can drop an 80% margin to 45%. Price the feature on its marginal cost, or it eats the business.
Software earned its premium valuation on one number: gross margin near 80%. That number is why investors paid 10x revenue for SaaS and 2x for services. It held because once you built the software, serving one more user cost almost nothing. The marginal cost of a customer was a rounding error, so revenue dropped almost straight to gross profit.
AI broke that quietly, in a single release. The moment your product makes an inference call per user action, the marginal cost of a customer stops being a rounding error and becomes a real, recurring, per-request line. The 80% margin did not erode over years. It fell the day the feature shipped, and most teams measured the engagement lift without measuring the margin it cost.
The assumption baked into every SaaS model
Every SaaS financial model assumes COGS is fixed. You spend to build the product and to keep the lights on, and those costs barely move whether you serve a thousand users or a million. That is the entire reason software margins are high and predictable, and it is the assumption behind every board deck, every valuation multiple, every "we'll grow into profitability" plan.
Inference violates that assumption at the root. An LLM call is not a fixed cost amortized across users. It is a variable cost incurred per use, like raw materials in manufacturing. By 2026, inference is a genuine variable cost of goods sold, and seat-based or flat-rate models can destroy gross margin when the AI does the work (per DigitalApplied's usage-based pricing analysis). The software industry spent two decades optimizing around fixed COGS, and AI reintroduced the variable cost that software was supposed to have escaped.
What one feature does to the margin
The arithmetic is brutal and worth doing explicitly. The following is illustrative with round numbers. Take a healthy SaaS seat at $100 per user per month, with $20 of fixed hosting and support cost per user. That is an 80% gross margin, textbook software economics.
Before AI:
revenue $100 - fixed COGS $20 = $80 gross profit
margin = 80 / 100 = 80%Now ship an AI feature that makes, say, 1,500 inference calls per user per month at an average blended cost of, illustratively, $35 in tokens per user per month. You did not raise the price, because the feature was sold as "included."
After AI:
revenue $100 - fixed $20 - inference $35 = $45 gross profit
margin = 45 / 100 = 45%One feature, no price change, and the margin fell from 80% to 45%. Worse, inference is your only line that grows with engagement, so your most active and "successful" users are your least profitable, and a power user can go margin-negative entirely. That inversion, where success costs money, is the same dynamic I traced in why AI cost explodes after scale.
The post-mortem version: a team shipped an AI assistant as a free add-on to drive retention. Retention improved, the launch was celebrated, and two quarters later finance found blended gross margin had fallen, illustratively, from 78% to 51%, with the heaviest-using cohort underwater. The feature worked exactly as designed. The pricing did not exist.
The line that moved from fixed to variable
The reframe that matters: in an AI product, COGS is no longer a fixed cost you amortize, it is a marginal cost you incur per action, and every part of the business that assumed otherwise is now mispriced. Pricing assumed it. Sales comp assumed it. The valuation multiple assumed it. All of them were built on a cost structure the AI feature silently replaced.
This is why "the model will get cheaper" is a dangerous comfort. Even if the per-token rate falls, inference remains a variable cost that scales with usage, so it behaves like COGS in a hardware business, not like the fixed R&D of classic software. A factory does not become a software company because steel got cheaper. Cheaper steel still scales with every unit shipped, and so does inference. The strategic consequence is that AI features have to be priced like consumption, not like access, which is the same conclusion I reached from the pricing side in per-seat pricing is dead.
The teams that survive treat inference COGS as a first-class number with an owner, the same way a manufacturer treats unit cost. The teams that do not keep reporting engagement lift to the board while the margin quietly compresses underneath it.
A framework for pricing the feature
Decide how an AI feature pays for its own marginal cost before you ship it, not after:
| Model | What it protects | When it breaks |
|---|---|---|
| Included in flat price | Simplicity, adoption | Heavy users go margin-negative |
| Usage credits / metered | Margin tracks cost | Friction can dampen adoption |
| Higher AI tier | Segments cost to willing payers | Splits your product surface |
| Bring-your-own-key | Removes COGS entirely | Loses usage data and markup |
The cleanest guardrail is to compute the margin per user with the AI feature on, and watch the heavy-use tail, not the average:
const grossMargin = (
revenue: number, // $/user/month
fixedCogs: number, // $/user/month
callsPerUser: number, // inference calls/user/month
costPerCall: number, // blended $/call
) => {
const inference = callsPerUser * costPerCall;
return (revenue - fixedCogs - inference) / revenue;
};
// The average user looks fine. Run it for your p95 power user,
// where callsPerUser is 5-10x. That is where the margin goes negative.Usage-based and hybrid models exist precisely to keep this line from going negative, which is why hybrid dominates AI-era pricing adoption, a tradeoff I covered in SaaS pricing models, subscription vs usage-based. The unit-economics discipline is the same one behind LTV vs CAC for SaaS: a customer whose marginal cost rises with engagement changes the lifetime-value math entirely.
Decision guidance
The trap is shipping an AI feature on the old fixed-cost mental model and discovering the margin damage at the quarterly review. Inference is COGS, and COGS has to be priced into the product, not absorbed by it.
The rule: If an AI feature adds variable inference cost per user action, then it must carry a pricing mechanism that scales with that usage before launch, or you cap or meter it, because an unpriced variable cost compounds with your own success.
Model the margin at the p95 power user, not the average, before you ship. If the heaviest cohort goes margin-negative, you have built a feature that loses more money the better it performs. Either meter it, gate it behind a higher tier, or cap the usage, but never ship unbounded inference inside a flat price and call it a growth feature.
Software's old advantage, on loan
The 80% margin was never a law of software. It was a consequence of fixed COGS, and AI quietly handed that advantage back. The companies that keep their margins are not the ones with the cheapest model. They are the ones who noticed COGS went variable and repriced before the board did the math for them.
Engagement lift with margin compression underneath it is not a win. It is a slower loss with better dashboards.
FAQ
Q: Why does adding AI features lower SaaS gross margin?
Because inference is a variable cost incurred per request, unlike traditional software COGS which is mostly fixed. Serving one more user used to cost almost nothing, keeping margins near 80%. An AI feature adds a recurring per-action token cost, so margin now falls with usage. A single feature can drop margin from 80% to 45% without any price change, since the cost scales with engagement.
Q: How do I calculate gross margin with AI inference costs?
Subtract both fixed COGS and variable inference cost from revenue, then divide by revenue. Inference cost equals calls per user times blended cost per call. Critically, compute it for your p95 power user, not the average, because heavy users make far more calls and are where margin goes negative. The average can look healthy while the heaviest cohort is unprofitable.
Q: Will cheaper models fix the margin problem?
Not fundamentally. A lower per-token rate reduces the variable cost but does not make it fixed. Inference still scales with every user action, so it behaves like unit cost in a hardware business rather than the fixed R&D of classic software. Cheaper tokens also tend to invite more usage, which can offset the savings. Pricing for the variable cost matters more than chasing a cheaper model.
Q: Should AI features be priced separately?
Usually, yes, in some form. Options include metered usage credits, a higher AI tier, or bring-your-own-key. The goal is a pricing mechanism that scales with the variable inference cost so heavy users do not go margin-negative. Bundling AI into a flat price works only if you cap or gate usage, otherwise your most engaged customers become your least profitable ones.
Q: What gross margin should an AI SaaS target?
There is no single number, but the discipline is to know your margin at the heavy-use tail and ensure it stays positive after inference. Many AI-heavy products run materially below the classic 80% software benchmark because inference is real COGS. The danger is not a lower margin per se, it is an unmeasured one that compresses with adoption while the team reports engagement growth.
Q: How is inference cost different from normal hosting cost?
Hosting is largely fixed: servers and bandwidth that support many users at once and barely move with one more user. Inference is variable: each user action triggers a billable model call, so cost rises in lockstep with usage. That shift from fixed to variable is what reintroduces hardware-style unit economics into software and breaks the fixed-COGS assumption SaaS margins were built on.
Next Read
The pricing-side conclusion of the same shift is that access-based billing no longer fits an AI product: see per-seat pricing is dead.
Sources & Further Reading
- DigitalApplied: SaaS Usage-Based Pricing Models Decision Matrix 2026
- BayTech: How AI Is Reshaping Software Markets in 2026
- CloudZero: LLM API Pricing Comparison 2026
Last updated: June 26, 2026