The Case for Building Cheap

Groq just raised $650M. SpaceX is charging $150M/month for compute. Our internal agents run for $40 a day.

Jun 24, 2026

This week three stories landed within 24 hours of each other, and read together they make one argument.

Groq sold its LPU chip technology to Nvidia for $20 billion, watched its founding team depart, then raised $650M to reinvent itself as an AI inference cloud provider. SpaceX signed a $6.3 billion compute deal with open-source AI startup Reflection AI: $150 million per month for access to Nvidia GB300 chips at the Colossus 2 data center. And Zhipu AI shipped GLM-5.2, a fully open-source model with a 1M token context window, an MIT license, and SWE-bench Pro scores (62.1%) that put it in the same conversation as closed frontier models.

These are infrastructure stories, not application stories. The people selling the shovels are doing very well. The question for any enterprise building on top of this stack is: how much of that volatility do you actually need to inherit?

We’ve been building agents for ourselves at TribalScale for the past year. Not demos. Production systems.

Four agents: Biscuit (people and culture), Gravy (delivery operations), a Proposal Agent, and personal agents for individual employees. Four months from start to production. The platform runs for $20-40 USD per day across all active workflows. Twenty employees use personal agents for timesheets, scheduling, HR requests, and productivity work. Case study creation went from days to approximately 15 minutes.

The architecture is deliberately simple: AWS Lambda, Claude Sonnet, a shared agent runtime, and a governed tool registry where each agent gets only the tools it actually needs for its specific task. No proprietary inference layer. No exotic compute. Each agent can only do what it’s supposed to do, which is also how you earn trust with the people using it.

The lesson isn’t that expensive infrastructure is wrong. It’s that most organizations don’t need it yet. In early adoption, the constraint isn’t compute or model capability. It’s clarity: what does the agent actually do, what tools does it need, who reviews its outputs, and what happens when it’s wrong. Simple, governed architectures answer those questions better than complex ones. And they’re a lot easier to debug at 11pm.

The GLM-5.2 story matters here too. A frontier-adjacent open-source model with an MIT license means enterprises under regulatory or data residency constraints now have a viable option they can run in their own environment. You don’t have to bet everything on one closed model provider. You can build for swap from the start.

The conversation we keep having with clients isn’t about which model to pick. It’s about the foundation: shared runtime, least-privilege tool access, transparent logging, human review at the right checkpoints. Get that right and the model underneath becomes a variable. Get it wrong and every model upgrade is a project.

We wrote up what we built and how: Enterprise Agentic Platform case study.

What I’ve been reading

Groq Raises $650M to Become a Neocloud After Nvidia’s $20B Deal — TechCrunch, June 22

Six months after licensing its inference technology to Nvidia for $20B (and losing its founding team in the process), Groq is raising $650M to pivot from chip startup to AI inference cloud. The bet: enterprises want fast, affordable inference capacity outside the big three hyperscalers. The neocloud category is real and growing. Whether Groq can execute with a largely new team is the open question.

SpaceX Signs $6.3B Compute Deal With Open-Source AI Startup Reflection — CNBC, June 22

$150M per month. That’s what Reflection AI is paying SpaceX for Nvidia GB300 chips at Colossus 2, the same facility that hosts Anthropic, Google, and Cursor compute. Reflection is pre-public-model, building for government and national security customers. The commercial compute platform story here is SpaceX’s: Colossus is quickly becoming infrastructure for the entire frontier AI ecosystem.

GLM-5.2 Is the Step Change for Open Agents — interconnects.ai, June 2026

Zhipu AI’s GLM-5.2 hits 62.1% on SWE-bench Pro and 81.0% on Terminal-Bench 2.1, trades blows with select frontier models on sustained planning tasks, ships with MIT weights, and supports up to 1M token context. For regulated industries where “the model runs in our environment” is a requirement, not a preference, this is a meaningful shift. The open-source option just became a serious one.

The BFSI signal worth paying attention to

The Cambridge Centre for Alternative Finance published its 2026 Global AI in Financial Services Report this month. A few numbers stand out.

81% of financial institutions now use AI in some form. 70% of buy-side firms are using it in the front office. Agentic AI adoption has already reached 52% across the industry. But fintechs are three times more likely than traditional banks to have reached advanced adoption stages: 19% versus 6%.

The gap isn’t capability or investment. It’s org structure. Most large financial institutions have AI/ML headcount and board-approved strategies. What they frequently lack is the product studio model: product, engineering, and design working together from day one, not in sequence. The bottleneck is how teams are structured, not how much they’ve budgeted.

For deeper thinking on what AI adoption actually looks like inside financial institutions, the architecture decisions, the governance trade-offs, the patterns that are working: FinScale is worth your time.

Hit reply with what you’re seeing. Are your clients asking about open-source models? Are you building on closed infrastructure and feeling the cost pressure? I read everything.

TribalScale

Discussion about this post

Ready for more?