welclaiAI·TREND·DIGEST
Models

Open vs closed models: how to choose for a real project

Open weights or a hosted API? The right answer depends on control, cost, and risk — not ideology. Here is a framework that survives contact with production.

models2026-05-11 14:31 KST·Lead Editor·7 min read

The "open versus closed" debate is usually argued as if it were a question of values. In a real project it is a question of trade-offs: who controls the model, where it runs, what it costs at your volume, and what happens when something breaks at 2 a.m. Treated that way, the decision becomes tractable. This guide gives you a way to make it without joining a tribe.

First, a clarification that prevents most confusion. "Open" almost always means open-weights: you can download the model's parameters and run them yourself. It rarely means open-source in the strict sense the Open Source Initiative defines, because the training data and full training code are usually not released. "Closed" means the weights stay with the provider and you reach the model through a hosted API. Both can be excellent. Neither is automatically safer, cheaper, or more capable.

What you are actually choosing between

When you pick an open-weights model, you are taking on the model and the responsibility for running it. You get the parameters, you choose the hardware, you own the uptime, the scaling, the security patching, and the upgrade path. When you pick a closed API, you rent capability and hand the operational burden to the provider. You trade control for convenience.

That single trade — control versus convenience — sits underneath almost every other consideration. Hold it in mind as you read the rest, because most "open is better" or "closed is better" arguments are really just someone weighting control and convenience differently than you should for your project.

Control and data residency

The strongest reason to run open weights is control over where data goes and how the system behaves over time.

  • Data residency. If your inputs cannot legally or contractually leave your infrastructure, a model you host yourself removes the question entirely. Nothing is sent to a third party. For regulated domains this is sometimes the whole decision.
  • Version stability. A model you host does not change underneath you. A hosted model can be updated, deprecated, or retired on the provider's schedule, and a silent change in behavior can break a carefully tuned prompt. Self-hosting freezes the behavior until you choose to move.
  • Offline and air-gapped use. If the system must run without internet access, open weights are effectively the only option.

The cost of this control is that everything the provider used to do — scaling, patching, monitoring — is now your team's job.

Convenience, capability, and speed to ship

The strongest reasons to choose a closed API are the mirror image.

  • You ship faster. A hosted endpoint is available the moment you have a key. No GPUs to procure, no inference server to tune, no autoscaling to design. For a prototype or an early product, this alone can justify the choice.
  • The frontier tends to appear there first. The largest, most capable general-purpose models are often available as hosted services before — or instead of — being released as open weights. If your task genuinely needs the top of the capability range, the closed option may simply be the only one that clears the bar.
  • Operational maturity comes included. Rate limiting, monitoring, safety filtering, and infrastructure scaling are handled for you. You pay for them inside the per-token price, but you do not have to build them.

The cost is dependence: on the provider's pricing, availability, and roadmap, and on a network round-trip you do not control.

The cost question, honestly

Cost comparisons are where most teams fool themselves, in both directions.

A hosted API has near-zero fixed cost and a clear variable cost per unit of use. That is wonderful at low and bursty volume and can become painful at steady high volume, because you pay full price for every single call forever.

Self-hosting inverts the curve. You pay a large, mostly fixed cost — hardware or reserved capacity, plus the engineering time to run it — that is wasted when traffic is low and amortizes beautifully when traffic is high and predictable. The break-even point is real but project-specific.

A few principles that hold regardless of the numbers:

  • Idle capacity is the hidden tax of self-hosting. A GPU you rent by the hour costs the same whether it serves a thousand requests or none. If your traffic is spiky, hosted pricing often wins despite a higher headline per-call rate.
  • Engineering time is part of inference cost. The salary of the people keeping your inference stack healthy is a real line item. Teams routinely forget it and then wonder why "free" weights felt expensive.
  • Total cost depends on utilization, not list price. Do not compare a per-token API price to a per-hour GPU price directly. Convert both to cost-per-useful-request at your expected load.

Licensing is not a footnote

For open-weights models, read the license before you build, not after. The word "open" covers a wide range of terms, and some carry obligations that matter commercially.

Some open-weight releases use permissive licenses close in spirit to MIT or Apache, allowing broad commercial use. Others use custom or community licenses that restrict use above a certain scale, forbid certain applications, or attach conditions to derivative models. The Open Source Initiative maintains the canonical definition of open source if you want a yardstick to measure a license against, and model hubs such as Hugging Face surface the license alongside each model so you can check it early.

The durable rule: a model you can download is not automatically a model you are licensed to use the way you intend. Treat the license as a gating requirement, on par with capability and cost.

A decision framework

Walk these questions in order. The first one that gives a hard answer often decides the matter.

  1. Is there a hard data-residency or offline requirement? If inputs cannot leave your infrastructure, lean strongly to open weights and self-hosting.
  2. Does the task need the very top of the capability range? If yes, check whether an open-weights model actually clears the bar. If only a frontier hosted model does, that narrows your choice fast.
  3. What is your volume profile? Low or spiky favors hosted. High, steady, and predictable favors self-hosting — if you have the team to run it.
  4. Do you have the operational capacity? Running inference reliably is real engineering. If you do not have or want that capability, a hosted API is not a compromise; it is the correct choice.
  5. Does the license permit your intended use at your intended scale? A "no" here overrides every other preference.

Notice that capability is only one of five gates, and rarely the deciding one. Most real projects are decided by data rules, volume, and team capacity long before raw model quality enters the picture.

You do not have to pick just one

The framing of a single binary choice is itself a trap. Mature systems frequently mix both. A common and sensible pattern is to route the bulk of easy, high-volume requests to a cheaper self-hosted open model, and escalate only the genuinely hard cases to a more capable hosted API. Another is to prototype on a hosted API to validate the product, then migrate the steady-state workload to self-hosted weights once volume and requirements are clear. The two options are tools, not teams. Use whichever fits each part of the workload.

The takeaway

Open versus closed is a trade between control and convenience, not a contest of virtue. Open weights give you data residency, version stability, and offline operation at the price of owning the operations and the license review. Closed APIs give you speed, frontier capability, and managed infrastructure at the price of dependence and per-call cost. Decide with the five-gate framework — residency, capability bar, volume profile, operational capacity, license — and let the hard requirements speak first. And remember you can blend the two: the best architecture often routes cheap work to one and hard work to the other.

Sourcing note: license terms and the availability of specific open-weight releases change frequently, so this guide describes durable principles rather than naming current models. Always read the actual license on the model's page before you build.

#open-weights#model-selection#deployment#licensing