Resolution Clarity Grade · Polymarket validation backtest
Do prediction markets resolve fairly?
A 7,155-contract Polymarket backtest.
Most prediction-market contracts on Polymarket and Kalshi resolve cleanly. Some don't. They end in a dispute, or settle against what actually happened. That is the resolution risk traders and funds care about. ClearMarket grades it before a market resolves, from the resolution rules alone, as A, B, or C. This page tests the grade against the public record: do the contracts it flags as unclear actually get disputed?
Key finding. Across 7,155 Polymarket markets, 100% of resolution disputes were on contracts ClearMarket rated C, the lowest of its three Resolution Clarity Grades (RCG). Zero A-rated and zero B-rated markets were disputed, including 627 high-volume A and B markets. Reproducible at github.com/JDSource/clearmarket.
1 · THE FINDING
Across 7,155 Polymarket markets, every market that ended up in a resolution dispute had been rated C by ClearMarket, the lowest clarity tier. Zero A-rated and zero B-rated markets were disputed.
| RCG | Resolution disputes | Markets | Dispute rate |
|---|---|---|---|
| A | 0 | 1,716 | 0.00% |
| B | 0 | 1,461 | 0.00% |
| C | 26 | 3,978 | 0.65% |
Disputes concentrate entirely in the bottom tier, the way loan defaults concentrate in the lowest credit rating. ClearMarket assigns the grade from rules text alone and never reads the dispute history, so the result is not circular.
2 · IT ISN'T JUST THE BUSY MARKETS
The obvious objection: contentious markets draw both more money and vaguer wording, so maybe the grade is only a proxy for attention. It isn't. Splitting the universe into quartiles by trading volume, A and B markets dispute at 0% in every band. Only C disputes, and only more so as volume rises.
| Volume band | RCG A | RCG B | RCG C |
|---|---|---|---|
| Lowest quartile | 0 / 658 | 0 / 441 | 2 / 690 · 0.29% |
| Highest quartile | 0 / 310 | 0 / 317 | 21 / 1,161 · 1.81% |
(A and B are 0% across all four quartiles; the middle two are omitted here.)
A dispute needs two things: a vague rule and enough money to be worth fighting over. The money was there. The highest-volume quartile alone held 627 A- and B-rated markets, the most contentious, highest-attention contracts on the platform, and not one was disputed. Clear wording held at every volume.
3 · WHAT A C LOOKS LIKE
The three largest disputed markets, each rated C by ClearMarket before it resolved, then disputed exactly where its rules were weak:
| Market | Venue | RCG | Volume | The defect |
|---|---|---|---|---|
| MicroStrategy sells any Bitcoin by May 31, 2026 | Polymarket | C | $230M | No committed source of record, only "a consensus of credible reporting"; rules silent on event date vs. confirmation date |
| Netanyahu out by March 31 | Polymarket | C | $104M | Subjective trigger (what counts as "out"), plus two sources with no tiebreak: the subject's own government and "a consensus of credible reporting" |
| US × Iran permanent peace deal by April 22 | Polymarket | C | $26M | A fact controlled by an interested party |
The MicroStrategy market is the most instructive of the three. It shows why the grade is more than a score. It scored 59, a B on the weighted factors, but its rules named no real source of record, pointing only to "a consensus of credible reporting." A hard cap held it at C. Strategy did sell Bitcoin inside the window, between May 26 and 31, but disclosed it on June 1, one day after the deadline. With no source of record committed, Polymarket ruled the late confirmation did not qualify, and the market resolved NO against a sale that had actually happened. One trader reported a loss of $527,000. The cap caught what the score alone would have missed.
4 · HOW THE GRADE WORKS
The grade is a weighted score across seven factors of the resolution rules text, banded A/B/C, then capped. It never reads the outcome. The methodology page carries the full specification: the scoring rules, how each factor is judged, and the complete cap list. In brief:
| Factor | Weight | What it asks |
|---|---|---|
| Trigger objectivity | 28 | Is the deciding condition objective, or open to interpretation? |
| Contested reality | 22 | Is the underlying fact controlled or disputed by an interested party? |
| Source clarity | 18 | Is the source of record named and verifiable? |
| Arbiter incentive | 12 | Is resolution handled by a regulated, accountable arbiter, or a token-holder vote that whales can sway? |
| Source conflict | 8 | Are there conflicting sources with no rule for which wins? |
| Temporal precision | 7 | Is the deadline precisely defined? |
| Source mutability | 5 | Can the source be changed or edited after the fact? |
The arbiter-incentive factor reflects a documented risk. A Wall Street Journal investigation (May 2026) found that in most disputed Polymarket markets a majority of the votes came from the ten largest UMA token wallets, and roughly one in five disputes had a voter who held a stake in the outcome they were judging.
Hard caps
Some defects are fatal. A market can score well on the seven factors and still be ceilinged. The grade is the worse of the score and any cap, so a single flaw can't be averaged away:
- No source of record, or only placeholder language ("a consensus of credible reporting," no named authority) → C.
- A source given only as an illustrative example ("for example, Reuters"), never committed to → B.
- Conflicting sources with no rule for which wins → C.
- A subjective trigger with no objective anchor → C.
- An underlying fact controlled by an interested party → B.
5 · SCOPE & METHOD
Stated plainly, because the honesty is the point.
- "Disputed" means formally challenged through Polymarket's on-chain UMA process, not necessarily overturned.
- ClearMarket grades Kalshi markets too, and the grade applies identically to both venues. This dispute test is Polymarket-only because Polymarket's disputes are public and on-chain; Kalshi has no comparable public dispute feed, so its known settlement controversies are tracked qualitatively, not in this rate.
- Reproducible. The grading model, dispute labels, and this analysis are open at github.com/JDSource/clearmarket.