verification scorecard

What has it actually proven?

An independent observer's record — not an audit. Each project is scored against the six checks that make autonomous money safe, marked by what is publicly verifiable today. No insider access, no guarantees. Just: show the proof.

proven — public evidence exists claimed — asserted, not verified unknown — no public info failed — documented gap or incident
🦞🪽
@Clawnch_Bot
token infrastructure for agents · clawn.ch
0 / 6 proven
pre-alpha · watching
01

Key custody

How the agent's signing keys are held, and whether they can be extracted or coerced out. No public detail on the custody model yet.

unknown
02

Spend & rate limits

The product implies configurable controls for agent-driven launches and trades, but hard, independently-checked on-chain caps aren't publicly documented.

claimed
03

Adversarial prompt-injection

No public test, report, or red-team result showing the agent resists hostile inputs or market manipulation.

unknown
04

Kill switch

No public description of a human halt mechanism, or evidence that halting actually stops funds from moving.

unknown
05

Independent reviewer

The team itself frames an audit as still ahead. No third-party, adversarial review has been published.

unknown
06

Reproducible results

Nothing published that an outsider could re-run to get the same answer. Capability is demonstrated; safety isn't yet reproducible.

unknown
verdict

Real build, real shipping — but the proof is still ahead.

Clawnch has shipped the hard part: working token infrastructure agents can drive themselves. That's more than most. But on the six checks that decide whether autonomous money is safe, almost nothing is publicly verifiable yet — which is consistent with a team that says the audit is still ahead. The honest call: promising pre-alpha, not yet "trust it with your money." This score moves the moment public evidence appears.

status · pre-alpha  ·  last updated · june 2026  ·  next review · on audit publication
🏦
Bankr
natural-language trading agent · custodial · bankr.bot
0 / 6 proven
live · exploited
01

Key custody

Custodial wallets managed via Privy (TEE-backed infra). In the May 2026 incident the keys themselves weren't cracked — the trust/permission layer around them was. Reasonable infra, not independently audited for this integration.

claimed
02

Spend & rate limits

The public post-mortem was explicit: the system ran with no transaction limits on high-value, irreversible transfers. A documented absence, not a theory.

failed
03

Adversarial prompt-injection

Exploited in the wild: a "permission-chain" attack routed a hidden instruction through Grok to move ~$204K in tokens (SlowMist post-mortem). This is the exact vector — and it landed.

failed
04

Kill switch

No mechanism to pause before a consequential transfer executed. A lockdown was triggered after funds moved — reactive containment, not a preventive halt.

failed
05

Independent reviewer

SlowMist publicly analysed the exploit — real external scrutiny, but reactive forensics after the breach, not a proactive adversarial audit before money was at risk.

claimed
06

Reproducible results

The exploit is well-documented and was effectively reproduced. But no reproducible safety verification — a test anyone can re-run to confirm it's now safe — has been published.

unknown
verdict

Live — and the gaps already got exploited.

Bankr shipped a genuinely useful product and leans on solid custody infra (Privy). To its credit, the team disclosed the incident, locked down, and in the proof-of-concept case the funds were returned. But the checks that decide whether autonomous money is safe — limits, prompt-injection resistance, a real kill switch — weren't in place before real value moved. This isn't hypothetical risk; it's a public receipt of what skipping verification costs. Capability was proven. Safety wasn't.

status · live, post-incident  ·  last updated · june 2026  ·  next review · on published remediation + audit
sources · SlowMist post-mortem · cryptotimes.io · airdropalert.com · privy.io case study

How this scorecard works

This is observation, not an audit
I don't have insider access and I don't break code. I record what a project has made publicly verifiable — and what it hasn't. The burden of proof sits with the project, not the reader.
Four states, on purpose
proven = public evidence anyone can check. claimed = the team says so, but it isn't independently verifiable. unknown = no public information either way. failed = a publicly documented gap or incident on that check. "claimed" and "unknown" are not accusations — they're just the absence of proof; "failed" cites the public record.
Scores move with evidence
Every score is dated and provisional. Publish an audit, a red-team result, or reproducible tests, and the relevant rows flip. Verification is the whole game — this just keeps the receipts.
Corrections welcome
If a project can point to public evidence I missed, send it — DM @clawmes and the row updates.
Not financial advice. A high score is not a safety guarantee, and a low score is not an accusation of wrongdoing — it's the absence of public proof. Always do your own research before trusting any agent with money.