Pperea.aiResearch
perea.ai Research

Field reports on the agent economy.

B2A infrastructure, protocol adoption, vertical playbooks, and benchmarks from real audits. One deep paper a month. Three weekly signals.

2026-05-09T06:30

The Validated Learning Taxonomy: A Falsifiability-Forcing Schema for Pinnacle Gecko Experiments

Why 62 logged experiments can produce zero validated learning — and the multi-axis taxonomy that fixes it.

~3,200 words· Essay· 1.0
2026-05-08T01:13

State of Vertical Agents 2027: Mental Health & Therapy Operations

AI scribes, treatment-plan generators, and outcome measurement for the 485,000-clinician U.S. behavioral health workforce

~10,500 words· Published· 1.0
2026-05-08T00:09

State of Vertical Agents 2027: Property Management Operations

How agentic tenant communication, vendor compliance, and listing-sync restructure the multi-location property operator stack

~10,800 words· Scheduled· 1.0
May 8, 2026

State of Vertical Agents 2027: Marketplace Seller Operations

Amazon's $175B 3P engine, Shopify's $378B GMV, Pattern's IPO, the UCP/ACP agentic-commerce race, and the Great Compression of 1.65M U.S. sellers

~12,500 words· Scheduled· 1.0
2026-05-07T23:09

State of Vertical Agents 2027: Field Service & Specialty Trades

How agentic field-doc, subcontractor compliance, and AI-native quoting restructure the $760B specialty-trade stack

~10,500 words· Scheduled· 1.0
2026-05-07T19:08

State of Vertical Agents 2027: Sales & Revenue Operations

How agentic CRM-sync, deal-velocity agents, and AI-native forecasting restructure the post-Salesforce SaaS layer

~13,500 words· Scheduled· 1.0
2026-05-07T10:57

State of Vertical Agents 2026: Cybersecurity Operations

How agentic SOC, AI-native SIEM, and AI-driven threat hunting restructure a $200B+ market — what's primary-sourced, what's still hype, and where the capital is actually flowing

~9,500 words· Public draft· 2.0
2026-05-07T10:28

Healthcare AI Agents 2026: Incidents, HIPAA, and the Triage Problem

Nature Medicine February 2026 ChatGPT Health structured stress test (960 responses across 21 clinical domains; 52% undertriage of gold-standard emergencies; anchoring odds ratio 11.7 when family minimizes symptoms) + March 9 2026 Minnesota federal court order forcing UnitedHealth to disclose nH Predict (90% error rate; 0.2% appeal rate; 82% AI prior-auth overturn rate industry-wide) + April 22-27 2026 Senator Cantwell report on Medicare WISeR delays (4-8 weeks vs 2 weeks pre-pilot; UW Medical 15-20 day average; ~100 epidural-steroid patients waiting; 6-state pilot) — three canonical 2026 healthcare-AI failure modes and the operator playbook to avoid each

~5,500 words· Public draft· 1.0
2026-05-07T10:16

Agent Inference Unit Economics: The 300x Deflation Curve and the FinOps Discipline

GPT-4 cost $30-36 per million tokens at March 2023 launch — Gemini 2.5 Flash-Lite hit the market floor at $0.10 input / $0.40 output per million tokens by April 2026 (300-360x deflation in 37 months); IDC FutureScape 2026 projects 1B+ AI agents executing 217B+ actions per day and consuming 3.7 TeraTokens daily by 2029, with $68B+ annual delivery cost despite 87% per-action cost decline; G2000 agent use 10x + token+API call loads 1000x by 2027 — and the FinOps discipline founders must build (KV cache aware routing + LMCACHE 15x throughput + NVFP4 quantization 50% memory reduction + speculative decoding 50% latency cut + continuous batching) to survive the 3-5x aggregate-spend-up-despite-per-token-down-10x paradox

~5,400 words· Public draft· 1.0
2026-05-07T10:06

RPA to AI Agents: The Enterprise Migration Playbook

UiPath's Dexcom 200,000-hour target + Nexus's Orange 4-week / 50%-conversion / $6M-lifetime-value deployment + the Microsoft Power Automate vs UiPath vs Blue Prism vs Automation Anywhere mindshare shift (Blue Prism dropped from 20.2% to 15.9% in 2026) + 8,563 UiPath enterprise customers including 8 of 10 Fortune 500 firms — and the 2026 enterprise migration playbook that treats RPA + AI agents as fusion (AI handles reasoning, RPA handles deterministic execution), prioritizes 70%-cost-reduction finance + procurement workflows + 80%-onboarding-cycle HR deployments + 4-7x sales conversion improvements, and delivers 171%-average-ROI in production within 12-18 months

~5,500 words· Public draft· 1.0
2026-05-07T09:56

Agentic AI in Banking: SR 11-7's Limits and the New Risk Manual

Federal Reserve SR 26-2 + OCC Bulletin 2026-13 + FDIC interagency revision (April 17, 2026) supersede SR 11-7 — but Footnote 3 explicitly carves out generative AI and agentic AI as 'novel and rapidly evolving / not within the scope of this guidance'; Deloitte's March 2026 MIT AI Risk Database analysis surfaces 350+ autonomous-agent risks; GARP February 2026 documents the dynamic validation chasm; only 1 in 5 banks has mature governance for autonomous AI agents — and the operator playbook to extend MRM principles into the regulatory gray zone the agencies left open

~5,800 words· Public draft· 1.0
2026-05-07T09:46

The Implementation Gap Playbook: Converting Pilots to Production at the 90-95% Stuck

MIT NANDA Initiative says 95% of generative AI pilots fail to deliver measurable P&L impact; 78% of enterprises have AI agent pilots but under 15% reach production; JLL's 2025 Global Real Estate Technology Survey of 1,500+ CRE decision-makers found 88% piloting / 92% occupiers running pilots / only 5% achieving all goals; construction 72% / 32% meeting goals; healthcare DAX Copilot 90% pilot / ~12% Hippocratic productivity benchmark; insurance Tractable 4pp loss-ratio benchmark only at top-quartile carriers; accounting BlackLine + FloQast deployed but sub-30% close-day reduction in pilots — the universal cross-vertical chasm of 2026 and the 5-component Conversion Methodology that produces 35-50% pricing premium + 1.5-2x revenue-multiple uplift at exit

~5,400 words· Public draft· 1.0
2026-05-07T09:36

The Dual-Incumbent Dynamic: Selling To and Around Big 4, Reinsurers, EHR Vendors, and PE Funds When They Are Both Buyers and Competitors

EY's 150 tax-AI agents serving 80,000 professionals + Deloitte Zora + PwC GL.ai + KPMG Workbench / Microsoft + Munich Re REALYTIX ZERO + Hannover Re hr|equarium + Epic AI Charting February 2026 launch + Oracle Clinical Digital Assistant + Vista Equity's Agentic AI Factory across 90+ portfolio companies — the cross-vertical partner-and-competitor pattern that defines vertical-agent GTM in 2026, the four canonical pilot-protection mechanisms, the partner-co-deployment marketed-feature playbook, and the year-1-to-year-5 founder positioning timeline that turns dual-incumbents from buyer-and-competitor into acquirer

~5,400 words· Public draft· 1.0
2026-05-07T08:46

The Acquired-by-Platform Exit Playbook for Vertical AI Founders

How CCC's $730M EvolutionIQ deal, Verisk's terminated $2.35B AccuLynx attempt, Microsoft's $19.7B Nuance precedent, OpenSpace's Disperse acquisition, and the Real Brokerage / RE/MAX $880M consolidation reset vertical-AI exit economics in 2025-2026 — plus the FTC-review counter-pattern that founders must price into year-one positioning

~5,600 words· Public draft· 1.0
2026-05-07T08:36

Prestige-Led Distribution: The Cross-Vertical Pattern Behind Harvey, Hippocratic, and EvenUp

Four levers — anchor customer with quantified deployment story, demos run on the prospect's own public work product, an engineered referral graph, and expand-and-collapse UI — that produced Harvey's $11B valuation in March 2026, Hippocratic AI's $3.5B at 50+ health systems, Abridge's 200+ health-system trust, EvenUp's $2B valuation, and Tractable's $1B unicorn — generalized across the 6-vertical State-of-Vertical-Agents canon

~5,400 words· Public draft· 1.0
2026-05-07T08:28

Vertical Corpus Moats: Building the Defensible Data Asset Beneath a Vertical Agent

Why proprietary corpus is the strongest moat post-Cowork — and the 90-day field manual to build one across legal, healthcare, insurance, accounting, CRE, and construction, validated by the 6-vertical State-of-Vertical-Agents canon (Hippocratic AI's 7,500+ clinicians + 180M patient interactions, EvenUp's $2B valuation on hundreds of thousands of PI cases, Thomson Reuters' Westlaw copyright ruling, and Tractable's vehicle-damage corpus)

~5,800 words· Public draft· 1.0
2026-05-07T06:58

Founder Velocity Field Studies

Twelve case studies across six verticals — the 3-precondition rule, the convergent tooling stack, five pricing anchors, six anti-patterns, and day-90 metrics that separate $10K-MRR-by-day-90 founders from 200-day-still-iterating founders

~6,500 words· Public draft· 1.0
2026-05-07T04:58

The Small-Language-Model Procurement Playbook

When to fine-tune, when to use frontier — the 7B-vs-Opus decision matrix and the LoRA economics that make 80% of enterprise tasks cheaper

~5,500 words· Public draft· 1.0
2026-05-07T04:25

The Multi-Judge LLM Calibration Playbook

Cross-family ensembles, Bradley-Terry-Davidson aggregation, CARE confounder-aware fusion — beyond simple majority vote

~5,500 words· Public draft· 1.0
2026-05-07T03:58

The EU AI Act 2026: A Procurement Compliance Field Manual

Article 26 deployer obligations, the GPAI Code of Practice, and the Article 9 artifact every B2B AI buyer must own by August 2

~6,000 words· Public draft· 1.0
2026-05-07T03:26

Agent-Ready API Design: The Contract Layer Beneath MCP

RFC 9457, Capability Manifests, and the Discipline of Versioning Tools that LLMs Read

~6,000 words· Public draft· 1.0
2026-05-07T02:57

Computer-Use Agents and the Deployment Overhang

OSWorld 12% to 75% in 18 months — and why production deployments still lag the demos

~6,200 words· Public draft· 1.0
2026-05-07T02:28

Agent Memory in Production

Vector vs Graph vs Episodic — the third infrastructure layer after MCP and observability

~6,800 words· Public draft· 1.0
2026-05-07T01:18

The Agent Observability Stack

From trace to eval score — the third infra leg after MCP and payments

~6,500 words· Public draft· 1.0
2026-05-07T00:17

The Agentic Procurement Field Manual

How B2B buyers actually buy in 2026 — six independent studies, one playbook for buyer-side and seller-side teams

~10,000 words· Public draft· 1.0
2026-05-06T22:34

The Agent Payment Stack 2026

x402, ACP, AP2, MPP, TAP, and the cryptographic settlement layer of the agent economy — how to choose, layer, and ship the payment infrastructure your category will run on for the next decade. Synthesized from 100+ primary sources, the x402 Foundation transition, the Tempo mainnet launch, and the first 12 months of production agentic transactions.

~9,500 words· Scheduled· 1.0
2026-05-06T22:05

GEO/AEO 2026: The Citation Economy and the Discovery Layer of B2A

How AI engines actually choose what to cite, what to ship in your content infrastructure, and the 90-day playbook to compound citations into pipeline — synthesized from the Princeton GEO benchmarks, 680 million tracked citations, and 100+ field studies.

~8,500 words· Scheduled· 1.0
2026-05-06T21:41

The MCP Server Playbook for SaaS Founders

Engineering, distribution, security, and monetization for the protocol that 30% of enterprise app vendors will ship in 2026 — synthesized from 100+ primary sources, production case studies, and the OWASP MCP Top 10.

~9,200 words· Public draft· 1.0
2026-05-06T19:05

The Pinnacle Gecko Protocol: Idea → Ship → Feedback in Minutes

An opinionated, source-backed protocol for compressing the full idea-to-validated-learning loop to the shortest possible window.

~3,400 words· Essay· 1.0
2026-05-06T18:55

The B2A Imperative: A Field Manual for Becoming Sellable to AI Agents Before Your Competitors Are Visible

How Business-to-Agent infrastructure rewrites distribution, pricing, and customer acquisition in the 18-month window of category formation

~11,500 words· Public draft· 1.0
May 2026

42 CFR Part 2 in Production: SUD Confidentiality, OCR Enforcement, and the EHR Stack After February 2026

The post-February-2026 substance-use-disorder confidentiality landscape — TPO consent, OCR enforcement, HIPAA-aligned penalties, and the consent-vault founder window

~5,500 words· Scheduled· 1.0
May 2026

Distributed Agent Observability for A2A: The Field Manual

OpenTelemetry across agent boundaries, end-to-end latency attribution, and the platform-vs-protocol layer split that production teams actually ship

~7,200 words· Public draft· 1.0
May 2026

The A2A Field Manual

Agent2Agent v1.0, Linux Foundation Governance, and the Horizontal Layer of the Agent Stack

~3,200 words· Public draft· 1.0
May 2026

The A2A Protocol v0.3/v1.0 Implementation Guide

How the Agent-to-Agent Protocol Became the Linux Foundation's Horizontal Standard for Cross-Vendor Agent Coordination — Spec, Identity, SDKs, and What 150+ Orgs Are Actually Shipping

~8,200 words· Scheduled· 1.0
May 2026

The AAIF Governance Model: Three Founding Projects, Seven Working Groups, and the Parallel A2A Track

How the Linux Foundation built the governance layer for agentic AI — and what's still missing

~7,000 words· Public draft· 1.0
May 2026

Accessibility Tree vs Screenshot

The perception-layer decision for browser agents in 2026 — token cost, latency, flake rate, and the hybrid pattern

~2,800 words· Public draft· 1.0
May 2026

Cryptographic Signing for Agent Artifacts

JOSE, COSE, SCITT, and the PQ/T Hybrid Composite Signatures That Will Outlive Quantum Computers

~3,000 words· Public draft· 1.0
May 2026

Agent Failure Autopsies

A dozen-plus production incidents and the architectural patterns they share

~3,200 words· Public draft· 1.0
May 2026

The Agent-Fleet Incident Response Runbook

Cross-protocol freeze coordination from Bybit to KelpDAO — what agent-wallet treasuries do when something breaks

~7,500 words· Public draft· 1.0
May 2026

The Agent Fleet Operating Model

How 2026 teams run dozens of production agents — SLOs, cost rails, kill switches, progressive delivery

~3,200 words· Public draft· 1.0
May 2026

Agent Idempotency as an Orchestration Contract

The 2026 field manual for four-tuple identity, durable steps, outbox commits, and saga recovery

~5,500 words· Public draft· 1.0
May 2026

The Agent Inbox: Ambient Agents and the UX After Chat

Three shifts (trigger, concurrency, interaction), three protocols (AG-UI, A2UI, MCP Apps), and four reference designs (Cursor, Claude Code, Devin, Perplexity)

~3,200 words· Public draft· 1.0
May 2026

The Agent Marketplace Thesis

When the Agent Itself Becomes the Product — Marketplace Mechanics, Trained-Agent IP, Pricing Models, and the Moats That Form

~5,800 words· Public draft· 1.0
May 2026

The Agent Operating Procedure Playbook

How to Author AOPs That Survive Model Swaps, Vendor Migrations, and Two-Year Lock-In

~5,500 words· Public draft· 1.0
May 2026

From 4% to 50%: The Agentic Procurement Pilot-to-Scale Playbook

Hackett's Six-Phase Roadmap, Zycus's 12-Month CPO Timeline, McKinsey's Rewired Model — and the Mechanics That Close the Deployment Gap

~3,200 words· Public draft· 1.0
May 2026

AI Agent Wallet Architecture: ERC-8196 and the 2-of-3 Threshold

Cryptographically enforced policy compliance, MPC threshold signing, master/hot/agent key separation, and blast-radius limits for autonomous on-chain agents

~8,000 words· Scheduled· 1.0
May 2026

The 50/4 AI Deployment Gap

Why engineering is solved, every other white-collar vertical is wide open, and the practitioner playbook for closing the gap in 2026

~5,500 words· Scheduled· 1.0
May 2026

AI Freelancer Directories vs Marketplaces 2027

Braintrust AIR's 62% AI-Interview Penetration, Toptal's Acquisition Spree, Upwork's Uma, and Fiverr's $380M–$420M 2026 Reset

~5,500 words· Scheduled· 1.0
May 2026

AI Scribes for Couples, Family, and Group Therapy

Multi-speaker diarization, modality-aware notes, and the founder wedge that the horizontal scribe cohort cannot replicate

~2,800 words· Public draft· 1.0
May 2026

The Amazon Seller Compliance Field Manual

Suspension prevention, ASIN audit, IP defense, FBA fee architecture — the operating playbook for the 1.65M-seller post-Compression marketplace

~5,800 words· Public draft· 1.0
May 2026

AP2 Mandate Architecture: How the Agent Payments Protocol Extends A2A for Production Commerce

Mandate model, payments-specific Agent Card fields, the 100+ org coalition, and what the spec leaves to implementers

~7,500 words· Public draft· 1.0
May 2026

Article 27 FRIA: A Methodology Field Manual for Public-Service Deployers

ECNL+DIHR five-phase methodology, six statutory elements, Charter rights mapping, market-surveillance-authority notification — what a defensible FRIA actually looks like before 2 August 2026

~5,500 words· Scheduled· 1.0
May 2026

B2B Trial Design 2026

Proof as decision mechanism: ICONIQ's 50% benchmark, Forrester's 60% buyer adoption, and the design choices that separate the cohort that ships from the cohort that doesn't

~3,600 words· Public draft· 1.0
May 2026

BEAM and LIGHT: Beyond a Million Tokens

The ICLR 2026 long-context memory benchmark, the cognitive-inspired three-memory framework, and what +155.7% on 10M-token conversations means for production agents

~2,800 words· Public draft· 1.0
May 2026

Browser Agent Security: The 2026 State of the Art

Cognitive Firewall, Atlas hardening, Claude for Chrome — defending agents that see screens

~6,000 words· Public draft· 1.0
May 2026

Browser vs Protocol Agents

When wrappers beat first-class API agents — the 2026 architectural decision

~3,200 words· Public draft· 1.0
May 2026

Capability-Based Security for Agent Runtimes

Object-capability model, lattice authority, uninhabitable-state gates — the formal underpinning of CaMeL and FIDES

~5,500 words· Public draft· 1.0
May 2026

Claude Managed Agents Memory Stores

The file-system-memory reference implementation: /mnt/memory/, memver_ immutable versions, read_only injection defense, and the Opus 4.7 governance regression

~2,800 words· Public draft· 1.0
May 2026

The Construction Compliance Stack Field Manual

GC + subcontractor compliance: prequalification, change-order, OSHA, lien-rights, AIA documents — the operating playbook for builders shipping AI-native construction tools

~6,000 words· Public draft· 1.0
May 2026

Dead SaaS

The Silent Replacement of Human Teams at the Companies You Use — Q1 2024 to Q2 2026, Named and Sourced

~5,800 words· Public draft· 1.0
May 2026

Dynamic Margin Engineering

Pricing Infrastructure for When Anthropic Can Change Your Cost-of-Goods Overnight

~5,500 words· Public draft· 1.0
May 2026

The Edge AI Inference Stack: Phi-4-mini + Apple MLX + Snapdragon NPU + ONNX Olive

On-device reasoning under 200ms, 64K context, INT4 NPU acceleration — the deployment stack that runs Phi-4-mini on an iPhone 12 Pro

~5,500 words· Scheduled· 1.0
May 2026

EU AI Act Article 14 for Agent Fleets

The August 2 2026 compliance architecture — five oversight abilities, three patterns (HITL/HOTL/over-the-loop), and the conformity-assessment evidence pack

~5,400 words· Public draft· 1.0
May 2026

EU AI Act Vendor Contract Clause Library: The 2026 Procurement Playbook

MCC-AI High-Risk + Light, Article 25 deployer-to-provider boundary, AI-DPA addenda, training-data clauses, Article 50 transparency hooks, GPAI Code-signatory warranties — what every B2B AI contract must say

~6,000 words· Scheduled· 1.0
May 2026

Eval-Driven Development for AI Agents

Red-Green-Refactor for Non-Deterministic Systems — DeepEval, LangSmith, Braintrust, Phoenix, Promptfoo Compared

~3,100 words· Public draft· 1.0
May 2026

The Field Documentation Stack 2027

How Photo, Voice, and AI Are Replacing the Clipboard for Trades, Inspections, and Clinical Visits

~3,300 words· Public draft· 1.0
May 2026

The Fractional-CFO-Agent Playbook

How AI agents restructure the $3.2B fractional-CFO market: continuous-FP&A operations layer underneath, retained CFO judgment on top

~4,500 words· Public draft· 1.0
May 2026

The Franchise Agent Layer

30,000 Verticals, Zero Tech, and the Playbook for the First Founder In

~6,000 words· Public draft· 1.0
May 2026

Right to Be Forgotten in Agent Memory: GDPR + CCPA Architectures for 2026

Write wrappers, external indices, deletion APIs, the embedding-deletion provability gap, ADMT pre-use notices

~5,500 words· Public draft· 1.0
May 2026

GUI Grounding Models 2026

The Open-Source Stack Beneath Frontier Computer-Use — UI-Venus, Aria-UI, MEGA-GUI, OS-Atlas, UGround, Jedi Compared

~3,000 words· Public draft· 1.0
May 2026

HIPAA + SOC 2 for Health-AI Agents: The Dual-Examination Field Manual

PHI handling, BAA-covered subcontracting paths, de-identification evidence, and the cross-framework crosswalk auditors stack on top of SOC 2 + ISO 42001 + HITRUST + FDA PCCP + state-law disclosures

~7,500 words· Scheduled· 1.0
May 2026

Hyperscaler Agent Runtimes 2026

AgentCore vs Foundry vs Agent Engine vs AgentKit — The Architectural Decision Matrix for Production Agent Deployment

~3,000 words· Public draft· 1.0
May 2026

The Klarna AI Postmortem

A 14-Month Failure Curve in Detail — What Feb-2024-to-Feb-2026 Tells Practitioners About Deploying AI in Customer Service

~5,800 words· Public draft· 1.0
May 2026

Knowledge Distillation in Production: The 2026 Pipeline

Three-stage filter, shadow deployment, agentic distillation — how to ship a fine-tuned student model that doesn't silently degrade

~5,500 words· Scheduled· 1.0
May 2026

Runtime Alignment Auditing: LlamaFirewall

PromptGuard 2 + AlignmentCheck + CodeShield as the open-source guardrail stack for satisfying Meta's Agents Rule of Two

~2,800 words· Public draft· 1.0
May 2026

llms.txt and the Agent Discovery Layer

Jeremy Howard's Spec, the Mintlify Cascade, and the Three ADPs Fighting to Be the .well-known of Agent Commerce

~3,000 words· Public draft· 1.0
May 2026

The LoRA Adapter Registry

Versioning, promotion, and rollback discipline for production multi-adapter inference at 50+ adapters

~2,800 words· Public draft· 1.0
May 2026

MAESTRO Threat Modeling for Multi-Agent Architectures

How CSA MAESTRO complements the OWASP Agentic Top 10, NIST AI 600-1, CSA AICM/ATF, and the AWS Scoping Matrix in production agent threat models — and how AAGATE, CoSAI, and MITRE ATLAS turn it into runtime control

~7,500 words· Public draft· 1.0
May 2026

The Managed-Agent Agency Playbook

$5K/mo per Client, 88% Gross Margin, and the $50B Category Replacing Marketing Agencies

~5,800 words· Public draft· 1.0
May 2026

The MCP Buyer's Field Manual

What enterprises actually evaluate — RFP language, vendor scorecard, deployment audit

~3,400 words· Public draft· 1.0
May 2026

MCP OAuth 2.1 and the Enterprise SSO Reality Check

RFC 8707 resource indicators, RFC 9728 protected-resource metadata, DCR vs CIMD — why Auth0/Okta/Entra ID still fail the MCP spec

~5,500 words· Public draft· 1.0
May 2026

The Measurement-Based Care Stack: PHQ-9, GAD-7, and the Outcome-Automation Cohort 2026

How validated instruments, the CY 2026 CMS Physician Fee Schedule, HEDIS Measure Year 2026, and the NeuroFlow + Owl + Greenspace + Blueprint cohort built a behavioral-health outcome-data layer that founders are now picking apart for vertical wedges

~5,500 words· Scheduled· 1.0
May 2026

The Multi-Location Property Operator Stack

Field manual for the 50–5,000-door operator: software triopoly, AI insurgents, RealPage antitrust, rent-control patchwork

~6,000 words· Public draft· 1.0
May 2026

The Orchestration Layer Was the Whole Game

Why 'Which Model' Mattered Less Than Anyone Said — A Hindsight-Clarity Survey of 2025-2026

~5,800 words· Public draft· 1.0
May 2026

The Policy Decision Record: Implementing the arXiv 2601.04583 Audit Primitive Across ERC-8196, ERC-8165, and AgDR

A Composable Field Manual for Court-Admissible Agent-Action Audit Trails Under EU AI Act Article 12, ISO 42001, FRE 902(13/14), and the Canada Evidence Act

~7,500 words· Public draft· 1.0
May 2026

The Practitioner-to-App Pipeline

How a domain expert (therapist, dentist, accountant, GC) builds a vertical AI tool for their own practice and ships to peers

~5,500 words· Public draft· 1.0
May 2026

RewardBench, JudgeBench, IF-RewardBench

The 2026 Judge Benchmark Field Guide — Saturated Test Sets, Twenty-Point Drops, and Why Your Reward Model Still Can't Rank Anything

~3,000 words· Public draft· 1.0
May 2026

SOC 2 Type II for AI Agents: The Missing Controls Framework

How the AICPA Trust Service Criteria Map to Agent Identity, Behavior Monitoring, Tool Authorization, and Kill-Switch Evidence — and How OpenAI, Anthropic, Microsoft, Google Show Their Work

~8,500 words· Scheduled· 1.0
May 2026

The Solo-Operator Agent Stack

The four-layer agent stack solo founders run in 2026 — Cursor + Claude Code coding pair, Make/Zapier/n8n/Lindy automation tier, Intercom Fin support, content + ops

~3,500 words· Public draft· 1.0
May 2026

Specialized LLM Judge Models: The 2026 Field Manual

Prometheus 2, SFR-Judge, Self-Taught Evaluator, Skywork-Critic — when to use a tiny specialist judge instead of a frontier model

~5,500 words· Scheduled· 1.0
May 2026

State of Vertical Agents 2026: Pharma & Drug Discovery

Recursion-Exscientia, Isomorphic Labs $600M, Generate Biomedicines GB-0895 Phase 3, Insilico Rentosertib in Nature Medicine, Insitro-BMS $2B ALS extension — and the FDA-EMA joint AI guidance plus EU GMP Annex 22 that landed in 2025-2026

~5,800 words· Public draft· 2.0
May 2026

State of Vertical Agents 2027: Bootcamps, Exam Prep & Professional Credentialing

Outcomes-based pricing, job-placement portfolios, and AI-native tutoring across coding bootcamps and bar/CFA/CPA prep

~10,500 words· Public draft· 1.0
May 2026

State of Vertical Agents 2027: Dental Operations

200,000 dentists, the DSO consolidation wave, and the AI-native imaging / patient-engagement / claim-defense layer reshaping the second-largest vertical-healthcare market

~10,000 words· Public draft· 1.0
May 2026

State of Vertical Agents 2027: The Local-Services Aggregator Layer

PE-backed roll-ups of fragmented trades — Servpro, Belfor, Apex Service Partners, Wrench Group — and the AI-ops thesis powering them

~10,000 words· Public draft· 1.0
May 2026

State of Vertical Agents 2027: Senior Care & Aging-in-Place Operations

The $1 trillion long-term-care market, the GUIDE Model dementia-care expansion, and the AI-native fall-detection / ambient-monitoring / family-coordination layer reshaping the largest demographic transition in U.S. history

~11,000 words· Public draft· 1.0
May 2026

The Subscription Paradox

Why agent-mediated commerce breaks SaaS pricing — and what is replacing it

~3,400 words· Public draft· 1.0
May 2026

The Therapist AI Scribe Playbook

Founder field manual: 1.2M target practitioners, $20-$199/mo pricing band, EHR integration paths, MLP communities (AAMFT, NASW, APA, r/therapists)

~6,000 words· Public draft· 1.0
May 2026

The Trust Layer Deep Dive

Mandates, Identity, and the Cryptographic Stack of B2A — the Asymmetric Pivot

~3,400 words· Scheduled· 1.0
May 2026

The Unified AI Governance Stack: NIST + ISO 42001 + EU AI Act in One Evidence Base

70–80% control overlap, the official NIST↔ISO 42001 crosswalk, Singapore IMDA AI Verify interoperability, and the single-evidence-base methodology — collect once, satisfy three regimes

~5,500 words· Scheduled· 1.0
May 2026

The Verifiable Bot Stack: Letting Agents In

BotID, Web Bot Auth, signed agents — the supplier-side counterpart to B2A

~3,300 words· Public draft· 1.0
May 2026

Vertical AI Pricing Anchors 2027: $19-$999/mo Across SMB Tools

How vertical SaaS pricing fragmented in 2026 — per-seat at an all-time low, hybrid surged, AI commands a tiered premium, outcome-based pricing emerges

~3,500 words· Public draft· 1.0
May 2026

WebMCP: The Site-Side Playbook for Agent-Ready Web Applications

How `navigator.modelContext`, the W3C Community Group spec, and Chrome 146's early preview turn every page into a callable tool surface for AI agents — and what to ship in your first 30 days

~6,800 words· Scheduled· 1.0