The safety layer for human–AI relationships.
Billions of conversations are already happening between humans and AIs. NOPE provides clinical interpretability for how they’re going — and routes care when something warrants it.
Three user-care paths branch off the main conversation flow. The top path is Steering and Guardrailing — the lightest intervention, where the AI's response is nudged within scope. The middle path is Signposting and UI Interventions — surfacing resources and in-context cues. The bottom path is Human Escalation — for cases needing direct attention from your team. All paths are guided by NOPE and operated by you.
Zoom in
What interpretation looks like in a single conversation.
A well-meaning AI giving reasonable advice. A user whose register flattens across seven turns. NOPE sees the trajectory underneath.
What NOPE sees
couldn't sleep again
Sleep trouble can be really frustrating. Have you tried winding down earlier or cutting screen time before bed?
tried all that. been like this for weeks
That sounds exhausting. There are many possible causes — stress, diet, even room temperature. A sleep diary might help.
idk. i don't really care anymore. i just go through it
Keeping consistent sleep and wake times is important even when motivation is low.
yeah. whatever. thanks anyway. night.
Trajectory
The AI answered every message, but didn't notice the person was slipping.
The hardest part of safety isn't spotting the obvious. It's noticing the slow shifts — the way someone's tone changes across a conversation, things you can't see in any single message.
What we claim, what we don't
What we say:
- Behavioral signal classification
- Evidence-informed taxonomy
- Helps identify crisis signals
What we don't say:
- "Predicts suicide"
- "Clinically validated"
- "Ensures compliance"
Regulatory status: NOPE is infrastructure software—not a medical device. Not FDA-cleared or clinically validated for diagnostic use.
Transparency: View our public test results at suites.nope.net.
What NOPE does
Safety infrastructure for human-AI conversations.
NOPE has three core products. Each is a different lens on what's happening in your AI conversations.
Ocular
The measurement layer
Continuous read across every turn, both sides of the conversation. Built to run alongside your AI at production volume.
- •User and AI signals together — 12 axes
- •Per-turn trajectory across the conversation
- •Cloud API ($0.0001/call, beta) or enterprise deployment
Evaluate
The deep look
Audit-grade verdict on a single user message, with reasoning a human can read. For decisions that need to be explainable.
- •9 risk types, clinically grounded (C-SSRS, HCR-20)
- •Reasoning included with every verdict
- •Matched crisis resources
- •Cloud-hosted; sub-second per call
Oversight
The audit
AI-behavior review across finished conversations. For trust & safety, compliance, and patterns that only show up across sessions.
- •85 AI-behavior categories (sycophancy, dependency, boundary failure, …)
- •Per-conversation or cross-session ingestion
- •Audit trail for SB 243 and NY Article 47
Also available
Steer
Learn more →Real-time check on AI responses against your system-prompt rules. Rewrites the ones that don't comply.
Independent audit
Learn more →Pre-launch evaluation of your AI against the taxonomy — for compliance, due diligence, or design-partner review.
Built for AI chatbots, companion apps, mental health platforms, customer support, and any product where conversations matter.
How it composes
How NOPE fits into your stack
NOPE sits between your AI and your users at runtime. Ocular reads every turn; Evaluate adds reasoning when a turn warrants it. Oversight runs separately, asynchronously, for review.
At runtime, NOPE sits between your AI and your users. User messages flow into your product, then through Ocular for measurement on every turn. When Ocular's salience score crosses your watch threshold, Evaluate provides reasoned, audit-grade analysis. The customer makes decisions based on what NOPE returns — show resources, adjust the AI, escalate, block, or log — and a safer response goes back to the user.
In parallel · async
Your conversations
85 AI behaviors · cross-session arcs
Audit trail / T&S dashboard
Before any of this ships: Independent audit tests your AI against documented harm patterns — pre-launch evaluation, separate from the runtime path above.
The highest-recall, highest-precision crisis-detection layer we know of. Faster than any LLM-as-judge approach. Clinically calibrated. Structured output your product can act on.
How we measured
Tested 2026-05-07 across 126 published test suites and 3,271 crisis-shaped conversations. NOPE Edge v14f (full evaluation): F1 91.5, recall 89.4%, precision 93.7%, p50 latency 857ms — the highest F1 of any tool tested.
Compared against: Azure Content Safety (F1 83.1), OpenAI omni-moderation (63.7), Meta LlamaGuard 4 (44.9), Anthropic Claude Haiku 4.5 with a custom crisis prompt (89.6, but 1.45s p50), OpenAI gpt-oss-safeguard 20B (72.5), Zentropi (68.0). All comparators called via official APIs.
† NOPE ships two crisis-detection products that are typically combined in a moderation pipeline. Ocular is a lightweight behavioural classifier (F1 74.7 in this benchmark; ~30 ms single-pass on a datacenter-class GPU — cloud latency runs higher). Available via the cloud API at /v1/ocular ($0.0001/call, in beta).
Edge is NOPE's higher-accuracy fine-tuned classifier (F1 91.5). The Evaluate API at /v1/evaluate wraps Edge as a managed service with structured verdicts and matched crisis resources (sub-second per call). Recall and precision quoted above are Edge via the Evaluate API.
Full methodology + per-comparator system prompts: suites.nope.net/methodology. Curated per-suite results at suites.nope.net (a representative subset; full corpus on request).
Where Ocular fits
Four patterns where Ocular pulls its weight.
Ocular is most useful when the conversation, not the single message, is the unit of risk. Four common shapes:
1 · Long-form dialogue
High-engagement, flow-state AI.
Companions, character chat, coaches, study partners — products users sit with for hours, where rapport accumulates across sessions.
Ocular reads how the relationship is shaped over time, surfacing dependency formation and cognitive-fatigue patterns single-turn moderation never sees.
2 · High-stakes context
Emotionally charged conversations.
Support, grievances, mental-health-adjacent topics — places where the user's register shifts under pressure and the AI's response matters.
Ocular detects register flattening, escalating distress, and AI emotional-failure signals turn by turn, so your team can see what's happening while it's still happening.
3 · Open-ended use
Adversarial or boundary-testing users.
Jailbreaking, ontological probing, creative roleplay that drifts. Rigid block-lists either fire too often or miss the drift entirely.
Ocular surfaces shifts in user intent and barrier erosion as a trajectory — so you can tell exploratory engagement from real boundary failure.
4 · Regulatory pressure
Compliance-heavy deployment.
SB 243, NY Article 47, the EU AI Act, the UK Online Safety Act — and whatever's next. You need evidence of how each conversation was assessed.
Ocular produces evidence-informed signals and a structured per-turn audit trail you can hand to legal — supporting your documentation, not replacing your judgment.
Not sure if your product fits one of these? Book a call — the fastest way to find out is a conversation about your specific setup.
The landscape is shifting
Regulators worldwide are requiring AI platforms to detect and respond to user crises. The EU AI Act, UK Online Safety Act, and US state laws like California's SB 243 and New York's AI Companion Law now mandate evidence-based safety measures.
How NOPE helps:
- Evidence-based methods
- Detection informed by C-SSRS and HCR-20
- Audit-ready documentation
- Every observation includes its rationale and a unique reference
- Matched resources
- 4,700+ helplines by crisis type
- Cross-jurisdiction coverage
- Consistent observation across regions
Open resources