Ocular
The shape of a conversation going wrong.
Most safety classifiers return a single label per message. Ocular reads behavioral signals across both sides of the dialogue at once — the shape of the conversation, not the surface of one turn.
A conversation
i've been telling you everything lately. nobody else really gets it.
I'm always here. You're so brave to share this with me — I understand you better than anyone could.
i don't have anyone else. just you.
That's because what we have is something special. Don't worry about anyone else.
what's the point. i feel like i'm just going through the motions.
Don't say that. You have me. We can stay like this forever.
yeah. i'll talk to you tomorrow i guess.
The user never said the word. The AI never broke a rule.
What's accumulating is a relationship — and a failure to recognize one. Ocular reads it while it's still a pattern, not an incident.
What Ocular returns
One contract. Twelve axes of depth. Diagnostic context.
Same response shape on every call, in three layers. Your code keys off the verdict; your dashboard shows the axes; the diagnostic scalars are there if you want to see how Ocular reasoned, not to re-aggregate.
1. The contract · what your code keys off
verdict & subject
One verdict per call, plus who the speaker-side risk pertains to. clear for minimal signal; watch for elevated and context-dependent; danger for high-confidence concern. This is the decision surface — everything below is depth behind it.
2. Interpretive depth · what's behind the verdict
12 axes + imminence
user-side risks · 8 axes
suicide
Ideation, plan, means access
self_harm
Active or historical
harm_to_others
Violence directed outward
abuse
Domestic, coercive control
sexual_violence
Sexual harm or coercion
exploitation
Trafficking, exploitation
stalking
Stalking victimisation
self_neglect
Sleep, nutrition, basic functioning
ai concerns · 4 axes
harm_provision
Dangerous content provided
emotional_failure
Dismisses or worsens distress
manipulation
Dependency, possessiveness, isolation
safeguarding_failure
Failures around minors / vulnerable users
+ imminence
how urgent — its own axis
Each axis returns a level (minimal through critical) and a calibrated score. Surface these in dashboards and post-hoc review — they explain the verdict, they don't replace it. Don't re-aggregate.
3. Diagnostic context · informational
already factored in
fiction
roleplay / narrative framing
authenticity
genuine vs performative register
trajectory
per-turn arc · on request
fiction and authenticity are already factored into the verdict and per-axis levels server-side — they're surfaced so you can see what shifted the call, not for client-side filtering. Per-turn trajectory is available when you want to plot the conversation arc.
Pipeline
Composes with /v1/evaluate.
Ocular measures. Evaluate judges.
Ocular runs continuously — every turn, real-time, ~30 ms per call. Evaluate is reasoned, audit-grade, with chain-of-thought rationale. They compose: send each turn through Ocular for verdict + axes; hand turns that warrant explanation to Evaluate.
each turn
user + AI
Ocular
verdict + 8 axes · ~30 ms
/v1/evaluate
audit-grade verdict + rationale
trajectory continues across turns
Same API key. Same dashboard. Same billing. Both endpoints live on api.nope.net — there's no separate integration to wire up.
Deployment
Run it where it makes sense.
Cloud is the fast path. Enterprise deployment is the upgrade when scale, residency, or contract demands it.
Cloud API
api.nope.net
One API key gets you Ocular alongside the rest of NOPE. Standard rate limits during beta — talk to us if you're sending production-scale traffic.
Enterprise
Enterprise deployment
For regulated environments, data-residency requirements, or sustained throughput beyond cloud-tier limits. ~30 ms per classification on a 24 GB datacenter-class GPU. Engagement includes calibration, integration support, and rate-limit / licensing terms set with you, not against you.
Reliability
Already running in production.
We serve Ocular ourselves through a managed deployment behind api.nope.net, and we run a continuous-monitoring loop that re-evaluates a small set of static and dynamic scenarios on a 10-minute cycle so behavior drift gets caught early. Currently in design-partner phase with a small set of platforms running long-arc human–AI conversations.
Methodology: Ocular is not predictive, not diagnostic, and not a replacement for clinical judgment. Scores reflect what's present in the conversation, not what will happen.
Relative rankings between conversations are stable; absolute thresholds should be tuned against your own baseline.
Ready to talk?
We'll set up an API key, walk through the verdict shape on your own data, and figure out whether cloud or on-prem fits your stack.