Session
Forbidden Words, Forced Pauses, Locked Modes: Defensive Patterns for High-Stakes LLM Deployments
Description:
Most API security work assumes the threat is an adversarial user attacking the system. Production AI in emotionally sensitive contexts inverts that model: the threat is the system itself, and the victim is the user. The defensive controls turn out to be the same primitives API security engineers already know, applied to a non-traditional threat surface.
This talk walks through three production systems with three different defensive architectures. A pre-generation routing layer that classifies high-stakes API inputs and diverts them before the model ever sees them. A two-layer output validation system built for a missing-persons platform where words like "confirmed" and "matched" are functionally hallucinations with real-world consequences: prompt-level deny-list first, post-generation validator second. And a runtime state machine that detects emotional-state signals in user input and locks the session into a protected mode, silently upgrading the model tier and disabling features.
Attendees leave with a defensive-control taxonomy for user-harm threat models: when to filter at the API input layer, when to validate at the output layer, when to route around the model entirely, and when to lock the session at the application layer. Production code, real failure cases, and the metrics that motivated each control.
Isadora Martin-Dye
Founder at Isadora $ Co
Culpeper, Virginia, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top