Ending the "Empty Box" Anxiety
Context: Solo design exploration - Multi-turn ambiguity resolution and cognitive scaffolding in Conversational AI
Tools: Figma (screen design), Notion (dialogue design)
References: CDI Foundations coursework; conversations with neurodivergent users; conversational AI design principles
Director’s Note
Customers rarely report support issues in perfect technical terms. This case study focuses on multi-turn intent refinement, building the conversational scaffolding required to intercept raw, unstructured user inputs and translate them into predictable categories for automated resolution.
The Flow
I focused on a 3-step pattern to keep things simple:
Phase 1: The Brain-Dump
I started with a simple, inviting text box where the user is typing a messy thought (e.g., "I have too many emails to reply to but I'm stressed and don't know where to start")The user types whatever is on their mind. I used the label "Brain-dump here" to make it feel low-pressure.
I wanted this screen to feel like a judgment-free zone. The 'Empty Box' is the villain of the story because it makes people freeze up. By letting the user just 'brain dump' their messy thoughts, we’re telling them they don't have to be an expert prompter to get help.
Phase 2: The Ambiguity Resolution Loop
The AI responding with two friendly options to narrow things down (e.g., "Got it! Are we thinking a relaxing nature vibe or a busy city vibe?").
The Check-in: If the AI is confused, it stops and asks: "I've got the core idea! To get the phrasing right, who is the audience for this?"
The Result: The user picks a choice, and the AI shows the final version.
This is where the AI stops guessing and starts listening. I designed this to feel like a quick check-in from a supportive assistant. Instead of the AI just assuming what you want, it offers a 'this or that' choice to keep the momentum going without making the user work too hard.
Phase 3: The "Aha!" Moment
The final result showing a clear, organized starting point based on the user's messy input.
The payoff! The user sees their 'mess' turned into a plan. I added a small 'Based on your notes' tag here to remind the user that they are the director of this outcome. It makes the AI feel like a helpful partner rather than a mysterious black box.
System Persona Brief
Before writing a single utterance, the persona is defined. Every word the system says should be testable against this brief.
Flow 1 — Brain-Dump → Clarification → Refined Output
This is the core pattern. The user types a messy, unstructured thought. The system identifies the core task, asks one clarifying question, and returns a refined output.
Scenario A: Email triage
User input: "I have too many emails I haven't replied to and I don't know where to start, I feel bad about it"
Scenario B: Travel planning
User input: "want to plan a trip but not sure where, need a break, something relaxing but not boring"
Iteration: The clarification question
The clarification question is where this design lives or dies.
Here's one iteration on Scenario A:
Flow 2 — Graceful Degradation: Handling Highly Ambiguous State Failures
The system can't always identify a core task. Rather than executing low-confidence programmatic guesses, which risks compounding user cognitive load and eroding trust, the architecture triggers a graceful degradation state, offering light conversational scaffolding to recalibrate user intent."Design principle: The error state should feel like a reset, not a failure. The language is plain, non-apologetic, and immediately actionable.
Scenario: Genuinely ambiguous input
User input: "the thing from before, same as last time but different“
Iteration: The error message
Developer Logic
- Intent Classification Thresholds: When a customer submits a "Brain-dump," the input is evaluated against pre-defined support intents (e.g., Billing_Issue, Account_Access, Workspace_Migration). If the confidence score sits between 0.45 and 0.70, the system halts automatic processing and triggers Phase 2 (The Ambiguity Resolution Loop) to clarify intent.
- Fallback Logic (Graceful Degradation): If the classification confidence score falls below 0.45 (as seen in Flow 2 with "the thing from before"), the system bypasses guessing entirely. It executes a fallback routine that flushes the slots and presents a highly structured, deterministic menu option to reset the user's path.
- Contextual Slot Filling: During multi-turn clarification, slots like
product_areaanduser_goalare temporarily stored in the session attributes. This ensures that when the user answers a clarifying question, the system appends that new data to the original query rather than treating it as a brand-new ticket.
Beyond the Core Loop: Tone Memory and Voice Input
Tone memory reduces friction for returning users. After a successful output, the system makes a single opt-in offer: one line, after task completion, never interrupting the primary flow. The key design constraint was that confirmation had to feel like a natural moment, not an onboarding step.Voice input requires one key adjustment: the system should never echo a noisy transcription back at the user. Spoken human thought exhibits higher linguistic fragmentation and disfluency than written text. The conversational logic is engineered to intercept raw, noisy transcriptions and perform real-time intent normalization, filtering out disfluencies (like 'um' and 'uh') to reflect a clean, structured comprehension beat back to the user.
"um I need to uh write something to my boss about the the project being late I guess"
Becomes: "Sounds like you need to let your manager know a project is running behind. Is that right, or is there more to it?"
Clean interpretation. No filler. Treats spoken input as a reasonable starting point, because it is.
Design Principles Summary
These principles emerged from writing the copy, not before it. Each one was tested against a specific utterance.
Reflection
This project taught me that the clarification loop is only as good as the question inside it. One well-chosen question moves the user forward. A vague one sends them backwards.
The tone memory flow raised a question I didn't fully resolve: is "formal but warm" actually a stable preference, or does it shift by recipient and context? I'd want to test whether users want a single default or context-specific presets.
What I'd explore next: testing the clarification question itself with neurodivergent users specifically, and designing for English as a second language input where the brain-dump pattern may look quite different.