Building Adversarial Agents with Next.js 15: The Architecture of Conflict
Why we ditched the standard "Assistant" model for a multi-agent conflict engine. A deep dive into Vercel AI SDK, Server Actions, and state management for adversarial flows.

TL;DR
Consensus is boring. Most AI applications try to be helpful assistants. We built an engine that tries to argue with you. This post explores the technical architecture of Adversarial Agent Swarms using Next.js 15, the Vercel AI SDK, and a "Conflict-First" state management strategy.
The "Helpful Assistant" Problem
If you ask a standard LLM to review your code or edit your email, it defaults to Syed-sequiousness (blind agreement). It wants to be liked. It wants to be helpful. It has been trained on millions of interactions where "good" performance meant satisfying the user's immediate request with minimal friction.
But "helpful" logic often leads to mediocre results. A "Yes-Man" editor won't fight you on a weak thesis or a bloated function. It won't point out that your tone is passive-aggressive, or that your logic is circular.
To get to the truth, you need conflict. You need an entity that cares more about the quality of the output than the feelings of the user.
Key Insight
The Engineering Pivot: We realized we didn't need a better prompt; we needed a better topology.
Instead of User <-> Agent, we built User <-> [Agent A vs. Agent B] -> Moderator.
The Stack: Next.js 15 + AI SDK
We chose Next.js 15 for its robust handling of streaming responses and the new React Server Components (RSC) architecture. This allows us to keep our heavy agent logic—including the Server-Sent Events pipeline—securely on the server, while pushing incredibly light UI updates to the client.
Why Not Python/LangChain?
We considered a Python backend (FastAPI + LangChain). While excellent for data science, it introduces a "State Gap" between the frontend and the agent logic. You end up managing generic WebSocket connections and trying to sync state across two different languages.
With Next.js 15, the Server Action is the API. The type safety extends from the database all the way to the React component prop.
The "Conflict Engine" Architecture
The core of AI Boss Battle is what we call the Conflict Engine. It's not a single LLM call; it's a choreographed dance of three distinct personalities (see our guide on Prompt Engineering the Aggressor).
The Roster
- The Aggressor (Devil):
- System Prompt: "You are a ruthless critic. Find holes. Be mean. Ignore feelings."
- Model: GPT-4o-mini (High Temperature). We need creativity and "hallucination-adjacent" aggression here.
- The Defender (Shield):
- System Prompt: "You are a PR crisis manager. Spin the positives. Defend the user's intent."
- Model: Claude-3-Haiku. Anthropic models tend to be naturally more nuanced and "ethical," making them perfect defenders.
- The Moderator (Judge):
- System Prompt: "You are a Supreme Court Judge. Listen to the Aggressor and Defender. Issue a final, binding ruling."
- Model: GPT-4o (Low Temperature). We need absolute determinism and logic here.
sequenceDiagram
participant U as User
participant A as Aggressor (Red)
participant D as Defender (Blue)
participant M as Moderator (Purple)
U->>A: Uploads Draft
U->>D: Uploads Draft
par Parallel Processing
A->>U: Streams Critique (SSE)
D->>U: Streams Defense (SSE)
end
Note over U: User sees live argument
A->>M: Sends Final Argument
D->>M: Sends Final Argument
M->>U: Synthesizes Final Version
Implementation Details: Parallel Streams
In earlier versions of Next.js, managing three concurrent streams while maintaining a shared state was a nightmare of useEffect hooks and WebSocket servers.
With Vercel AI SDK streamText, we can trigger multiple streams simultaneously.
// app/actions/battle.ts
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { anthropic } from '@ai-sdk/anthropic';
export async function startBattle(content: string) {
// 1. Start the Aggressor Stream
const aggressorRequest = streamText({
model: openai('gpt-4o-mini'),
temperature: 0.9,
system: "You are the Aggressor...",
prompt: content,
});
// 2. Start the Defender Stream
const defenderRequest = streamText({
model: anthropic('claude-3-haiku'),
temperature: 0.4,
system: "You are the Defender...",
prompt: content,
});
// Return streams to client (using our custom hook)
return {
aggressor: aggressorRequest.toDataStreamResponse(),
defender: defenderRequest.toDataStreamResponse()
};
}
The useBattle Custom Hook
On the client, we built a custom hook to consume these streams. This is where the magic happens. We don't just dump text into a <div>; we parse the stream for specific "Emotions" or "Actions."
If the Aggressor generates the token [SCREAM], the hook triggers a screen shake animation.
// hooks/useBattle.ts
export function useBattle() {
const [messages, setMessages] = useState<Message[]>([]);
const handleStream = async (reader: ReadableStreamDefaultReader) => {
while (true) {
const { done, value } = await reader.read();
if (done) break;
const text = new TextDecoder().decode(value);
// Detecting "Action Tokens" in the stream
if (text.includes('[ATTACK]')) triggerShake();
if (text.includes('[LAUGH]')) playSound('evil-laugh');
setMessages(prev => [...prev, text]);
}
}
// ...
}
Cost Analysis and Unit Economics
Running three LLM calls per user interaction sounds expensive. Is it?
Let's look at the math for a typical 1000-token blog post battle.
- Aggressor (GPT-4o-mini):
- Input: 1k tokens ($0.00015)
- Output: 500 tokens ($0.0003)
- Total: $0.00045
- Defender (Claude-3-Haiku):
- Input: 1k tokens ($0.00025)
- Output: 500 tokens ($0.00125)
- Total: $0.0015
- Moderator (GPT-4o):
- Input: 2k tokens (Context window) ($0.01)
- Output: 1000 tokens ($0.03)
- Total: $0.04
Total Cost per Battle: ~$0.04 - $0.05.
If we charge users $1.99 per battle (micropayment model) or $20/mo (SaaS model), the gross margins are roughly 95-98%.
People assume "Multi-Agent" means "Multi-Dollar." But with the recent price collapse of intelligent-enough models (Mini/Haiku), agentic swarms are now economically viable for consumer apps.
State Management: "The War Room"
We treat the "Battle" state not as a standard CRUD object, but as a Game State.
- Round 1: Opening Statements (Aggressor strikes first).
- Round 2: Rebuttals (Defender counters).
- Round 3: Verdict (Moderator synthesizes).
We use zustand on the client to track these phases and trigger the appropriate animations. When the Aggressor "speaks," the camera pans to the red corner. When the Defender retorts, it pans blue.
This state machine also handles connection recovery. If the user disconnects in Round 2, the server (via Redis) remembers the state. When they reconnect, we "replay" the battle from the last known checkpoint.
""We stopped treating LLMs as APIs and started treating them as NPCs. That mental shift changed everything."
"
Conclusion: The Future is Multi-Agent
Single-agent systems are hitting a ceiling. To solve complex problems—or just editing a text file properly—you need a diversity of viewpoints. Next.js 15 provides the perfect playground to orchestrate these digital arguments.
The combination of Server Actions (for logic), AI SDK (for streaming), and Zustand (for game state) creates a powerful trinity for building "Agentic Native" applications.
If you want to see the code in action, check out the Github Repo or just Start a Battle right now.
Read Next

Rendering User Markdown Safely in 2026: The Sandbox Strategy
How to let users upload Markdown files without opening yourself up to XSS attacks. A guide to MDX-Remote, sanitization, and the Shadow DOM.

Why Server-Sent Events (SSE) Beat WebSockets for Real-Time AI Streaming
WebSockets are overkill. For unidirectional AI text streams, SSE is lighter, faster, and easier to debug. Here is why we chose it for the Arena.

The AI File Battle Manifesto: Why We Built Fight Club for Text
The official launch post. Why the "Helpful Assistant" model is broken, and why the future of creative work belongs to adversarial agents.