## Quick Answer
Use any OpenAI-compatible API (OpenAI, Claude, Assisters) with the `openai` npm package. Stream responses via Server-Sent Events, store conversation history in Postgres, and add function calling for tool use.
- Streaming feels 5x faster even at the same latency - Store every message for debugging and fine-tuning - Rate-limit per user to prevent abuse
## What You'll Need
- Next.js 15+ app or any Node backend - OpenAI-compatible API key (Assisters recommended for self-hosted) - Postgres or Supabase for history - Vercel AI SDK or raw `openai` client
## Steps
1. **Install dependencies.** `pnpm add openai ai @ai-sdk/openai` 2. **Configure client.** ```ts import OpenAI from 'openai'; const ai = new OpenAI({ baseURL: 'https://assisters.dev/api/v1', apiKey: process.env.ASSISTERS_API_KEY!, }); ``` 3. **Create streaming endpoint.** In `app/api/chat/route.ts`: ```ts const stream = await ai.chat.completions.create({ model: 'assisters-chat-v1', messages, stream: true, }); return new Response(stream.toReadableStream()); ``` 4. **Build the UI.** Use Vercel AI SDK's `useChat` hook. 5. **Persist messages.** On each exchange, insert into `messages` table with `conversation_id`. 6. **Add function calling.** Define tools (search DB, call API). AI decides when to invoke. 7. **Moderate input and output.** Call `/moderate` endpoint before responding. 8. **Rate limit.** `@upstash/ratelimit` or self-hosted Redis: 20 msg/min per user.
## Common Mistakes
- **Skipping moderation.** A single jailbreak screenshot destroys trust. - **Infinite context.** Truncate history to last 20 messages + summary of older. - **No retry logic.** Network blips kill UX. Use exponential backoff. - **Exposing API key in client.** Always proxy through your server.
## Top Tools
| Tool | Use | |------|-----| | Vercel AI SDK | Chat UI primitives | | Assisters | OpenAI-compatible gateway | | Supabase | History + auth | | Langfuse | Observability | | Upstash / Redis | Rate limiting |
## FAQs
**Which model should I use?** Start with `assisters-chat-v1` — cheaper than GPT, comparable quality.
**How much does it cost?** $5-50/mo for a low-volume chatbot. Scales linearly with usage.
**Can I fine-tune?** Yes — see our next article on fine-tuning.
**Does it work on mobile?** Next.js PWA or React Native with EventSource polyfill.
**How do I handle long conversations?** Summarize the first half every 20 turns.
**What about function calling safety?** Always confirm destructive actions with the user before executing.
## Conclusion
A production chatbot is a weekend project in 2026 with OpenAI-compatible APIs and the Vercel AI SDK. Self-host the model gateway (Assisters) to control costs and data. Try [Misar Dev](https://misar.dev) to generate the entire scaffold from a prompt.
Free newsletter
Join thousands of creators and builders. One email a week — practical AI tips, platform updates, and curated reads.
No spam · Unsubscribe anytime
Let AI generate, tune, and self-heal your CI/CD workflows — GitHub Actions, CircleCI, and GitLab pipelines that fix them…
AI calendar assistants, smart reminders, and rescheduling automation — kill the scheduling ping-pong.
Form extraction, document parsing, and database population — eliminate manual data entry forever.
Comments
Sign in to join the conversation
No comments yet. Be the first to share your thoughts!