Large Language Models (LLMs) are transforming how we build web applications. In this comprehensive guide, you’ll learn practical, production-ready techniques for integrating AI into your JavaScript applicationsโusing tools and APIs that work today.
We’ll cover everything from simple API calls to advanced streaming responses, cost optimization, and building real-world features like chatbots, content generators, and smart assistants.
Table of Contents
- Why Integrate LLMs into Web Apps
- Available LLM Providers
- Getting Started with OpenAI
- Using the Vercel AI SDK
- Building a Smart Chatbot
- Streaming Responses for Better UX
- Working with Different LLM Providers
- Client-Side vs Server-Side Integration
- Advanced Patterns and Best Practices
- Cost Optimization Strategies
- Security and Rate Limiting
- Production-Ready Examples
Why Integrate LLMs into Web Apps
What You Can Build
- Smart Chatbots: Customer support, sales assistants, FAQ bots
- Content Generation: Blog posts, product descriptions, marketing copy
- Code Assistants: Code completion, debugging help, documentation
- Data Analysis: Extract insights, summarize reports, analyze trends
- Personalization: Tailored recommendations, dynamic content
- Translation: Multi-language support with context awareness
- Search Enhancement: Semantic search, Q&A over your data
Real Benefits
- Enhanced User Experience: Conversational interfaces, instant help
- Automation: Reduce manual work, scale support
- Personalization: Adapt to individual user needs
- Innovation: Build features that weren’t possible before
Available LLM Providers
Commercial APIs (Production-Ready)
OpenAI (GPT-4, GPT-4 Turbo, GPT-3.5)
- Best quality, most expensive
- Excellent for complex reasoning
- Strong developer ecosystem
- API:
https://api.openai.com/v1/chat/completions
Anthropic Claude (Claude 3 Opus, Sonnet, Haiku)
- Great quality, competitive pricing
- Longer context windows (200K tokens)
- Strong at analysis and writing
- API:
https://api.anthropic.com/v1/messages
Google Gemini (Pro, Ultra)
- Multimodal capabilities
- Free tier available
- Good performance
- API:
https://generativelanguage.googleapis.com/v1/models
Groq
- Ultra-fast inference (fastest in market)
- Very cheap
- Limited models but excellent speed
- API:
https://api.groq.com/openai/v1/chat/completions
Together AI
- Open-source models
- Affordable pricing
- Good variety (Llama, Mixtral, etc.)
- API:
https://api.together.xyz/v1/chat/completions
Self-Hosted Options
Ollama (Local development)
- Run models on your machine
- No API costs
- Privacy-first
- Great for development
LM Studio (Local GUI)
- User-friendly interface
- Download and run models locally
- Compatible with OpenAI API format
Getting Started with OpenAI
Installation
npm install openai
Basic Usage (Node.js)
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
async function chat(message) {
const completion = await openai.chat.completions.create({
model: "gpt-4-turbo-preview",
messages: [
{
role: "system",
content: "You are a helpful assistant."
},
{
role: "user",
content: message
}
],
temperature: 0.7,
max_tokens: 500
});
return completion.choices[0].message.content;
}
// Usage
const response = await chat("Explain how async/await works in JavaScript");
console.log(response);
Browser-Safe Implementation (Via Backend)
Never expose your API key in client-side code! Always proxy through your backend:
// frontend.js
async function askAI(question) {
const response = await fetch('/api/chat', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ message: question })
});
const data = await response.json();
return data.response;
}
// Usage
const answer = await askAI('What is machine learning?');
console.log(answer);
// backend/api/chat.js (Express)
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
app.post('/api/chat', async (req, res) => {
try {
const { message } = req.body;
const completion = await openai.chat.completions.create({
model: "gpt-3.5-turbo", // Cheaper for simple queries
messages: [{ role: "user", content: message }],
max_tokens: 300
});
res.json({
response: completion.choices[0].message.content
});
} catch (error) {
console.error('OpenAI error:', error);
res.status(500).json({ error: 'Failed to generate response' });
}
});
Using the Vercel AI SDK
The Vercel AI SDK provides a unified interface for multiple LLM providers with built-in streaming support.
Installation
npm install ai @ai-sdk/openai
Basic Example (Next.js App Router)
// app/api/chat/route.js
import { openai } from '@ai-sdk/openai';
import { streamText } from 'ai';
export async function POST(req) {
const { messages } = await req.json();
const result = await streamText({
model: openai('gpt-4-turbo'),
messages,
});
return result.toAIStreamResponse();
}
// app/page.js
'use client';
import { useChat } from 'ai/react';
export default function Chat() {
const { messages, input, handleInputChange, handleSubmit } = useChat();
return (
<div className="chat-container">
<div className="messages">
{messages.map(m => (
<div key={m.id} className={`message ${m.role}`}>
<strong>{m.role}:</strong> {m.content}
</div>
))}
</div>
<form onSubmit={handleSubmit}>
<input
value={input}
onChange={handleInputChange}
placeholder="Ask me anything..."
/>
<button type="submit">Send</button>
</form>
</div>
);
}
That’s it! The useChat hook handles all the complexity:
- Streaming responses
- Message history
- Loading states
- Error handling
Building a Smart Chatbot
Let’s build a production-ready chatbot with conversation history, context, and memory.
Backend Implementation
// chatbot.js
import OpenAI from 'openai';
class SmartChatbot {
constructor(systemPrompt, options = {}) {
this.openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
this.systemPrompt = systemPrompt;
this.model = options.model || 'gpt-3.5-turbo';
this.temperature = options.temperature || 0.7;
this.maxTokens = options.maxTokens || 500;
// Store conversations in memory (use database in production)
this.conversations = new Map();
}
getConversation(userId) {
if (!this.conversations.has(userId)) {
this.conversations.set(userId, [
{ role: 'system', content: this.systemPrompt }
]);
}
return this.conversations.get(userId);
}
async chat(userId, message) {
const conversation = this.getConversation(userId);
// Add user message
conversation.push({
role: 'user',
content: message
});
// Keep only last 10 messages to control costs
const recentMessages = conversation.slice(-10);
try {
const completion = await this.openai.chat.completions.create({
model: this.model,
messages: recentMessages,
temperature: this.temperature,
max_tokens: this.maxTokens
});
const response = completion.choices[0].message.content;
// Add assistant response to history
conversation.push({
role: 'assistant',
content: response
});
return {
response,
usage: completion.usage
};
} catch (error) {
console.error('Chat error:', error);
throw error;
}
}
clearHistory(userId) {
this.conversations.delete(userId);
}
async streamChat(userId, message) {
const conversation = this.getConversation(userId);
conversation.push({
role: 'user',
content: message
});
const stream = await this.openai.chat.completions.create({
model: this.model,
messages: conversation.slice(-10),
temperature: this.temperature,
max_tokens: this.maxTokens,
stream: true
});
return stream;
}
}
// Usage
const supportBot = new SmartChatbot(
`You are a helpful customer support agent for TechCorp.
Be friendly, concise, and always try to solve the customer's problem.
If you don't know something, admit it and offer to escalate.`,
{
model: 'gpt-3.5-turbo',
temperature: 0.7
}
);
export default supportBot;
Express API Routes
// routes/chat.js
import express from 'express';
import supportBot from './chatbot.js';
const router = express.Router();
router.post('/message', async (req, res) => {
try {
const { userId, message } = req.body;
if (!userId || !message) {
return res.status(400).json({ error: 'Missing userId or message' });
}
const result = await supportBot.chat(userId, message);
res.json({
response: result.response,
tokensUsed: result.usage.total_tokens
});
} catch (error) {
res.status(500).json({ error: 'Failed to process message' });
}
});
router.post('/clear', (req, res) => {
const { userId } = req.body;
supportBot.clearHistory(userId);
res.json({ success: true });
});
export default router;
Frontend Chat UI
// chat-ui.js
class ChatUI {
constructor(containerId, userId) {
this.container = document.getElementById(containerId);
this.userId = userId;
this.messages = [];
this.render();
}
render() {
this.container.innerHTML = `
<div class="chat-window">
<div class="messages" id="messages"></div>
<div class="input-area">
<input
type="text"
id="messageInput"
placeholder="Type your message..."
/>
<button id="sendBtn">Send</button>
</div>
</div>
`;
this.messagesContainer = document.getElementById('messages');
this.input = document.getElementById('messageInput');
this.sendBtn = document.getElementById('sendBtn');
this.sendBtn.addEventListener('click', () => this.sendMessage());
this.input.addEventListener('keypress', (e) => {
if (e.key === 'Enter') this.sendMessage();
});
}
addMessage(role, content) {
const messageEl = document.createElement('div');
messageEl.className = `message ${role}`;
messageEl.innerHTML = `
<div class="message-content">${this.formatMessage(content)}</div>
`;
this.messagesContainer.appendChild(messageEl);
this.messagesContainer.scrollTop = this.messagesContainer.scrollHeight;
}
formatMessage(content) {
// Simple markdown-like formatting
return content
.replace(/\*\*(.*?)\*\*/g, '<strong>$1</strong>')
.replace(/\n/g, '<br>');
}
showTyping() {
const typingEl = document.createElement('div');
typingEl.className = 'message assistant typing';
typingEl.id = 'typing-indicator';
typingEl.innerHTML = '<div class="dots"><span></span><span></span><span></span></div>';
this.messagesContainer.appendChild(typingEl);
}
hideTyping() {
const typingEl = document.getElementById('typing-indicator');
if (typingEl) typingEl.remove();
}
async sendMessage() {
const message = this.input.value.trim();
if (!message) return;
// Add user message
this.addMessage('user', message);
this.input.value = '';
this.sendBtn.disabled = true;
// Show typing indicator
this.showTyping();
try {
const response = await fetch('/api/chat/message', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
userId: this.userId,
message: message
})
});
const data = await response.json();
this.hideTyping();
this.addMessage('assistant', data.response);
} catch (error) {
this.hideTyping();
this.addMessage('system', 'Sorry, something went wrong. Please try again.');
} finally {
this.sendBtn.disabled = false;
this.input.focus();
}
}
}
// Initialize
const chat = new ChatUI('chatContainer', 'user-123');
Streaming Responses for Better UX
Streaming provides a much better user experience by showing responses as they’re generated, like ChatGPT.
Server-Side Streaming (Node.js)
// api/stream-chat.js
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
export async function POST(req, res) {
const { message } = await req.json();
// Set headers for SSE (Server-Sent Events)
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
try {
const stream = await openai.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: message }],
stream: true
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
if (content) {
// Send each chunk as SSE
res.write(`data: ${JSON.stringify({ content })}\n\n`);
}
}
res.write('data: [DONE]\n\n');
res.end();
} catch (error) {
console.error('Streaming error:', error);
res.write(`data: ${JSON.stringify({ error: 'Stream failed' })}\n\n`);
res.end();
}
}
Client-Side Streaming Consumer
async function streamChat(message, onChunk, onComplete) {
const response = await fetch('/api/stream-chat', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ message })
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n\n');
buffer = lines.pop(); // Keep incomplete line in buffer
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') {
onComplete();
return;
}
try {
const parsed = JSON.parse(data);
onChunk(parsed.content);
} catch (e) {
console.error('Parse error:', e);
}
}
}
}
}
// Usage
const messageDiv = document.getElementById('response');
let fullResponse = '';
await streamChat(
'Explain quantum computing',
(chunk) => {
// Called for each chunk
fullResponse += chunk;
messageDiv.textContent = fullResponse;
},
() => {
// Called when complete
console.log('Streaming complete');
}
);
Streaming with Vercel AI SDK (Easiest)
// Using useChat hook - streaming is automatic!
import { useChat } from 'ai/react';
export default function StreamingChat() {
const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
api: '/api/chat'
});
return (
<div>
{messages.map(m => (
<div key={m.id}>
<strong>{m.role}:</strong> {m.content}
</div>
))}
<form onSubmit={handleSubmit}>
<input
value={input}
onChange={handleInputChange}
disabled={isLoading}
/>
<button type="submit" disabled={isLoading}>
{isLoading ? 'Sending...' : 'Send'}
</button>
</form>
</div>
);
}
Working with Different LLM Providers
Anthropic Claude
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
async function chatWithClaude(message) {
const response = await anthropic.messages.create({
model: 'claude-3-sonnet-20240229',
max_tokens: 1024,
messages: [
{ role: 'user', content: message }
],
});
return response.content[0].text;
}
// Streaming
async function streamClaude(message) {
const stream = await anthropic.messages.create({
model: 'claude-3-sonnet-20240229',
max_tokens: 1024,
messages: [{ role: 'user', content: message }],
stream: true,
});
for await (const event of stream) {
if (event.type === 'content_block_delta') {
console.log(event.delta.text);
}
}
}
Google Gemini
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY);
async function chatWithGemini(message) {
const model = genAI.getGenerativeModel({ model: 'gemini-pro' });
const result = await model.generateContent(message);
const response = await result.response;
return response.text();
}
// Streaming
async function streamGemini(message) {
const model = genAI.getGenerativeModel({ model: 'gemini-pro' });
const result = await model.generateContentStream(message);
for await (const chunk of result.stream) {
const chunkText = chunk.text();
console.log(chunkText);
}
}
Groq (Ultra-Fast)
import Groq from 'groq-sdk';
const groq = new Groq({
apiKey: process.env.GROQ_API_KEY
});
async function chatWithGroq(message) {
const completion = await groq.chat.completions.create({
messages: [
{ role: 'user', content: message }
],
model: 'mixtral-8x7b-32768', // Very fast!
});
return completion.choices[0].message.content;
}
Provider Abstraction Layer
// llm-provider.js
class LLMProvider {
constructor(provider, apiKey) {
this.provider = provider;
this.apiKey = apiKey;
this.client = this.initializeClient();
}
initializeClient() {
switch (this.provider) {
case 'openai':
return new OpenAI({ apiKey: this.apiKey });
case 'anthropic':
return new Anthropic({ apiKey: this.apiKey });
case 'groq':
return new Groq({ apiKey: this.apiKey });
default:
throw new Error(`Unsupported provider: ${this.provider}`);
}
}
async chat(message, options = {}) {
switch (this.provider) {
case 'openai':
return this.chatOpenAI(message, options);
case 'anthropic':
return this.chatAnthropic(message, options);
case 'groq':
return this.chatGroq(message, options);
}
}
async chatOpenAI(message, options) {
const completion = await this.client.chat.completions.create({
model: options.model || 'gpt-3.5-turbo',
messages: [{ role: 'user', content: message }],
...options
});
return completion.choices[0].message.content;
}
async chatAnthropic(message, options) {
const response = await this.client.messages.create({
model: options.model || 'claude-3-sonnet-20240229',
max_tokens: options.max_tokens || 1024,
messages: [{ role: 'user', content: message }]
});
return response.content[0].text;
}
async chatGroq(message, options) {
const completion = await this.client.chat.completions.create({
model: options.model || 'mixtral-8x7b-32768',
messages: [{ role: 'user', content: message }]
});
return completion.choices[0].message.content;
}
}
// Usage
const llm = new LLMProvider('openai', process.env.OPENAI_API_KEY);
const response = await llm.chat('Hello!');
// Easy to switch providers
const groqLLM = new LLMProvider('groq', process.env.GROQ_API_KEY);
const fastResponse = await groqLLM.chat('Hello!');
Client-Side vs Server-Side Integration
โ Don’t Do This (Client-Side API Key)
// NEVER do this!
const openai = new OpenAI({
apiKey: 'sk-...' // Exposed to everyone!
});
โ Do This (Proxy Through Backend)
Frontend:
// frontend.js
async function askAI(question) {
const response = await fetch('/api/ask', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ question })
});
return await response.json();
}
Backend:
// backend.js
app.post('/api/ask', async (req, res) => {
// API key is safe on server
const completion = await openai.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: req.body.question }]
});
res.json({ answer: completion.choices[0].message.content });
});
Advanced Patterns and Best Practices
1. Function Calling (Tool Use)
Let the LLM call functions in your code:
async function chatWithFunctions(message) {
const functions = [
{
name: 'get_weather',
description: 'Get the current weather in a location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name, e.g., San Francisco'
},
unit: {
type: 'string',
enum: ['celsius', 'fahrenheit']
}
},
required: ['location']
}
}
];
const completion = await openai.chat.completions.create({
model: 'gpt-4-turbo',
messages: [{ role: 'user', content: message }],
functions: functions,
function_call: 'auto'
});
const responseMessage = completion.choices[0].message;
// Check if the model wants to call a function
if (responseMessage.function_call) {
const functionName = responseMessage.function_call.name;
const functionArgs = JSON.parse(responseMessage.function_call.arguments);
// Call your actual function
let functionResponse;
if (functionName === 'get_weather') {
functionResponse = await getWeather(functionArgs.location, functionArgs.unit);
}
// Send function result back to model
const secondCompletion = await openai.chat.completions.create({
model: 'gpt-4-turbo',
messages: [
{ role: 'user', content: message },
responseMessage,
{
role: 'function',
name: functionName,
content: JSON.stringify(functionResponse)
}
]
});
return secondCompletion.choices[0].message.content;
}
return responseMessage.content;
}
// Weather function
async function getWeather(location, unit = 'celsius') {
// Call weather API
return {
location,
temperature: 22,
unit,
condition: 'sunny'
};
}
2. Prompt Templates
class PromptTemplate {
constructor(template) {
this.template = template;
}
format(variables) {
let result = this.template;
for (const [key, value] of Object.entries(variables)) {
result = result.replace(new RegExp(`{{${key}}}`, 'g'), value);
}
return result;
}
}
// Usage
const summarizeTemplate = new PromptTemplate(`
Summarize the following text in {{max_words}} words or less:
Text: {{text}}
Summary:
`);
const prompt = summarizeTemplate.format({
text: 'Long article here...',
max_words: 50
});
const summary = await openai.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: prompt }]
});
3. Conversation Memory with Context Window Management
class ConversationManager {
constructor(maxTokens = 4000) {
this.maxTokens = maxTokens;
this.messages = [];
}
addMessage(role, content) {
this.messages.push({ role, content });
this.trimToTokenLimit();
}
trimToTokenLimit() {
// Rough estimation: 1 token โ 4 characters
let totalChars = this.messages.reduce((sum, msg) =>
sum + msg.content.length, 0
);
while (totalChars > this.maxTokens * 4 && this.messages.length > 1) {
// Remove oldest user message (keep system message)
const removed = this.messages.splice(1, 1)[0];
totalChars -= removed.content.length;
}
}
getMessages() {
return this.messages;
}
clear() {
this.messages = [];
}
}
// Usage
const conversation = new ConversationManager(4000);
conversation.addMessage('system', 'You are a helpful assistant.');
conversation.addMessage('user', 'Hello!');
conversation.addMessage('assistant', 'Hi! How can I help?');
const completion = await openai.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: conversation.getMessages()
});
4. Retry Logic with Exponential Backoff
async function callWithRetry(fn, maxRetries = 3) {
let lastError;
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (error) {
lastError = error;
// Don't retry on client errors (400s)
if (error.status >= 400 && error.status < 500) {
throw error;
}
// Exponential backoff: 1s, 2s, 4s
const delay = Math.pow(2, i) * 1000;
console.log(`Retry ${i + 1}/${maxRetries} after ${delay}ms`);
await new Promise(resolve => setTimeout(resolve, delay));
}
}
throw lastError;
}
// Usage
const response = await callWithRetry(async () => {
return await openai.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: 'Hello' }]
});
});
Cost Optimization Strategies
1. Choose the Right Model
// Model pricing (approximate, per 1M tokens):
// GPT-4 Turbo: $10 input, $30 output
// GPT-3.5 Turbo: $0.50 input, $1.50 output
// Claude Sonnet: $3 input, $15 output
// Groq (Mixtral): $0.27 input, $0.27 output
function selectModel(taskComplexity) {
if (taskComplexity === 'simple') {
return 'gpt-3.5-turbo'; // Cheap and fast
} else if (taskComplexity === 'medium') {
return 'claude-3-sonnet'; // Good balance
} else {
return 'gpt-4-turbo'; // Best quality
}
}
2. Cache Responses
import NodeCache from 'node-cache';
const cache = new NodeCache({ stdTTL: 3600 }); // 1 hour
async function cachedChat(message) {
const cacheKey = `chat:${message}`;
// Check cache first
const cached = cache.get(cacheKey);
if (cached) {
console.log('Cache hit!');
return cached;
}
// Call API
const response = await openai.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: message }]
});
const result = response.choices[0].message.content;
// Store in cache
cache.set(cacheKey, result);
return result;
}
3. Limit Max Tokens
async function costAwareChat(message, budget = 'low') {
const budgetLimits = {
low: { max_tokens: 150, model: 'gpt-3.5-turbo' },
medium: { max_tokens: 500, model: 'gpt-3.5-turbo' },
high: { max_tokens: 1000, model: 'gpt-4-turbo' }
};
const config = budgetLimits[budget];
return await openai.chat.completions.create({
model: config.model,
messages: [{ role: 'user', content: message }],
max_tokens: config.max_tokens
});
}
4. Batch Similar Requests
async function batchProcess(items) {
// Instead of N API calls, make 1 call with all items
const batchPrompt = `
Process the following items and return JSON array:
${items.map((item, i) => `${i + 1}. ${item}`).join('\n')}
Return format: [{"item": "...", "result": "..."}]
`;
const response = await openai.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: batchPrompt }],
response_format: { type: 'json_object' }
});
return JSON.parse(response.choices[0].message.content);
}
Security and Rate Limiting
Rate Limiting
import rateLimit from 'express-rate-limit';
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // Limit each IP to 100 requests per window
message: 'Too many requests, please try again later.'
});
app.use('/api/chat', limiter);
Input Validation
function validateInput(message) {
// Length check
if (!message || message.length === 0) {
throw new Error('Message cannot be empty');
}
if (message.length > 4000) {
throw new Error('Message too long (max 4000 characters)');
}
// Content filtering (basic)
const forbiddenPatterns = [
/\b(password|api[_-]?key|secret)\b/i,
/<script/i,
/javascript:/i
];
for (const pattern of forbiddenPatterns) {
if (pattern.test(message)) {
throw new Error('Message contains forbidden content');
}
}
return true;
}
app.post('/api/chat', async (req, res) => {
try {
validateInput(req.body.message);
// Process message...
} catch (error) {
res.status(400).json({ error: error.message });
}
});
User Authentication
import jwt from 'jsonwebtoken';
function authenticateToken(req, res, next) {
const token = req.headers['authorization']?.split(' ')[1];
if (!token) {
return res.status(401).json({ error: 'No token provided' });
}
jwt.verify(token, process.env.JWT_SECRET, (err, user) => {
if (err) {
return res.status(403).json({ error: 'Invalid token' });
}
req.user = user;
next();
});
}
app.post('/api/chat', authenticateToken, async (req, res) => {
// req.user is available here
const userId = req.user.id;
// Process chat with user context...
});
Production-Ready Examples
Example 1: Content Generator API
// content-generator.js
import express from 'express';
import OpenAI from 'openai';
import rateLimit from 'express-rate-limit';
const app = express();
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
app.use(express.json());
const limiter = rateLimit({
windowMs: 60 * 1000,
max: 10
});
app.use('/api/generate', limiter);
app.post('/api/generate/blog-post', async (req, res) => {
try {
const { topic, tone, length } = req.body;
if (!topic) {
return res.status(400).json({ error: 'Topic is required' });
}
const prompt = `Write a ${length || 'medium'}-length blog post about "${topic}"
in a ${tone || 'professional'} tone. Include an introduction, main points, and conclusion.`;
const completion = await openai.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: [
{
role: 'system',
content: 'You are an expert content writer.'
},
{
role: 'user',
content: prompt
}
],
temperature: 0.8,
max_tokens: 1000
});
const content = completion.choices[0].message.content;
const tokensUsed = completion.usage.total_tokens;
res.json({
content,
metadata: {
tokensUsed,
model: 'gpt-3.5-turbo',
topic,
tone
}
});
} catch (error) {
console.error('Generation error:', error);
res.status(500).json({ error: 'Failed to generate content' });
}
});
app.listen(3000, () => {
console.log('Content Generator API running on port 3000');
});
Example 2: Smart FAQ Bot
// faq-bot.js
class FAQBot {
constructor(faqData) {
this.faqData = faqData;
this.openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
}
buildContext() {
return `You are a customer support bot. Use the following FAQ to answer questions:
${this.faqData.map(faq => `Q: ${faq.question}\nA: ${faq.answer}`).join('\n\n')}
If the question is not covered in the FAQ, politely say you don't know and offer to connect them with a human agent.`;
}
async answer(question) {
const completion = await this.openai.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: [
{
role: 'system',
content: this.buildContext()
},
{
role: 'user',
content: question
}
],
temperature: 0.3, // Low for consistent answers
max_tokens: 300
});
return completion.choices[0].message.content;
}
}
// Usage
const faqData = [
{
question: 'What are your business hours?',
answer: 'We are open Monday-Friday, 9 AM - 5 PM EST.'
},
{
question: 'How do I reset my password?',
answer: 'Click "Forgot Password" on the login page and follow the instructions.'
}
// Add more FAQs...
];
const bot = new FAQBot(faqData);
app.post('/api/faq', async (req, res) => {
const { question } = req.body;
const answer = await bot.answer(question);
res.json({ answer });
});
Example 3: Email Response Generator
// email-assistant.js
async function generateEmailResponse(emailContent, tone = 'professional') {
const prompt = `Generate a ${tone} email response to the following email:
${emailContent}
Response:`;
const completion = await openai.chat.completions.create({
model: 'gpt-4-turbo',
messages: [
{
role: 'system',
content: `You are a professional email assistant. Write clear,
concise, and polite email responses.`
},
{
role: 'user',
content: prompt
}
],
temperature: 0.7
});
return completion.choices[0].message.content;
}
app.post('/api/email/respond', async (req, res) => {
try {
const { email, tone } = req.body;
const response = await generateEmailResponse(email, tone);
res.json({ response });
} catch (error) {
res.status(500).json({ error: 'Failed to generate response' });
}
});
Best Practices Checklist
Security โ
- Never expose API keys in client-side code
- Always proxy API calls through your backend
- Implement rate limiting
- Validate and sanitize user input
- Use authentication for API access
Performance โ
- Use streaming for better UX
- Cache common responses
- Choose appropriate models for task complexity
- Implement timeout handling
- Use retry logic with backoff
Cost โ
- Set max_tokens limits
- Use cheaper models when possible
- Monitor usage and set budgets
- Batch similar requests
- Cache responses when appropriate
User Experience โ
- Show loading indicators
- Stream responses when possible
- Handle errors gracefully
- Provide fallback messages
- Add typing indicators
Code Quality โ
- Use TypeScript for type safety
- Write unit tests
- Log errors and usage
- Document API endpoints
- Use environment variables
Conclusion
Integrating LLMs into JavaScript web applications is now easier than ever. With the right tools and patterns, you can build production-ready AI features that:
- Enhance user experience with intelligent interactions
- Automate repetitive tasks
- Provide personalized content
- Scale efficiently with proper caching and rate limiting
Key Takeaways
- Always secure your API keys - use backend proxies
- Stream responses for better UX
- Choose the right model for your use case and budget
- Implement proper error handling and retry logic
- Monitor costs and optimize token usage
- Use frameworks like Vercel AI SDK to speed up development
Next Steps
- Choose an LLM provider (OpenAI, Anthropic, Groq)
- Set up a simple backend API
- Build a basic chat interface
- Add streaming for better UX
- Implement caching and rate limiting
- Deploy to production
The AI revolution in web development is here. Start building!
Comments