Skip to main content
โšก Calmops

JavaScript Meets AI: Integrating LLMs into Your Web Applications

Table of Contents

Large Language Models (LLMs) are transforming how we build web applications. In this comprehensive guide, you’ll learn practical, production-ready techniques for integrating AI into your JavaScript applicationsโ€”using tools and APIs that work today.

We’ll cover everything from simple API calls to advanced streaming responses, cost optimization, and building real-world features like chatbots, content generators, and smart assistants.

Table of Contents

  1. Why Integrate LLMs into Web Apps
  2. Available LLM Providers
  3. Getting Started with OpenAI
  4. Using the Vercel AI SDK
  5. Building a Smart Chatbot
  6. Streaming Responses for Better UX
  7. Working with Different LLM Providers
  8. Client-Side vs Server-Side Integration
  9. Advanced Patterns and Best Practices
  10. Cost Optimization Strategies
  11. Security and Rate Limiting
  12. Production-Ready Examples

Why Integrate LLMs into Web Apps

What You Can Build

  • Smart Chatbots: Customer support, sales assistants, FAQ bots
  • Content Generation: Blog posts, product descriptions, marketing copy
  • Code Assistants: Code completion, debugging help, documentation
  • Data Analysis: Extract insights, summarize reports, analyze trends
  • Personalization: Tailored recommendations, dynamic content
  • Translation: Multi-language support with context awareness
  • Search Enhancement: Semantic search, Q&A over your data

Real Benefits

  1. Enhanced User Experience: Conversational interfaces, instant help
  2. Automation: Reduce manual work, scale support
  3. Personalization: Adapt to individual user needs
  4. Innovation: Build features that weren’t possible before

Available LLM Providers

Commercial APIs (Production-Ready)

OpenAI (GPT-4, GPT-4 Turbo, GPT-3.5)

  • Best quality, most expensive
  • Excellent for complex reasoning
  • Strong developer ecosystem
  • API: https://api.openai.com/v1/chat/completions

Anthropic Claude (Claude 3 Opus, Sonnet, Haiku)

  • Great quality, competitive pricing
  • Longer context windows (200K tokens)
  • Strong at analysis and writing
  • API: https://api.anthropic.com/v1/messages

Google Gemini (Pro, Ultra)

  • Multimodal capabilities
  • Free tier available
  • Good performance
  • API: https://generativelanguage.googleapis.com/v1/models

Groq

  • Ultra-fast inference (fastest in market)
  • Very cheap
  • Limited models but excellent speed
  • API: https://api.groq.com/openai/v1/chat/completions

Together AI

  • Open-source models
  • Affordable pricing
  • Good variety (Llama, Mixtral, etc.)
  • API: https://api.together.xyz/v1/chat/completions

Self-Hosted Options

Ollama (Local development)

  • Run models on your machine
  • No API costs
  • Privacy-first
  • Great for development

LM Studio (Local GUI)

  • User-friendly interface
  • Download and run models locally
  • Compatible with OpenAI API format

Getting Started with OpenAI

Installation

npm install openai

Basic Usage (Node.js)

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

async function chat(message) {
  const completion = await openai.chat.completions.create({
    model: "gpt-4-turbo-preview",
    messages: [
      {
        role: "system",
        content: "You are a helpful assistant."
      },
      {
        role: "user",
        content: message
      }
    ],
    temperature: 0.7,
    max_tokens: 500
  });

  return completion.choices[0].message.content;
}

// Usage
const response = await chat("Explain how async/await works in JavaScript");
console.log(response);

Browser-Safe Implementation (Via Backend)

Never expose your API key in client-side code! Always proxy through your backend:

// frontend.js
async function askAI(question) {
  const response = await fetch('/api/chat', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ message: question })
  });

  const data = await response.json();
  return data.response;
}

// Usage
const answer = await askAI('What is machine learning?');
console.log(answer);
// backend/api/chat.js (Express)
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

app.post('/api/chat', async (req, res) => {
  try {
    const { message } = req.body;
    
    const completion = await openai.chat.completions.create({
      model: "gpt-3.5-turbo", // Cheaper for simple queries
      messages: [{ role: "user", content: message }],
      max_tokens: 300
    });

    res.json({ 
      response: completion.choices[0].message.content 
    });
  } catch (error) {
    console.error('OpenAI error:', error);
    res.status(500).json({ error: 'Failed to generate response' });
  }
});

Using the Vercel AI SDK

The Vercel AI SDK provides a unified interface for multiple LLM providers with built-in streaming support.

Installation

npm install ai @ai-sdk/openai

Basic Example (Next.js App Router)

// app/api/chat/route.js
import { openai } from '@ai-sdk/openai';
import { streamText } from 'ai';

export async function POST(req) {
  const { messages } = await req.json();

  const result = await streamText({
    model: openai('gpt-4-turbo'),
    messages,
  });

  return result.toAIStreamResponse();
}
// app/page.js
'use client';

import { useChat } from 'ai/react';

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } = useChat();

  return (
    <div className="chat-container">
      <div className="messages">
        {messages.map(m => (
          <div key={m.id} className={`message ${m.role}`}>
            <strong>{m.role}:</strong> {m.content}
          </div>
        ))}
      </div>

      <form onSubmit={handleSubmit}>
        <input
          value={input}
          onChange={handleInputChange}
          placeholder="Ask me anything..."
        />
        <button type="submit">Send</button>
      </form>
    </div>
  );
}

That’s it! The useChat hook handles all the complexity:

  • Streaming responses
  • Message history
  • Loading states
  • Error handling

Building a Smart Chatbot

Let’s build a production-ready chatbot with conversation history, context, and memory.

Backend Implementation

// chatbot.js
import OpenAI from 'openai';

class SmartChatbot {
  constructor(systemPrompt, options = {}) {
    this.openai = new OpenAI({
      apiKey: process.env.OPENAI_API_KEY
    });
    
    this.systemPrompt = systemPrompt;
    this.model = options.model || 'gpt-3.5-turbo';
    this.temperature = options.temperature || 0.7;
    this.maxTokens = options.maxTokens || 500;
    
    // Store conversations in memory (use database in production)
    this.conversations = new Map();
  }

  getConversation(userId) {
    if (!this.conversations.has(userId)) {
      this.conversations.set(userId, [
        { role: 'system', content: this.systemPrompt }
      ]);
    }
    return this.conversations.get(userId);
  }

  async chat(userId, message) {
    const conversation = this.getConversation(userId);
    
    // Add user message
    conversation.push({
      role: 'user',
      content: message
    });

    // Keep only last 10 messages to control costs
    const recentMessages = conversation.slice(-10);

    try {
      const completion = await this.openai.chat.completions.create({
        model: this.model,
        messages: recentMessages,
        temperature: this.temperature,
        max_tokens: this.maxTokens
      });

      const response = completion.choices[0].message.content;

      // Add assistant response to history
      conversation.push({
        role: 'assistant',
        content: response
      });

      return {
        response,
        usage: completion.usage
      };
    } catch (error) {
      console.error('Chat error:', error);
      throw error;
    }
  }

  clearHistory(userId) {
    this.conversations.delete(userId);
  }

  async streamChat(userId, message) {
    const conversation = this.getConversation(userId);
    
    conversation.push({
      role: 'user',
      content: message
    });

    const stream = await this.openai.chat.completions.create({
      model: this.model,
      messages: conversation.slice(-10),
      temperature: this.temperature,
      max_tokens: this.maxTokens,
      stream: true
    });

    return stream;
  }
}

// Usage
const supportBot = new SmartChatbot(
  `You are a helpful customer support agent for TechCorp. 
   Be friendly, concise, and always try to solve the customer's problem.
   If you don't know something, admit it and offer to escalate.`,
  {
    model: 'gpt-3.5-turbo',
    temperature: 0.7
  }
);

export default supportBot;

Express API Routes

// routes/chat.js
import express from 'express';
import supportBot from './chatbot.js';

const router = express.Router();

router.post('/message', async (req, res) => {
  try {
    const { userId, message } = req.body;

    if (!userId || !message) {
      return res.status(400).json({ error: 'Missing userId or message' });
    }

    const result = await supportBot.chat(userId, message);

    res.json({
      response: result.response,
      tokensUsed: result.usage.total_tokens
    });
  } catch (error) {
    res.status(500).json({ error: 'Failed to process message' });
  }
});

router.post('/clear', (req, res) => {
  const { userId } = req.body;
  supportBot.clearHistory(userId);
  res.json({ success: true });
});

export default router;

Frontend Chat UI

// chat-ui.js
class ChatUI {
  constructor(containerId, userId) {
    this.container = document.getElementById(containerId);
    this.userId = userId;
    this.messages = [];
    
    this.render();
  }

  render() {
    this.container.innerHTML = `
      <div class="chat-window">
        <div class="messages" id="messages"></div>
        <div class="input-area">
          <input 
            type="text" 
            id="messageInput" 
            placeholder="Type your message..."
          />
          <button id="sendBtn">Send</button>
        </div>
      </div>
    `;

    this.messagesContainer = document.getElementById('messages');
    this.input = document.getElementById('messageInput');
    this.sendBtn = document.getElementById('sendBtn');

    this.sendBtn.addEventListener('click', () => this.sendMessage());
    this.input.addEventListener('keypress', (e) => {
      if (e.key === 'Enter') this.sendMessage();
    });
  }

  addMessage(role, content) {
    const messageEl = document.createElement('div');
    messageEl.className = `message ${role}`;
    messageEl.innerHTML = `
      <div class="message-content">${this.formatMessage(content)}</div>
    `;
    
    this.messagesContainer.appendChild(messageEl);
    this.messagesContainer.scrollTop = this.messagesContainer.scrollHeight;
  }

  formatMessage(content) {
    // Simple markdown-like formatting
    return content
      .replace(/\*\*(.*?)\*\*/g, '<strong>$1</strong>')
      .replace(/\n/g, '<br>');
  }

  showTyping() {
    const typingEl = document.createElement('div');
    typingEl.className = 'message assistant typing';
    typingEl.id = 'typing-indicator';
    typingEl.innerHTML = '<div class="dots"><span></span><span></span><span></span></div>';
    this.messagesContainer.appendChild(typingEl);
  }

  hideTyping() {
    const typingEl = document.getElementById('typing-indicator');
    if (typingEl) typingEl.remove();
  }

  async sendMessage() {
    const message = this.input.value.trim();
    if (!message) return;

    // Add user message
    this.addMessage('user', message);
    this.input.value = '';
    this.sendBtn.disabled = true;

    // Show typing indicator
    this.showTyping();

    try {
      const response = await fetch('/api/chat/message', {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          userId: this.userId,
          message: message
        })
      });

      const data = await response.json();

      this.hideTyping();
      this.addMessage('assistant', data.response);
    } catch (error) {
      this.hideTyping();
      this.addMessage('system', 'Sorry, something went wrong. Please try again.');
    } finally {
      this.sendBtn.disabled = false;
      this.input.focus();
    }
  }
}

// Initialize
const chat = new ChatUI('chatContainer', 'user-123');

Streaming Responses for Better UX

Streaming provides a much better user experience by showing responses as they’re generated, like ChatGPT.

Server-Side Streaming (Node.js)

// api/stream-chat.js
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

export async function POST(req, res) {
  const { message } = await req.json();

  // Set headers for SSE (Server-Sent Events)
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  try {
    const stream = await openai.chat.completions.create({
      model: 'gpt-3.5-turbo',
      messages: [{ role: 'user', content: message }],
      stream: true
    });

    for await (const chunk of stream) {
      const content = chunk.choices[0]?.delta?.content || '';
      if (content) {
        // Send each chunk as SSE
        res.write(`data: ${JSON.stringify({ content })}\n\n`);
      }
    }

    res.write('data: [DONE]\n\n');
    res.end();
  } catch (error) {
    console.error('Streaming error:', error);
    res.write(`data: ${JSON.stringify({ error: 'Stream failed' })}\n\n`);
    res.end();
  }
}

Client-Side Streaming Consumer

async function streamChat(message, onChunk, onComplete) {
  const response = await fetch('/api/stream-chat', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ message })
  });

  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let buffer = '';

  while (true) {
    const { done, value } = await reader.read();
    
    if (done) break;

    buffer += decoder.decode(value, { stream: true });
    const lines = buffer.split('\n\n');
    buffer = lines.pop(); // Keep incomplete line in buffer

    for (const line of lines) {
      if (line.startsWith('data: ')) {
        const data = line.slice(6);
        
        if (data === '[DONE]') {
          onComplete();
          return;
        }

        try {
          const parsed = JSON.parse(data);
          onChunk(parsed.content);
        } catch (e) {
          console.error('Parse error:', e);
        }
      }
    }
  }
}

// Usage
const messageDiv = document.getElementById('response');
let fullResponse = '';

await streamChat(
  'Explain quantum computing',
  (chunk) => {
    // Called for each chunk
    fullResponse += chunk;
    messageDiv.textContent = fullResponse;
  },
  () => {
    // Called when complete
    console.log('Streaming complete');
  }
);

Streaming with Vercel AI SDK (Easiest)

// Using useChat hook - streaming is automatic!
import { useChat } from 'ai/react';

export default function StreamingChat() {
  const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
    api: '/api/chat'
  });

  return (
    <div>
      {messages.map(m => (
        <div key={m.id}>
          <strong>{m.role}:</strong> {m.content}
        </div>
      ))}
      
      <form onSubmit={handleSubmit}>
        <input 
          value={input} 
          onChange={handleInputChange}
          disabled={isLoading}
        />
        <button type="submit" disabled={isLoading}>
          {isLoading ? 'Sending...' : 'Send'}
        </button>
      </form>
    </div>
  );
}

Working with Different LLM Providers

Anthropic Claude

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

async function chatWithClaude(message) {
  const response = await anthropic.messages.create({
    model: 'claude-3-sonnet-20240229',
    max_tokens: 1024,
    messages: [
      { role: 'user', content: message }
    ],
  });

  return response.content[0].text;
}

// Streaming
async function streamClaude(message) {
  const stream = await anthropic.messages.create({
    model: 'claude-3-sonnet-20240229',
    max_tokens: 1024,
    messages: [{ role: 'user', content: message }],
    stream: true,
  });

  for await (const event of stream) {
    if (event.type === 'content_block_delta') {
      console.log(event.delta.text);
    }
  }
}

Google Gemini

import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY);

async function chatWithGemini(message) {
  const model = genAI.getGenerativeModel({ model: 'gemini-pro' });
  
  const result = await model.generateContent(message);
  const response = await result.response;
  
  return response.text();
}

// Streaming
async function streamGemini(message) {
  const model = genAI.getGenerativeModel({ model: 'gemini-pro' });
  
  const result = await model.generateContentStream(message);
  
  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

Groq (Ultra-Fast)

import Groq from 'groq-sdk';

const groq = new Groq({
  apiKey: process.env.GROQ_API_KEY
});

async function chatWithGroq(message) {
  const completion = await groq.chat.completions.create({
    messages: [
      { role: 'user', content: message }
    ],
    model: 'mixtral-8x7b-32768', // Very fast!
  });

  return completion.choices[0].message.content;
}

Provider Abstraction Layer

// llm-provider.js
class LLMProvider {
  constructor(provider, apiKey) {
    this.provider = provider;
    this.apiKey = apiKey;
    this.client = this.initializeClient();
  }

  initializeClient() {
    switch (this.provider) {
      case 'openai':
        return new OpenAI({ apiKey: this.apiKey });
      case 'anthropic':
        return new Anthropic({ apiKey: this.apiKey });
      case 'groq':
        return new Groq({ apiKey: this.apiKey });
      default:
        throw new Error(`Unsupported provider: ${this.provider}`);
    }
  }

  async chat(message, options = {}) {
    switch (this.provider) {
      case 'openai':
        return this.chatOpenAI(message, options);
      case 'anthropic':
        return this.chatAnthropic(message, options);
      case 'groq':
        return this.chatGroq(message, options);
    }
  }

  async chatOpenAI(message, options) {
    const completion = await this.client.chat.completions.create({
      model: options.model || 'gpt-3.5-turbo',
      messages: [{ role: 'user', content: message }],
      ...options
    });
    return completion.choices[0].message.content;
  }

  async chatAnthropic(message, options) {
    const response = await this.client.messages.create({
      model: options.model || 'claude-3-sonnet-20240229',
      max_tokens: options.max_tokens || 1024,
      messages: [{ role: 'user', content: message }]
    });
    return response.content[0].text;
  }

  async chatGroq(message, options) {
    const completion = await this.client.chat.completions.create({
      model: options.model || 'mixtral-8x7b-32768',
      messages: [{ role: 'user', content: message }]
    });
    return completion.choices[0].message.content;
  }
}

// Usage
const llm = new LLMProvider('openai', process.env.OPENAI_API_KEY);
const response = await llm.chat('Hello!');

// Easy to switch providers
const groqLLM = new LLMProvider('groq', process.env.GROQ_API_KEY);
const fastResponse = await groqLLM.chat('Hello!');

Client-Side vs Server-Side Integration

โŒ Don’t Do This (Client-Side API Key)

// NEVER do this!
const openai = new OpenAI({
  apiKey: 'sk-...' // Exposed to everyone!
});

โœ… Do This (Proxy Through Backend)

Frontend:

// frontend.js
async function askAI(question) {
  const response = await fetch('/api/ask', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ question })
  });
  
  return await response.json();
}

Backend:

// backend.js
app.post('/api/ask', async (req, res) => {
  // API key is safe on server
  const completion = await openai.chat.completions.create({
    model: 'gpt-3.5-turbo',
    messages: [{ role: 'user', content: req.body.question }]
  });
  
  res.json({ answer: completion.choices[0].message.content });
});

Advanced Patterns and Best Practices

1. Function Calling (Tool Use)

Let the LLM call functions in your code:

async function chatWithFunctions(message) {
  const functions = [
    {
      name: 'get_weather',
      description: 'Get the current weather in a location',
      parameters: {
        type: 'object',
        properties: {
          location: {
            type: 'string',
            description: 'City name, e.g., San Francisco'
          },
          unit: {
            type: 'string',
            enum: ['celsius', 'fahrenheit']
          }
        },
        required: ['location']
      }
    }
  ];

  const completion = await openai.chat.completions.create({
    model: 'gpt-4-turbo',
    messages: [{ role: 'user', content: message }],
    functions: functions,
    function_call: 'auto'
  });

  const responseMessage = completion.choices[0].message;

  // Check if the model wants to call a function
  if (responseMessage.function_call) {
    const functionName = responseMessage.function_call.name;
    const functionArgs = JSON.parse(responseMessage.function_call.arguments);

    // Call your actual function
    let functionResponse;
    if (functionName === 'get_weather') {
      functionResponse = await getWeather(functionArgs.location, functionArgs.unit);
    }

    // Send function result back to model
    const secondCompletion = await openai.chat.completions.create({
      model: 'gpt-4-turbo',
      messages: [
        { role: 'user', content: message },
        responseMessage,
        {
          role: 'function',
          name: functionName,
          content: JSON.stringify(functionResponse)
        }
      ]
    });

    return secondCompletion.choices[0].message.content;
  }

  return responseMessage.content;
}

// Weather function
async function getWeather(location, unit = 'celsius') {
  // Call weather API
  return {
    location,
    temperature: 22,
    unit,
    condition: 'sunny'
  };
}

2. Prompt Templates

class PromptTemplate {
  constructor(template) {
    this.template = template;
  }

  format(variables) {
    let result = this.template;
    for (const [key, value] of Object.entries(variables)) {
      result = result.replace(new RegExp(`{{${key}}}`, 'g'), value);
    }
    return result;
  }
}

// Usage
const summarizeTemplate = new PromptTemplate(`
Summarize the following text in {{max_words}} words or less:

Text: {{text}}

Summary:
`);

const prompt = summarizeTemplate.format({
  text: 'Long article here...',
  max_words: 50
});

const summary = await openai.chat.completions.create({
  model: 'gpt-3.5-turbo',
  messages: [{ role: 'user', content: prompt }]
});

3. Conversation Memory with Context Window Management

class ConversationManager {
  constructor(maxTokens = 4000) {
    this.maxTokens = maxTokens;
    this.messages = [];
  }

  addMessage(role, content) {
    this.messages.push({ role, content });
    this.trimToTokenLimit();
  }

  trimToTokenLimit() {
    // Rough estimation: 1 token โ‰ˆ 4 characters
    let totalChars = this.messages.reduce((sum, msg) => 
      sum + msg.content.length, 0
    );

    while (totalChars > this.maxTokens * 4 && this.messages.length > 1) {
      // Remove oldest user message (keep system message)
      const removed = this.messages.splice(1, 1)[0];
      totalChars -= removed.content.length;
    }
  }

  getMessages() {
    return this.messages;
  }

  clear() {
    this.messages = [];
  }
}

// Usage
const conversation = new ConversationManager(4000);
conversation.addMessage('system', 'You are a helpful assistant.');
conversation.addMessage('user', 'Hello!');
conversation.addMessage('assistant', 'Hi! How can I help?');

const completion = await openai.chat.completions.create({
  model: 'gpt-3.5-turbo',
  messages: conversation.getMessages()
});

4. Retry Logic with Exponential Backoff

async function callWithRetry(fn, maxRetries = 3) {
  let lastError;
  
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error;
      
      // Don't retry on client errors (400s)
      if (error.status >= 400 && error.status < 500) {
        throw error;
      }
      
      // Exponential backoff: 1s, 2s, 4s
      const delay = Math.pow(2, i) * 1000;
      console.log(`Retry ${i + 1}/${maxRetries} after ${delay}ms`);
      await new Promise(resolve => setTimeout(resolve, delay));
    }
  }
  
  throw lastError;
}

// Usage
const response = await callWithRetry(async () => {
  return await openai.chat.completions.create({
    model: 'gpt-3.5-turbo',
    messages: [{ role: 'user', content: 'Hello' }]
  });
});

Cost Optimization Strategies

1. Choose the Right Model

// Model pricing (approximate, per 1M tokens):
// GPT-4 Turbo: $10 input, $30 output
// GPT-3.5 Turbo: $0.50 input, $1.50 output
// Claude Sonnet: $3 input, $15 output
// Groq (Mixtral): $0.27 input, $0.27 output

function selectModel(taskComplexity) {
  if (taskComplexity === 'simple') {
    return 'gpt-3.5-turbo'; // Cheap and fast
  } else if (taskComplexity === 'medium') {
    return 'claude-3-sonnet'; // Good balance
  } else {
    return 'gpt-4-turbo'; // Best quality
  }
}

2. Cache Responses

import NodeCache from 'node-cache';

const cache = new NodeCache({ stdTTL: 3600 }); // 1 hour

async function cachedChat(message) {
  const cacheKey = `chat:${message}`;
  
  // Check cache first
  const cached = cache.get(cacheKey);
  if (cached) {
    console.log('Cache hit!');
    return cached;
  }

  // Call API
  const response = await openai.chat.completions.create({
    model: 'gpt-3.5-turbo',
    messages: [{ role: 'user', content: message }]
  });

  const result = response.choices[0].message.content;
  
  // Store in cache
  cache.set(cacheKey, result);
  
  return result;
}

3. Limit Max Tokens

async function costAwareChat(message, budget = 'low') {
  const budgetLimits = {
    low: { max_tokens: 150, model: 'gpt-3.5-turbo' },
    medium: { max_tokens: 500, model: 'gpt-3.5-turbo' },
    high: { max_tokens: 1000, model: 'gpt-4-turbo' }
  };

  const config = budgetLimits[budget];

  return await openai.chat.completions.create({
    model: config.model,
    messages: [{ role: 'user', content: message }],
    max_tokens: config.max_tokens
  });
}

4. Batch Similar Requests

async function batchProcess(items) {
  // Instead of N API calls, make 1 call with all items
  const batchPrompt = `
Process the following items and return JSON array:

${items.map((item, i) => `${i + 1}. ${item}`).join('\n')}

Return format: [{"item": "...", "result": "..."}]
`;

  const response = await openai.chat.completions.create({
    model: 'gpt-3.5-turbo',
    messages: [{ role: 'user', content: batchPrompt }],
    response_format: { type: 'json_object' }
  });

  return JSON.parse(response.choices[0].message.content);
}

Security and Rate Limiting

Rate Limiting

import rateLimit from 'express-rate-limit';

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // Limit each IP to 100 requests per window
  message: 'Too many requests, please try again later.'
});

app.use('/api/chat', limiter);

Input Validation

function validateInput(message) {
  // Length check
  if (!message || message.length === 0) {
    throw new Error('Message cannot be empty');
  }
  
  if (message.length > 4000) {
    throw new Error('Message too long (max 4000 characters)');
  }

  // Content filtering (basic)
  const forbiddenPatterns = [
    /\b(password|api[_-]?key|secret)\b/i,
    /<script/i,
    /javascript:/i
  ];

  for (const pattern of forbiddenPatterns) {
    if (pattern.test(message)) {
      throw new Error('Message contains forbidden content');
    }
  }

  return true;
}

app.post('/api/chat', async (req, res) => {
  try {
    validateInput(req.body.message);
    // Process message...
  } catch (error) {
    res.status(400).json({ error: error.message });
  }
});

User Authentication

import jwt from 'jsonwebtoken';

function authenticateToken(req, res, next) {
  const token = req.headers['authorization']?.split(' ')[1];
  
  if (!token) {
    return res.status(401).json({ error: 'No token provided' });
  }

  jwt.verify(token, process.env.JWT_SECRET, (err, user) => {
    if (err) {
      return res.status(403).json({ error: 'Invalid token' });
    }
    req.user = user;
    next();
  });
}

app.post('/api/chat', authenticateToken, async (req, res) => {
  // req.user is available here
  const userId = req.user.id;
  // Process chat with user context...
});

Production-Ready Examples

Example 1: Content Generator API

// content-generator.js
import express from 'express';
import OpenAI from 'openai';
import rateLimit from 'express-rate-limit';

const app = express();
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

app.use(express.json());

const limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 10
});

app.use('/api/generate', limiter);

app.post('/api/generate/blog-post', async (req, res) => {
  try {
    const { topic, tone, length } = req.body;

    if (!topic) {
      return res.status(400).json({ error: 'Topic is required' });
    }

    const prompt = `Write a ${length || 'medium'}-length blog post about "${topic}" 
in a ${tone || 'professional'} tone. Include an introduction, main points, and conclusion.`;

    const completion = await openai.chat.completions.create({
      model: 'gpt-3.5-turbo',
      messages: [
        {
          role: 'system',
          content: 'You are an expert content writer.'
        },
        {
          role: 'user',
          content: prompt
        }
      ],
      temperature: 0.8,
      max_tokens: 1000
    });

    const content = completion.choices[0].message.content;
    const tokensUsed = completion.usage.total_tokens;

    res.json({
      content,
      metadata: {
        tokensUsed,
        model: 'gpt-3.5-turbo',
        topic,
        tone
      }
    });
  } catch (error) {
    console.error('Generation error:', error);
    res.status(500).json({ error: 'Failed to generate content' });
  }
});

app.listen(3000, () => {
  console.log('Content Generator API running on port 3000');
});

Example 2: Smart FAQ Bot

// faq-bot.js
class FAQBot {
  constructor(faqData) {
    this.faqData = faqData;
    this.openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
  }

  buildContext() {
    return `You are a customer support bot. Use the following FAQ to answer questions:

${this.faqData.map(faq => `Q: ${faq.question}\nA: ${faq.answer}`).join('\n\n')}

If the question is not covered in the FAQ, politely say you don't know and offer to connect them with a human agent.`;
  }

  async answer(question) {
    const completion = await this.openai.chat.completions.create({
      model: 'gpt-3.5-turbo',
      messages: [
        {
          role: 'system',
          content: this.buildContext()
        },
        {
          role: 'user',
          content: question
        }
      ],
      temperature: 0.3, // Low for consistent answers
      max_tokens: 300
    });

    return completion.choices[0].message.content;
  }
}

// Usage
const faqData = [
  {
    question: 'What are your business hours?',
    answer: 'We are open Monday-Friday, 9 AM - 5 PM EST.'
  },
  {
    question: 'How do I reset my password?',
    answer: 'Click "Forgot Password" on the login page and follow the instructions.'
  }
  // Add more FAQs...
];

const bot = new FAQBot(faqData);

app.post('/api/faq', async (req, res) => {
  const { question } = req.body;
  const answer = await bot.answer(question);
  res.json({ answer });
});

Example 3: Email Response Generator

// email-assistant.js
async function generateEmailResponse(emailContent, tone = 'professional') {
  const prompt = `Generate a ${tone} email response to the following email:

${emailContent}

Response:`;

  const completion = await openai.chat.completions.create({
    model: 'gpt-4-turbo',
    messages: [
      {
        role: 'system',
        content: `You are a professional email assistant. Write clear, 
concise, and polite email responses.`
      },
      {
        role: 'user',
        content: prompt
      }
    ],
    temperature: 0.7
  });

  return completion.choices[0].message.content;
}

app.post('/api/email/respond', async (req, res) => {
  try {
    const { email, tone } = req.body;
    const response = await generateEmailResponse(email, tone);
    res.json({ response });
  } catch (error) {
    res.status(500).json({ error: 'Failed to generate response' });
  }
});

Best Practices Checklist

Security โœ…

  • Never expose API keys in client-side code
  • Always proxy API calls through your backend
  • Implement rate limiting
  • Validate and sanitize user input
  • Use authentication for API access

Performance โœ…

  • Use streaming for better UX
  • Cache common responses
  • Choose appropriate models for task complexity
  • Implement timeout handling
  • Use retry logic with backoff

Cost โœ…

  • Set max_tokens limits
  • Use cheaper models when possible
  • Monitor usage and set budgets
  • Batch similar requests
  • Cache responses when appropriate

User Experience โœ…

  • Show loading indicators
  • Stream responses when possible
  • Handle errors gracefully
  • Provide fallback messages
  • Add typing indicators

Code Quality โœ…

  • Use TypeScript for type safety
  • Write unit tests
  • Log errors and usage
  • Document API endpoints
  • Use environment variables

Conclusion

Integrating LLMs into JavaScript web applications is now easier than ever. With the right tools and patterns, you can build production-ready AI features that:

  • Enhance user experience with intelligent interactions
  • Automate repetitive tasks
  • Provide personalized content
  • Scale efficiently with proper caching and rate limiting

Key Takeaways

  1. Always secure your API keys - use backend proxies
  2. Stream responses for better UX
  3. Choose the right model for your use case and budget
  4. Implement proper error handling and retry logic
  5. Monitor costs and optimize token usage
  6. Use frameworks like Vercel AI SDK to speed up development

Next Steps

  1. Choose an LLM provider (OpenAI, Anthropic, Groq)
  2. Set up a simple backend API
  3. Build a basic chat interface
  4. Add streaming for better UX
  5. Implement caching and rate limiting
  6. Deploy to production

The AI revolution in web development is here. Start building!

Resources

SDKs and Libraries

Documentation

Learning Resources

Community


Comments