Worksona.js Documentation

AI Agent Management - Library & API Server

Core Library Overview

Worksona.js is a powerful, single-file JavaScript library (2,241 lines, zero runtime dependencies) that provides sophisticated AI agent management across multiple LLM providers. The library operates in two modes: as a direct JavaScript library or as a REST API server.

Zero Dependencies

Single-file architecture with no external runtime dependencies

Multi-Provider

OpenAI, Anthropic, Google with automatic failover

Image Pipeline

Generate, analyze, edit, and create image variations

Observable

17 event emitters for complete visibility

Architecture Overview

graph TB subgraph "Worksona Core" W[Worksona Class] A[Agent Class] E[Event System] CP[Control Panel] end subgraph "LLM Providers" OAI[OpenAI
GPT-5, GPT-4o, o3] ANT[Anthropic
Claude Opus 4.5] GOO[Google
Gemini Pro] end subgraph "Capabilities" CH[Chat & Query] IM[Image Processing] FP[File Processing] ME[Metrics & History] end W --> A W --> E W --> CP A --> OAI A --> ANT A --> GOO W --> CH W --> IM W --> FP W --> ME style W fill:#2563eb,color:#fff style A fill:#7c3aed,color:#fff

Latest Frontier Models

Worksona.js provides first-class support for the latest and most powerful AI models:

  • GPT-5 series (gpt-5, gpt-5-mini, gpt-5-nano) - OpenAI's latest flagship models
  • Claude Opus 4.5 - Anthropic's most capable model with 200K context
  • o3 & o3-mini - Advanced reasoning models for complex problem-solving
  • GPT-4o - Multimodal model with vision and voice capabilities
  • Full backward compatibility with GPT-4, Claude 3.5, and earlier models

Event-Driven Architecture

The library emits 17 distinct events for complete observability:

// Core events
worksona.on('agent-loaded', (data) => console.log('Agent ready'));
worksona.on('chat-start', (data) => console.log('Processing...'));
worksona.on('chat-complete', (data) => console.log('Response:', data));

// Image events
worksona.on('image-generation-start', (data) => ...);
worksona.on('image-generation-complete', (data) => ...);
worksona.on('image-analysis-complete', (data) => ...);

// Error handling
worksona.on('error', (error) => console.error('Error:', error));

Image Processing Pipeline

Complete image workflow powered by OpenAI's DALL-E 3 and GPT-4o vision:

  • Generate - Create images from text descriptions
  • Analyze - Extract insights from images using vision AI
  • Edit - Modify existing images with text prompts
  • Variations - Create variations of existing images

Real-Time Control Panel

Built-in visual control panel for development and debugging:

  • View all loaded agents and their status
  • Monitor API provider connectivity
  • Inspect agent metrics and history
  • Test chat and image operations
  • Floating or embedded mode

Delegators Explained

Delegators are the orchestration layer in Worksona.js that enable multi-agent workflows. A delegator agent can break down complex tasks and delegate subtasks to specialized agents, creating powerful AI pipelines.

Multi-Agent Workflow Example

// Content creation pipeline with delegation
const pipeline = async (topic) => {
  // Step 1: Research
  const research = await worksona.chat('research-analyst',
    `Research key points about: ${topic}`);

  // Step 2: Write content
  const draft = await worksona.chat('content-writer',
    `Write article based on: ${research}`);

  // Step 3: Edit
  const edited = await worksona.chat('editor-agent',
    `Edit and improve: ${draft}`);

  // Step 4: Fact-check
  const verified = await worksona.chat('research-analyst',
    `Verify facts in: ${edited}`);

  return verified;
};

// Execute the pipeline
const article = await pipeline('Quantum Computing');

Common Delegation Patterns

  • Research Write Edit - Content creation workflows
  • Analyze Extract Summarize - Document processing
  • Classify Route Respond - Customer support automation
  • Generate Review Refine - Creative workflows
  • Parse Validate Transform - Data processing pipelines

Use Cases for Delegation

Content Production

Research, writing, editing, and fact-checking pipelines

Document Processing

OCR, extraction, analysis, and summarization workflows

Customer Support

Intent detection, routing, response generation, and QA

Data Analysis

Collection, validation, analysis, and reporting chains

Endpoint Agents

Endpoint agents are pre-configured, specialized agents with distinct personalities, knowledge domains, and expertise. Each agent is defined by a JSON configuration file that specifies its traits, system prompt, and conversation examples.

Agent Personality System

Agents have rich personality configurations that shape their behavior:

{
  "id": "marketing-agent",
  "name": "Marketing Strategist",
  "description": "Expert in marketing strategy and brand positioning",
  "config": {
    "provider": "openai",
    "model": "gpt-4o",
    "temperature": 0.7,
    "systemPrompt": "You are a marketing strategist...",
    "traits": {
      "personality": [
        "Creative and strategic",
        "Data-driven decision maker",
        "Brand-focused"
      ],
      "knowledge": [
        "Marketing strategy",
        "Brand positioning",
        "Customer psychology",
        "Digital marketing"
      ],
      "tone": "Professional yet enthusiastic"
    },
    "examples": [
      {
        "user": "How do we improve brand awareness?",
        "assistant": "Let's develop a multi-channel strategy..."
      }
    ]
  }
}

Available Pre-Configured Agents

Research Analyst

Expert in research, analysis, and fact-checking

Content Writer

Specialized in creating engaging written content

Legal Agent

Knowledgeable in legal analysis and compliance

Marketing Agent

Expert in marketing strategy and campaigns

PRD Editor

Specialized in product requirement documents

Interviewer

Conducts structured interviews and assessments

Creating Custom Agents

You can create custom agents programmatically or load from JSON files:

// Programmatic creation
await worksona.loadAgent({
  id: 'support-bot',
  name: 'Customer Support Bot',
  config: {
    provider: 'anthropic',
    model: 'claude-opus-4-5-20251101',
    temperature: 0.7,
    systemPrompt: 'You are a helpful customer support agent...',
    traits: {
      personality: ['Helpful', 'Patient', 'Empathetic'],
      knowledge: ['Product documentation', 'Troubleshooting', 'FAQs'],
      tone: 'Friendly and professional'
    }
  }
});

// Load from JSON file
const agentConfig = await fetch('./agents/custom-agent.json');
await worksona.loadAgent(await agentConfig.json());

REST API Server

Worksona.js includes a full-featured Express-based REST API server that exposes all library functionality via HTTP endpoints. The server provides 32+ endpoints organized into logical categories.

API Architecture

graph LR C[Client] -->|HTTP| AG[Agent Management] C -->|HTTP| QR[Query & Chat] C -->|HTTP| FU[File Upload] C -->|HTTP| TL[Tools] subgraph "API Server" AG --> WS[Worksona Library] QR --> WS FU --> WS TL --> WS end WS --> OAI[OpenAI] WS --> ANT[Anthropic] WS --> GOO[Google] style C fill:#7c3aed,color:#fff style WS fill:#2563eb,color:#fff

Key API Features

File Processing

Upload and process images, PDFs, DOCX, XLSX, CSV files

OCR Capabilities

Extract text from images using Tesseract

Document Parsing

Parse PDF, DOCX, XLSX with specialized libraries

Agent-Scoped Routing

Pattern: /api/agents/:agentId/:action/:object

Batch Processing

Process up to 10 queries in parallel

Webhook Support

Integrate with external services via webhooks

Endpoint Categories (32+ endpoints)

  • Agent Management - Load, list, get, delete, chat
  • Query & Chat - Generic query, agent query, batch processing
  • File Processing - Upload, analyze, process with schema
  • Image Operations - Analyze with vision, generate with DALL-E
  • Document Processing - OCR, analyze documents
  • Tool System - DALL-E, web scraper, text-to-speech
  • Convenience Endpoints - Quick translate, ask shortcuts
  • Slash Commands - /ocr, /summarize, /translate, /extract-data
  • Webhooks - Incoming webhook handlers

Rate Limiting & Security

  • 100 requests per 15 minutes on /api routes
  • Optional API key authentication (X-API-Key header)
  • File size limit: 10MB per upload
  • Max 5 files per request
  • CORS enabled for development

Quick Example

// Simple query via REST API
const response = await fetch('http://localhost:3000/api/query', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    agent: 'research-analyst',
    query: 'What is quantum computing?',
    options: { temperature: 0.7 }
  })
});

const data = await response.json();
console.log(data.data.response);

Tooling Ecosystem

Worksona.js includes an extensible tool system that adds specialized capabilities beyond basic chat. Tools can be used standalone or enhanced by agents for improved results.

Tool System Architecture

graph TB subgraph "Tool Access" DT[Direct Tool Access
/api/tools/:tool/:action] AT[Agent-Enhanced Tool
/api/agents/:id/tools/:tool/:action] end subgraph "Available Tools" DALLE[DALL-E 3
Image Generation] SCRAPE[Web Scraper
Content Extraction] TTS[Text-to-Speech
6 Voices] end DT --> DALLE DT --> SCRAPE DT --> TTS AT --> |Prompt Enhancement| DALLE AT --> |Content Analysis| SCRAPE AT --> TTS style DALLE fill:#10b981,color:#fff style SCRAPE fill:#f59e0b,color:#fff style TTS fill:#8b5cf6,color:#fff

Built-In Tools

DALL-E 3 Image Generator

Actions: generate, edit, variations

Agent Enhancement: Marketing agents can transform "logo" into detailed professional prompts

Web Scraper

Actions: fetch (extract text), extract (structured data)

Agent Enhancement: Research agents can analyze and summarize scraped content

Text-to-Speech

Actions: speak, generate

Voices: alloy, echo, fable, onyx, nova, shimmer

Direct vs Agent-Enhanced Tools

// Direct tool access - basic functionality
GET /api/tools/dalle/generate?prompt=sunset
// Returns: Basic sunset image

// Agent-enhanced tool access - improved results
GET /api/agents/marketing-agent/tools/dalle/generate?prompt=logo
// Agent transforms "logo" into:
// "Create a modern, professional logo design that embodies innovation...
//  using a sophisticated color palette of navy blue and silver..."
// Returns: Professional, detailed logo

Document Processing Tools

Specialized tools for document handling:

  • PDF Parser - Extract text and metadata from PDF files
  • DOCX Parser - Process Microsoft Word documents
  • XLSX Parser - Read Excel spreadsheets
  • OCR Engine - Tesseract-powered text extraction from images
  • Markdown Renderer - Convert and render markdown content