Authentication

Assisters uses API keys for authentication. Include your key in every request via the Authorization header using the Bearer scheme.

curl https://api.assisters.com/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

Key Management

Create Keys: POST /v1/keys

  { "name": "dev-key-01" }

Rotate Keys: DELETE /v1/keys/{key_id} then create a new one.
Rate Limits: 1000 requests per minute per key. Exceeding this returns HTTP 429.

Best Practices

Store keys in environment variables (never in code).
Use separate keys for development, staging, and production.
Rotate keys every 90 days or after personnel changes.

Core Endpoints

Models

List available AI models and their capabilities.

Request

GET /v1/models

Response

{
  "models": [
    {
      "id": "gpt-4.1-mini",
      "name": "GPT-4.1 Mini",
      "max_tokens": 128000,
      "supports": ["chat", "embeddings", "reasoning"]
    }
  ]
}

Use Case: Select a model based on token limits or supported features.

Chat Completions

Generate AI responses for chat interactions.

Request

POST /v1/chat/completions

Body

{
  "model": "gpt-4.1-mini",
  "messages": [
    { "role": "user", "content": "Explain quantum computing." }
  ],
  "temperature": 0.7,
  "max_tokens": 1000
}

Response

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Quantum computing..."
      }
    }
  ]
}

Parameters

model: Required. Specify the model ID.
messages: Required. Array of { role, content } pairs (e.g., user, assistant).
temperature: Float (0–1). Lower = more deterministic.
max_tokens: Integer. Maximum response length.

Streaming Responses Set stream: true to receive chunks as they’re generated.

fetch("https://api.assisters.com/v1/chat/completions", {
  method: "POST",
  headers: { "Authorization": "Bearer YOUR_KEY", "Content-Type": "application/json" },
  body: JSON.stringify({ model: "gpt-4.1-mini", messages: [{ role: "user", content: "Hello" }], stream: true })
});

Embeddings

Convert text into vector embeddings for semantic search or clustering.

Request

POST /v1/embeddings

Body

{
  "model": "text-embedding-3-small",
  "input": "The quick brown fox jumps over the lazy dog."
}

Response

{
  "embedding": [0.0012, -0.0045, ..., 0.0078],
  "model": "text-embedding-3-small",
  "usage": { "tokens": 12 }
}

Use Cases

Semantic search in document databases.
Clustering user queries for analytics.
Input for machine learning models.

Advanced Features

Tools

Extend chat completions with function calling for real-world integrations.

Request

{
  "model": "gpt-4.1-mini",
  "messages": [{ "role": "user", "content": "What’s the weather in Paris?" }],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a location.",
        "parameters": {
          "type": "object",
          "properties": {
            "location": { "type": "string" }
          }
        }
      }
    }
  ]
}

Response

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": null,
      "tool_calls": [{
        "id": "call_123",
        "type": "function",
        "function": { "name": "get_weather", "arguments": "{\"location\": \"Paris\"}" }
      }]
    }
  }]
}

Handling Tool Calls

Parse the tool_calls array.
Execute the named function with provided arguments.
Return results via a new message:

   {
     "role": "tool",
     "content": "{\"temperature\": 15, \"unit\": \"C\"}",
     "tool_call_id": "call_123"
   }

Supported Tools

web_search: Real-time web search.
code_interpreter: Execute Python code.
Custom tools via the tools parameter.

Reasoning

Enable step-by-step problem-solving for complex queries.

Request

{
  "model": "gpt-4.1-mini",
  "messages": [{ "role": "user", "content": "Solve 2x + 3 = 7." }],
  "reasoning": true,
  "max_tokens": 2000
}

Response

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "Step 1: Subtract 3 from both sides → 2x = 4.
Step 2: Divide by 2 → x = 2.",
      "reasoning": "Derived from algebraic manipulation."
    }
  }]
}

Use Cases

Debugging code.
Mathematical proofs.
Multi-step decision making.

Error Handling

Assisters uses standard HTTP status codes. Key errors:

Code	Error Type	Example
400	Bad Request	Missing `model` parameter.
401	Unauthorized	Invalid API key.
404	Not Found	Unknown model ID.
429	Too Many Requests	Rate limit exceeded.
500	Internal Server Error	Model inference failed.

Error Response Format

{
  "error": {
    "type": "invalid_request_error",
    "message": "Model not found.",
    "param": "model",
    "code": "model_not_found"
  }
}

Retry Logic

For 429, implement exponential backoff (e.g., 1s, 2s, 4s).
For 500, retry up to 3 times with jitter (e.g., +0.5s).

SDKs and Libraries

Official SDKs

Python: pip install assistents

  from assistents import Assisters

  client = Assisters(api_key="YOUR_KEY")
  response = client.chat.completions.create(model="gpt-4.1-mini", messages=[{"role": "user", "content": "Hello"}])
  print(response.choices[0].message.content)

Node.js: npm install @assisters/sdk

  import Assisters from '@assisters/sdk';

  const client = new Assisters({ apiKey: "YOUR_KEY" });
  const response = await client.chat.completions.create({ model: "gpt-4.1-mini", messages: [{ role: "user", content: "Hello" }] });
  console.log(response.choices[0].message.content);

Community Libraries

Go: github.com/assisters/go-sdk
Ruby: gem assistents-ruby

Webhooks

Subscribe to real-time events (e.g., chat completions, errors).

Setup

Create Hook: POST /v1/webhooks

   {
     "url": "https://your-server.com/events",
     "events": ["chat.completion", "model.failed"]
   }

Verify Endpoint: Respond to GET /webhooks/verify with a challenge token.
Receive Events: Assisters sends HTTP POST requests with payloads like:

   {
     "event": "chat.completion",
     "data": { "id": "chat_123", "status": "completed" }
   }

Security

Validate webhook signatures using a shared secret.
Use HTTPS for the endpoint URL.

Performance Optimization

Caching

Cache embeddings for repeated queries:

  from cachetools import cached, TTLCache

  cache = TTLCache(maxsize=1000, ttl=3600)

  @cached(cache)
  def get_embedding(text):
      response = client.embeddings.create(model="text-embedding-3-small", input=text)
      return response.embedding

Use Redis or Memcached for distributed caching.

Batch Processing

Embed multiple texts in one request:

  {
    "model": "text-embedding-3-small",
    "input": ["text 1", "text 2", "text 3"]
  }

Model Selection

Use smaller models for low-latency tasks (e.g., gpt-4.1-mini instead of gpt-4.1-ultra).

Compliance and Security

Data Handling

GDPR/CCPA: Delete data via DELETE /v1/data/{id}.
Encryption: All data in transit uses TLS 1.3. Data at rest is encrypted.
PII Redaction: Use mask: true in requests to redact personally identifiable information.

Audit Logs

Access logs via GET /v1/audit?start=2024-01-01&end=2024-01-31.

Migration Guide

From v1 Legacy API

Update endpoints:

/v1/completions → /v1/chat/completions

Replace prompt with messages array:

   - { "prompt": "Hello" }
   + { "messages": [{ "role": "user", "content": "Hello" }] }

Use new models (e.g., gpt-4.1-mini instead of gpt-3.5-turbo).

Breaking Changes in v2

temperature now defaults to 1.0 (was 0.5).
max_tokens includes response tokens (previously excluded).

Best Practices for Developers

Idempotency: Use the idempotency-key header for retries:

  POST /v1/chat/completions
  Idempotency-Key: abc123

Monitoring: Track latency and error rates with /v1/metrics.
Fallbacks: Implement a secondary model for high-priority tasks.
Testing: Use /v1/models/{model}/test for canary deployments.

Assisters’ API empowers you to integrate AI seamlessly into your applications, whether you’re building chatbots, search engines, or automation tools. By leveraging the endpoints, tools, and optimizations outlined here, you can reduce development time from weeks to minutes while ensuring scalability and reliability. Start with the quickstart guide and experiment with the interactive playground to see what’s possible. The future of AI-assisted development is here—build it today.

Authentication

Key Management

Best Practices

Core Endpoints

Models

Chat Completions

Embeddings

Advanced Features

Tools

Reasoning

Error Handling

SDKs and Libraries

Official SDKs

Community Libraries

Webhooks

Performance Optimization

Caching

Batch Processing

Model Selection

Compliance and Security

Data Handling

Audit Logs

Migration Guide

From v1 Legacy API

Breaking Changes in v2

Best Practices for Developers

Related Articles

How to Choose the Best AI Assistant API in 2026: Developer Guide

How Git Integration Prevents AI App Development Disasters in 2026

Best AI Assistant SDKs for Developers in 2026: Speed vs Cost

More like this

Comments

More from Assisters

How to Use a Free AI Assistant in 2026: Step-by-Step Guide

10 Real AI Agent Examples You Can Build in 2026

What Is Private AI? Beginner's Guide for 2026

Recommended for you

How to Automate API Docs with AI in 2026: Step-by-Step Guide

How to Use AI for Copywriting: A Beginner's Guide for 2026

Client Acquisition Cost in 2026: Step-by-Step Guide to Reduce CAC

Explore More from Misar

How to Use Android SDK in 2026: Beginner's Step-by-Step Guide