Skip to main content
Helios uses concurrency limiting to ensure fair usage and system stability. This guide explains how limits work and how to handle them.

How Concurrency Limits Work

Instead of traditional rate limits (requests per minute), Helios limits the number of concurrent requests your account can have processing at once.
Why concurrency limits? Health data analysis takes several minutes per request. Traditional rate limits don’t work well for long-running operations, so we limit concurrent requests instead.

Default Limits

Agent TypeDefault Limit
Light Agent7 concurrent requests
Deep Agent7 concurrent requests
Lab Results Agent7 concurrent requests
Limits are tracked separately for each agent type. You can have 7 Light Agent requests, 7 Deep Agent requests, and 7 Lab Results Agent requests running simultaneously.
Need higher limits? Contact support@heliosinc.xyz to discuss custom limits for your account.

Response Headers

Every API response includes headers showing your current concurrency status:
HeaderDescriptionExample
X-Concurrent-LimitMaximum concurrent requests allowed7
X-Concurrent-ActiveCurrently active requests (including this one)3
X-Concurrent-RemainingAvailable slots4
These headers are included on every API response (200 OK, 429, etc.), not just rate limit errors. This allows you to monitor your concurrency usage proactively.

Example Response Headers (200 OK)

HTTP/1.1 200 OK
X-Concurrent-Limit: 7
X-Concurrent-Active: 3
X-Concurrent-Remaining: 4
Content-Type: application/json

Handling Rate Limit Errors

When you exceed your concurrency limit, the API returns a 429 Too Many Requests response:
{
  "error": "Too many concurrent requests",
  "message": "You have 7 active light agent requests. Maximum is 7.",
  "activeCount": 7,
  "limit": 7
}

Response Headers on 429

HTTP/1.1 429 Too Many Requests
X-Concurrent-Limit: 7
X-Concurrent-Active: 7
X-Concurrent-Remaining: 0
Retry-After: 60
Content-Type: application/json
The Retry-After header suggests how long to wait before retrying (in seconds).

Best Practices

Check the X-Concurrent-Remaining header on every response to know when you’re approaching your limit.
const response = await fetch('https://api.heliosai.health/api/v1/agent', {
  method: 'POST',
  headers: { 'x-api-key': apiKey, 'Content-Type': 'application/json' },
  body: JSON.stringify(requestData)
});

const remaining = response.headers.get('X-Concurrent-Remaining');
if (parseInt(remaining) < 2) {
  console.warn('Approaching concurrency limit');
}
When you receive a 429, wait before retrying:
async function submitWithRetry(requestData, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch('https://api.heliosai.health/api/v1/agent', {
      method: 'POST',
      headers: { 'x-api-key': apiKey, 'Content-Type': 'application/json' },
      body: JSON.stringify(requestData)
    });
    
    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After') || '60';
      const waitTime = parseInt(retryAfter) * 1000;
      console.log(`Rate limited. Waiting ${retryAfter}s before retry...`);
      await new Promise(r => setTimeout(r, waitTime));
      continue;
    }
    
    return response;
  }
  throw new Error('Max retries exceeded');
}
For high-volume use cases, implement a queue to control how many requests you submit:
class RequestQueue {
  constructor(maxConcurrent = 6) {
    this.maxConcurrent = maxConcurrent;
    this.active = 0;
    this.queue = [];
  }
  
  async submit(requestData) {
    if (this.active >= this.maxConcurrent) {
      await new Promise(resolve => this.queue.push(resolve));
    }
    
    this.active++;
    try {
      return await fetch('https://api.heliosai.health/api/v1/agent', {
        method: 'POST',
        headers: { 'x-api-key': apiKey, 'Content-Type': 'application/json' },
        body: JSON.stringify(requestData)
      });
    } finally {
      this.active--;
      if (this.queue.length > 0) {
        this.queue.shift()();
      }
    }
  }
}
Store the runId of submitted requests so you can track what’s in progress:
const activeRequests = new Map();

// When submitting
const { runId } = await response.json();
activeRequests.set(runId, { submittedAt: new Date(), data: requestData });

// When receiving webhook
app.post('/webhook', (req, res) => {
  const { runId, status } = req.body;
  activeRequests.delete(runId);
  res.status(200).send('OK');
});

Slot Release

Concurrency slots are automatically released when:
  1. Processing completes - Successfully or with an error
  2. Webhook is delivered - After the final webhook is sent
  3. Timeout occurs - After ~14 minutes if processing hangs (safety mechanism)
If your webhook endpoint is slow to respond, slots may take slightly longer to release. Ensure your webhook returns 200 OK quickly.

Next Steps