Agent Webhooks

FleetLM talks to agents through simple HTTP webhooks. Register the endpoint once, then stream AI SDK UI message chunks (newline-delimited JSON) whenever a session message arrives. FleetLM forwards every chunk to connected clients in real time and compacts the full stream into a single assistant message when it finishes.

Register an agent

POST /api/agents
Content-Type: application/json

{
  "agent": {
    "id": "my-agent",
    "name": "My Agent",
    "origin_url": "https://agent.example.com",
    "webhook_path": "/webhook",
    "message_history_mode": "tail",
    "message_history_limit": 20,
    "timeout_ms": 30000,
    "debounce_window_ms": 500,
    "headers": {
      "X-API-Key": "secret"
    }
  }
}

message_history_mode: tail (default), last, or entire
message_history_limit: positive integer (used by tail)
timeout_ms: how long FleetLM waits for a response
debounce_window_ms: debounce window in milliseconds (default: 500)
headers: optional map sent with every webhook request

Webhook Debouncing

FleetLM batches rapid message bursts using a debounce mechanism to reduce agent load and improve efficiency:

When a user sends a message, FleetLM schedules a webhook dispatch after debounce_window_ms
If another message arrives before the timer fires, the timer resets
When the timer finally expires, the agent receives all accumulated messages in a single webhook call

Example: User sends 10 messages with 300ms gaps, debounce_window_ms = 500:

Messages 1-10 arrive over 3 seconds
Each message resets the 500ms timer
After the last message, timer fires after 500ms
Agent receives 1 webhook with all 10 messages batched together

Benefits:

Reduces webhook calls (10 messages → 1 webhook)
Mirrors natural human ↔ agent interaction patterns
Configurable per-agent for different use cases

Tuning:

debounce_window_ms: 0 — Immediate dispatch (no batching)
debounce_window_ms: 500 — Default, good for most cases
debounce_window_ms: 2000 — High batching for slow-typing users

Webhook request payload

{
  "session_id": "01HZXAMPLE12345",
  "agent_id": "my-agent",
  "user_id": "alice",
  "messages": [
    {
      "seq": 1,
      "sender_id": "alice",
      "kind": "text",
      "content": { "text": "Hello!" },
      "inserted_at": "2024-10-03T12:00:00Z"
    }
  ]
}

Session metadata is stored internally but not forwarded to the webhook.

Message history modes

Mode	Behaviour
`tail`	Last N messages (N = `message_history_limit`)
`last`	Most recent message, `limit` must stay > 0
`entire`	Full conversation; limit must stay > 0

Respond with AI SDK JSONL

Reply with status 200 and newline-delimited JSON. Each line must be a valid UI message chunk. FleetLM validates the payload, broadcasts it to WebSocket subscribers via stream_chunk, and only persists a message when it receives a terminal chunk (finish or abort).

HTTP/1.1 200 OK
Content-Type: application/json

{"type":"start","messageId":"msg_123"}
{"type":"text-start","id":"part_1"}
{"type":"text-delta","id":"part_1","delta":"Thinking"}
{"type":"text-delta","id":"part_1","delta":"..."}
{"type":"text-end","id":"part_1"}
{"type":"finish","messageMetadata":{"latency_ms": 1800}}

Common chunk types include:

start, finish, abort – lifecycle markers for the message.
text-*, reasoning-* – streaming natural language and reasoning traces.
tool-* – tool call arguments/results (static and dynamic tools).
data-*, file, source-* – structured attachments such as charts or citations.

FleetLM stores the final, compacted assistant message with kind: "assistant" and the collected parts array. Any chunk the parser cannot recognise results in telemetry (:invalid_json, :missing_type) and is dropped.

Register an agent​

Webhook Debouncing​

Webhook request payload​

Message history modes​

Respond with AI SDK JSONL​

Register an agent

Webhook Debouncing

Webhook request payload

Message history modes

Respond with AI SDK JSONL