Skip to main content
POST
/
v1
/
agent
Agent
curl --request POST \
  --url https://api.incredible.one/v1/agent \
  --header 'Content-Type: application/json' \
  --data '{
  "messages": [
    {
      "role": "user",
      "content": "What is the weather in San Francisco?"
    }
  ],
  "system_prompt": "You are a helpful assistant with access to tools.",
  "tools": [
    {
      "name": "get_weather",
      "description": "Get weather information for a location",
      "input_schema": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "City name"
          }
        },
        "required": [
          "location"
        ]
      }
    }
  ],
  "stream": false
}'
{
  "success": true,
  "response": "I'll check the weather in San Francisco for you.",
  "tool_calls": [
    {
      "id": "call_123",
      "name": "get_weather",
      "inputs": {
        "location": "San Francisco"
      }
    }
  ]
}

Overview

Use Kimi K2 Thinking model for agentic conversations with function calling capabilities. This endpoint excels at reasoning and tool use, returning tool calls for the caller to execute. Important: This endpoint returns tool calls but does not execute them. The caller is responsible for executing tools and providing results back in subsequent requests. Use this when you need:
  • Advanced reasoning and planning
  • Function/tool calling
  • Multi-step problem solving
  • External API integration
  • Streaming or non-streaming responses

Use cases

  • API integration (weather, search, databases)
  • Multi-step task planning
  • Data retrieval and manipulation
  • External system interaction
  • Complex reasoning with tools

Model details

  • Model: Kimi K2 Thinking (via Fireworks)
  • Advanced reasoning capabilities
  • Native tool calling support
  • Caller-executed tools pattern

Request example

curl -X POST "https://api.incredible.one/v1/agent" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What is the weather in San Francisco?"}
    ],
    "system_prompt": "You are a helpful assistant with access to tools.",
    "tools": [
      {
        "name": "get_weather",
        "description": "Get weather information for a location",
        "input_schema": {
          "type": "object",
          "properties": {
            "location": {"type": "string", "description": "City name"}
          },
          "required": ["location"]
        }
      }
    ]
  }'

Request Body

  • messages array (required) — Conversation history with user and assistant messages
    • role string — Either "user" or "assistant"
    • content string — Message content
  • tools array (required) — List of tool definitions available to the agent
    • name string — Tool name (should be descriptive)
    • description string — What the tool does
    • input_schema object — JSON schema defining tool inputs
  • system_prompt string (optional) — System prompt to guide the agent (default: “You are a helpful assistant with access to tools.”)
  • stream boolean (optional) — Enable streaming response via Server-Sent Events (default: false)

Response

Non-streaming (default)

{
  "success": true,
  "response": "I'll check the weather in San Francisco for you.",
  "tool_calls": [
    {
      "id": "call_123",
      "name": "get_weather",
      "inputs": {
        "location": "San Francisco"
      }
    }
  ]
}
If no tools are called:
{
  "success": true,
  "response": "I'm ready to help! Please let me know what you need.",
  "tool_calls": null
}

Streaming (stream=true)

Server-Sent Events format:
data: {"thinking": "The user wants weather information..."}
data: {"content": "I'll check"}
data: {"content": " the weather"}
data: {"tool_call": {"id": "call_123", "name": "get_weather", "inputs": {"location": "San Francisco"}}}
data: {"tokens": 245}
data: {"done": true}

Tool execution pattern

This endpoint follows a “caller-executed tools” pattern:
  1. Agent request → Returns tool calls
  2. Caller executes → Run the tools in your code
  3. Provide results → Send results back in next request as assistant message
  4. Agent responds → Uses tool results to answer
Example multi-turn flow:
# Step 1: Agent decides to call tool
response1 = call_agent({
    "messages": [{"role": "user", "content": "Weather in SF?"}],
    "tools": [weather_tool]
})
# Returns: tool_calls=[{name: "get_weather", inputs: {location: "SF"}}]

# Step 2: Execute tool
weather_result = get_weather("SF")

# Step 3: Provide result back
response2 = call_agent({
    "messages": [
        {"role": "user", "content": "Weather in SF?"},
        {"role": "assistant", "content": f"Tool result: {weather_result}"}
    ],
    "tools": [weather_tool]
})
# Returns: response="The weather in SF is sunny and 72°F"

vs Other Endpoints

Feature/v1/agent/v1/conversation/v1/chat-completion
Tool calling✅ Yes (caller-executed)❌ No✅ Yes (auto-executed)
Reasoning✅ Advanced⚠️ Basic✅ Configurable
ModelKimi K2 ThinkingDeepSeek v3.1Configurable
Best forTool use + reasoningFast conversationsFull agentic

Next steps

Body

application/json
messages
object[]
required

Conversation history with user and assistant messages

Minimum length: 1
tools
object[]
required

List of tool definitions available to the agent

Minimum length: 1
system_prompt
string

System prompt to guide the agent. Defaults to 'You are a helpful assistant with access to tools.'

stream
boolean
default:false

Enable streaming response via Server-Sent Events

Response

Successful agent response

success
boolean
required

Whether the request was successful

response
string
required

The assistant's response

tool_calls
object[]

List of tool calls made by the agent (null if no tools were called)