Skip to main content
🔑
Get your Incredible API key
Generate your API key to start using this endpoint

Overview

The Conversation API is designed for multi-turn dialogues where context and conversation history matter. Unlike the Answer API which handles single questions, Conversation maintains the flow of discussion across multiple exchanges, making it ideal for chatbots, virtual assistants, and interactive applications. Key characteristics:
  • Stateless but context-aware - You provide the full conversation history with each request
  • Optimized for dialogue - Faster and more cost-effective than the Agent endpoint when you don’t need tool calling
  • Flexible context - Supports alternating user and assistant messages
  • Document-aware - Can reference uploaded files for context-rich conversations
When to use Conversation vs other endpoints:
  • Use Conversation for multi-turn dialogue without needing tools or function calling
  • Use Answer for single-question scenarios or stateless Q&A
  • Use Agent when you need autonomous tool calling and complex workflows

Using Conversation

The Conversation endpoint is designed for multi-turn dialogues where you maintain conversation context across exchanges. You send the full message history with each request, and the API generates contextually aware responses. This gives you complete control over conversation management, allowing features like context pruning or conversation forking.

Examples

from incredible_python import Incredible

client = Incredible(api_key="YOUR_API_KEY")

response = client.conversation(
    messages=[
        {"role": "user", "content": "Hello!"},
        {"role": "assistant", "content": "Hi there! How can I help you today?"},
        {"role": "user", "content": "Tell me a joke"}
    ]
)

print(response.response)
What are messages? An array of message objects representing the conversation history. Each message has a role (either “user” or “assistant”) and content. Messages are processed in order, allowing the model to understand the full context of the conversation. Best practices:
  • Always provide enough context for the model to understand the current query
  • Consider implementing conversation summarization for very long dialogues
  • Alternate between user and assistant messages naturally
  • Don’t include messages that are no longer relevant to the current topic
Optional: System Prompts - You can include a system_prompt to define the AI’s behavior, personality, and constraints. For example: "You are a helpful customer service representative for Acme Corp." or "You are a technical expert who explains concepts in simple terms." The system prompt shapes the assistant’s tone and expertise level throughout the conversation. Optional: Streaming - Set stream: true to receive responses in real-time as they’re generated. See the Streaming section below for details.

Attaching Files to Conversations

Files can be attached to individual messages within a conversation, allowing the AI to reference document content when generating responses. This is powerful for document Q&A, report analysis, and context-specific assistance. You can provide document context in conversations by uploading files and referencing them:
from incredible_python import Incredible

client = Incredible(api_key="YOUR_API_KEY")

# Upload a file
with open("report.pdf", "rb") as f:
    file = client.files.upload(file=f)

# Use the file in a conversation
response = client.conversation(
    messages=[
        {"role": "user", "content": "What's in this document?", "file_ids": [file.file_id]},
        {"role": "assistant", "content": "This document contains a quarterly financial report."},
        {"role": "user", "content": "What were the key findings?"}
    ]
)

print(response.response)

Streaming Responses

Streaming creates a more natural, chat-like experience by delivering responses in real-time as they’re generated. Instead of waiting for the complete message, your users see text appearing progressively, similar to how a human types. Benefits of streaming in conversations:
  • Reduced perceived latency - Users see progress immediately
  • Better engagement - The typing effect feels more natural and interactive
  • Improved UX - Users can start reading while the response is still being generated
  • Cancellation support - Users can interrupt long responses they don’t need
Enable streaming by setting stream: true in your request:
from incredible_python import Incredible

client = Incredible(api_key="YOUR_API_KEY")

# Stream a conversation
stream = client.conversation(
    messages=[
        {"role": "user", "content": "Tell me a story about a robot"},
        {"role": "assistant", "content": "Once upon a time..."},
        {"role": "user", "content": "What happened next?"}
    ],
    stream=True
)

# Process streaming chunks
for chunk in stream:
    if hasattr(chunk, 'content') and chunk.content:
        print(chunk.content, end='', flush=True)
    if hasattr(chunk, 'done') and chunk.done:
        print("\n[Stream complete]")
Stream Event Types:
  • content - Text chunks as they’re generated
  • thinking - Internal reasoning process (if available)
  • done - Signals completion of the stream
  • error - Any errors that occurred during generation

Response

{
  "success": true,
  "response": "I'm doing well, thank you for asking! How can I help you today?"
}