Skip to main content

Conversation Management

QALITA Studio tracks conversations in the Platform database, linking them to issues, sources, and projects for easy retrieval.

Conversation Storage

Database Model

Conversations are stored in the studio_conversation table:

ColumnTypeDescription
idintPrimary key
conv_idstrUnique conversation identifier
filenamestrOriginal filename
partner_idintOrganization FK
issue_idintLinked issue (optional)
source_idintLinked source (optional)
project_idintLinked project (optional)
line_countintNumber of messages
created_atdatetimeCreation timestamp
updated_atdatetimeLast update timestamp

Conversation Content

Conversation content (messages) may be stored:

  • In S3-compatible object storage (for full content)
  • As metadata only in the database (line count, timestamps)

Message Format (when stored)

Messages follow a standard structure:

{
"role": "user",
"content": "Explain this quality issue",
"timestamp": "2024-10-20T14:30:22.123456Z"
}
{
"role": "assistant",
"content": "This issue concerns null values...",
"timestamp": "2024-10-20T14:30:24.789012Z"
}

Chat Request Messages

During chat, messages are passed in the request:

{
"message": "Current user message",
"messages_history": [
{"role": "user", "content": "First message"},
{"role": "assistant", "content": "First response"},
{"role": "user", "content": "Follow-up question"},
{"role": "assistant", "content": "Follow-up answer"}
]
}

Retrieving Conversations

Via Platform Interface

  1. Open Studio in the Platform sidebar
  2. In the context panel, select a Source or Issue
  3. Related conversations appear in the history section
  4. Click on a conversation to load its messages
  5. Continue the conversation or start fresh

Via API

Get Contextual Conversations

Retrieve conversations based on current context (source, issue, project):

Endpoint: GET /api/v1/studio/conversations/context

Query Parameters:

  • source_id: Get conversations from issues linked to this source
  • issue_id: Get conversations from this specific issue
  • project_id: Get all project conversations

Example:

curl -H "Authorization: Bearer YOUR_TOKEN" \
"https://your-platform/api/v1/studio/conversations/context?source_id=456&project_id=1"

Response:

{
"source_conversations": [
{
"id": 5,
"conv_id": "conv_20241020_143022",
"filename": "conv_20241020_143022.jsonl",
"project_id": 1,
"project_name": "E-Commerce",
"issue_id": 123,
"issue_title": "High null rate in email",
"source_id": 456,
"source_name": "customers",
"line_count": 24,
"created_at": "2024-10-20T14:30:22Z",
"updated_at": "2024-10-20T15:45:00Z"
}
],
"issue_conversations": [],
"project_conversations": [
{
"id": 8,
"conv_id": "investigation_orders",
"filename": "investigation_orders.jsonl",
"project_id": 1,
"project_name": "E-Commerce",
"issue_id": 125,
"issue_title": "Duplicate order IDs",
"source_id": 457,
"source_name": "orders",
"line_count": 36,
"created_at": "2024-10-19T09:30:00Z",
"updated_at": "2024-10-19T11:00:00Z"
}
],
"total_count": 2
}

Conversation Grouping

Conversations are grouped by context:

GroupDescription
source_conversationsFrom issues linked to the selected source
issue_conversationsDirectly linked to the selected issue
project_conversationsFrom all sources/issues in the project

This enables users to see related conversations when working on similar topics.

Conversation Linking

Conversations in Studio are linked to Platform entities for organization and retrieval.

Entity Linking

LinkPurpose
issue_idAssociate with a quality issue
source_idAssociate with a data source
project_idAssociate with a project

Automatic Linking

When chatting with context, conversations are automatically linked:

{
"message": "Analyze this issue",
"context": {
"issue_id": 123,
"source_id": 456,
"project_id": 1
}
}

The resulting conversation will be linked to:

  • Issue #123
  • Source #456
  • Project #1

Conversation Discovery

Users can discover related conversations:

  1. From an Issue: See all conversations about this issue
  2. From a Source: See conversations from all issues on this source
  3. From a Project: See conversations across all project sources

History Continuity

When reopening a context (issue/source), the frontend can:

  1. Query contextual conversations
  2. Load previous messages as history
  3. Continue the conversation with full context
# Load history for context
convs = requests.get(
f"{BASE_URL}/studio/conversations/context",
headers=headers,
params={"issue_id": 123}
).json()

# Use in next chat request
if convs["issue_conversations"]:
prev_conv = convs["issue_conversations"][0]
# Load messages from this conversation
# Pass as messages_history in chat request

Conversation Metadata

Each conversation tracks:

MetadataDescription
conv_idUnique identifier
filenameOriginal filename
line_countNumber of messages
created_atFirst message timestamp
updated_atLast message timestamp

This metadata helps users identify relevant conversations without loading full content.

Working with Conversations

Passing Message History

When continuing a conversation, pass previous messages:

import requests

headers = {"Authorization": "Bearer YOUR_TOKEN"}

# First message
response1 = requests.post(
f"{BASE_URL}/studio/chat",
headers=headers,
json={
"message": "What quality issues does this source have?",
"context": {"source_id": 456},
"conversation_id": "investigation_001"
}
).json()

# Follow-up with history
response2 = requests.post(
f"{BASE_URL}/studio/chat",
headers=headers,
json={
"message": "Which one should I fix first?",
"context": {"source_id": 456},
"conversation_id": "investigation_001",
"messages_history": [
{"role": "user", "content": "What quality issues does this source have?"},
{"role": "assistant", "content": response1["response"]}
]
}
).json()

Context Continuity

The agent maintains context through:

  1. Message History: Previous messages passed in request
  2. Analysis State: analysis_completed, analysis_result tracked
  3. Scripts State: scripts_generated for codification workflows
  4. Enriched Context: Issue details, source schema, metrics

Session Management

For long conversations, the frontend typically:

  1. Stores messages locally during the session
  2. Passes relevant history to each request
  3. Saves conversation metadata to Platform on completion

Privacy and Security

Data in Conversations

Conversations may include:

  • User prompts about data issues
  • Agent responses with analysis
  • Data samples (if requested)
  • Schema information
  • Quality metrics

Security Considerations

  1. Access Control: Conversations filtered by partner_id
  2. LLM Privacy: Choose providers aligned with data policies
  3. Local Option: Use Ollama for sensitive data
  4. No PII in Prompts: Avoid including personal data in messages

Sensitive Data

When including data samples in context:

  • Mask or exclude PII columns
  • Use aggregates instead of raw values
  • Consider synthetic data for examples

Best Practices

  1. Context Selection:

    • Always select a source or issue for context
    • Include relevant enriched data (schema, metrics)
    • Use data samples sparingly
  2. Conversation Structure:

    • One conversation per investigation topic
    • Start with clear objectives
    • Use follow-up questions to drill down
  3. History Management:

    • Keep relevant history, trim irrelevant parts
    • For long conversations, summarize early messages
    • Clear history when changing topics
  4. Organization:

    • Link conversations to issues for traceability
    • Use meaningful conversation IDs
    • Review related conversations before starting new ones

Next Steps