Skip to content

Database Schema

The application uses a single SQLite database (evonic.db) with tables for evaluation results, test definitions, and the agent platform.

Top-level table for each evaluation run.

ColumnTypeDescription
run_idTEXT PKUUID identifier
started_atDATETIMEWhen the run started
completed_atDATETIMEWhen the run finished (null if in progress)
model_nameTEXTModel label for this run
summaryTEXTRun summary text
overall_scoreREALAggregate score (0.0-1.0)
total_tokensINTEGERTotal tokens consumed
total_duration_msINTEGERTotal wall-clock time

Aggregated results per domain/level.

ColumnTypeDescription
idINTEGER PKAuto-increment
run_idTEXT FKReferences evaluation_runs
domainTEXTDomain identifier
levelINTEGERComplexity level (1-5)
scoreREALAggregate score for this cell
statusTEXTpassed, failed, running, pending
promptTEXTLast prompt (legacy)
responseTEXTLast response (legacy)

Per-test results with full details.

ColumnTypeDescription
idINTEGER PKAuto-increment
run_idTEXT FKReferences evaluation_runs
test_idTEXT FKReferences tests
domainTEXTDomain identifier
levelINTEGERLevel number
promptTEXTFull prompt sent to LLM
responseTEXTFull LLM response
expectedTEXT (JSON)Expected output
scoreREALScore (0.0-1.0)
statusTEXTpassed or failed
detailsTEXT (JSON)Evaluator-specific details
duration_msINTEGERResponse time
model_nameTEXTModel used
system_promptTEXTResolved system prompt

Aggregated scores per domain/level.

ColumnTypeDescription
run_idTEXT FKReferences evaluation_runs
domainTEXTDomain identifier
levelINTEGERLevel number
average_scoreREALAverage score across tests
total_testsINTEGERNumber of tests
passed_testsINTEGERNumber of passed tests

Cached domain metadata from domain.json files.

ColumnTypeDescription
idTEXT PKDomain identifier
nameTEXTDisplay name
evaluator_idTEXTDefault evaluator
system_promptTEXTDomain system prompt
system_prompt_modeTEXToverwrite or append
enabledBOOLEANInclude in evaluations
tool_idsTEXTJSON array of tool IDs

Per-level configuration.

ColumnTypeDescription
domain_idTEXT PKReferences domains
levelINTEGER PKLevel number
system_promptTEXTLevel system prompt
system_prompt_modeTEXToverwrite or append

Individual test definitions.

ColumnTypeDescription
idTEXT PKTest identifier
domain_idTEXT FKReferences domains
levelINTEGERLevel number
promptTEXTUser prompt
expectedTEXT (JSON)Expected output
evaluator_idTEXTOverride evaluator
system_promptTEXTTest system prompt
weightREALScore weight (default: 1.0)
enabledBOOLEANInclude in evaluations

Evaluator configuration registry.

ColumnTypeDescription
idTEXT PKEvaluator identifier
typeTEXTregex, custom, hybrid
eval_promptTEXTLLM evaluation prompt template
extraction_regexTEXTRegex pattern for score extraction
uses_pass2BOOLEANWhether to use two-pass extraction
configTEXT (JSON)Additional configuration

Tool definition registry.

ColumnTypeDescription
idTEXT PKTool identifier
nameTEXTDisplay name
function_defTEXT (JSON)OpenAI function schema
mock_responseTEXTMock response for evaluation
mock_response_typeTEXTjson or javascript

Agent definitions.

ColumnTypeDescription
idTEXT PKSlug identifier (e.g., bookstore_bot)
nameTEXTDisplay name
descriptionTEXTShort description
system_promptTEXTAgent persona and instructions
modelTEXTModel override (null = use default)
created_atTIMESTAMPCreation time
updated_atTIMESTAMPLast update time

Many-to-many mapping of agents to tools.

ColumnTypeDescription
agent_idTEXT PK, FKReferences agents
tool_idTEXT PKTool identifier

Per-agent channel configurations.

ColumnTypeDescription
idTEXT PKUUID identifier
agent_idTEXT FKReferences agents
typeTEXTtelegram, whatsapp, discord
nameTEXTDisplay name
configTEXT (JSON)Channel-specific config (e.g., bot token)
enabledBOOLEANWhether the channel is active

Per-user conversation sessions.

ColumnTypeDescription
idTEXT PKUUID identifier
agent_idTEXT FKReferences agents
channel_idTEXT FKReferences channels (nullable for web chat)
external_user_idTEXTUser identifier from the channel

Sessions are uniquely identified by the tuple (agent_id, channel_id, external_user_id).

Conversation message history.

ColumnTypeDescription
idINTEGER PKAuto-increment
session_idTEXT FKReferences chat_sessions
roleTEXTuser, assistant, tool, system
contentTEXTMessage content
tool_callsTEXT (JSON)Tool call objects (for assistant messages)
tool_call_idTEXTTool call ID (for tool result messages)
created_atTIMESTAMPMessage timestamp