Domain ontologies

Domain ontologies are the central design concept in create-context-graph. A single YAML file defines everything about a domain — its entities, relationships, agent behavior, and visualization — and the tool uses that definition to generate an entire application. This page explains how that mechanism works.

What a domain ontology defines

A domain ontology is a structured YAML file that declares:

What exists

Entity types and their properties — the nodes in your knowledge graph

How things relate

Relationship types between entities — the edges in your knowledge graph

What the agent can do

Agent tools with Cypher queries and typed parameters

How the agent behaves

System prompt, demo scenarios, and guided chat examples

What data looks like

Document templates and decision trace scenarios for data generation

How it is visualized

Node colors, sizes, and the default Cypher query for the graph view

The ontology is not code. It is a declarative specification that gets fed into Jinja2 templates at generation time. The templates read the ontology and produce working Python, TypeScript, Cypher, and configuration files.

Two-layer inheritance

Every domain ontology is built on a shared base. The base ontology (_base.yaml) defines the POLE+O entity types that are common across all domains:

Base entity	POLE+O type	Default color	Examples
`Person`	`PERSON`	`#22c55e`	Customers, employees, patients, suspects
`Organization`	`ORGANIZATION`	`#3b82f6`	Companies, agencies, teams, departments
`Location`	`LOCATION`	`#a855f7`	Physical places with optional coordinates
`Event`	`EVENT`	`#f97316`	Time-bound occurrences
`Object`	`OBJECT`	`#eab308`	Domain-specific things that don’t fit above

The base also defines three common relationships: WORKS_FOR (Person → Organization), LOCATED_AT (Organization → Location), and PARTICIPATED_IN (Person → Event).

The `inherits: _base` mechanism

When a domain YAML declares inherits: _base, the ontology.py loader performs a merge:

Base entity types are prepended to the domain’s entity list.
Base relationships are prepended to the domain’s relationship list.
If the domain defines an entity with the same label as a base entity (e.g., its own Person with extra properties), the domain version takes precedence and the base version is skipped.

# Every domain YAML starts with this line
inherits: _base

domain:
  id: healthcare
  name: Healthcare
  description: Patient care, clinical encounters, diagnoses, treatments, and provider networks

This means every domain automatically gets Person, Organization, Location, Event, and Object nodes — plus the three base relationships — without redeclaring them.

YAML structure

A complete domain YAML contains these top-level keys:

domain

Metadata about the domain: unique id, display name, description, tagline, and emoji.

domain:
  id: healthcare
  name: Healthcare
  description: Patient care, clinical encounters, diagnoses, treatments, and provider networks
  tagline: "AI-powered Clinical Intelligence"
  emoji: "🏥"

entity_types

The node types in your knowledge graph. Each entry defines a label, POLE+O category, visual appearance, and typed properties.

entity_types:
  - label: Patient
    pole_type: PERSON
    subtype: PATIENT
    color: "#06b6d4"
    icon: user
    properties:
      - name: patient_id
        type: string
        required: true
        unique: true
      - name: name
        type: string
        required: true
      - name: date_of_birth
        type: date
      - name: blood_type
        type: string
        enum: ["A+", "A-", "B+", "B-", "AB+", "AB-", "O+", "O-"]

relationships

Typed, directed edges between entity labels.

relationships:
  - type: DIAGNOSED_WITH
    source: Patient
    target: Diagnosis
  - type: TREATED_BY
    source: Patient
    target: Provider
  - type: OCCURRED_AT
    source: Encounter
    target: Facility

agent_tools

Domain-specific tools the agent can call. Each tool includes a name, description, Cypher query, and typed parameters. These are iterated in the agent template to produce framework-specific tool definitions.

agent_tools:
  - name: get_patient_history
    description: Get comprehensive history for a patient
    cypher: |
      MATCH (p:Patient {patient_id: $patient_id})
      OPTIONAL MATCH (p)-[:HAD_ENCOUNTER]->(e:Encounter)-[:OCCURRED_AT]->(f:Facility)
      OPTIONAL MATCH (p)-[:DIAGNOSED_WITH]->(d:Diagnosis)
      OPTIONAL MATCH (e)-[:INCLUDES]->(t:Treatment)
      RETURN p, collect(DISTINCT e) AS encounters,
             collect(DISTINCT d) AS diagnoses,
             collect(DISTINCT t) AS treatments,
             collect(DISTINCT f) AS facilities
    parameters:
      - name: patient_id
        type: string
        description: Patient ID

system_prompt

The system prompt injected verbatim into the agent configuration. Tells the LLM what domain it operates in, what tools are available, and how to behave.

system_prompt: |
  You are an AI clinical intelligence assistant with access to a comprehensive
  knowledge graph of healthcare data. You help clinicians, care coordinators,
  and medical staff analyze patient records, diagnoses, treatments, and
  provider networks.

  Always prioritize patient safety. Flag potential contraindications immediately.
  Provide evidence-based insights grounded in the clinical data available.

document_templates

Templates that drive the synthetic data generation pipeline. Each template specifies what kind of document to produce, how many, and which entities are required.

document_templates:
  - id: discharge_summary
    name: Discharge Summary
    description: Patient discharge summaries after hospital stays
    count: 8
    prompt_template: |
      Write a discharge summary for patient {{patient.name}} (ID: {{patient.patient_id}})
      discharged from {{facility.name}} on {{date}}.
    required_entities: [Patient, Provider, Facility, Diagnosis, Treatment]

decision_traces

Reasoning scenarios for generating reasoning memory. Each trace has a task description and a sequence of thought/action steps, plus an outcome template.

decision_traces:
  - id: treatment_selection
    task: "Select appropriate treatment plan for {{patient.name}} diagnosed with {{diagnosis.name}}"
    steps:
      - thought: "Review patient history and current diagnoses"
        action: "Query patient entity graph including past encounters, diagnoses, and medications"
      - thought: "Check for medication contraindications and allergies"
        action: "Cross-reference proposed medications with patient allergies and current prescriptions"
    outcome_template: "Treatment plan: {{plan}}. Rationale: {{rationale}}"

demo_scenarios

Pre-built chat prompts that appear as clickable suggestions in the frontend, giving users a guided tour of the agent’s capabilities.

demo_scenarios:
  - name: Patient Lookup
    prompts:
      - "Show me all patients with a chronic diagnosis"
      - "What medications are currently prescribed to patients in the cardiology department?"

visualization

NVL graph visualization configuration: per-label node colors, node sizes (in pixels), and the default Cypher query used when the graph first loads.

visualization:
  node_colors:
    Patient: "#06b6d4"
    Provider: "#14b8a6"
    Diagnosis: "#ef4444"
  node_sizes:
    Patient: 25
    Provider: 25
    Facility: 30
  default_cypher: "MATCH (p:Patient)-[r]-(n) RETURN p, r, n LIMIT 100"

Property types

Entity properties support these types, which map directly to Neo4j types and Python types in the generated code:

YAML type	Neo4j type	Python type	Notes
`string`	`STRING`	`str`	Default type if omitted
`integer`	`INTEGER`	`int`
`float`	`FLOAT`	`float`
`boolean`	`BOOLEAN`	`bool`	Enum values must be quoted: `["true", "false"]`
`date`	`DATE`	`date`
`datetime`	`DATETIME`	`datetime`
`point`	`POINT`	`str`	Serialized as a string in Python models

Properties that set unique: true generate a Cypher uniqueness constraint. Properties that set required: true become mandatory fields in the generated Pydantic models.

YAML boolean values in enum lists must be quoted strings. Write enum: ["true", "false"], not enum: [true, false]. Unquoted YAML booleans will fail ontology validation.

How the ontology drives everything

The DomainOntology Pydantic model (defined in ontology.py) is loaded from the YAML and passed as context to every Jinja2 template. Each section of the ontology produces a different part of the generated application:

Entity types → Neo4j schema

Each entity_types entry with unique: true properties generates a Cypher uniqueness constraint. Every entity type also gets a name index for fast lookups:

-- Generated from Patient entity with unique: true on patient_id
CREATE CONSTRAINT patient_patient_id_unique IF NOT EXISTS
  FOR (n:Patient) REQUIRE n.patient_id IS UNIQUE;

CREATE INDEX patient_name IF NOT EXISTS
  FOR (n:Patient) ON (n.name);

Entity types → Pydantic models

Each entity label becomes a Python class in backend/app/models.py. enum properties generate Python Enum classes. Required properties become mandatory fields; optional ones default to None:

# Generated from the Diagnosis entity_type
class DiagnosisCategoryEnum(str, Enum):
    CHRONIC = "chronic"
    ACUTE = "acute"
    INFECTIOUS = "infectious"
    AUTOIMMUNE = "autoimmune"
    GENETIC = "genetic"
    MENTAL_HEALTH = "mental_health"

class Diagnosis(BaseModel):
    """Entity model for Diagnosis."""

    icd_code: str = ...
    name: str = ...
    category: DiagnosisCategoryEnum | None = None
    severity: DiagnosisSeverityEnum | None = None

Agent tools → framework-specific code

The agent_tools list is iterated in agent.py.j2 to produce tool definitions in the chosen framework’s idiom. For PydanticAI, each tool becomes an @agent.tool decorated function. For LangGraph, each becomes a @tool function. The Cypher query and parameter definitions come directly from the YAML — no additional code is needed.

Visualization → NVL frontend

The visualization section (plus the color field on each entity type) populates the NVL component configuration in the Next.js frontend. Node colors, sizes, and the initial Cypher query for the graph view all come from the ontology.

Domain-agnostic templates, data-driven output

There are no per-domain template directories. The same Jinja2 templates produce a healthcare app, a financial services app, a wildlife conservation app, or any of the 22 built-in domains. The templates are parameterized entirely by the ontology context. This means:

Adding a new domain requires only a YAML file, not new templates.
Improvements to templates benefit all 22 domains simultaneously.
The template surface area stays small and maintainable regardless of how many domains exist.

The only template that varies by a non-ontology dimension is agent.py.j2, which has one version per supported agent framework. Even those framework-specific templates read the same ontology context — they just express tool definitions and agent setup in different framework idioms.

Extending with custom domains

Beyond the 22 built-in domains, you can create custom ontologies in two ways:

Write a YAML manually
Generate from a description

Follow the schema above, place the file in the domains/ directory (or ~/.create-context-graph/custom-domains/), and it becomes available as a --domain option:

uvx create-context-graph my-app \
  --domain my-custom-domain \
  --framework pydanticai

Use --custom-domain with a natural language description. The CLI sends the description to an LLM, which generates a complete ontology YAML validated against the DomainOntology Pydantic model:

uvx create-context-graph my-app \
  --custom-domain "A knowledge graph for tracking academic research papers, authors, institutions, and citation networks" \
  --framework pydanticai

Generated domains can be saved to ~/.create-context-graph/custom-domains/ for reuse.

Both paths produce the same result: a DomainOntology object that drives all the templates.

What a domain ontology defines

What exists

How things relate

What the agent can do

How the agent behaves

What data looks like

How it is visualized

Two-layer inheritance

The `inherits: _base` mechanism

YAML structure

Property types

How the ontology drives everything

Entity types → Neo4j schema

Entity types → Pydantic models

Agent tools → framework-specific code

Visualization → NVL frontend

Domain-agnostic templates, data-driven output

Extending with custom domains

Next steps

Why context graphs?

Ontology YAML schema

Documentation Index

​What a domain ontology defines

What exists

How things relate

What the agent can do

How the agent behaves

What data looks like

How it is visualized

​Two-layer inheritance

​The inherits: _base mechanism

​YAML structure

​Property types

​How the ontology drives everything

​Entity types → Neo4j schema

​Entity types → Pydantic models

​Agent tools → framework-specific code

​Visualization → NVL frontend

​Domain-agnostic templates, data-driven output

​Extending with custom domains

​Next steps

Why context graphs?

Ontology YAML schema

What a domain ontology defines

Two-layer inheritance

The `inherits: _base` mechanism

YAML structure

Property types

How the ontology drives everything

Entity types → Neo4j schema

Entity types → Pydantic models

Agent tools → framework-specific code

Visualization → NVL frontend

Domain-agnostic templates, data-driven output

Extending with custom domains

Next steps