Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/neo4j-labs/create-context-graph/llms.txt

Use this file to discover all available pages before exploring further.

Domain ontologies are the central design concept in create-context-graph. A single YAML file defines everything about a domain — its entities, relationships, agent behavior, and visualization — and the tool uses that definition to generate an entire application. This page explains how that mechanism works.

What a domain ontology defines

A domain ontology is a structured YAML file that declares:

What exists

Entity types and their properties — the nodes in your knowledge graph

How things relate

Relationship types between entities — the edges in your knowledge graph

What the agent can do

Agent tools with Cypher queries and typed parameters

How the agent behaves

System prompt, demo scenarios, and guided chat examples

What data looks like

Document templates and decision trace scenarios for data generation

How it is visualized

Node colors, sizes, and the default Cypher query for the graph view
The ontology is not code. It is a declarative specification that gets fed into Jinja2 templates at generation time. The templates read the ontology and produce working Python, TypeScript, Cypher, and configuration files.

Two-layer inheritance

Every domain ontology is built on a shared base. The base ontology (_base.yaml) defines the POLE+O entity types that are common across all domains:
Base entityPOLE+O typeDefault colorExamples
PersonPERSON#22c55eCustomers, employees, patients, suspects
OrganizationORGANIZATION#3b82f6Companies, agencies, teams, departments
LocationLOCATION#a855f7Physical places with optional coordinates
EventEVENT#f97316Time-bound occurrences
ObjectOBJECT#eab308Domain-specific things that don’t fit above
The base also defines three common relationships: WORKS_FOR (Person → Organization), LOCATED_AT (Organization → Location), and PARTICIPATED_IN (Person → Event).

The inherits: _base mechanism

When a domain YAML declares inherits: _base, the ontology.py loader performs a merge:
  1. Base entity types are prepended to the domain’s entity list.
  2. Base relationships are prepended to the domain’s relationship list.
  3. If the domain defines an entity with the same label as a base entity (e.g., its own Person with extra properties), the domain version takes precedence and the base version is skipped.
# Every domain YAML starts with this line
inherits: _base

domain:
  id: healthcare
  name: Healthcare
  description: Patient care, clinical encounters, diagnoses, treatments, and provider networks
This means every domain automatically gets Person, Organization, Location, Event, and Object nodes — plus the three base relationships — without redeclaring them.

YAML structure

A complete domain YAML contains these top-level keys:
Metadata about the domain: unique id, display name, description, tagline, and emoji.
domain:
  id: healthcare
  name: Healthcare
  description: Patient care, clinical encounters, diagnoses, treatments, and provider networks
  tagline: "AI-powered Clinical Intelligence"
  emoji: "🏥"
The node types in your knowledge graph. Each entry defines a label, POLE+O category, visual appearance, and typed properties.
entity_types:
  - label: Patient
    pole_type: PERSON
    subtype: PATIENT
    color: "#06b6d4"
    icon: user
    properties:
      - name: patient_id
        type: string
        required: true
        unique: true
      - name: name
        type: string
        required: true
      - name: date_of_birth
        type: date
      - name: blood_type
        type: string
        enum: ["A+", "A-", "B+", "B-", "AB+", "AB-", "O+", "O-"]
Typed, directed edges between entity labels.
relationships:
  - type: DIAGNOSED_WITH
    source: Patient
    target: Diagnosis
  - type: TREATED_BY
    source: Patient
    target: Provider
  - type: OCCURRED_AT
    source: Encounter
    target: Facility
Domain-specific tools the agent can call. Each tool includes a name, description, Cypher query, and typed parameters. These are iterated in the agent template to produce framework-specific tool definitions.
agent_tools:
  - name: get_patient_history
    description: Get comprehensive history for a patient
    cypher: |
      MATCH (p:Patient {patient_id: $patient_id})
      OPTIONAL MATCH (p)-[:HAD_ENCOUNTER]->(e:Encounter)-[:OCCURRED_AT]->(f:Facility)
      OPTIONAL MATCH (p)-[:DIAGNOSED_WITH]->(d:Diagnosis)
      OPTIONAL MATCH (e)-[:INCLUDES]->(t:Treatment)
      RETURN p, collect(DISTINCT e) AS encounters,
             collect(DISTINCT d) AS diagnoses,
             collect(DISTINCT t) AS treatments,
             collect(DISTINCT f) AS facilities
    parameters:
      - name: patient_id
        type: string
        description: Patient ID
The system prompt injected verbatim into the agent configuration. Tells the LLM what domain it operates in, what tools are available, and how to behave.
system_prompt: |
  You are an AI clinical intelligence assistant with access to a comprehensive
  knowledge graph of healthcare data. You help clinicians, care coordinators,
  and medical staff analyze patient records, diagnoses, treatments, and
  provider networks.

  Always prioritize patient safety. Flag potential contraindications immediately.
  Provide evidence-based insights grounded in the clinical data available.
Templates that drive the synthetic data generation pipeline. Each template specifies what kind of document to produce, how many, and which entities are required.
document_templates:
  - id: discharge_summary
    name: Discharge Summary
    description: Patient discharge summaries after hospital stays
    count: 8
    prompt_template: |
      Write a discharge summary for patient {{patient.name}} (ID: {{patient.patient_id}})
      discharged from {{facility.name}} on {{date}}.
    required_entities: [Patient, Provider, Facility, Diagnosis, Treatment]
Reasoning scenarios for generating reasoning memory. Each trace has a task description and a sequence of thought/action steps, plus an outcome template.
decision_traces:
  - id: treatment_selection
    task: "Select appropriate treatment plan for {{patient.name}} diagnosed with {{diagnosis.name}}"
    steps:
      - thought: "Review patient history and current diagnoses"
        action: "Query patient entity graph including past encounters, diagnoses, and medications"
      - thought: "Check for medication contraindications and allergies"
        action: "Cross-reference proposed medications with patient allergies and current prescriptions"
    outcome_template: "Treatment plan: {{plan}}. Rationale: {{rationale}}"
Pre-built chat prompts that appear as clickable suggestions in the frontend, giving users a guided tour of the agent’s capabilities.
demo_scenarios:
  - name: Patient Lookup
    prompts:
      - "Show me all patients with a chronic diagnosis"
      - "What medications are currently prescribed to patients in the cardiology department?"
NVL graph visualization configuration: per-label node colors, node sizes (in pixels), and the default Cypher query used when the graph first loads.
visualization:
  node_colors:
    Patient: "#06b6d4"
    Provider: "#14b8a6"
    Diagnosis: "#ef4444"
  node_sizes:
    Patient: 25
    Provider: 25
    Facility: 30
  default_cypher: "MATCH (p:Patient)-[r]-(n) RETURN p, r, n LIMIT 100"

Property types

Entity properties support these types, which map directly to Neo4j types and Python types in the generated code:
YAML typeNeo4j typePython typeNotes
stringSTRINGstrDefault type if omitted
integerINTEGERint
floatFLOATfloat
booleanBOOLEANboolEnum values must be quoted: ["true", "false"]
dateDATEdate
datetimeDATETIMEdatetime
pointPOINTstrSerialized as a string in Python models
Properties that set unique: true generate a Cypher uniqueness constraint. Properties that set required: true become mandatory fields in the generated Pydantic models.
YAML boolean values in enum lists must be quoted strings. Write enum: ["true", "false"], not enum: [true, false]. Unquoted YAML booleans will fail ontology validation.

How the ontology drives everything

The DomainOntology Pydantic model (defined in ontology.py) is loaded from the YAML and passed as context to every Jinja2 template. Each section of the ontology produces a different part of the generated application:

Entity types → Neo4j schema

Each entity_types entry with unique: true properties generates a Cypher uniqueness constraint. Every entity type also gets a name index for fast lookups:
-- Generated from Patient entity with unique: true on patient_id
CREATE CONSTRAINT patient_patient_id_unique IF NOT EXISTS
  FOR (n:Patient) REQUIRE n.patient_id IS UNIQUE;

CREATE INDEX patient_name IF NOT EXISTS
  FOR (n:Patient) ON (n.name);

Entity types → Pydantic models

Each entity label becomes a Python class in backend/app/models.py. enum properties generate Python Enum classes. Required properties become mandatory fields; optional ones default to None:
# Generated from the Diagnosis entity_type
class DiagnosisCategoryEnum(str, Enum):
    CHRONIC = "chronic"
    ACUTE = "acute"
    INFECTIOUS = "infectious"
    AUTOIMMUNE = "autoimmune"
    GENETIC = "genetic"
    MENTAL_HEALTH = "mental_health"

class Diagnosis(BaseModel):
    """Entity model for Diagnosis."""

    icd_code: str = ...
    name: str = ...
    category: DiagnosisCategoryEnum | None = None
    severity: DiagnosisSeverityEnum | None = None

Agent tools → framework-specific code

The agent_tools list is iterated in agent.py.j2 to produce tool definitions in the chosen framework’s idiom. For PydanticAI, each tool becomes an @agent.tool decorated function. For LangGraph, each becomes a @tool function. The Cypher query and parameter definitions come directly from the YAML — no additional code is needed.

Visualization → NVL frontend

The visualization section (plus the color field on each entity type) populates the NVL component configuration in the Next.js frontend. Node colors, sizes, and the initial Cypher query for the graph view all come from the ontology.

Domain-agnostic templates, data-driven output

There are no per-domain template directories. The same Jinja2 templates produce a healthcare app, a financial services app, a wildlife conservation app, or any of the 22 built-in domains. The templates are parameterized entirely by the ontology context. This means:
  • Adding a new domain requires only a YAML file, not new templates.
  • Improvements to templates benefit all 22 domains simultaneously.
  • The template surface area stays small and maintainable regardless of how many domains exist.
The only template that varies by a non-ontology dimension is agent.py.j2, which has one version per supported agent framework. Even those framework-specific templates read the same ontology context — they just express tool definitions and agent setup in different framework idioms.

Extending with custom domains

Beyond the 22 built-in domains, you can create custom ontologies in two ways:
Follow the schema above, place the file in the domains/ directory (or ~/.create-context-graph/custom-domains/), and it becomes available as a --domain option:
uvx create-context-graph my-app \
  --domain my-custom-domain \
  --framework pydanticai
Both paths produce the same result: a DomainOntology object that drives all the templates.

Next steps

Why context graphs?

How graph memory compares to flat stores and vector databases

Ontology YAML schema

Full reference for every field and option in the domain YAML format