Utils

`build_schema_description(schema, header='Feldhinweise und erlaubte Werte (getrennt durch Semikolons):', type_description_prefix='Beschreibung: ', cardinality_prefix='Kardinalität: ', type_prefix='Typ: ', choices_prefix='Zulässige Werte: ', choices_description_prefix='Hinweise zu den Werten: ', component_separator=' | ', choices_separator='; ', indent_step=' ', include_field_descriptions=True, include_type_descriptions=True, indent=0, root_schema=None)`

Build a human‑readable summary for a JSON Schema.

Output format: - Optional first line: "" if include_type_descriptions and the description exists - Optional header line (only at top level if header is not None) - One line per property with format depending on which prefix parameters are not None: "- [: ][][][]" - For nested objects, recursively includes their properties with increased indentation

Cardinality rules: - type=array ⇒ "0..*" - non-array with "default" ⇒ "0..1" - non-array without "default" ⇒ "1"

Type extraction: - Supports inline "type", direct "$ref", and compositions via "allOf"/"anyOf"/"oneOf" - For arrays, type is taken from "items"

Choices extraction: - Supports inline enums, direct "$ref", and compositions via "allOf"/"anyOf"/"oneOf" - For arrays, choices are taken from "items" (including "$ref" or compositions)

Parameters:

Name	Type	Description	Default
`schema`	`Mapping[str, Any]`	The JSON Schema dictionary to process	required
`header`	`str \| None`	Header text for field list (only shown at top level, None to omit)	`'Feldhinweise und erlaubte Werte (getrennt durch Semikolons):'`
`type_description_prefix`	`str \| None`	Prefix for type descriptions (descriptions for the overall schema and nested schemas, not field descriptions)	`'Beschreibung: '`
`cardinality_prefix`	`str \| None`	Prefix for cardinality information (None to omit cardinality)	`'Kardinalität: '`
`type_prefix`	`str \| None`	Prefix for type information (None to omit types)	`'Typ: '`
`choices_prefix`	`str \| None`	Prefix for choices value lists (None to omit choices)	`'Zulässige Werte: '`
`choices_description_prefix`	`str \| None`	Prefix for choices descriptions (None to omit choices descriptions)	`'Hinweise zu den Werten: '`
`component_separator`	`str`	Separator between field components (name, cardinality, type, choices)	`' \| '`
`choices_separator`	`str`	Separator between individual choices values	`'; '`
`indent_step`	`str`	String used for each indentation level	`' '`
`include_field_descriptions`	`bool`	Whether to include field/property descriptions in the output	`True`
`include_type_descriptions`	`bool`	Whether to include schema/type descriptions (top-level and nested) in the output. NOTE: This is deprecated; please set type_description_prefix to None to omit type descriptions instead.	`True`
`indent`	`int`	Current indentation level (internal, for recursion)	`0`
`root_schema`	`Mapping[str, Any] \| None`	Root schema containing $defs (internal, for recursion)	`None`

Returns:

Type	Description
`str`	Multi-line string summarizing schema structure, fields, and constraints

Source code in src/kibad_llm/schema/utils.py

def build_schema_description(
    schema: Mapping[str, Any],
    header: str | None = "Feldhinweise und erlaubte Werte (getrennt durch Semikolons):",
    type_description_prefix: str | None = "Beschreibung: ",
    cardinality_prefix: str | None = "Kardinalität: ",
    type_prefix: str | None = "Typ: ",
    choices_prefix: str | None = "Zulässige Werte: ",
    choices_description_prefix: str | None = "Hinweise zu den Werten: ",
    component_separator: str = " | ",
    choices_separator: str = "; ",
    indent_step: str = "  ",
    include_field_descriptions: bool = True,
    include_type_descriptions: bool = True,
    # internal args
    indent: int = 0,
    root_schema: Mapping[str, Any] | None = None,
) -> str:
    """
    Build a human‑readable summary for a JSON Schema.

    Output format:
    - Optional first line: "<type_description_prefix><schema.description>" if include_type_descriptions and the description exists
    - Optional header line (only at top level if header is not None)
    - One line per property with format depending on which prefix parameters are not None:
      "<indent>- <name>[: <description>][<separator><cardinality_prefix><cardinality>][<separator><type_prefix><type>][<separator><enum_prefix><values>]"
    - For nested objects, recursively includes their properties with increased indentation

    Cardinality rules:
    - type=array ⇒ "0..*"
    - non-array with "default" ⇒ "0..1"
    - non-array without "default" ⇒ "1"

    Type extraction:
    - Supports inline "type", direct "$ref", and compositions via "allOf"/"anyOf"/"oneOf"
    - For arrays, type is taken from "items"

    Choices extraction:
    - Supports inline enums, direct "$ref", and compositions via "allOf"/"anyOf"/"oneOf"
    - For arrays, choices are taken from "items" (including "$ref" or compositions)

    Args:
        schema: The JSON Schema dictionary to process
        header: Header text for field list (only shown at top level, None to omit)
        type_description_prefix: Prefix for type descriptions (descriptions for the overall schema and nested schemas, not field descriptions)
        cardinality_prefix: Prefix for cardinality information (None to omit cardinality)
        type_prefix: Prefix for type information (None to omit types)
        choices_prefix: Prefix for choices value lists (None to omit choices)
        choices_description_prefix: Prefix for choices descriptions (None to omit choices descriptions)
        component_separator: Separator between field components (name, cardinality, type, choices)
        choices_separator: Separator between individual choices values
        indent_step: String used for each indentation level
        include_field_descriptions: Whether to include field/property descriptions in the output
        include_type_descriptions: Whether to include schema/type descriptions (top-level and nested) in the output.
            NOTE: This is deprecated; please set type_description_prefix to None to omit type descriptions instead.
        indent: Current indentation level (internal, for recursion)
        root_schema: Root schema containing $defs (internal, for recursion)

    Returns:
        Multi-line string summarizing schema structure, fields, and constraints
    """
    if not include_type_descriptions:
        type_description_prefix = None
        warn_once(
            "include_type_descriptions is deprecated; please set type_description_prefix to None instead "
            "of using include_type_descriptions=False."
        )

    if root_schema is None:
        root_schema = schema

    lines = []
    prefix = indent_step * indent

    # Add description
    if type_description_prefix is not None:
        # remove all newlines and extra spaces from the description
        schema_desc = _norm_desc(schema.get("description"))
        if schema_desc:
            lines.append(f"{prefix}{type_description_prefix}{schema_desc}")

    if header:
        lines.append(header)

    props: dict = schema.get("properties", {}) or {}
    for name, spec in props.items():
        # Single check for array vs non-array handling
        is_array = spec.get("type") == "array"
        target = spec.get("items") if is_array else spec
        target_for_hints = _pick_preferred_branch(target, root_schema)

        # Determine cardinality
        has_default = "default" in spec
        cardinality = "0..*" if is_array else ("0..1" if has_default else "1")

        # Extract type and choices from target
        field_type = _extract_type(root_schema, target_for_hints)
        # use choices_separator also to join the enum *descriptions* if needed
        choices_with_description = _extract_choices_with_description(
            root_schema, target_for_hints, description_separator=choices_separator
        )

        # Build field line
        hint = f"{prefix}- {name}:"
        # the field description is mandatory (if exists)
        if include_field_descriptions:
            # remove all newlines and extra spaces from the description
            desc = _norm_desc(spec.get("description"))
            if desc:
                hint += f" {desc}"
        if cardinality_prefix is not None:
            hint += f"{component_separator}{cardinality_prefix}{cardinality}"
        if field_type and type_prefix is not None:
            hint += f"{component_separator}{type_prefix}{field_type}"
        if choices_with_description and choices_prefix is not None:
            choices, choices_desc = choices_with_description
            hint += f"{component_separator}{choices_prefix}" + choices_separator.join(choices)
            # remove all newlines and extra spaces from the choices description
            choices_desc = _norm_desc(choices_desc)
            if choices_desc and choices_description_prefix is not None:
                hint += f"{component_separator}{choices_description_prefix}{choices_desc}"

        lines.append(hint)

        # Handle nested objects recursively:
        # - $ref objects
        # - inline object schemas with "properties" (needed for metadata wrappers)
        if field_type == "object" and isinstance(target_for_hints, ABCMapping):
            nested_schema: Mapping[str, Any] | None = None

            ref = target_for_hints.get("$ref")
            if isinstance(ref, str):
                nested_schema = _resolve_ref(root_schema, ref)
            elif isinstance(target_for_hints.get("properties"), ABCMapping):
                nested_schema = target_for_hints

            if nested_schema:
                nested_content = build_schema_description(
                    nested_schema,
                    indent=indent + 1,
                    root_schema=root_schema,
                    # no header for nested
                    header=None,
                    type_description_prefix=type_description_prefix,
                    cardinality_prefix=cardinality_prefix,
                    type_prefix=type_prefix,
                    choices_prefix=choices_prefix,
                    component_separator=component_separator,
                    choices_separator=choices_separator,
                    indent_step=indent_step,
                    include_field_descriptions=include_field_descriptions,
                )
                lines.append(nested_content)

    return "\n".join(lines)

`wrap_terminals_with_metadata(schema, metadata_schema, *, content_key=WRAPPED_CONTENT_KEY, content_description=None)`

Wrap every terminal field schema (scalars/enums/const, including nullable unions and refs) into an object containing: - : the original terminal schema - plus metadata fields from metadata_schema (default: evidence_anchor: string)

Notes: - Does NOT wrap the root of $defs entries (important for shared enum defs). - DOES wrap terminal fields inside object definitions in $defs. - Returns a deep-copied dict; input is not mutated.

Source code in src/kibad_llm/schema/utils.py

def wrap_terminals_with_metadata(
    schema: Mapping[str, Any],
    metadata_schema: Mapping[str, Any],
    *,
    content_key: str = WRAPPED_CONTENT_KEY,
    content_description: str | None = None,
) -> dict[str, Any]:
    """
    Wrap every terminal field schema (scalars/enums/const, including nullable unions and refs)
    into an object containing:
      - <content_key>: the original terminal schema
      - plus metadata fields from `metadata_schema` (default: evidence_anchor: string)

    Notes:
    - Does NOT wrap the root of $defs entries (important for shared enum defs).
    - DOES wrap terminal fields inside object definitions in $defs.
    - Returns a deep-copied dict; input is not mutated.
    """
    from copy import deepcopy

    root: dict[str, Any] = deepcopy(dict(schema))
    metadata_obj_schema = _normalize_metadata_schema(metadata_schema)

    def transform(node: Any, *, allow_wrap_here: bool = True) -> Any:
        if isinstance(node, list):
            return [transform(x, allow_wrap_here=True) for x in node]
        if not isinstance(node, ABCMapping):
            return node

        node_dict: dict[str, Any] = dict(node)

        if _is_metadata_wrapper(
            node_dict, metadata_obj_schema=metadata_obj_schema, content_key=content_key
        ):
            return node_dict

        if isinstance(node_dict.get("type"), list):
            raise ValueError(
                "Encountered JSON Schema 'type' as a list. "
                "This code expects Pydantic-style unions via anyOf/oneOf. "
                "Please normalize type-lists to anyOf/oneOf first or update the wrapper."
            )

        if allow_wrap_here and _schema_should_be_wrapped(root, node_dict):
            return _wrap_value_schema_with_metadata(
                node_dict,
                metadata_obj_schema=metadata_obj_schema,
                content_key=content_key,
                content_description=content_description,
            )

        # combinators
        for k in ("anyOf", "oneOf", "allOf"):
            v = node_dict.get(k)
            if isinstance(v, list):
                node_dict[k] = [transform(s, allow_wrap_here=True) for s in v]

        # object keywords
        props = node_dict.get("properties")
        if isinstance(props, ABCMapping):
            node_dict["properties"] = {
                name: transform(spec, allow_wrap_here=True) for name, spec in props.items()
            }

        pat_props = node_dict.get("patternProperties")
        if isinstance(pat_props, ABCMapping):
            node_dict["patternProperties"] = {
                pat: transform(spec, allow_wrap_here=True) for pat, spec in pat_props.items()
            }

        add_props = node_dict.get("additionalProperties")
        if isinstance(add_props, ABCMapping):
            node_dict["additionalProperties"] = transform(add_props, allow_wrap_here=True)

        uneval_props = node_dict.get("unevaluatedProperties")
        if isinstance(uneval_props, ABCMapping):
            node_dict["unevaluatedProperties"] = transform(uneval_props, allow_wrap_here=True)

        # array keywords
        items = node_dict.get("items")
        if isinstance(items, ABCMapping):
            node_dict["items"] = transform(items, allow_wrap_here=True)

        prefix_items = node_dict.get("prefixItems")
        if isinstance(prefix_items, list):
            node_dict["prefixItems"] = [transform(s, allow_wrap_here=True) for s in prefix_items]

        # other schema-bearing keywords (common)
        for k in ("not", "if", "then", "else", "contains", "propertyNames", "dependentSchemas"):
            v = node_dict.get(k)
            if isinstance(v, ABCMapping):
                node_dict[k] = transform(v, allow_wrap_here=True)
            elif isinstance(v, list):
                node_dict[k] = [transform(x, allow_wrap_here=True) for x in v]

        # defs: don't wrap the *root* of each def, but do transform inside it
        for defs_key in ("$defs", "definitions"):
            defs = node_dict.get(defs_key)
            if isinstance(defs, ABCMapping):
                node_dict[defs_key] = {
                    n: transform(s, allow_wrap_here=False) for n, s in defs.items()
                }

        return node_dict

    return transform(root, allow_wrap_here=True)