Skip to content

Schema & I/O

BLOGE treats operator inputs and outputs as contracts that should be visible to both humans and tooling. The schema system exists to remove guesswork from graph authoring, validation, testing, and visualization.

Why explicit schemas matter

Without schemas, downstream consumers have to read operator source code to discover output shape. That makes DSL authoring weaker, runtime mismatches harder to diagnose, and visual tooling less useful.

With explicit or inferred schemas, BLOGE can:

  • validate field paths during graph build and DSL compilation
  • document operator inputs and outputs automatically
  • surface shape information in Studio and metadata JSON
  • detect mismatches earlier, before production traffic hits a broken path

Core schema model

BLOGE's schema layer is centered around SchemaDescriptor.

TypeMeaning
StructuredSchemaField-level schema with nested structure
TypedSchemaSimple type reference when field expansion is unnecessary
OpaqueSchemaEscape hatch when shape is unknown or intentionally undeclared

A structured schema is composed of FieldDescriptor records describing field name, type, required-ness, optional nested schema, and documentation.

Where schemas come from

BLOGE can discover schema from several sources.

Java-side inference

For Java operators, the runtime can infer schema from the Operator<I, O> generic types.

Typical behavior:

  • Java records -> recursive field extraction
  • POJOs -> getter-based inspection
  • Map<String, Object> -> usually degrades to OpaqueSchema

Operators can override inference by implementing SchemaAware and returning explicit schemas.

DSL-side declarations

The DSL can declare schemas inline or through reusable schema blocks.

bloge
schema UserOutput {
  id: Int
  name: String
  email: String?
}

node fetchUser {
  operator: "FetchUserOperator"
  output: UserOutput
}

Inline declarations are especially useful when a graph is defined externally and the compiler should validate field paths before runtime.

Transform inference

transform fields also participate in the schema system. Their types can come from:

  1. explicit type annotations
  2. expression type inference
  3. an Unknown fallback when inference is incomplete

Validation stages

BLOGE can validate schema at three different moments.

StageWhat it checks
Graph build timeCompatibility across edges and referenced fields
DSL compile timePath validity, declared schema references, and binding compatibility
RuntimeActual values versus declared required fields and types

Validation strictness is controlled by SchemaValidationLevel:

  • OFF
  • WARN
  • ERROR

This lets teams start with warnings and tighten enforcement as contracts stabilize.

Example: inferred Java schema

java
public record UserQuery(String userId) {}
public record UserView(String id, String name, String email) {}

public final class FetchUserOperator implements Operator<UserQuery, UserView> {
    @Override
    public UserView execute(UserQuery input, OperatorContext ctx) {
        return userService.find(input.userId());
    }
}

Here BLOGE can infer both input and output schema automatically from record components.

Example: metadata export

The Maven plugin can export operator schema into operator-metadata.json for visual tooling:

json
{
  "name": "FetchUserOperator",
  "inputSchema": {
    "kind": "structured",
    "fields": [{ "name": "userId", "type": "String", "required": true }]
  },
  "outputSchema": {
    "kind": "structured",
    "fields": [
      { "name": "id", "type": "String", "required": true },
      { "name": "name", "type": "String", "required": true },
      { "name": "email", "type": "String", "required": false }
    ]
  }
}

Studio can then use this data for operator catalogs, field completion, and data-flow visualization.

Advanced schema constraints

The schema layer goes beyond field names and types. These constraints are type-checked at compile time and enforced at runtime by SchemaValidator.

Allowed values (oneOf)

Constrain a field to an enum-like set of literals:

bloge
schema OrderState {
  status: String = oneOf("pending", "approved", "rejected")
}

Branch validation errors when a case literal falls outside a field's allowed-value set, catching unreachable or misspelled branches before runtime.

Union schemas & branch narrowing

A field may declare a union of named member schemas:

bloge
schema PaymentResult = PaymentSuccess | PaymentFailure

SchemaValidator accepts a value matching any member and rejects values matching none. When you branch on a union's discriminator — a shared field whose oneOf(...) sets do not overlap across members — each branch arm is narrowed to the matched member, so downstream nodes can reference fields that exist only on that member.

Generic schema retention

CollectionSchema and MapSchema descriptors preserve element / key / value schemas, and FieldDescriptor.genericType retains the user-facing signature (e.g. List<Order>). The introspector, JSON codec, and validator all operate on the full parameterized types rather than collapsing them to raw collections.

Safe navigation (?.)

Path expressions support null-propagating navigation in any input binding:

bloge
input {
  city = ctx.user?.address?.city
  hint = fetch?.output.result?.value
}

Safe-navigation metadata flows through the AST and codegen, so a null mid-path yields a null result instead of an error.

Await signal schemas

await nodes can declare per-event signal_schema constraints that are compiled into node metadata and enforced at runtime: publishEvent rejects payloads that violate an event-specific schema before they can corrupt AND-mode correlations, and signal validates the aggregated payload before resuming. Violations raise SignalSchemaViolationException.

Schema evolution

VersionedSchema wraps any SchemaDescriptor with a version string, a deprecated-field set, and a reserved-field-name set. SchemaEvolutionChecker compares consecutive versions and classifies each change:

ClassificationExamples
FullyCompatibleAdd an optional field
BackwardCompatibleRemove an optional field, required → optional, union add member, deprecation
BreakingChangeAdd/remove a required field, type change, optional → required, union remove member, reserved-name reuse

At publish time GraphEngine invokes GraphRegistryStore.checkEvolution(...): breaking changes are rejected with GraphDefinitionException, while backward-compatible changes are logged as warnings. This enforces compatibility before a new graph version ever serves traffic.

Practical guidance

  • Prefer Java records for clean inference whenever possible.
  • Use explicit schemas when a graph is authored in DSL and path validation matters.
  • Treat OpaqueSchema as a temporary escape hatch, not the default destination.
  • Version breaking output changes intentionally instead of mutating contracts invisibly.
  • Keep transform outputs typed when they become shared downstream dependencies.

What schema validation is not

Schema validation does not replace domain validation. It answers questions like:

  • does fetchUser.output.address.city exist?
  • is this field optional or required?
  • does the downstream node expect a compatible shape?

It does not decide whether a value is semantically correct for the business domain.

Next steps