Schema & I/O
BLOGE treats operator inputs and outputs as contracts that should be visible to both humans and tooling. The schema system exists to remove guesswork from graph authoring, validation, testing, and visualization.
Why explicit schemas matter
Without schemas, downstream consumers have to read operator source code to discover output shape. That makes DSL authoring weaker, runtime mismatches harder to diagnose, and visual tooling less useful.
With explicit or inferred schemas, BLOGE can:
- validate field paths during graph build and DSL compilation
- document operator inputs and outputs automatically
- surface shape information in Studio and metadata JSON
- detect mismatches earlier, before production traffic hits a broken path
Core schema model
BLOGE's schema layer is centered around SchemaDescriptor.
| Type | Meaning |
|---|---|
StructuredSchema | Field-level schema with nested structure |
TypedSchema | Simple type reference when field expansion is unnecessary |
OpaqueSchema | Escape hatch when shape is unknown or intentionally undeclared |
A structured schema is composed of FieldDescriptor records describing field name, type, required-ness, optional nested schema, and documentation.
Where schemas come from
BLOGE can discover schema from several sources.
Java-side inference
For Java operators, the runtime can infer schema from the Operator<I, O> generic types.
Typical behavior:
- Java records -> recursive field extraction
- POJOs -> getter-based inspection
Map<String, Object>-> usually degrades toOpaqueSchema
Operators can override inference by implementing SchemaAware and returning explicit schemas.
DSL-side declarations
The DSL can declare schemas inline or through reusable schema blocks.
schema UserOutput {
id: Int
name: String
email: String?
}
node fetchUser {
operator: "FetchUserOperator"
output: UserOutput
}Inline declarations are especially useful when a graph is defined externally and the compiler should validate field paths before runtime.
Transform inference
transform fields also participate in the schema system. Their types can come from:
- explicit type annotations
- expression type inference
- an
Unknownfallback when inference is incomplete
Validation stages
BLOGE can validate schema at three different moments.
| Stage | What it checks |
|---|---|
| Graph build time | Compatibility across edges and referenced fields |
| DSL compile time | Path validity, declared schema references, and binding compatibility |
| Runtime | Actual values versus declared required fields and types |
Validation strictness is controlled by SchemaValidationLevel:
OFFWARNERROR
This lets teams start with warnings and tighten enforcement as contracts stabilize.
Example: inferred Java schema
public record UserQuery(String userId) {}
public record UserView(String id, String name, String email) {}
public final class FetchUserOperator implements Operator<UserQuery, UserView> {
@Override
public UserView execute(UserQuery input, OperatorContext ctx) {
return userService.find(input.userId());
}
}Here BLOGE can infer both input and output schema automatically from record components.
Example: metadata export
The Maven plugin can export operator schema into operator-metadata.json for visual tooling:
{
"name": "FetchUserOperator",
"inputSchema": {
"kind": "structured",
"fields": [{ "name": "userId", "type": "String", "required": true }]
},
"outputSchema": {
"kind": "structured",
"fields": [
{ "name": "id", "type": "String", "required": true },
{ "name": "name", "type": "String", "required": true },
{ "name": "email", "type": "String", "required": false }
]
}
}Studio can then use this data for operator catalogs, field completion, and data-flow visualization.
Advanced schema constraints
The schema layer goes beyond field names and types. These constraints are type-checked at compile time and enforced at runtime by SchemaValidator.
Allowed values (oneOf)
Constrain a field to an enum-like set of literals:
schema OrderState {
status: String = oneOf("pending", "approved", "rejected")
}Branch validation errors when a case literal falls outside a field's allowed-value set, catching unreachable or misspelled branches before runtime.
Union schemas & branch narrowing
A field may declare a union of named member schemas:
schema PaymentResult = PaymentSuccess | PaymentFailureSchemaValidator accepts a value matching any member and rejects values matching none. When you branch on a union's discriminator — a shared field whose oneOf(...) sets do not overlap across members — each branch arm is narrowed to the matched member, so downstream nodes can reference fields that exist only on that member.
Generic schema retention
CollectionSchema and MapSchema descriptors preserve element / key / value schemas, and FieldDescriptor.genericType retains the user-facing signature (e.g. List<Order>). The introspector, JSON codec, and validator all operate on the full parameterized types rather than collapsing them to raw collections.
Safe navigation (?.)
Path expressions support null-propagating navigation in any input binding:
input {
city = ctx.user?.address?.city
hint = fetch?.output.result?.value
}Safe-navigation metadata flows through the AST and codegen, so a null mid-path yields a null result instead of an error.
Await signal schemas
await nodes can declare per-event signal_schema constraints that are compiled into node metadata and enforced at runtime: publishEvent rejects payloads that violate an event-specific schema before they can corrupt AND-mode correlations, and signal validates the aggregated payload before resuming. Violations raise SignalSchemaViolationException.
Schema evolution
VersionedSchema wraps any SchemaDescriptor with a version string, a deprecated-field set, and a reserved-field-name set. SchemaEvolutionChecker compares consecutive versions and classifies each change:
| Classification | Examples |
|---|---|
FullyCompatible | Add an optional field |
BackwardCompatible | Remove an optional field, required → optional, union add member, deprecation |
BreakingChange | Add/remove a required field, type change, optional → required, union remove member, reserved-name reuse |
At publish time GraphEngine invokes GraphRegistryStore.checkEvolution(...): breaking changes are rejected with GraphDefinitionException, while backward-compatible changes are logged as warnings. This enforces compatibility before a new graph version ever serves traffic.
Practical guidance
- Prefer Java records for clean inference whenever possible.
- Use explicit schemas when a graph is authored in DSL and path validation matters.
- Treat
OpaqueSchemaas a temporary escape hatch, not the default destination. - Version breaking output changes intentionally instead of mutating contracts invisibly.
- Keep transform outputs typed when they become shared downstream dependencies.
What schema validation is not
Schema validation does not replace domain validation. It answers questions like:
- does
fetchUser.output.address.cityexist? - is this field optional or required?
- does the downstream node expect a compatible shape?
It does not decide whether a value is semantically correct for the business domain.
Next steps
- Export schema catalogs in Maven Plugin & Lint
- Visualize them in Bloge Studio
- See data shaping guidance in Design Principles