Skip to content

Generation Pipeline

When you call generator.generate(config), the composer engine runs a fixed pipeline that turns the request into a complete, deterministic composition.

Musical vocabulary in this pipeline

The pipeline treats music as structured data: forms allocate voices, harmony supplies chord targets, candidate search chooses notes, and validation rejects illegal voice interactions. The compact glossary is in Music Primer for Engineers.

The whole pipeline is deterministic: the same config and seed always yield byte-identical output.

Step 1: Compose Request

The raw config is resolved into a compose request. Defaults are filled in, the seed is resolved (a seed of 0 becomes a concrete non-zero value, reported via getInfo().seedUsed), and the request is validated.

Validation is strict: unknown form/character/instrument/scale values and out-of-range bpm are rejected, as are forbidden character/form pairs (chorale_prelude rejects playful/restless; toccata_and_fugue rejects noble). The form fixes the voice count, meter, and natural length; scale/targetBars resolve the final bar count (snapped to the form's granularity and clamped to [min, 128]).

Step 2: Form Director

The form director lays out the piece. For the chosen form it assigns voice intents to bar spans — subjects and answers, ground basses, cantus firmus, figuration, variation material — across the resolved length.

Bar span and material

A bar span is a range of measures. Material is predeclared musical content, such as a fugue subject or repeating ground bass, that the candidate search should respect instead of freely replacing.

The layout follows a design-value arc — establish, develop, climax at roughly 80% of the span, then resolve — that drives the density, register, and velocity tiers used downstream. The arc is fixed by the form, not searched.

Against a harmonic plan (chords, modulation, cadences), the candidate search selects notes for each non-fixed voice. Selection is per-beat and chord-tone anchored: at each beat the search prefers pitches that are consonant with the harmony and with the other voices, falling back through ranked alternatives when the first choice violates a constraint. Material assigned by the form director (subjects, grounds, cantus firmus) is fixed and is not re-selected here.

Chords, modulation, cadences

Chords are the vertical targets at each point in time. Modulation means moving the tonal center to another key. A cadence is a phrase ending, usually a dominant-to-tonic arrival such as V to I.

Step 4: Rule Validator

The validator checks the assembled voices against counterpoint and structure rules and fails fast on a violation. It reports failures with rule identifiers so the responsible span can be located. Learn the musical ideas in Counterpoint Course, then use the Validator Rule Reference when you need to look up a specific rule ID.

What fail-fast means here

The engine does not keep a list of every possible musical problem in a bad candidate. It stops at the first violated rule for that pass, reports the rule identifier, and lets the search or caller handle the failed candidate.

Step 5: Renderer

The validated voices are rendered into tracks — one track per voice — with channels, the instrument's General MIDI program, and note timings.

Source tags survive rendering

Rendered notes keep their provenance: "material" for fixed source material, "compose" for notes selected by candidate search, and "ornament" for notes added later. This is useful when debugging why a rule could or could not rewrite a note.

Step 6: Ornament & Expression (post-pass)

Deterministic post-passes decorate the rendered tracks. At the C++ library level the ornament pass is a separate function (applyOrnamentPass) deliberately kept out of Composer::run(); the public generation path bach_generate_from_json — used by the JS API, CLI, and this demo — always invokes it after validation.

  • Ornaments — trills, mordents, and Nachschlag, with density depending on character and instrument. Ground-bass and cantus-firmus lines are never ornamented.
  • Expression — organ registration as a CC#7/#11 curve following the form's energy arc, and closing ritardando tempo events.

Notes added by these passes carry the source: "ornament" provenance tag (versus "material" and "compose").

Ornament

An ornament is a small decorative figure added around a structural note. It is not the source melody itself, so event data marks it separately with source: "ornament".

Step 7: MIDI Writer

The internal representation is composed entirely in C. The MIDI writer is the only place where the requested key is applied — pitches are transposed on the way out, time signatures are written (3/4 for passacaglia and chaconne, 4/4 otherwise), and the result is a Type 1 Standard MIDI File.

js
const midi = generator.getMidi()       // Uint8Array (transposed to your key)
const events = generator.getEvents()   // Event data (pitches stay in C)

TIP

The events JSON from getEvents() reports pitches in C and tags every note with its source. See the JavaScript API for the full type definitions.

Beyond the pipeline: how quality is measured

The seven steps above are everything that happens at runtime. The pipeline never searches for a "more Bach-like" candidate while generating — the layout is fixed by design values, and the validator only rejects illegal results. So how do we know those design values actually produce Bach-like output?

The answer is development-side quality gates. Whenever a form or figuration in the engine changes, the generated output is run through two families of automated checks. Both ship as public Python tooling in the engine repository (the texture-gate and closure commands of bach_tools.py).

The corpus statistics model

Five distributions are extracted from a generated piece's event data and scored by their distance (negative log-likelihood) from reference distributions estimated over a corpus of real works. A distance beyond a calibrated threshold fails the gate.

FeatureWhat it captures
Pitch classHow often each note name is used
Melodic intervalThe distribution of distances between adjacent notes within a voice
DurationThe distribution of note lengths
Beat positionWhere in the bar notes are attacked
Vertical interval classThe distribution of intervals between simultaneously sounding notes

The most heavily weighted of these is the melodic interval. In Bach's actual writing the dominant melodic interval is overwhelmingly the step (a move of one or two semitones to a neighboring scale tone — see the primer), so a leap-heavy line widens this distance immediately.

The tension between horizontal steps and vertical consonance

This is where the core tension of figuration design lives.

  • Horizontally: a line matches the corpus better the more stepwise it is — scalar runs, wave figures.
  • Vertically: beat onsets sound consonant against a sustained bass the more they are anchored to the bar's chord tones (see also Leaps need recovery).

Implemented naively, the two collide: a freely running scale line steps onto non-chord tones at beat onsets, while jumping every beat to the nearest chord tone produces a leap-riddled line. The engine's figures (scalar wave, sawtooth, broken chord) are written to keep both — landing on chord tones that lie ahead in the running direction, and choosing oscillation partners from bass-consonant neighbor tones before resorting to a leap.

Texture gates

Alongside the statistics model, a seed × form sweep checks structural metrics directly — per-voice activity, silence ratios, repeated-note run lengths, parallel perfect intervals, and the model score threshold above. If a single case fails, the change does not merge.

These are not part of generate()

Neither the corpus model nor the texture gates run at runtime. Generation is always a deterministic single shot; quality is guaranteed on the side that produces it — in the design values, at development time.

Dual-licensed: AGPL-3.0 · commercial licensing available. Generated MIDI is yours to use freely.