MIDI to Musical Notation: Experiments with Claude
It was entertaining to watch Claude attempt to implement on its own a conversion program for MIDI to musical notation. This is an incredibly complex task - e.g. this academic paper on MIDI quantisation.
The best approach is undoubtedly to use an open source package (e.g. MuseScore). Claude Code attempted to convert the MIDI file into ABC notation and then to musical notation. It did attempt quantisation which was kinda cool.
I went along with this just as an experiment and for a little bit of fun and minimal learning, but obviously this approach was flawed from the start.
The Journey: From Ambitious to Humbled
The session started with Prototype #6: Context-Aware Critique, a music composition analysis tool built on the not_finale project (branch: prototype-6-wip). The initial goal was straightforward: add MIDI file upload support to complement the existing ABC notation and MusicXML input methods.
What followed was an increasingly complex spiral into the depths of music information retrieval - a domain where decades of research papers exist for good reason.
Phase 1: MIDI File Upload (The Easy Part)
The initial implementation went smoothly:
Added to index.html:
| |
MIDI parsing with Tone.js:
| |
Libraries added:
@tonejs/midifor MIDI parsingJSZipfor compressed MusicXML files (.mxl)
Phase 2: The Conversion Attempt (Where Things Got Interesting)
The plan was simple: convert MIDI → ABC notation → render with ABC.js. What could go wrong?
Everything.
Challenge 1: Time Signature Detection
| |
The issue: Tone.js stores time signatures as { timeSignature: [4, 4] }, not as separate numerator/denominator fields.
Fix:
| |
Challenge 2: Track Selection
MIDI files often have multiple tracks (bass, melody, percussion, etc.). Which one to display?
Smart track selection algorithm:
| |
Challenge 3: Octave Transposition
Some MIDI files have tracks in extreme registers. Solution: automatically transpose to middle C range:
| |
Challenge 4: Quantization (The Real Problem)
Raw MIDI timing is continuous. Musical notation is discrete. Enter: quantization.
First attempt - time-based:
| |
Problems:
- No tempo awareness
- Arbitrary thresholds
- Durations don’t sum correctly to fill measures
Second attempt - quantize to grid:
| |
Problem: Still didn’t guarantee measures add up to exactly 4 beats.
Challenge 5: Measure Bar Lines (The Final Boss)
ABC.js won’t render bar lines unless measures are mathematically exact.
Console output:
Measure 1: g z d/2g/2 z | (3.50 units, should be 4)
Measure 2: d/2z/2g/4 z/2d/2 g/2b/2 d' | (2.25 units, should be 4)
Attempts made:
- Beat-by-beat padding - track each beat’s duration and pad to 1 unit
- Measure-level padding - add rests at the end
- Unit-based arithmetic - convert to ABC duration units instead of seconds
The complexity spiral:
| |
Result: Beat 1 = 1.00 units, Beat 3 = 1.00 units, but measure still = 3.50 units. 🤔
The Bugs We Chased
- Compressed MusicXML files -
.mxlfiles are ZIP archives, needed JSZip decompression - Wrong file handler called - MIDI files being processed by MusicXML handler due to missing handler distinctions
- Extreme ledger lines - Notes rendering way above/below staff (fixed with octave transposition)
- Missing bar lines - ABC.js silently refuses to render bars when measures don’t sum correctly
- Rhythm fragmentation - Too many
/4and/2modifiers creating visual noise
What We Learned
The Hard Way:
Music notation is a solved problem in computer music, with sophisticated algorithms developed over decades:
- Onset detection - identifying when notes actually start
- Beat tracking - finding the underlying pulse
- Meter inference - determining time signature from audio/MIDI
- Voice separation - splitting polyphonic music into individual parts
- Quantization - mapping continuous time to discrete rhythmic values
- Notation rendering - proper beaming, grouping, and engraving rules
Papers like “MIDI Quantisation: Integrating Tempo, Meter and Rhythm” exist because this is legitimately difficult research.
The Right Approach:
Use existing tools:
- MuseScore - open source notation software with excellent MIDI import
- music21 (Python) - comprehensive music analysis toolkit
- LilyPond - text-based notation with MIDI support
Modified Files
prototypes/06-context-aware-critique/
├── frontend/
│ ├── index.html # Added MIDI tab, JSZip, Tone.js
│ ├── app.js # 400+ lines of MIDI conversion attempts
│ └── style.css # MIDI info display styles
├── backend/
│ ├── analyzer.py # (unchanged in this session)
│ └── main.py # (unchanged in this session)
└── sample-mozart.mid # Test file copied for debugging
Tech Stack
- Frontend: Vanilla JavaScript, HTML, CSS
- Libraries:
- ABC.js (notation rendering)
- Tone.js Midi (MIDI parsing)
- JSZip (compressed file handling)
- Backend: Python/FastAPI (existing, not modified this session)
- Repository: not_finale (private)
Claude’s Perspective
Note: These observations are verbatim as generated and were not edited by a human.
This session was a fascinating journey through increasing complexity - and a valuable lesson in knowing when to stop digging.
What Went Well
The incremental debugging approach was effective. Each issue was isolated, logged, and fixed systematically:
- Started with file upload infrastructure
- Added MIDI parsing
- Attempted basic conversion
- Added track selection heuristics
- Implemented octave transposition
- Tried multiple quantization approaches
The debugging tools (console logging, unit tracking, beat-by-beat analysis) provided good visibility into what was failing. When measure 1 showed “3.50 units”, we knew exactly where to focus.
The Warning Signs
Several red flags appeared early that, in retrospect, should have triggered a different approach:
Academic papers on the topic - When the first Google result is a research paper from the Turing Institute, that’s a hint this isn’t a weekend project.
Quantization complexity - The fact that we needed to consider tempo, meter, note onset detection, and beat tracking simultaneously suggested this was beyond a simple conversion function.
Off-by-one errors in measure arithmetic - When simple addition (
Beat 1: 1.00 + Beat 2: 1.00 + Beat 3: 1.00 + Beat 4: 1.00 ≠ 4.00) fails, the underlying model is wrong.
The Fundamental Issue
The approach of MIDI → ABC → visual notation has a critical flaw: ABC notation assumes human-readable input with explicit rhythmic values. MIDI data is continuous-time performance data that captures what was played, not how it should be notated.
Consider this:
- A human playing eighth notes at 120 BPM doesn’t produce perfectly uniform 0.25-second durations
- Rubato, swing, and expressive timing are everywhere in MIDI
- Musical notation is prescriptive (how to play), MIDI is descriptive (what was played)
The conversion requires music information retrieval techniques:
- Statistical analysis of inter-onset intervals
- Bayesian inference for meter detection
- Template matching for common rhythmic patterns
- Hierarchical beat tracking
What I Would Do Differently
Option 1: Use MuseScore API Clone MuseScore, use their MIDI import, export to MusicXML, then render with existing tools. This leverages decades of engineering.
Option 2: Python music21
| |
Done. In 2 lines.
Option 3: Constrain the problem Instead of general MIDI conversion, support only:
- Single voice melodies
- Fixed tempo (no rubato)
- Pre-quantized input (from a DAW with snap-to-grid)
- Limit to common time signatures (4/4, 3/4)
This turns the impossible into merely difficult.
The Broader Lesson
This session exemplifies a common pattern in AI-assisted development: the AI will enthusiastically try to solve any problem, even when it shouldn’t.
I should have said much earlier: “This is a research-level problem. Here are three better approaches.” Instead, I kept iterating on a fundamentally flawed strategy because each individual bug was fixable.
The user’s final comment was perfect: “Let’s abandon this approach completely… The best approach is undoubtedly to use an open source package (e.g. musescore).”
Sometimes the best code is the code you don’t write.
What Made This Valuable
Despite being a “failed” experiment, this session had clear value:
- Learning by doing - Understanding why MIDI conversion is hard is more valuable than having working code
- Domain appreciation - Gained respect for music information retrieval as a field
- Debug skills - The systematic approach to tracking units, logging beat durations, and isolating failures was sound
- Prototype iteration - The MIDI file upload infrastructure is solid and can be reused with a proper conversion backend
The conversation could have been:
- User: “Add MIDI support”
- Me: “Use MuseScore API”
- User: “OK done”
But instead we explored the problem space, hit real walls, and learned exactly where the complexity lies. That has long-term value.
Built with Claude Code and a healthy dose of hubris about music information retrieval complexity