web synth docs

2021-03-20 - new [midi-editor] design

Going to try out the daily note feature of this thing. I've usually done written notes for this kind of thing in the past, but my notebook is almost full and I do not have a new one right now.

The [fm-synth] demo is more or less done. The main thing left to do is to write blog posts to interest the programmer crowd to prepare for broader posts on reddit/ycombinator. That is something that I will address separately.

Right now, my focus is moving up the chain of music production. The [fm-synth] is capable of producing some actually good sounds; it has very rough edges and lacks a lot of nuance and advanced capabilities, but I really feel that it's able to function as a strong instrument. I want to move from making sounds to making demos or clips; riffs, maybe. Combining multiple layers of sound together in a controllable way.

The first step towards achieving that is to re-build the MIDI editor. The existing one is a total loss. It is the original piece of this project that I built, and it is pretty much unusable. I am going to build the new one in PIXI.js without doingany of the manual SVG madness.

high level design

From the start, here are some features that this thing needs to have:

variable granularity note locking and grid rendering. Should be a very simple interface to the top level MIDI editor to set the note lock granularity and have everything re-render just like that
good zoom support via scroll wheel. Zoom should be there from the start; no hard-coded pixel lengths for anything.
note dragging to move AND to resize. We will of course need to do intelligent note "collision detection" for both of these things, which will be handled by:
data structure to handle note positions and mutations. We can write this in Rust/Wasm if we want to and expose a simple API; might be nice to make use of Rust data structures for this since I think it's going to involve trees (because fuck skip lists). Will go into more on this later.
support multiple note selection and multi-note operations from the start, I think. I was thinking about doing something like submitting a batch of actions to the engine to execute and then getting back a list of state transitions or something like that, but I think I want to keep the border between the Rust and the JS as thin as possible and do more of the logic in the JS than in the Rust. The Rust can solely be responsible for managing the data structures and applying operations to them

I think we can focus on the UI and implementation before worrying about integration with the [patch-network] MIDI nodes or anything like that, and I feel like that will not be a huge problem once we get there anyway.

We've already solved this problem once, and a lot of that will transfer over. One of the mistakes that we made last time was trying to generalize over the grid that we used to back this for the [track-compositor] as well. That was doomed to fail; there aren't enough things similar between them to make that effort even remotely worth it. The [track-compositor] will be a standaone thing that we will handle later as a totally separate project. If there is some code we can share, it will be shared at a much more granular level.

Perhaps it would be a good idea to make the backing data structures generic enough to be re-used there. I think that is a functionality that will actually be useful. Just the collision detection and mutation operations (add, delete, move, resize) should be good. Yeah, I really like that idea.

"backend" data structures design

As mentioned before, the main purpose of the backend will be to manage the positions of the notes (perhaps we should come up with a better name to handle the fact that they're generic) and handle all of the mutations. The API will be entirely imperative; there will be no callbacks or actions originating from the backend. It will simply be created, modified, queried, and deleted from the JS MIDI editor code.

The data structures are probably going to be some kind of tree, the BTreeMap from Rusts's standard library will probably do the trick just fine. I'm not interested in dealing with the whole custom skip list thing from before. Performance is likely going to be a non-factor for this and we should focus on making the API and implementation as clean + simple as possible over everything else.

The main operations that we need to support:

creation
adding notes
deleting notes
moving notes horizontally
resizing notes horizontally

I honestly think that that's enough. Note that our MIDI editor is going to consist of an array of these trees for each of the different note levels that we support. Moving notes vertically between levels can be implemented using removal and insertion operations. If we want to add utility methods to support those higher-level operations we can do so, but yeah I think that's all we need.

Notes should have associated unique IDs, but no metadata should be stored in the backend. The backend should focus entirely on the position aspect and not deal with the higher-level things at all so that it can remain generic and be re-used for the [track-compositor] and any other use cases we have.

OK - I like this so far. The very simple backend is a big improvement over the convoluted coupled mess from the previous implementation. Using PIXI.js for the UI will be a massive improvement as well; being able to use its hierarchical rendering system rather than doing horrific offset calculations everywhere.

handling beats to pixels conversion for zooming

Oh, one last thing I wanted to touch on is beats to pixel calculation. The old MIDI editor suffered badly from really messy conversions between beats and pixels everywhere. We're going to have a single beats to pixels constant that represents the current zoom value which will be stored in the top-level MIDI editor state.

I have been trying to think about the best way to handle zooming. As far as I can think, zooming is the only thing that is going to require re-rendering everything. I was trying to think about if there was some kind of observable thing in pixi that we could use to automatically re-render and update all of the points just by updating a single variable. However, a (very) cursory look seems to make it seem that's not the case. We may have to expose re-render functions along the PIXI component hierarchy to handle changes to zoom or other "global" parameters that require re-rendering things.

--Actually, looking at this further there is a scale param on PIXI sprites and graphics which might just do the trick. Will have to look in a bit closer to see if this does stuff like make the border wider or cause other funky issues like that, in which case we can simply fall back to manually triggering re-renders or updates. As long as we are aware of this requirement from the start, it should be no problem.

OK, after a bit more thinking I really should write those blog posts for the [fm-synth] before I go ahead and start a whole new massive multi-week (let's be real - it's going to be multi-month) project in the form of the MIDI editor v2.

SO, what blog posts do we want to write?

There needs to be one initial post which will be the main one going over the [fm-synth] at a high level, explaining the largest pieces of its architecture and how they fit together, and including lots of images as well. What are those high-level pieces that want to include?

What [fm-synth] is, the basics of how it's implemented via phase modulation. I want to avoid including much code here, if any. Tell them what operators are, what carrier vs. modulator is, what a modulation index is. Maybe include the modulation index formula.
Polyphonic voice manager
UI with react-[control-panel]
Visualizers (spectrogram and oscilloscope)
- can go into a small explanation of time domain vs. frequency domain visualization
Envelope generators
Effects
Plumbing with AudioWorkletProcessor, Wasm, message passing, gating/ungating, param sources, etc.
SIMD
Filter
MIDI device support

OK wow that's a big list; it helps me realize/understand just how large of an undertaking this thing was. We obviously can't/shouldn't talk about all of these things in just the one blog post. I think one or two of those things would be good standalone blog posts; I still want to do that "what I wish I learned about digital filters" article at some point regardless.

Envelope generators is something I definitely want to go over. Besides being an important concept for synthesis and the dynamic capabilities of the synth, the UI aspect is engaging and involved in and of itself.

The visualizers would similarly be very easy to write about and I think the would be good to at least mention. I can't remember if I wrote an article about the spectrum viz already; I think I did.

SIMD is a MUST; it's going to be a big drawing point for people to care about this thing.

Should include info about why the SIMD was used and what benefit it provides
Tell about how the Wasm feature detection is done and how we build two versions with and without SIMD
Can maybe pull in some info about the Wasm tooling that I use (wasm2wat, wasm-opt, etc.) here as well

Plumbing can be included in there somewhere as well, tying in WebAudio, Wasm, Rust, and the param source stuff together. We can mention the UI in there as well maybe, but I don't want to put too much focus on it.

OK, so here's apparently what we're settling on for structure of the main article:

High-level overview of [fm-synthesis], explaining our implementation and some of the terminology used in the context of the [fm-synth]
Go into the technical implementation further, talking about the whole Wasm SIMD thing, WebAudio plumbing, and other code-level stuff. Maybe mention some of the browser bugs we ran into that made things difficult (firefox's built-in exponential ramper being completely busted, Linux's buffer size being fucked and broken and how we debugged it)
Talk about the envelope generates and how they are integrated with the rest of the synth. We should def. mention how we use SharedArrayBuffer here.
To close it off, we can go over the visualizations (briefly) and touch on frequency domain vs. time domain

I like that. I really would like to do that filter writeup at some point, but I don't think that's a blocker to the release and it's a huge undertaking so I don't want to focus on it right now.

It will be very important to include good images and visualizations; I'm convinced that's vital to getting people to care about blog posts. I think it would be good to include some small videos with actual sound demos.

OH, and very important is that we need to have a demo/tutorial video to go along with all of this; it should be featured at the top of the blog post.

I think this is enough to go off of, cool. Let's get started.

2021-03-20 - new [[midi-editor]] design