Chandan Singh — Blog

Building imgplex: part 7

Sat, 04 Jul 2026 00:00:00 GMT

This is a series of posts on building imgplex, best read in order:
Part 1 - The why, what, and how of imgplex
Part 2 - Getting things up and running
Part 3 - The node definition system
Part 4 - Executing the node graph, making it fast
Part 5 - Two graphs in one
Part 6 - Multiple inputs and outputs, processing images as sets
Part 7 - The small, measured optimizations beneath the big ones

The layer beneath

Back in Part 4 I covered the big structural performance decisions - command fusion, parallel workers, the fast-path/slow-path split. Those are the load-bearing ones, the choices that decide whether a 2000 image batch takes seconds or minutes. But underneath them sits a second layer: a pile of small, individually unglamorous optimizations that each shave a little off, and compound into a lot.

The thing worth saying about this layer up front is that none of it was guessed. In game dev you don’t optimize a frame you haven’t captured in a profiler - intuition about what’s slow is often wrong enough that acting on it blind is how you spend a day speeding up something that was never the bottleneck. Same rule here. Every trick below came from actually measuring where the time went, which is why the back half of this post is about the measuring, not the tricks.

MIFF: stop compressing files you’re about to delete

I touched on this in Part 4, but it’s the cleanest example of the mindset so it’s worth restating. When a processing chain has to break and write an intermediate file - because the image branches, or the format changes mid-graph - that file exists for a few milliseconds before the next stage reads it back and it gets thrown away.

Writing that throwaway file as a PNG means paying to compress it on the way out and decompress it on the way in. For a file nobody will ever look at, that compression is pure waste. Intermediates are now written as MIFF, ImageMagick’s own uncompressed native format, which skips the encode/decode entirely. Only the final output - the thing the user actually keeps - gets encoded to the real target format.

It feels backwards the first time: you’re deliberately writing bigger files to go faster. But it’s the same logic as an intermediate render target in a render pipeline. You don’t compress a buffer you’re going to read back next stage; the compression costs more time than the extra bytes ever will.

WebP thumbnails: pay a cheap decode to save memory

The filmstrip can hold thousands of thumbnails at once, and every one of them sits on disk for as long as the image is loaded in. Generated as PNG, that adds up fast. Thumbnails are now WebP instead, which is dramatically smaller on disk and in memory for the same visual quality, at the cost of a slightly more expensive decode - a trade that’s obviously worth it when you’re holding thousands of them. The thumbnail resolution is also configurable per workflow, from the Input node’s inspector - 256px by default. Dense filmstrip on a big monitor and want more detail? Turn it up. Working with an enormous folder and want imports snappy and memory low? Turn it down. It’s the same knob as picking a texture resolution - the right answer depends on the budget you’re working against, so it’s exposed rather than hardcoded. That same thumbnail is what the preview pipeline reuses as its input, so the number you pick here sets preview latency too.

JPEG DCT hints: don’t decode pixels you’re going to throw away

This is my favorite one because it’s genuinely clever and it’s entirely ImageMagick doing the work - you just have to ask.

Making a 256px thumbnail from a 6000px JPEG, the naive path fully decodes all six thousand pixels of width, then throws away 95% of them in the downscale. But JPEG’s compression is built on the DCT, and libjpeg’s decoder can run that step at a reduced internal scale - 1/2, 1/4, 1/8 - decoding straight to a smaller image without ever reconstructing the full-resolution one. Passing -define jpeg:size=NxN tells it the target size so it picks the coarsest scale that still covers what you need.

For a big JPEG headed for a small thumbnail, that cuts the decode cost substantially, because you’re skipping most of the decompression rather than doing it and discarding the result. It’s mip levels, basically - you don’t sample the full-resolution mip to shade a distant object, and you don’t fully decode a JPEG to make a thumbnail of it.

Batched channel means: draw-call batching for metadata

Some nodes don’t process an image so much as measure it - the mean value of the red channel, say, or all four channel means to check whether an alpha channel carries real data. The obvious implementation reads each channel with its own magick identify call: four spawns to answer one question, and spawns are the expensive thing.

Those reads are now combined. A single magick identify with a compound format string pulls all the channel means back at once - up to four spawns collapsed into one. If you’ve ever batched draw calls, this is exactly that move: the per-call overhead dwarfs the work inside the call, so you stop making N calls and make one that does N things.

There’s a sharper version of the same idea for a specific case. When a channel-split node’s outputs feed only mean-value analysis - nobody’s looking at the actual split channels, they’re just being measured - the four channel image files never get written at all. The means are gathered directly in a single spawn, and the temp file I/O for images no one will ever see is skipped entirely. The cheapest work is the work you prove you never have to do.

Thread limits: the bug hiding in global state

Part 4 mentioned that ImageMagick has its own internal multithreading, so running N worker processes that each grab every core oversubscribes the CPU and makes the whole batch slower - the fix being to divide the thread budget across the workers so the total stays sane. That’s the MAGICK_THREAD_LIMIT environment variable, and setting it is the easy part.

The subtle part is how you set it. The first implementation wrote it into the process’s shared environment before each spawn. That works fine when one thing runs at a time. It stops working the moment two things overlap - a preview firing while a batch is running, thumbnails generating during an import - because they’re all reading and writing the same global variable, and whoever wrote last wins. One workload clobbers another’s thread budget and the careful division falls apart.

The fix was to attach the thread limit to each individual spawn’s own environment instead of mutating the shared one. Every magick process now carries its own budget, and concurrent previews, thumbnails, and batches stop stepping on each other. It’s the classic shared-mutable-global bug - the kind that looks fine in every single-threaded test and only misbehaves when things run at once - and the fix is the classic one too: stop sharing the state.

Keeping the main process responsive

Throughput is one half of feeling fast; the other half is never freezing. Electron’s main process runs a single event loop - the same shape as a game’s main thread - and any synchronous work you put on it blocks everything else until it returns. Stall it and the UI stops responding, IPC messages queue up behind the stall, and the app feels locked even though it’s technically busy working.

Two places were quietly doing exactly that. Importing a folder walked the directory tree with synchronous recursion - the kind of readdirSync walk that’s fine on ten files and janks hard on a deep tree of thousands, because nothing else on the event loop gets a turn until the entire walk finishes. Converting it to async filesystem calls lets the scan yield between steps so IPC keeps flowing; the import takes as long as it takes, but the app stays alive and responsive while it happens.

Startup housekeeping had the same shape. On launch the app sweeps its temp directory for intermediate files orphaned by a previous crash - the debris a hard quit leaves mid-batch - and it now also ages out cached thumbnails and preview files older than two weeks so the cache folder doesn’t grow without bound. That sweep used to run synchronously and added its cost directly to launch time; it’s now async, so it happens in the background and never holds up the window appearing. The still-valid recent cache is left untouched; only the genuinely stale files go.

It’s the same discipline as refusing to do a giant synchronous asset load on the game thread. The work still has to happen - you just don’t let it hold everything else hostage while it does.

You can’t optimize what you can’t see

Here’s the honest part. Every optimization above is small, and several of them are non-obvious. The JPEG DCT trick and the thread-limit race in particular are the sort of thing you do not find by staring at code. You find them by measuring, being surprised, and going to look at why.

So there’s a timing system built in, toggled from Debug > Enable Performance Timers menu option. With it on, every batch breaks its time down by phase: setup, the ImageMagick startup cost (from kicking off the run to the first image actually being touched), the per-image magick time, the file-existence check, and the final copy. The breakdown is printed to the console after each run and written to a perf.log in the output directory, so you can compare runs across code changes instead of trusting your memory of “that felt faster.”

When the timers are on, it also asks ImageMagick itself for its -verbose per-image stats - format, dimensions, colorspace, file size, elapsed time - and folds those into the same perf.log. That’s how you tell the difference between “the batch is slow” and “the batch is slow because three specific 12,000px PSDs are dominating the whole run.” The phase breakdown tells you which stage; the per-image stats tell you which file. Between them you’re never guessing.

(One small cross-platform gotcha that cost some time: -verbose placed as a per-image option gets silently ignored by some Windows ImageMagick builds. It has to sit in the global option position, before the input, to be honoured everywhere. The kind of thing that works on your machine and quietly does nothing on someone else’s.)

The log window

The timing system tells you about batches. For everything else there’s a dedicated log window - Help > View Log - that streams entries live from the main process with timestamps and severity levels, in its own window so you’re not squinting at a terminal behind the app. This is also available in release builds, so you can ask your artist to send you the log if something didn’t work on their system as expected.

The reason it’s useful is that the pipeline logs generously. Every magick spawn records its arguments, how long it took, and how many bytes it wrote to stdout and stderr. Batch and import runs log their start and end. Thumbnail generation, node registry hot-reloads, IPC handler entry points - all of it lands in the same stream. When a node misbehaves or a batch stalls, the answer is almost always sitting right there in the log: the exact magick command that ran, and what it said back. It turns “it doesn’t work” into “this specific command failed with this specific message,” which covers 99% of debugging.

A couple of small practical touches keep it usable rather than overwhelming. The in-memory log is capped at a thousand entries so a long session doesn’t slowly eat memory, there’s a clear button to reset the view without reopening the window, and the window drops the main app’s menu bar since File/Edit/View mean nothing in a log viewer.

The learnings

While this tech stack is new to me, the principles are the same ones I’ve always worked by, and they come down to two things: measure before you fix, and don’t do work you’ll only throw away.

The first is why I’ve leaned on profilers for most of my career. Some of them are clunky to use, but I love them for the one thing they do - tell you what’s really happening instead of what you assumed was. So when I sat down to build imgplex I made performance metrics a core part of the app from the start: a clock on every phase, a log on every spawn. It’s already paid off in finding and fixing the issues above.

The second is the thread running through every trick in this post - the cheapest work is the work you never do. Don’t compress a scratch file, don’t decode pixels you’re about to discard, don’t spawn four processes for one question, don’t render channels nobody will look at, don’t block the event loop with work that could yield.

The present and the future

imgplex is in a stable shape now, and a number of people across different game studios are using it in production. Their feedback, bug reports, and discussion have been a huge help in improving it. Along the way I’ve researched and learnt a lot about architecture and about writing code bases that are scalable, performant, and manageable.

AI accelerated this many times over - I honestly wouldn’t have attempted a code base this complex on my own, and definitely not in the small slice of time left over after my day job.

Parts of the application have been through big refactors, and the pipeline design has changed in meaningful ways more than once. All of it led to the current state: stable, and hopefully easy to extend. It’s in no way done - it’ll keep evolving with the needs and feedback of the people using it - but it’s already been a genuinely valuable learning experience, and those lessons have paid off in other tools I’ve built since.

If you end up using imgplex and have thoughts about it, I’d love to hear from you - reach out directly or on the project GitHub repo.

Standalone E-Ink Picture Frame

Mon, 08 Jun 2026 00:00:00 GMT

I came across this blog post by Guy Sie detailing a Spectra 6 based e-ink display to use as a picture frame. The premise is quite simple and enticing: a dynamic picture frame that doesn’t look like a display and can show images relatively well as long as the images are encoded properly for it. I ended up getting one and set it up to work with Home Assistant.

Disclaimer: The Arduino firmware for this project and parts of Ink Frame Lab has been developed with help from AI tools. The design decisions, architecture, and hardware debugging were done manually, with AI assisting primarily in code generation and iteration.

Its nice, but..

After living with the Home Assistant setup for a while, a few friction points became clear:

It wasn’t standalone. The frame needed a Home Assistant instance running somewhere on the network to serve images. That’s fine for my setup, but I wanted this to be something I could give as a gift to someone who doesn’t have a Home Assistant setup or a NAS. A picture frame shouldn’t need infrastructure.
No battery visibility. I had to open Home Assistant to check the battery level. For something that sits on a shelf and sleeps most of the time, I wanted a glanceable indicator on the display itself.
The image processing workflow was rough. There are tools online that can dither images to the Spectra 6 color palette, but you still have to manually crop and resize each image before dithering. That’s not something I can ask a non-technical person to do if they want to add new photos. Also, most web based tools lack batch image processing and can only process one image at a time - fine for the odd experiment, but impractical when you want to process a bunch of images in one go.
No way to preview the end result. E-ink panels have quite muted colors - for example white is more of a light bluish grey on the display, and there’s no backlight, so images look very different depending on ambient lighting. I wanted to be able to visualize how a processed image would actually look on the panel in the real world in different lighting conditions before committing to it.

Ideas and dead ends

My first idea was to expose the SD card as a USB drive when the device is plugged in, so you could just drag and drop images like a thumb drive. This turned out to be a hardware dead end: on the reTerminal E1002, the USB-C port is routed through a CH341 UART bridge chip, which can only do serial communication. The ESP32-S3 does have native USB OTG that could theoretically do mass storage, but those GPIO pins (19/20) are repurposed for the I2C bus on this board. There’s no way to present a storage device to the host without physically modifying the PCB.

The fallback was Wi-Fi. The ESP32-S3 has Wi-Fi built in, so the device could host a small web server with a drag-and-drop upload page. No app needed, works from any phone or laptop browser. The question was how to make this accessible to someone who has never configured a microcontroller. The answer turned out to be using the device’s own Wi-Fi access point - the frame creates its own network, the e-ink screen shows the network name, password, and URL in large text, and you just follow the steps. No router configuration, no IP address hunting.

The solution

I ended up building two things: custom Arduino firmware for the reTerminal E1002, and Ink Frame Lab — a browser-based tool for preparing images for e-ink displays.

Firmware for reTerminal E1002

Web server mode

Web server interface

Battery level bar

The firmware replaces the stock ESPHome setup with standalone Arduino code that doesn’t need Wi-Fi or Home Assistant during normal operation. The device reads PNG images from the SD card, picks one (randomly or sequentially based on a config file), renders it to the e-ink display with a single-pixel horizontal battery indicator bar at the bottom, and goes into deep sleep until it’s time to change.

The interesting engineering challenges were all around the shared SPI bus. The SD card and e-ink display share the same SPI pins (MOSI, MISO, SCK) with separate chip selects, which means they can’t talk at the same time. My first approach was to decode the PNG and draw to the display simultaneously inside GxEPD2’s paged drawing loop - re-reading the PNG from SD for each page. This worked for the first 40-pixel strip and then the rest of the screen was white. The display’s SPI context was active during the page loop, so the SD card reads silently failed.

The fix was a two-pass approach using the ESP32-S3’s 8MB PSRAM: first, decode the entire PNG from SD into a 384KB buffer in PSRAM, close the SD card, then initialize the display and draw from the buffer. This also meant being deliberate about initialization order - if the display driver sent its init sequence on the shared SPI bus before the SD card was mounted, the card’s internal SPI state machine would get confused and reject subsequent mount attempts. Splitting initDisplay() into a pin-setup phase and a deferred driver-init phase fixed this, ensuring the SD card always gets a clean bus.

Another entertaining bug: the first successful render had all the colors wrong. Green foliage showed as red, blue sky showed as green. The PNGdec library’s getLineAsRGB565() function was being called with PNG_RGB565_BIG_ENDIAN, which byte-swaps each pixel for big-endian displays - but the ESP32 is little-endian. The bit extraction was pulling the wrong channels from each swapped uint16_t. A one line fix to PNG_RGB565_LITTLE_ENDIAN and the colors were correct.

The web server mode is a secondary boot mode activated by holding the green button during power-on. The e-ink screen shows step-by-step instructions with the Wi-Fi credentials and URL, and the web interface lets you upload, delete, and manage photos, configure the rotation interval, and choose between random or sequential display order. In sequential mode, the two white buttons on the device navigate forward and backward through the images. When you’re done, the device enters deep sleep for a couple of seconds and wakes up as a clean cold boot into slideshow mode - I learned the hard way that ESP.restart() is a software reset that doesn’t properly reinitialize the SPI peripheral, so the SD card would fail to mount after every restart from setup mode.

As a side effect of running standalone and not needing to maintain a Wi-Fi connection, battery life improved significantly. My device is set to change images randomly every 4 hours and loses about 10% over a week.

Here are the firmware files for reTerminal E1002 and installation instructions.

Ink Frame Lab

The image preparation side of the problem needed its own tool. Existing dithering tools handle the palette conversion, but none of them solve the full workflow: crop to the panel’s aspect ratio, resize to 800×480, dither to the Spectra 6 palette, and preview how it will actually look on the muted, non-backlit display.

Ink Frame Lab is a browser-based tool that handles all of this. You import images, crop them with a locked aspect ratio, preview the dithered result, and - the part I’m most pleased with - inspect it in a 3D view that simulates different lighting conditions and angles. This matters more than you’d think: an image that looks great on your monitor can look muddy or washed out on the actual panel, and being able to preview that before exporting saves a lot of trial and error. Oh, and you can bulk edit and export images - no need to process images one at a time.

Crop view

Processed view

3D view

You can read more about this tool here and a functional web version is here.

Credits

Three.js - 3D rendering, used for the frame viewer, IBL lighting, and post-processing pipeline (OrbitControls, RoomEnvironment, RGBELoader, EffectComposer, GTAOPass, OutputPass).
OpenDithering - the image adjustments pipeline (DRC, tone mapping, S-curve, saturation, exposure) was ported from this project by Guy Sie.
epdoptimize - inspiration and references for image processing and measured values
JSZip - client-side ZIP archive creation for the Export ZIP feature.
Inter - UI typeface by Rasmus Andersson, served via Google Fonts.
Dithering algorithms - error diffusion kernels (Floyd-Steinberg, Atkinson, False Floyd-Steinberg, Jarvis-Judice-Ninke, Stucki, Burkes, Sierra-3, Sierra-2, Sierra-2-4A), ordered Bayer matrix, and random noise dithering are original implementations of published public-domain techniques.
Polyhaven - images used for image based lighting in the 3D view.

Future plans

Improve the 3D viewer of Ink Frame Lab - the lighting simulation works but could be more realistic.
Add more presets for different devices and panel types - the only verified device preset here is the reTerminal e1002, and I’ve added specs for some other devices by getting their specs off the internet.

Building imgplex: part 6

Sun, 31 May 2026 00:00:00 GMT

This is a series of posts on building imgplex, best read in order:
Part 1 - The why, what, and how of imgplex
Part 2 - Getting things up and running
Part 3 - The node definition system
Part 4 - Executing the node graph, making it fast
Part 5 - Two graphs in one
Part 6 - Multiple inputs and outputs, processing images as sets
Part 7 - The small, measured optimizations beneath the big ones

One in, one out - until it wasn’t

The earlier posts described the pipeline as if it had a single entrance and a single exit: images come in, flow through the nodes, and land at the Output node. That was true for a while, and it was the right place to start. But real image work almost never fits that shape.

You want to pull frames from one folder and a set of masks from another. You want the same processed image written out as a full-resolution PNG and have its computed dimensions dumped to a text file and have the whole batch tiled into a single contact sheet for review. One source and one destination stops being enough almost immediately.

If you’ve built graphs in Substance Designer, you already know the shape this wants to take. A Substance graph doesn’t have one output - it has a basecolor output, a normal output, a roughness output, all fed from shared upstream nodes, each producing a different deliverable from the same network. imgplex ended up in the same place: any number of inputs, any number of typed outputs, all on one canvas.

Multiple inputs

The first half of that is straightforward to describe and was a real change under the hood. Instead of a single implicit source, the canvas supports any number of Input nodes, and each one owns its own image list and its own filmstrip. Click one Input node and the filmstrip shows its images; click another and you see that one’s. They’re independent sources that happen to share a canvas.

This matters the moment a graph branches. A compositing workflow might take a base image from one input and an overlay from another. Keeping them as separate nodes, each with its own queue, means the graph reflects what’s actually happening - two distinct sources meeting downstream - rather than forcing everything through one funnel.

The wrinkle this introduces is that the engine can no longer assume it knows where an image “came from.” With one input that question is trivial; with several, every output needs to know which source feeds it. So when a run starts, the engine walks backwards from each output node - following the wires upstream, hop by hop, until it reaches an Input node. That trace is what tells it which queue of images to push through that particular branch of the graph. Multiple inputs and multiple outputs are really the same feature viewed from two ends: the graph became a many-to-many network, and the engine has to resolve which end connects to which.

Multiple typed outputs

The other half is outputs, and this is where the typed-wire system from the last post pays off again. There are three kinds of output node, and they’re genuinely different things:

Image Output - writes processed image files. The one you’d expect.
Text Output - writes computed values to a text file. Wire an image’s dimensions, filename, or any value-graph result into it and it produces a per-image .txt record. Useful for generating manifests, sidecar metadata, or just inspecting what the value graph computed without running a full batch.
Flipbook Output - tiles a whole set of images into a single contact sheet.

Each is a first-class node on the canvas with its own inspector and its own settings. Interestingly, this wasn’t the original design - the output started life as a single node with a mode switch, and the flipbook was folded in as just another mode. That worked until it didn’t: each output type accumulated enough of its own settings that cramming them behind a dropdown on one node got awkward. Splitting them into separate typed nodes let each one have exactly the inspector it needed, and made a graph with several outputs readable at a glance - you can see the three destinations sitting on the canvas instead of having to click into one node to discover it’s secretly doing three jobs.

When you run the workflow, every valid output node is processed in turn. One graph, one Run, several deliverables - the same network feeding a PNG, a manifest, and a contact sheet in a single pass.

All three output types share the same “skip existing or overwrite” choice, too. That started as an Image Output feature and later got extended to the other two, because re-running a workflow over files that already exist should behave predictably no matter which kind of output produced them.

Processing in sets

The features so far treat each image as an independent unit. But a lot of image work comes in groups that belong together and have to be processed as a unit: PBR texture sets (diffuse, normal, channel packed mask), the six faces of a cubemap, a run of frames that will become a sprite sheet. Processing those one loose image at a time loses the thing that makes them a set.

This is what the Set Input node is for. Instead of treating a folder as a flat list of images, it groups them by a naming convention - a shared prefix or suffix - so that T_Example_D.png and T_Example_N.png are understood as a PBR texture set, or Frame_01, Frame_02, Frame_03 as one flipbook sequence set. The group travels through the pipeline together as a unit, and nodes that operate on sets (the flipbook being the obvious one) receive the whole group rather than a single frame.

For a technical artist this is a familiar mental model. It’s the same grouping you rely on when a tool ingests a texture set - basecolor, normal, roughness, metallic - by matching filename suffixes, and treats the four maps as one material rather than four unrelated images. The naming convention is the grouping logic, and it’s the convention art teams already follow.

The prefix that drives the grouping is itself a wireable port, which is a small thing with an outsized payoff. Because it’s typed as a string input, you can drive it from the value graph instead of typing it in by hand - compute a prefix from some other property and the grouping adapts automatically. It’s the two-graph model from the previous post showing up exactly where you’d want it.

Flipbooks

The Flipbook Output deserves its own mention because the name is not a coincidence. In real-time VFX, a flipbook is a grid of animation frames packed into one texture that a shader steps through over time - a staple of particle systems. Tiling a set of images into a single sheet is the same operation, whether you’re building an actual flipbook texture for a particle effect or just want a contact sheet to eyeball a hundred processed images at once.

The node takes a set, lays the frames out in a grid, and composites them into one image. The background between and behind tiles is a configurable color, which matters more than it sounds - a transparent versus a solid background is the difference between a usable particle flipbook and one with fringing artifacts, and the choice gets passed straight through to the composite step.

Running it without nasty surprises

Once a graph can have several inputs and several outputs, “run the workflow” is no longer a simple instruction. Some outputs might be ready; others might be missing a required setting or not wired to a source at all. The worst possible behavior here is a silent partial run - you hit Run, some outputs quietly get skipped, and you don’t find out until you go looking for files that were never written.

So a run doesn’t just start. First every output node is validated independently, and a dialog lists each one with its status and, for any that can’t run, a plain reason why - this output isn’t connected to an input, that one has no file path set. You see exactly what will and won’t run before committing to it, rather than discovering the gaps afterward.

A couple of related behaviors back this up. Nodes with no path to any output are skipped entirely during both preview and batch, so a half-wired experiment left sitting on the canvas doesn’t throw errors or slow anything down - it’s simply inert until you connect it. And saved workflows carry the app version that wrote them, so opening a file made by a newer build warns you about the mismatch instead of silently misreading a format that has since changed. When a run is underway, the progress modal names the file it’s currently on, so on a long batch you can see where the pipeline actually is rather than watching a bar inch along with no context.

None of this is glamorous, but it’s the difference between a tool you trust with a 2000-image job and one you have to babysit.

Next post in this series: Part 7 - The small, measured optimizations beneath the big ones

Building imgplex: part 5

Sun, 24 May 2026 00:00:00 GMT

This is a series of posts on building imgplex, best read in order:
Part 1 - The why, what, and how of imgplex
Part 2 - Getting things up and running
Part 3 - The node definition system
Part 4 - Executing the node graph, making it fast
Part 5 - Two graphs in one
Part 6 - Multiple inputs and outputs, processing images as sets
Part 7 - The small, measured optimizations beneath the big ones

Two graphs in one

The previous post talked about the node graph execution. So far these posts have talked about the node graph as if it does one thing: push images through a chain of ImageMagick operations. But there’s a second graph living inside the same canvas, and it never touches an image at all.

If you’ve used Substance Designer or Blender’s geometry nodes, this will feel familiar. Some nodes carry the actual thing being processed - the image, the mesh, the texture. Other nodes just compute values: a number, a comparison, a bit of math whose result feeds into a parameter somewhere else. Both live in the same graph and connect with the same wires, but only one of them is doing the heavy lifting.

imgplex has exactly this split. There are image nodes - anything with an image or mask port, executed as ImageMagick operations - and pure-value nodes - math, logic, and constants that get evaluated in JavaScript and never spawn a single magick process. The thing that lets these two coexist cleanly on one canvas is that every port is typed.

Typed wires

Every port in the graph has a type, and every wire connecting two ports has to respect it. You can’t wire a color into a slider that expects a number - the canvas checks the connection as you make it and simply refuses the ones that don’t fit.

This is the same idea as a shader graph refusing to plug a float3 into a float input. Types kill off a whole category of mistakes before they can happen, and they make a dense graph readable at a glance. The validation covers a few cases: a port can’t connect to itself, a connection that would form a cycle is rejected, and the core types - image, number, boolean, string, path - only connect to compatible ports. There’s also an any type for nodes that are type-agnostic; once one wire lands on an any port, its siblings on that node are constrained to match, so a generic node can’t end up with a nonsensical mix of input types.

To make all of this legible, each type has its own wire color:

image - orange (the main pixel pipeline)
mask - purple (a grayscale image used as a mask)
path - mint green (a filesystem path)
number - cyan
string - green
bool - yellow
color - pink
vector2 / vector3 / vector4 - amber / indigo / teal

When you drag a wire out of a port, the preview wire is already tinted with the source type’s color, so before you’ve connected anything you can see what kind of value is flowing. In a busy graph this ends up being the fastest way to trace what connects to what - you just follow the color. The palette isn’t arbitrary either; each color was checked for AAA contrast against the dark node background, to ensure easy legibility.

The pure-value graph

Pure-value nodes are the ones with no image ports at all - their input and output lists are empty. Instead of pixels they deal in numbers, strings, booleans, and vectors. They come in a few families:

Value nodes - a constant. A Float node outputs a number you set; a Color node outputs a color. These are the leaves of the value graph.
Math nodes - Add, Subtract, Multiply, Divide, Power, Lerp. Ordinary arithmetic on wired-in values.
Vector nodes - build a vector from components, split one back apart, dot product, length, normalize.
Logic nodes - AND, OR, NOT, comparisons, and a Branch node that picks between two values based on a condition.
Properties nodes - these read a fact about the current image (its width, height, name, file size, bit depth) and output it as a value.

That last family is the bridge between the two graphs. A Properties node doesn’t process the image - it reads a fact about it and hands that fact to the value graph as a plain number or string.

How the two graphs connect

The connection point is parameters. Every editable parameter on an image node - the sigma on a Blur, the angle on a Rotate - exposes an input handle on its left side. That handle accepts a wire from any compatible value output.

So you can do things like: read an image’s width with a Dimensions node, halve it with a Math node, and wire the result straight into a Crop node’s parameter. The crop now adapts to each image’s size on its own, with no fixed numbers anywhere in the graph.

When the pipeline runs, it evaluates the whole value graph first, in dependency order, so that by the time an image node executes, every one of its parameters is already a concrete resolved value. The value graph is essentially a precomputation pass that front-loads all the arithmetic before any ImageMagick work begins - which is exactly why the image side of the engine can stay so simple. Every image node just needs its parameters handed to it as finished values; it never has to know they came from a chain of math nodes.

Three ways a node computes

Early on, every node was just a command_template - a fixed ImageMagick argument string with {{placeholder}} slots filled in from parameter values. That covers a surprising number of nodes: a Blur is -blur 0x{{sigma}} and nothing more.

But plain substitution hits a wall fast. What if a flag should only appear when a checkbox is on? What if two parameters need combining with a bit of math first? What if the node computes a value and never touches an image at all? So the model grew into three ways to define what a node does:

command_template - the simple case. A fixed argument string with placeholders, no logic. Most image nodes are this.
command_js - for image nodes that need logic. A small JavaScript snippet that receives the resolved parameters and returns an array of ImageMagick arguments. This is what lets a node conditionally add a flag, do arithmetic on a parameter, or format a value correctly before handing it to magick.
compute_js - the same idea for pure-value nodes. Instead of returning ImageMagick arguments it returns computed output values, which is how a custom math or logic node does its work without ever spawning a process.

All three are still just fields in a JSON file. Adding a node with conditional logic doesn’t mean touching the app’s source - you write the snippet inside the node definition, drop the file in, and it hot-reloads like any other node.

It’s worth being honest about what those snippets are, though. command_js and compute_js are real JavaScript, and they run in the app’s main process - the privileged side that can touch the filesystem and spawn processes. That power is the point, but it’s also a responsibility. Keeping executable logic inside node definitions is fine, because those are part of the app or files you’ve deliberately added. Workflow files are a different matter: they’re built to be shared and double-click-opened, which makes their contents untrusted input. Drawing a hard line so that executable code can only ever come from a trusted node definition, and never be smuggled in through a shared workflow, turned out to be worth a whole post of its own - that one’s coming later in the series.

The handful of genuinely complex nodes - the ones that split an image into four channel outputs, or resize with a mode switch - still use hardcoded executors in the app’s TypeScript. Multiple output ports and variable port shapes are the two things JSON alone can’t express. But a lot of behavior I initially assumed would need real code turned out to be expressible as a JSON node with a small command_js snippet.

Data-driven niceties

Two smaller things fall out of this typed, data-driven approach, and both make the node library feel less mechanical.

The first is conditional Inspector UI. A node can declare visibility rules in its JSON - show this parameter row only when that dropdown has a particular value. A Resize node in “percent” mode shows a scale slider; in “pixels” mode it shows a pixel-count box instead. This used to be hardcoded per node; moving it into a declarative rule means any node can opt into it without a line of application code.
The second is search. Node definitions can list alternate names, so typing “sharpen” or “blur” finds the right node even when its formal label is something you wouldn’t have guessed. It’s a tiny feature, but it’s the difference between a library you browse and one you can actually search.

Where this leaves the architecture

The two-graph model - a typed value graph feeding parameters into an image processing graph - is the part of the design I’m happiest with. It keeps the image pipeline dead simple, because every image node only ever sees finished parameter values, while making the graph itself genuinely programmable. Complex adaptive behavior emerges from wiring simple typed nodes together, rather than from any one node being clever.

It also happens to be why the workflow builder runs in a browser with no ImageMagick at all. The value graph is pure JavaScript, so you can build and wire an entire workflow in the web version and only need the real backend when it’s time to actually process pixels.

Next post in this series: Building imgplex: part 6 - Multiple inputs and outputs, processing images as sets

Building imgplex: part 4

Sat, 16 May 2026 00:00:00 GMT

This is a series of posts on building imgplex, best read in order:
Part 1 - The why, what, and how of imgplex
Part 2 - Getting things up and running
Part 3 - The node definition system
Part 4 - Executing the node graph, making it fast
Part 5 - Two graphs in one
Part 6 - Multiple inputs and outputs, processing images as sets
Part 7 - The small, measured optimizations beneath the big ones

From graph to commands

With nodes defined as JSON and a working graph editor as described in part 3, the next challenge was the pipeline engine: taking a connected graph of nodes and turning it into actual ImageMagick operations applied to actual images.

The first step is always a topological sort. Before executing anything, the engine needs to know the correct order to process nodes - each node has to run after all of its inputs are ready. This is the same problem a shader graph or a Houdini network solves: evaluate the leaf nodes first and work toward the output. Kahn’s algorithm handles the ordering cleanly, and it catches cycles as a side effect - a feedback loop in the graph is rejected before any processing starts.

Once the order is established, the engine walks the sorted list and resolves each node’s parameters. If a parameter has a wire coming into it from another node, the upstream value wins. If not, the value from the Inspector is used. The pure-value nodes - floats, math, string constants - are evaluated first, so that by the time an image node runs, every one of its parameters is already a concrete value.

For image nodes, those resolved parameters are turned into ImageMagick arguments and handed to a magick process. Image in, image out. That’s the whole loop.

The preview pipeline

The preview runs constantly - every parameter edit, every new connection, every time you click a different image in the filmstrip. The whole point is real-time feedback, so it has to be fast, and a few things make it fast.

First, it only does the work it needs to. The graph is trimmed to just the ancestors of the node you’re currently looking at - nothing downstream of the selected node, and nothing on an unrelated branch, gets evaluated. And it only runs on the single image you’ve got selected in the filmstrip, not the whole queue.

Second, the input is small. Rather than spawning a fresh process to produce a preview-resolution copy of the source, the preview pipeline reuses the thumbnail that was already generated when the image was imported - a WebP capped at 256px (user configurable) on its longest edge. The WebP format was chosen for the thumbnail for two reasons: it is very fast to write to, and even at 70–80% lossy compression it looks almost the same as the source image. That thumbnail already exists in the cache, so the preview starts from it directly and skips a magick spawn entirely on every single preview cycle. ImageMagick operations on a 256px image are trivially cheap compared to a 4K source, and for judging a brightness tweak or a crop the result is visually identical to the full-res output.

On top of that, the preview caches each node’s output, keyed by a hash of that node’s inputs and parameters. When you change something, only that node and its actual downstream dependents get invalidated and re-run; everything upstream, and every unrelated branch, serves its cached result immediately. It’s the same principle as incremental compilation or a shader cache - never recompute what hasn’t changed. Tweak one node, only that node and the nodes that depend on it re-evaluate. There’s a 80ms debounce on top, so dragging a slider settles before it fires a run rather than launching a hundred of them a second.

The spawn-cost problem

Here’s the thing that shaped most of the batch engine’s design. On Windows, launching a magick process costs a fixed overhead no matter how trivial the actual operation is - and at scale that overhead dominates everything else. Run it once and you won’t notice. Run it ten thousand times and it’s most of your runtime.

The naive approach (and the thing I did first) - one magick spawn per node, per image - falls apart quickly. A five-node graph over 2000 images is 10,000 process launches, and the spawn overhead alone would have the computer spending most of its time starting and stopping processes before touching a single pixel. Getting rid of that overhead took a few techniques stacked together.

The first is command fusion. ImageMagick can apply many operations in a single invocation - a resize, then a brightness adjustment, then a format conversion, all in one command. So the batch engine doesn’t spawn per node. It walks a chain of consecutive standard operations and accumulates them lazily into one argument list, only actually spawning magick when it hits something that forces a break: a branch where the image feeds two consumers, or a format change. A long linear chain of nodes collapses into a single process launch. Even channel splitting, which pulls the R, G, B, and A channels out as separate images, is done in one magick call rather than four.

The second is about the moments when a chain does have to break and write an intermediate file to disk. Those intermediates used to be written as PNG, which means every break point paid the cost of PNG-compressing the image on the way out and decompressing it on the way back in - pure overhead for a file that only exists for a few milliseconds. They’re now written as MIFF, ImageMagick’s own uncompressed native format, which skips the encode/decode entirely. Only the final output - the thing the user actually keeps - gets encoded to the real target format.

The third is parallelism. Instead of processing images one at a time, the batch runs several concurrently, with the worker count derived from the CPU core count. Each worker pulls the next image off the queue and processes it independently. The one subtlety here is that ImageMagick has its own internal multithreading, so if you run N workers and let each one spin up a full thread pool, you oversubscribe the CPU and everything gets slower. The fix is to divide ImageMagick’s thread budget across the workers so the total stays sensible.

Fast path, slow path

There’s one more optimization worth explaining, because it’s a nice example of letting the graph’s shape drive the strategy.

Most of the time, the operations applied to every image are identical - the same resize, the same adjustment, the same conversion. In that case the engine builds the operation plan once and reuses it verbatim for every image in the batch. That’s the fast path, and it skips per-image parameter evaluation entirely.

But some nodes read facts about the specific image they’re processing. The moment a graph contains one of those Properties nodes, the shared plan can’t be reused, because the plan genuinely differs per image. So the engine detects that case and switches to a slow path where it re-evaluates per image.

The interesting part is being precise about what “reading a fact about the image” actually costs. Some facts - the filename, the path, the file size - come straight from the filesystem and don’t require decoding the image at all. Others - width, height, bit depth - need ImageMagick to actually open the pixels. Early on, the file-size node was mistakenly flagged as needing full image metadata, which meant every image in a batch spawned a pair of magick identify processes just to answer a question the operating system already knew. Fixing that one flag took a 3,720-image batch from around two minutes down to about two-tenths of a second. The lesson generalizes: the engine should only pay for the information a node truly needs, and a surprising amount of what looks like image metadata is really just filesystem metadata wearing a costume.

Import performance

Loading a big folder into the filmstrip had the exact same spawn-cost problem - generating a thumbnail and reading dimensions for every image, one magick spawn at a time, is painfully slow for a folder of any real size. The fix had a few parts.

For the common formats - PNG, JPEG, BMP, WebP, TGA - the image dimensions can be read straight out of the file header in a handful of bytes, with no process spawn at all. That covers most of the texture formats that show up in a game dev folder.

Thumbnails are generated in batches: a single magick invocation produces thumbnails for eight images at once, so the launch cost is paid off across the group instead of paid per image. The thumbnails themselves are WebP, which keeps them small on disk and in memory. Import also runs with concurrent workers, same idea as the batch pipeline. And every generated thumbnail is cached to disk, keyed by the source path and its modification time, so re-importing the same folder in a later session skips the work entirely and loads straight from cache.

Together these took importing ~2000 mixed PNG/TGA/JPG/PSD images from around 44 seconds down to about 3 seconds - roughly a 15× speedup. The difference between that being an interruption and it being instant is the difference between a tool you reach for and one you avoid.

Non-fatal batch errors

One deliberate design decision: errors during a batch are non-fatal. If ImageMagick chokes on one image - a corrupt file, an unusual format edge case, a weird character in a path - the batch keeps going. The failure is recorded and shown in a summary dialog at the end, alongside the counts of images processed and skipped, and written to a timestamped log next to the output so there’s a permanent record of exactly what happened.

This is just how you’d want a batch tool to behave. Halting a 2000-image run because one file had a problem would be maddening. The summary gives you enough to go and investigate the failures afterward without ever interrupting the work that succeeded.

Next post in this series: Building imgplex: part 5 - Two graphs in one

Building imgplex: part 3

Sun, 03 May 2026 00:00:00 GMT

This is a series of posts on building imgplex, best read in order:
Part 1 - The why, what, and how of imgplex
Part 2 - Getting things up and running
Part 3 - The node definition system
Part 4 - Executing the node graph, making it fast
Part 5 - Two graphs in one
Part 6 - Multiple inputs and outputs, processing images as sets
Part 7 - The small, measured optimizations beneath the big ones

At the end of part 2 I had an Electron window with a working Svelte Flow canvas and a single node I could drag around - the stack was wired, but nothing actually did anything yet. The obvious next question was how nodes should work in the first place. imgplex is a node-based tool on top of ImageMagick, so the node system is the heart of the whole thing, and the decision I made here shaped everything that came after.

The node definition system

One of the earliest architectural decisions was how to define nodes. The question was: should nodes be hardcoded in TypeScript, or should they be described by data that the app loads at runtime?

The hardcoded approach is simpler to start with. But it means every new node requires a code change, a recompile, and a new release. For a tool where the node library is expected to grow - and where users might eventually define their own nodes - that’s a significant constraint.

The data-driven approach means each node is described by a JSON file that the app loads at startup. Adding a new node is as simple as dropping a new file into the node-definitions/ folder. No recompile required. In development, the registry even hot-reloads when a file changes, so you can iterate on a node definition and see the result in the running app immediately.

If you’ve used Unity’s ScriptableObjects to define data-driven game content - enemy stats, item definitions, ability configs - this is the same idea. The application code defines how to interpret the data; the data files define what exists.

What a node definition looks like

Here’s a simplified example - the Brightness/Contrast node:

{
  "id": "brightness_contrast",
  "version": "1.0.0",
  "label": "Brightness / Contrast",
  "description": "Adjust image brightness and contrast",
  "aliases": ["exposure", "levels adjustment"],
  "category": "Color",
  "inputs": [
    {
      "type": "image",
      "label": "Input"
    }
  ],
  "outputs": [
    {
      "type": "image",
      "label": "Output"
    }
  ],
  "params": [
    {
      "name": "brightness",
      "label": "Brightness",
      "type": "int",
      "widget": "slider",
      "default": 0,
      "min": -100,
      "max": 100
    },
    {
      "name": "contrast",
      "label": "Contrast",
      "type": "int",
      "widget": "slider",
      "default": 0,
      "min": -100,
      "max": 100
    }
  ],
  "command_template": "-brightness-contrast {{brightness}}x{{contrast}}"
}

The command_template field is an ImageMagick command with parameter placeholders in curly braces. At execution time the pipeline substitutes the actual parameter values and passes the result to magick. That’s the entire connection between a node and the image processing backend for the majority of nodes.

The params array drives three things simultaneously: the widget rows shown in the inspector panel, the input handles on the left side of the node card for wiring values from other nodes, and the CLI export - the same parameter values get substituted into the command when exporting a shell script.

The param-wire system

Each parameter with a writable value exposes an input handle on the left side of the node. That handle can accept a wire from any compatible output port elsewhere in the graph.

This is how pure-value nodes integrate with image processing nodes. A Float node outputs a constant number. A Math node can add two floats together. The result can be wired directly into the brightness param of the Brightness/Contrast node. The pipeline resolves the full value chain before executing any ImageMagick commands.

The type system enforces compatibility at connection time - you can’t wire a color value into a float input. Each type also has a distinct wire color, making it easy to visually trace what’s flowing where. The color choices were all validated against WCAG AAA contrast requirements on the dark node background, ensuring text is easily readable at any point.

The node registry

At startup the main process scans the node-definitions/ folder and loads every JSON file into a registry. The renderer requests the full list via IPC and uses it to populate the node library sidebar and to construct node cards in the graph editor.

The registry watches the folder for changes. Edit a JSON file, save it, and the node library updates in the running app without a restart. This made iterating on node definitions very fast - the feedback loop for tweaking parameter ranges or adding a new node was just a file save away. This also has been extended to packaged/production builds, so users can add nodes without a restart in the shipped app.

The registry also validates definitions on load. Missing required fields, unknown executor keys, or malformed parameter types are caught early and logged, rather than surfacing as cryptic runtime errors later.

What this enabled

The payoff of this approach became clear as the node library grew. Adding the entire Transform, Colors, and Filters categories was a matter of writing JSON files, not TypeScript. The logic for rendering node cards, building the inspector, handling wire connections, and generating CLI output was written once and works for every node automatically.

The categories that did require custom TypeScript are the ones with non-trivial behavior: the Channel Split node forks the image stream into multiple outputs, the Properties node reads per-image metadata, and so on. Everything else is just data.

Currently there are over 60 nodes across 12 categories, and the vast majority of them are pure JSON with a command_template. That ratio is the whole point of the system.

The file export formats are defined the same way - each format is just a new JSON file in the format-definitions/ folder.

Next post in this series: Building imgplex: part 4 - Executing node graph, making it fast

Building imgplex: part 2

Sat, 04 Apr 2026 00:00:00 GMT

This is a series of posts on building imgplex, best read in order:
Part 1 - The why, what, and how of imgplex
Part 2 - Getting things up and running
Part 3 - The node definition system
Part 4 - Executing the node graph, making it fast
Part 5 - Two graphs in one
Part 6 - Multiple inputs and outputs, processing images as sets
Part 7 - The small, measured optimizations beneath the big ones

In part 1 I covered why I’m building imgplex - a node-based batch image tool on top of ImageMagick - and how I landed on a stack of Electron, Vite, Svelte 5, and Svelte Flow. This post is where the plan meets reality: getting all four of those to actually run together, and setting up an architecture I wouldn’t have to fight later.

Setting up a web dev environment

With the tech stack decided, the first practical challenge was getting Electron, Vite, Svelte 5, and Svelte Flow all working together in a single project. Each of these has opinions about how they want to be configured, and they don’t always agree with each other out of the box.

This is where Claude came in handy - figuring out all the dependencies and setting up a development environment. The starting point was electron-vite - a scaffolding tool that pre-wires the project structure and build pipeline. Vite here is essentially the build system and dev server, similar to how Unity handles compilation and hot-reloading in the editor. electron-vite extends it to handle Electron’s requirements: the main process, the preload script, and the renderer all need to be compiled separately, and electron-vite sets all of that up so you can focus on building the actual app.

The available templates didn’t include Svelte, so I went with the Vanilla template and added Svelte manually. This was the right call anyway - there was nothing to rip out, just things to add.

The first dependency conflict came up immediately: the scaffold shipped with Vite 5, but the latest Svelte plugin requires Vite 6+. The fix was pinning to an older version of the plugin that supports Vite 5. Minor, but a preview of the kind of version-negotiation that comes with assembling a stack from several fast-moving libraries. Not unlike trying to get a specific version of a Unity package to work with a specific editor version.

The three-process mental model

This is the thing most worth understanding early. Electron applications have two completely separate JavaScript environments that cannot share memory or imports directly - and getting this wrong wastes a lot of debugging time.

The main process is essentially a Node.js application running on your machine. It can access the filesystem, spawn processes, and use any native library. Think of it as the backend, or the editor scripts side of things.

The renderer process is basically a Chrome browser tab. It renders the UI. It has no access to the filesystem or native APIs by default. Think of it as the runtime game side - it only knows what you explicitly give it.

The preload script is a small bridge between the two. It runs in a special privileged context that has access to both environments, and its only job is to expose a controlled, safe API from the main process to the renderer. In Electron terms this is called the contextBridge.

In practice: whenever the UI needs to do something that touches the filesystem or spawns an ImageMagick process, it calls through IPC (inter-process communication) to the main process, which does the actual work and sends the result back. The renderer never touches files directly.

This is actually a clean architectural separation once you internalize it. The UI just asks for things, the backend does them. It maps reasonably well to how you’d separate gameplay logic from engine systems in a well-structured game codebase.

Folder structure

Rather than letting the scaffold dictate the structure, I set up the folder layout from the spec upfront:

electron/          - main process and preload script
src/shared/        - types and constants shared by both processes
src/main/          - Node.js business logic (pipeline, registry, IPC handlers)
src/renderer/      - Svelte application (all the UI)
node-definitions/  - JSON node descriptor files (loaded at runtime)

The src/shared/ folder is particularly important. Any type that crosses the IPC boundary - node definitions, graph state, pipeline progress events - is defined there so both sides of the app stay in sync. TypeScript catches mismatches at compile time, which saves a lot of runtime debugging.

First working state

The milestone for this phase was an Electron window showing a working Svelte Flow canvas with a node I could drag around. Nothing processed images yet - the goal was just confirming the full stack was correctly wired and hot-reload was working across all three bundles.

Once that was running, the foundation was solid enough to start building real features on top of.

Next post in this series: Building imgplex: part 3 - The node definition system

Building imgplex: part 1

Fri, 20 Mar 2026 00:00:00 GMT

This is a series of posts on building imgplex, best read in order:
Part 1 - The why, what, and how of imgplex
Part 2 - Getting things up and running
Part 3 - The node definition system
Part 4 - Executing the node graph, making it fast
Part 5 - Two graphs in one
Part 6 - Multiple inputs and outputs, processing images as sets
Part 7 - The small, measured optimizations beneath the big ones

imgplex is a node-based batch image workflow creator and processor. All the processing is handled by ImageMagick, imgplex just makes the string of commands needed to pass onto ImageMagick.

Disclaimer: imgplex has been developed with help from AI tools. But this isn’t a ‘vibe coded’ project - AI didn’t write all of the code, and a lot of thought, research, and planning went into development of this application to keep the development properly planned and organized. A lot of effort has also gone into keeping the UX and batch processing performance optimal. This series of blog posts will describe the development process of imgplex.

The why

In game development we do a lot of texture processing as part of the content pipeline, whether it is creating textures or processing textures from asset packs. As a technical artist I’ve made a bunch of tools of various sorts to automate the process as much as possible, but the use cases are far too different to be covered effectively by a few tools.

While with the advent of AI tools it has become easier to make purpose built tools to solve a specific problem, it rapidly becomes a jumble of various tools that do just one job, are scattered, and not documented.

I’ve used ImageMagick as a solution for some of these problems. It is incredibly powerful, and you can pipe commands to make complex operations happen in one go. But it has the limitations of being a command line tool: scary for non-technical people, and no preview stage before processing. A lot of image processing we do is visual in nature, and command line tools don’t give interactive preview controls that artists love to use. I also want something artists can use independently, without needing a technical person or an AI tool.

The what

I decided to build the tool on top of ImageMagick since it can do literally everything we need in terms of image processing. The idea was to use a node based editor to generate ImageMagick commands. Node based editors are common in game development workflows, we have node based editors for shaders, VFX, and procedural content that artists are already comfortable with (Shader and VFX graph, Houdini, Substance Designer, Blender’s geometry nodes etc).

I also happen to have a lot of experience with image editing and node based tools as a result of my game development career, and it seemed like a natural fit.

A node based image processing workflow editor would be easy to extend as well: a node is just a representation of the ImageMagick command. I also decided that the node definitions should be kept as JSON files, so that users can add nodes without the application needing a recompile.

There’s another advantage too to this approach, since the tool is used to define just the workflow and not handle image processing, I can run the processing headlessly in automated workflows: just export the commands from the workflow to a script.

The how

With the plan decided, I started research on the tech stack to make the tool. The core tension was native performance vs development speed: I wanted to build the tool, not fight the tooling.

I evaluated several approaches:

C++ with Qt was the obvious professional choice: great node editor libraries, excellent performance. But learning a new language and framework simultaneously (I’m experienced with C# and Python, but not much with C++) alongside a complex project was a recipe for never shipping anything.
Tauri looked promising: lightweight, Rust backend, web frontend. But two concrete blockers killed it: an IPC bottleneck where serialising image data as strings benchmarked at ~200ms per 3MB; and the sidecar lifecycle complexity of managing a separate processing process.
I gave serious thought to using Godot: GPU shaders for preview would be great. It was rejected because desktop UI infrastructure (inputs, dropdowns, file dialogs) would take weeks to build.
Electron won. It does get criticism for bundle size and memory overhead, but for a professional tool that will handle gigabytes of image data, a 200MB install and some extra RAM usage are irrelevant. What matters is: consistent rendering across platforms, Node.js built in (no sidecar needed), and a mature ecosystem. I could get something working really quickly and iterate from there.
For the node graph editor, Svelte Flow beat out LiteGraph.js - the library used by ComfyUI - because LiteGraph’s original repo hasn’t been maintained in years and ComfyUI operates off a divergent fork. Svelte Flow is MIT-licensed, actively maintained, and integrates natively with the UI framework.
For the UI framework, I chose Svelte 5 over React: less boilerplate, no virtual DOM, and native integration with Svelte Flow’s $state.raw requirement for graph state.

Next post in this series: Building imgplex: part 2 - Getting things up and running