Shell: Spawn a Program in Background that Reads from STDIN Indefinitely

April 26, 2022

Categories: Technical Tags: hugo linux shell

This wasn't a contrived requirement, rather one that came up in real life.

This blog used to be a 100% javascript free static site generated with hugo styled with 100% "normal" CSS. Why? Because I hated javascript, and I hated external dependencies. There was also an ideological angle to it, back then I was a full on minimalism hipster and everything was bloat. And now, I don't really hate Javascript any more. I think modern JS is a much better language than say Python. But I still do hate the tooling and the ecosystem. However that's beginning to change with things like esbuild which is genuinely incredible (and ironically written in go). I have also maybe drunk the kool-aid and switched to Tailwind. I wouldn't want to justify that choice, beyond that I think Tailwind is still semantically very close to CSS, and for me it's purely a productivity choice.

One great aspect of esbuild is that it's a single binary. This has allowed me to completely ditch npm/yarn as build tools, I would much rather deal in shell scripts and Makefiles. You can couple them with any off the shelf inotify and live reloader tools and you can have a complete dev experience targeting browser without involving nodejs. For example, I personally use these two sister projects: modd (inotify) and devd (livereload and basic reverse proxy). Some other big projects like Phoenix have ditched nodejs on the basis of esbuild. Phoenix even has a semi official plugin for Tailwind which basically is a wrapper around official standalone tailwind binary. And that official tailwind binary is basically tailwind and some plugins bundled with nodejs using vercel's pkg.

Is that really nodejs free if we are bundling nodejs? Well this post isn't about standalone native binaries, it's about generic CLI programs that run in shell. This can include those that run on nodejs invoked from npm or whatever so technically all of this is irrelevant. But it's a setup to explain what I have been doing. I was trying to write a shell script that i) spawns esbuild in watch mode in background ii) spawns tailwind in watch and jit mode in background ii) starts hugo server in foreground iv) and once hugo is manually exited with C-c, the script cleans up the background processes.

Three different shell sessions for these three programs would make that preamble redundant, but I do want to see errors from each programs in same place, hence same shell session. But it still shouldn't be a problem right?

Why not just spawn them in background normally?

Like,

esbuild js/index.jsx --target=es2020 --outdir=dist --sourcemap=inline --watch &

tailwindcss --config=tailwind.config.js --input=css/tailwind.css --output=dist/app.css --watch &

hugo server --watch # runs in foreground as you work

# cleanup
pkill esbuild
pkill tailwindcss

This doesn't work because of special semantics around STDIN. These daemon programs must be able to read from STDIN (the why will be shortly explained). But in shell, commands that read from the STDIN cannot be sent to background because then there is no way for them to read input when detached from controlling terminal. And OS simply takes that as permission to send SIGTTIN signal to the process if they even try to, which stops the process. Here is a good read:

http://curiousthing.org/sigttin-sigttou-deep-dive-linux

In short, doing it like above won't work, because esbuild and tailwind process immediately end up at Stopped State (T).

Okay, why not just specify no STDIN with either redirecting /dev/null to it, or running under nohup?

Like,

esbuild js/index.jsx --target=es2020 --outdir=dist --sourcemap=inline --watch </dev/null &

tailwindcss --config=tailwind.config.js --input=css/tailwind.css --output=dist/app.css --watch </dev/null &

# and so on

The /dev/null as input is interesting. It instantly sends EOF to the program, which disconnects the input from terminal immediately. This is better than before because the program is still running. And this is used a lot to launch programs in the background so that you immediately get a shell prompt back (without </dev/null the program would still keep trying to read from terminal despite being backgrounded).

Unfortunately, when it comes to esbuild and tailwind (and probably many other daemons like them), we can't do that. Because to them STDIN has special meaning. As long as they can read from STDIN, they interpret that as their parent process being still alive. This convention came up because quite often parent dies without managing children life cycle, which leaves inadvertent dangling processes. As far as I know, there is no standard (e.g. POSIX specified) mechanism for notifying child of parent's death. A common solution involve maintaining a pipe between parent/child. When parent's end closes, then child receives EOF when reading. This way child processes can clean themselves up knowing parent is dead, if convention is followed.

It's a common convention though by no means prevalent. In fact neither esbuild nor tailwind followed it, until Jose Valim's recent PR to both of these projects to get them to work reliably under Phoenix. The PR for esbuild is a particularly great read on this matter:

https://github.com/evanw/esbuild/pull/1449

In short, these processes need to be able to read on STDIN indefinitely. So </dev/null doesn't work.

What about </dev/zero?

This actually works. But I also get terrible CPU usage from both of them. I think that's because they keep trying to read, and /dev/zero happily keeps sending nonsense. Which is immediately discarded because the content itself doesn't really matter, but the whole thing is still too much of unnecessary work. It's best if there is nothing to read and so the reading thread remains blocked.

My Solution

Not sure how you would do it, but blocking on read immediately reminded me to use FIFO (named pipe). But a process blocking on reading without a process writing doesn't even start, so we need a bogus process first.

fifo=/tmp/silly

[ -p $fifo ] || mkfifo $fifo
/usr/bin/sleep infinity </dev/null > $fifo &

Notice the absolute path, we don't want the shell builtin. Also we definitely want to read on /dev/null with the sleep because we want it to relinquish terminal's input control.

And now, all we need to do is to read that FIFO from the background processes instead.

esbuild js/index.jsx --target=es2020 --outdir=dist --sourcemap=inline --watch <$fifo &

tailwindcss --config=tailwind.config.js --input=css/tailwind.css --output=dist/app.css --watch <$fifo &

hugo server --watch # runs in foreground as you work

# cleanup
# just killing the sleep closes pipe, which in turn sends EOF to reading processes
# esbuild and tailwind now knows to clean themselves up when that happens!
kill %1

The cascade after kill %1 is a nice touch. In general one might want to play it safe and do:

kill $(jobs -p)

# or

jobs -p | xargs kill

And while that works in bash, I notice it doesn't work in (m)ksh. Neat reminder that life sucks.