Short Answer: Not one, but two wiped drives.
Yesterday I met someone who was about to board flight to thousands of miles away, but not before they could extract data out of an ancient looking drive which turned out to have some bad sectors in it. They thanked me profusely, as I broke out ddrescue and then rsync-ed the data to safety. But my smile was strained and eyes haunted, as I took to reminisce that fateful January day almost exactly two years ago. Time healed most of the tragedy, so I guess what's left is hilarity.
The title is sorta clickbait-y. It's true this doesn't happen if I didn't add a trailing slash after the directory name (doing so however is a most normal thing to do). But this disaster owes to at least two other questionable choices, none of them fatal on their own, but only came to life when combined together.
The premise is similar. I had an external drive, one that contained a folder which I wanted to sync with my local folder in home dir. For full disclosure, it was 7 in the morning, I was sleep deprived which is a recurring theme in all mess ups. Here is how it unfold:
I mounted the drive at ~/usb/
with gid=1000,uid=1000
Frankly, I don't remember why I did this. At home I am not root (I use sudo or doas recently), and sometimes it gets in the way of convenience. I do remember doing something more than just syncing, maybe that's why. Also note how the mounted folder is right in the home dir. If it was anywhere isolated and sane like /media/usb/
then the disaster could still be partially avoided (though not fully, as you will see).
I used rsync with --delete
flag
Once again, this by itself was not the problem, but was even by design. The folder in the external drive (let's call it A
) was the up-to-date version to which we wanted to sync the local folder (B
). This meant not just updating files that were old, but also deleting files that were in B
but not in A
.
No, the problem was blindly using a function from my ~/.bashrc
that I must have wrote some time ago. When I wrote that, I probably thought either it's self-explanatory or I will always remember the pitfalls. But afterwards, not only did I not remember, I took the fact that "I wrote it" to be enough of a reason to completely turn off my paranoia.
my_sync ()
{
rsync -auP --exclude=.git --inplace --delete "$@"
}
The Crime Scene
I only did this:
cd usb/
my_sync folder/ ~/
What I expected: The ~/usb/folder/
to replace or sync the ~/folder/
, something that would be consistent with how mv
or cp
works.
What happened: Sheer carnage.
Actually, we can replicate the crime scene safely thanks to containers:
λ podman run --rm -it alpine:edge
/ # export PS1='\n\w $ '
/ $ apk update && apk add rsync tree
fetch http://dl-cdn.alpinelinux.org/alpine/edge/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/edge/community/x86_64/APKINDEX.tar.gz
v20200117-76-g414239ec20 [http://dl-cdn.alpinelinux.org/alpine/edge/main]
v20200117-77-g4963c038fa [http://dl-cdn.alpinelinux.org/alpine/edge/community]
OK: 11463 distinct packages available
(1/4) Installing libacl (2.2.53-r0)
(2/4) Installing popt (1.16-r7)
(3/4) Installing rsync (3.1.3-r3)
(4/4) Installing tree (1.8.0-r0)
Executing busybox-1.31.1-r8.trigger
OK: 6 MiB in 18 packages
/ $ my_sync() { rsync -aPu --inplace --delete "$@" ; }
/ $ cd ~
~ $ seq 1 3 | xargs touch # Create random files in $HOME
~ $ mkdir foo && touch foo/bar
~ $ mkdir usb && cd usb
~/usb $ mkdir foo && touch foo/baz
~/usb $ tree ~/
/root/
├── 1
├── 2
├── 3
├── foo
│ └── bar
└── usb
└── foo
└── baz
3 directories, 5 files
If one omits the slash, it works as intended:
~/usb $ my_sync foo ~/
sending incremental file list
deleting foo/bar
foo/baz
0 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=0/2)
~/usb $ tree ~/
/root/
├── 1
├── 2
├── 3
├── foo
│ └── baz
└── usb
└── foo
└── baz
3 directories, 5 files
But what happened instead with the slash (which your shell auto-completes to, I should add):
~/usb $ my_sync foo/ ~/
sending incremental file list
deleting usb/foo/baz
deleting usb/foo/
deleting usb/
deleting foo/bar
deleting foo/
deleting 3
deleting 2
deleting 1
deleting .ash_history
./
file has vanished: "/root/usb/foo/baz"
rsync warning: some files vanished before they could be transferred (code 24) at main.c(1189) [sender=3.1.3]
sh: getcwd: No such file or directory
….Just wow. Everything in $HOME
dir got recursively nuked, which incidentally also included the drive I just mounted with user permissions. Both data and backup gone in one stroke. Double kill!
I can only thank my lucky stars that the original function included the -P
flag so I got an inkling of what's happening, and could do C-c
…. after recovering enough motor function from sheer mind numbing disbelief and panic (which was ample time to do significant damage).
But WHY?
Well, Unix may not be Plan9, but a directory is just a file. What's not a file is a process. Each process are kings in their own meta realm where they can make up their own rules. That said almost all of them agreed to follow certain conventions (such as for defensive reasons), and one of them is to treat a trailing slash as an intent to differentiate a directory from plain file. For example, say you want to do something like:
mv file dir/
But say you forgot to add the slash, if dir
doesn't exist then file
will be renamed; it's not what you want but the result is not catastrophic. But if you added the slash and the dir doesn't exist, then you will get a nice warning like: mv: can't rename 'file': Not a directory
. The slash in the source doesn't matter, and even in destination case mv uses it to do some safety check.
But our boy rsync
uses this convention to add some functional and not necessarily safety related semantics: trailing slash in destination doesn't matter, but in the source it does; trailing slash means the content of the source dir should be synced with destination, and only when there isn't one then it implies the source dir itself (with its content) is to be synced.
What happened above was simple: there was no file named bar
in the $HOME
dir. So rsync simply deleted everything that's not ~/bar
which means everything including the original ~/usb/
folder.
Post-Mortem
It was quite traumatic for me. What I took away from it was:
- Rsync is not a
mv
orcp
replacement for me (though progress indicator can admittedly be an integral improvement over those two). And for the usecase where it shines, well I use syncthing for dropbox purposes, and restic for backup. - Mistakes will happen, try as I might to avert. But that's why rsync had a
--dry-run
flag, it was idiotic to not try that first. - The notion that "I wrote it so it must be fine" is a nice way to bypass your useful scepticism. Don't fall for that.