An Old Question
What’s the difference between aspell, ispell, myspell, gspell, gtkspell, hunspell, uspell, hspell? The answer always was: I dread to even care to know. Just one of those things it’s better to be willfully ignorant of.
Then suddenly I needed to know.
Spell checking for my native chat client
Lately I have grown very fond of weechat. I am not exactly like this guy, but I have always had an affinity for IRC. Is it dying? Sure. But a lot of chatting services coming en vogue are just well embellished IRC, it probably doesn’t take much to bridge the gap. Gitter works seamlessly here. Slack offers a gateway too I think. If you use BitlBee then suddenly you have an array of protocols supported through plugins. And with the much anticipated Matrix protocol in the offing, the possibilities are vast and weechat has a role to play right now.
But that’s IRC, I love weechat in particular. A software after my own heart, very modular and customisable. Can be extended through the fantastic scripting support (with Perl, Python, Ruby, Lua, Guile and tcl). I spent a bit of last night getting notification work through Dunst (and libnotify). There are fair number of options but I went with notifym.pl. It’s just simple, powerful and even kinda fun.
It’s no wonder that the spell checking support is also delegated to
a plugin. Now, I tried weechat in Manjaro and it worked out of the box,
but my daily driver is Void
Linux and one of the things I
love it for is its package granularity. Was hardly surprising that the
plugins are neatly separated into
subpackages.
You only install what you need, and the dependencies are automatically
accounted for. So I installed the plugin and idly observed that a new
package called enchant
was also installed.
Except it didn’t work. Doing /aspell listdict
shows that it can’t even
find a dictionary but I could swear that I have the package aspell
and
aspell-en
installed. Doing aspell dicts
in the shell reveals that I
do have the dict files in the system and aspell can find them. In fact I
use them in Emacs so why can’t weechat find them? This reeks of a path
issue, but my system
places
the dict files in /usr/share/dict
which is pretty standard, right?
The thick plottens
So what the hell? Instead of being logical, I went after a lot of red
herrings, but that’s only obvious in retrospect. Not much was clear in
the source of the plugin, except it seemed to be delegating the task of
finding dictionaries to libenchant
. More reading revealed enchant
to
be not a spellchecker in itself, but more of an intermediate layer that
unifies all other incompatible programs under a stable API. Nice, now
what’s wrong here?
Enchant was definitely not finding my aspell dicts. To drive home the point:
$ echo 'ths is misspllt' | enchant -a
@(#) International Ispell Version 3.1.20 (but really Enchant 1.6.0)
Couldn't create a dictionary for en_GB.UTF-8
$
Or in Python:
>>> import enchant
>>> enchant.listdicts()
[]
>>>
I still didn’t know what to make of that. Also, why RTFM when you have got
strace
? /s
$ strace 2> log weechat
$ vim log
Strace notes down all the system calls a program makes during runtime. I asked aspell plugin to list dictionaries once again, and promptly exited weechat. Then went backward through log and immediately found something:
open("/home/natrys/.config/enchant/myspell", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/home/natrys/.enchant/myspell", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/local/share/myspell/dicts", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/share/myspell/dicts", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/share/enchant/myspell", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/share/hunspell", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
So those are where it looked. I don’t know what myspell
is, though
I have heard of hunspell
. Enchant manual said it supports aspell and
myspell, but no mention of hunspell. And also:
$ enchant-lsmod
myspell (Myspell Provider)
But my package manager shows there is no such thing as myspell
in void
repo. I hit google, the entry in Debian repo for it is a dummy
transitional package for …. hunspell?! Wiki clarified, myspell was a
predecessor of hunspell and the latter is backward compatible so things
are kinda rotten. I installed a hunspell dict and now everything works. Now I
see that void wiki even has a related
entry, but it’s typically
a bit threadbare so didn’t think of looking into it at first.
So, just two questions remain. Why do the weechat people call it ‘aspell’ plugin then? It makes no particular sense. I guess the answer is that initially they only supported aspell, hence the name. But at some later point switched to libenchant.
And secondly, enchant should still work with aspell. Why does it not? The answer turned out to be in the template that builds enchant, specifically here:
configure_args="--disable-zemberek --disable-ispell --disable-aspell --with-myspell-dir=/usr/share/hunspell"
Ah so my distro explicitly disables it. Found the exact pull request where that happened, it’s a couple of years already. I suppose the reasoning provided does make sense.
Here is a reddit thread that I wish I found sooner:
And for an alternative perspective which I am not sure what to make of:
In the end, things are still sometimes problematic. But solutions are usually also in plain sight.