3 Hunspell is a free spell checker and morphological analyzer library
4 and command-line tool, licensed under LGPL/GPL/MPL tri-license.
6 Hunspell is used by LibreOffice office suite, free browsers, like
7 Mozilla Firefox and Google Chrome, and other tools and OSes, like
8 Linux distributions and macOS. It is also a command-line tool for
9 Linux, Unix-like and other OSes.
11 It is designed for quick and high quality spell checking and
12 correcting for languages with word-level writing system,
13 including languages with rich morphology, complex word compounding
14 and character encoding.
16 Hunspell interfaces: Ispell-like terminal interface using Curses
17 library, Ispell pipe interface, C++/C APIs and shared library, also
18 with existing language bindings for other programming languages.
20 Hunspell's code base comes from OpenOffice.org's MySpell library,
21 developed by Kevin Hendricks (originally a C++ reimplementation of
22 spell checking and affixation of Geoff Kuenning's International
23 Ispell from scratch, later extended with eg. n-gram suggestions),
24 see http://lingucomponent.openoffice.org/MySpell-3.zip, and
25 its README, CONTRIBUTORS and license.readme (here: license.myspell) files.
27 Main features of Hunspell library, developed by László Németh:
30 - Highly customizable suggestions: word-part replacement tables and
31 stem-level phonetic and other alternative transcriptions to recognize
32 and fix all typical misspellings, don't suggest offensive words etc.
33 - Complex morphology: dictionary and affix homonyms; twofold affix
34 stripping to handle inflectional and derivational morpheme groups for
35 agglutinative languages, like Azeri, Basque, Estonian, Finnish, Hungarian,
36 Turkish; 64 thousand affix classes with arbitrary number of affixes;
37 conditional affixes, circumfixes, fogemorphemes, zero morphemes,
38 virtual dictionary stems, forbidden words to avoid overgeneration etc.
39 - Handling complex compounds (for example, for Finno-Ugric, German and
40 Indo-Aryan languages): recognizing compounds made of arbitrary
41 number of words, handle affixation within compounds etc.
42 - Custom dictionaries with affixation
44 - Morphological analysis (in custom item and arrangement style)
45 - Morphological generation
46 - SPELLML XML API over plain spell() API function for easier integration
47 of stemming, morpological generation and custom dictionaries with affixation
48 - Language specific algorithms, like special casing of Azeri or Turkish
49 dotted i and German sharp s, and special compound rules of Hungarian.
51 Main features of Hunspell command line tool, developed by László Németh:
53 - Reimplementation of quick interactive interface of Geoff Kuenning's Ispell
54 - Parsing formats: text, OpenDocument, TeX/LaTeX, HTML/SGML/XML, nroff/troff
55 - Custom dictionaries with optional affixation, specified by a model word
56 - Multiple dictionary usage (for example hunspell -d en_US,de_DE,de_medical)
57 - Various filtering options (bad or good words/lines)
58 - Morphological analysis (option -m)
59 - Stemming (option -s)
61 See man hunspell, man 3 hunspell, man 5 hunspell for complete manual.
65 Build only dependencies:
67 g++ make autoconf automake autopoint libtool
71 | | Mandatory | Optional |
72 |---------------|------------------|------------------|
74 |hunspell tool | libiconv gettext | ncurses readline |
76 # Compiling on GNU/Linux and Unixes
78 We first need to download the dependencies. On Linux, `gettext` and
79 `libiconv` are part of the standard library. On other Unixes we
80 need to manually install them.
84 sudo apt install autoconf automake autopoint libtool
86 Then run the following commands:
94 For dictionary development, use the `--with-warnings` option of
97 For interactive user interface of Hunspell executable, use the
100 Optional developer packages:
102 - ncurses (need for --with-ui), eg. libncursesw5 for UTF-8
103 - readline (for fancy input line editing, configure parameter:
106 In Ubuntu, the packages are:
108 libncurses5-dev libreadline-dev
110 # Compiling on OSX and macOS
112 On macOS for compiler always use `clang` and not `g++` because Homebrew
113 dependencies are build with that.
115 brew install autoconf automake libtool gettext
116 brew link gettext --force
118 Then run autoreconf, configure, make. See above.
120 # Compiling on Windows
122 ## Compiling with Mingw64 and MSYS2
124 Download Msys2, update everything and install the following
127 pacman -S base-devel mingw-w64-x86_64-toolchain mingw-w64-x86_64-libtool
129 Open Mingw-w64 Win64 prompt and compile the same way as on Linux, see
132 ## Compiling in Cygwin environment
134 Download and install Cygwin environment for Windows with the following
141 - gcc-g++ development package
142 - ncurses, readline (for user interface)
143 - iconv (character conversion)
145 Then compile the same way as on Linux. Cygwin builds depend on
150 It is recommended to install a debug build of the standard library:
154 For debugging we need to create a debug build and then we need to start
157 ./configure CXXFLAGS='-g -O0 -Wall -Wextra'
159 ./libtool --mode=execute gdb src/tools/hunspell
161 You can also pass the `CXXFLAGS` directly to `make` without calling
162 `./configure`, but we don't recommend this way during long development
165 If you like to develop and debug with an IDE, see documentation at
166 https://github.com/hunspell/hunspell/wiki/IDE-Setup
170 Testing Hunspell (see tests in tests/ subdirectory):
174 or with Valgrind debugger:
177 VALGRIND=[Valgrind_tool] make check
182 VALGRIND=memcheck make check
186 features and dictionary format:
192 http://hunspell.github.io/
196 After compiling and installing (see INSTALL) you can run the Hunspell
197 spell checker (compiled with user interface) with a Hunspell or Myspell
200 hunspell -d en_US text.txt
202 or without interface:
205 hunspell -d en_GB -l <text.txt
207 Dictionaries consist of an affix (.aff) and dictionary (.dic) file, for
208 example, download American English dictionary files of LibreOffice
209 (older version, but with stemming and morphological generation) with
211 wget -O en_US.aff https://cgit.freedesktop.org/libreoffice/dictionaries/plain/en/en_US.aff?id=a4473e06b56bfe35187e302754f6baaa8d75e54f
212 wget -O en_US.dic https://cgit.freedesktop.org/libreoffice/dictionaries/plain/en/en_US.dic?id=a4473e06b56bfe35187e302754f6baaa8d75e54f
214 and with command line input and output, it's possible to check its work quickly,
215 for example with the input words "example", "examples", "teached" and
216 "verybaaaaaaaaaaaaaaaaaaaaaad":
227 & teached 9 0: taught, teased, reached, teaches, teacher, leached, beached
229 verybaaaaaaaaaaaaaaaaaaaaaad
230 # verybaaaaaaaaaaaaaaaaaaaaaad 0
232 Where in the output, `*` and `+` mean correct (accepted) words (`*` = dictionary stem,
233 `+` = affixed forms of the following dictionary stem), and
234 `&` and `#` mean bad (rejected) words (`&` = with suggestions, `#` = without suggestions)
237 Example for stemming:
239 $ hunspell -d en_US -s
243 Example for morphological analysis (very limited with this English dictionary):
245 $ hunspell -d en_US -m
250 cats st:cat ts:0 is:Ns
251 cats st:cat ts:0 is:Vs
255 The src/tools directory contains the following executables after compiling.
257 - The main executable:
258 - hunspell: main program for spell checking and others (see
261 - analyze: example of spell checking, stemming and morphological
263 - chmorph: example of automatic morphological generation and
265 - example: example of spell checking and suggestion
266 - Tools for dictionary development:
267 - affixcompress: dictionary generation from large (millions of
269 - makealias: alias compression (Hunspell only, not back compatible
271 - wordforms: word generation (Hunspell version of unmunch)
272 - hunzip: decompressor of hzip format
273 - hzip: compressor of hzip format
274 - munch (DEPRECATED, use affixcompress): dictionary generation
275 from vocabularies (it needs an affix file, too).
276 - unmunch (DEPRECATED, use wordforms): list all recognized words
277 of a MySpell dictionary
279 Example for morphological generation:
281 $ ~/hunspell/src/tools/analyze en_US.aff en_US.dic /dev/stdin
283 generate(cat, mice) = cats
285 generate(mouse, cats) = mice
286 generate(mouse, cats) = mouses
288 # Using Hunspell library with GCC
290 Including in your program:
292 #include <hunspell.hxx>
294 Linking with Hunspell static library:
296 g++ -lhunspell-1.7 example.cxx
297 # or better, use pkg-config
298 g++ $(pkg-config --cflags --libs hunspell) example.cxx
302 Hunspell (MySpell) dictionaries:
304 - https://wiki.documentfoundation.org/Language_support_of_LibreOffice
305 - http://cgit.freedesktop.org/libreoffice/dictionaries
306 - http://extensions.libreoffice.org
307 - http://extensions.openoffice.org
308 - http://wiki.services.openoffice.org/wiki/Dictionaries
310 Aspell dictionaries (conversion: man 5 hunspell):
312 - ftp://ftp.gnu.org/gnu/aspell/dict
314 László Németh, nemeth at numbertext org