3rdparty/hunspell/1.7.0/NEWS

   1 2018-11-12: Hunspell 1.7.0 release:
   2
   3   New features and bug fixes by László Németh, supported by FSF.hu Foundation:
   4
   5   - No annoying suggestion times any more, especially in languages with
   6     compound word handling and complex morphology. By adding balanced
   7     multi-level time limits, now the guaranteed suggestion time is there
   8     within half a second, not seconds (nor dozen of seconds or more
   9     in extreme cases) for longer misspellings, too.
  10
  11   - add SPELLML support for run-time dictionary extension with optional
  12     affixation of user words. See new "Grammar By" feature of
  13     language-specific user dictionaries of LibreOffice 6.0:
  14
  15     News: https://wiki.documentfoundation.org/ReleaseNotes/6.0#.E2.80.9CGrammar_By.E2.80.9D_spell_checking
  16
  17     Screencast with English example: https://www.youtube.com/watch?v=EsS3gaBTfOo
  18
  19     Screencast with German example: https://www.youtube.com/watch?v=aYVFDqCUb6I
  20
  21   - Improved, highly customizable suggestions on level of dictionary words:
  22     Pronunciations and typical misspellings defined by optional "ph:" fields of
  23     the dictionary words are used not only in n-gram suggestions, but as
  24     elements of the REP replacement list getting the highest priority in normal
  25     suggestions, also giving the best suggestions for short words, too.
  26     More information: see "ph:" in man 5 hunspell.
  27
  28   - Handling multiple word suggestions is much more easier. Like in a
  29     traditional spelling dictionary, for example, to get the correct suggestion
  30     "a lot" for the typical misspelling "alot" at the first place, now it's
  31     enough to put the following line to the dic(tionary) file:
  32
  33     a lot
  34
  35   - Limit compound overgeneration by dictionary based word pairs:
  36     Now it's possible to filter bad compound words by listing
  37     the correct word pairs with space in the dictionary, as in a traditional
  38     spelling dictionary.
  39
  40   - clean-up suggestion:
  41
  42     - no n-gram and compound word suggestions, if "good" suggestion
  43       exists, ie. uppercase, REP, ph: or dictionary word pair suggestions
  44
  45     - word pairs are always suggested, if they exist in the dic file
  46
  47     - word pairs have top priority in suggestions, and
  48       these are the only suggestions if there is no other good suggestion.
  49
  50     - also dictionary word pairs separated by dash instead of space
  51       are handled specially in two-word suggestion (depending from the
  52       language)
  53
  54    - limit bad suggestions by improved n-gram suggestion rules:
  55
  56      don't suggest capitalized dictionary words for lower
  57      case misspellings in n-gram suggestions, except
  58
  59      - PHONE usage, or
  60      - in the case of German, where not only proper
  61        nouns are capitalized, or
  62      - the capitalized word has special pronunciation
  63
  64      and don't suggest if the difference of lengths of misspellings and
  65      suggestions is 5 or more characters.
  66
  67   - Extend dotless i and dotted I rules to Crimean Tatar language
  68     Allow dotted I in dictionary, and disable bad capitalization of i.
  69
  70   - BREAK: extended recursive word breaking algorithm to handle words or
  71     words with suffixes when they already contain word break characters,
  72     for example, "e-mail" is a dictionary word with a word break character, and
  73     it wasn't accepted before in compounds in some languages.
  74
  75   - FORBIDDENWORD precedes BREAK: Now it's possible to forbid compound
  76     forms recognized by BREAK word breaking by adding the bad compounds to
  77     the dictionary with FORBIDDENWORD flags.
  78
  79   - lower limit for "doubletwochars" suggestion algorithm:
  80     one of the typical misspellings recognized by Hunspell suggestion
  81     mechanism is the syllable duplication. Along the old pattern
  82     ABABA -> ABA, for example nutrITITIon -> nutrITIon, now also the
  83     simpler ABAB -> AB pattern is recognized in non-starting position,
  84     for example, regretTETEd -> regretTEd.
  85
  86   - lower limit for longswapchar and movechar: recognized only max.
  87     4-character distances to avoid slow and bad suggestions.
  88
  89   - fix compound handling for new Hungarian orthography reform
  90
  91   - Allow suggestion search for prefix + *two suffixes*:
  92     Remove artificial performance limit to get correct
  93     suggestions for relatively simple misspellings in
  94     Hungarian, etc., when the word form contains prefix
  95     and both derivative and inflectional suffixes, too:
  96
  97     lefikszálása -> lefixálása
  98
  99   Improvements for command-line Hunspell:
 100
 101   - Remove false alarms during checking OpenDocument (ODF)
 102     documents by ignoring <text:span> elements. (LibreOffice
 103     creates a lot of <text:span> elements also within words
 104     during text reediting, resulted often huge amount of broken
 105     words before this fix.)
 106
 107   - List filenames during filtering multiple files in command-line:
 108
 109     Examples:
 110
 111     $ hunspell -l *.odt
 112     a.odt: mispelling
 113     b.odt: egzample
 114
 115     $ hunspell -l -G *.odt
 116     a.odt: good
 117     b.odt: words
 118
 119   - Dictionary search by option -D doesn't wait for the standard input
 120     (fixed by Siva Mahadevan)
 121
 122   Other improvements:
 123
 124   - makealias dictionary compression: add option --minimize-diff
 125     to reuse free positions of alias lists to create minimal and
 126     readable diffs for alias compressed dictionaries stored in
 127     revision control systems, as dictionaries of LibreOffice.
 128
 129   - Brazilian-Portuguese translation by Rafael Fontenelle
 130
 131   - Catalan translation by robert dot buj at gmail
 132
 133   - Minor bug fixes by several contributors, see git log
 134
 135 2017-09-03: Hunspell 1.6.2 release:
 136   - Library changes: no. Same as 1.6.1.
 137   - Command line tool:
 138       - Added German translation
 139       - Fixed bug with wrong output encoding, not respecting system locale.
 140
 141 2017-03-25: Hunspell 1.6.1 release:
 142   - Library changes:
 143       - Performance improvements in suggest()
 144       - Fixes regressions for Hungarian related to compounding.
 145       - Fixes regressions for Korean related to ICONV.
 146   - Command line tool:
 147       - Added Tajik translation
 148       - Fix regarding serching of OOo dicts installed in user folder
 149   - Manpages:
 150       - Fix microsoft-cp1251 to cp1251. Dicts should not use the first.
 151       - Typos.
 152
 153 2016-12-22: Hunspell 1.6.0 release:
 154   - Library changes:
 155       - Performance improvement in ngsuggest(), suggestions should be faster.
 156       - Revert MAXWORDLEN to 100 as in 1.3.3 for performance reasons.
 157       - MAXWORDLEN can be set during build time with -D defines.
 158       - Fix crash when word with 102 consecutive X is spelled.
 159   - Command line tool:
 160       - -D shows all loaded dictionares insted of only the first.
 161       - -D properly lists all available dictionaries on Windows.
 162
 163 2016-11-30: Hunspell 1.5.4 release:
 164   - Fixes the command COMPOUNDSYLLABLE used in Hungarian dictionary.
 165
 166 2016-11-28: Hunspell 1.5.3 release:
 167   - Removed a #include from hunspell.hxx that was creating trouble
 168
 169 2016-11-27: Hunspell 1.5.2 release:
 170   - Reverted full backward compatibility with 1.4 public API, again
 171
 172 2016-11-27: Hunspell 1.5.1 release:
 173   - Reverted full backward compatibility with 1.4 public API
 174
 175 2016-11-18: Hunspell 1.5.0 release:
 176   - Lot of stability fixes
 177   - Fixed compilation errors on various systems (Windows, FreeBSD)
 178   - Small performance improvement compared to 1.4.0
 179   - The C++ API is updated to use modern C++ types (string, vector).
 180     Backward compatibility is kept for most of the functions except for
 181     the following:
 182       - get_wordchars();
 183       - get_version();
 184       - input_conv(string, string);
 185       - removed get_csconv();
 186
 187 2016-04-15: Hunspell 1.4.0 release:
 188   - various abi changes due to moving away from char* to std::string
 189
 190 2014-06-02: Hunspell 1.3.3 release:
 191   - OpenDocument (ODF and Flat ODF) support (ODF needs unzip program)
 192   - various bug fixes
 193
 194 2011-02-02: Hunspell 1.3.2 release:
 195   - fix library versioning
 196   - improved manual
 197
 198 2011-02-02: Hunspell 1.3.1 release:
 199   - bug fixes
 200
 201 2011-01-26: Hunspell 1.2.15/1.3 release:
 202   - new features: MAXDIFF, ONLYMAXDIFF, MAXCPDSUGS, FORBIDWARN, see manual
 203   - bug fixes
 204
 205 2011-01-21:
 206   - new features: FORCEUCASE and WARN, see manual
 207   - new options: -r to filter potential mistakes (rare words
 208     signed by flag WARN in the dictionary)
 209   - limited and optimized suggestions
 210
 211 2011-01-06: Hunspell 1.2.14 release:
 212   - bug fix
 213 2011-01-03: Hunspell 1.2.13 release:
 214   - bug fixes
 215   - improved compound handling and
 216     other improvements supported by OpenTaal Foundation, Netherlands
 217 2010-07-15: Hunspell 1.2.12 release
 218 2010-05-06: Hunspell 1.2.11 release:
 219   - Maintenance release bug fixes
 220 2010-04-30: Hunspell 1.2.10 release:
 221   - Maintenance release bug fixes
 222 2010-03-03: Hunspell 1.2.9 release:
 223   - Maintenance release bug fixes and warnings
 224   - MAP support for composed characters or character sequences
 225 2008-11-01: Hunspell 1.2.8 release:
 226   - Default BREAK feature and better hyphenated word suggestion to accept
 227     and fix (compound) words with hyphen characters by spell checker
 228     instead of by work breaking code of OpenOffice.org. With this feature
 229     it's possible to accept hyphenated compound words, such as "scot-free",
 230     where "scot" is not a correct English word.
 231
 232   - ICONV & OCONV: input and output conversion tables for optional character
 233     handling or using special inner format. Example:
 234
 235   # Accepting de facto replacements of the Romanian comma acuted letters
 236   SET UTF-8
 237   ICONV 4
 238   ICONV ş ș
 239   ICONV ţ ț
 240   ICONV Ş Ș
 241   ICONV Ţ Ț
 242
 243     Typical usage of ICONV/OCONV is to manage an inner format for a segmental
 244     writing system, like the Ethiopic script of the Amharic language.
 245
 246   - Extended CHECKCOMPOUNDPATTERN to handle conpound word alternations, like
 247     sandhi feature of Telugu and other writing systems.
 248
 249   - SIMPLIFIEDTRIPLE compound word feature: allow simplified Swedish and
 250     Norwegian compound word forms, like tillåta (till|låta) and
 251     bussjåfør (buss|sjåfør)
 252
 253   - wordforms: word generator script for dictionary developers (Hunspell
 254     version of unmunch).
 255
 256   - bug fixes
 257
 258 2008-08-15: Hunspell 1.2.7 release:
 259   - FULLSTRIP: new option for affix handling. With FULLSTRIP, affix rules can
 260     strip full words, not only one less characters.
 261   - COMPOUNDRULE works with all flag types. (COMPOUNDRULE is for pattern
 262     matching. For example, en_US dictionary of OpenOffice.org uses COMPOUNDRULE
 263     for ordinal number recognition: 1st, 2nd, 11th, 12th, 22nd, 112th, 1000122nd
 264     etc.).
 265   - optimized suggestions:
 266     - modified 1-character distance suggestion algorithms: search a TRY character
 267       in all position instead of all TRY characters in a character position
 268       (it can give more readable suggestion order, also better suggestions
 269       in the first positions, when TRY characters are sorted by frequency.)
 270       For example, suggestions for "moze":
 271       ooze, doze, Roze, maze, more etc. (Hunspell 1.2.6),
 272       maze, more, mote, ooze, mole etc. (Hunspell 1.2.7).
 273     - extended compound word checking for better COMPOUNDRULE related
 274       suggestions, for example English ordinal numbers: 121323th -> 121323rd
 275       (it needs also a th->rd REP definition).
 276   - bug fixes
 277
 278 2008-07-15: Hunspell 1.2.6 release:
 279   - bug fix release (fix affix rule condition checking of sk_SK dictionary,
 280     iconv support in stemming and morphological analysis of the Hunspell
 281     utility, see also Changelog)
 282
 283 2008-07-09: Hunspell 1.2.5 release:
 284   - bug fix release (fix affix rule condition checking of en_GB dictionary,
 285     also morphological analysis by dictionaries with two-level suffixes)
 286
 287 2008-06-18: Hunspell 1.2.4-2 release:
 288   - fix GCC compiler warnings
 289
 290 2008-06-17: Hunspell 1.2.4 release:
 291   - add free_list() for C, C++ interfaces to deallocate suggestion lists
 292
 293   - bug fixes
 294
 295 2008-06-17: Hunspell 1.2.3 release:
 296   - extended XML interface to use morphological functions by standard
 297     spell checking interface, spell() and suggest(). See hunspell.3 manual page.
 298
 299   - default dash suggestions for compound words: newword-> new word and new-word
 300
 301   - new manual pages: hunspell.3, hzip.1, hunzip.1.
 302
 303   - bug fixes
 304
 305 2008-04-12: Hunspell 1.2.2 release:
 306   - extended dictionary (dic file) support to use multiple base and
 307     special dictionaries.
 308
 309   - new and improved options of command line hunspell:
 310     -m: morphological analysis or flag debug mode (without affix
 311         rule data it signs the flag of the affix rules)
 312     -s: stemming mode
 313     -D: list available dictionaries and search path
 314     -d: support extra dictionaries by comma separated list. Example:
 315
 316     hunspell -d en_US,en_med,de_DE,de_med,de_geo UNESCO.txt
 317
 318     - forbidding in personal dictionary (with asterisk, / signs affixation)
 319
 320   - optional compressed dictionary format "hzip" for aff and dic files
 321     usage:
 322     hzip example.aff example.dic
 323     mv example.aff example.dic /tmp
 324     hunspell -d example
 325     hunzip example.aff.hz >example.aff
 326     hunzip example.dic.hz >example.dic
 327
 328   - new affix compression tool "affixcompress": compression tool for
 329     large (millions of words) dictionaries.
 330
 331   - support encrypted dictionaries for closed OpenOffice.org extensions or
 332     other commercial programs
 333
 334   - improved manual
 335
 336   - bug fixes
 337
 338 2007-11-01: Hunspell 1.2.1 release:
 339   - new memory efficient condition checking algorithm for affix rules
 340
 341   - new morphological functions:
 342     - stem() for stemming
 343     - analyze() for morphological analysis
 344     - generate() for morphological generation
 345
 346   - new demos:
 347     - analyze: stemming, morphological analysis and generation
 348     - chmorph: morphological conversion of texts
 349
 350 2007-09-05: Hunspell 1.1.12 release:
 351   - dictionary based phonetic suggestion for words with
 352     special or foreign pronounciation or alternative (bad) transliteration
 353     (see Changelog, tests/phone.* and manual).
 354
 355   - improved data structure and memory optimization for dictionaries
 356     with variable count fields
 357
 358   - bug fixes for Unicode encoding dictionaries and ngram suggestions
 359
 360   - improved REP suggestions with space: it works without dictionary
 361     modification
 362
 363   - updated and new project files for Windows API
 364
 365 2007-08-27: Hunspell 1.1.11 release:
 366   - portability fixes
 367
 368 2007-08-23: Hunspell 1.1.10 release:
 369   - pronounciation based suggestion using Björn Jacke's original Aspell
 370     phonetic transcription algorithm (http://aspell.net), relicensed under
 371     GPL/LGPL/MPL tri-license with the permission of the author
 372
 373   - keyboard base suggestion by KEY (see manual)
 374
 375   - better time limits for suggestion search
 376
 377   - test environment for suggestion based on Wikipedia data
 378
 379   - bug fixes for non standard Mozilla platforms etc.
 380
 381 2007-07-25: Hunspell 1.1.9 release:
 382   - better tokenization:
 383     - for URLs, mail addresses and directory paths (default: skip these tokens)
 384     - for colons in words (for Finnish and Swedish)
 385
 386   - new examples:
 387     - affixation of personal dictionary words
 388     - digits in words
 389
 390   - bug fixes (see ChangeLog)
 391
 392 2007-07-16: Hunspell 1.1.8 release:
 393   - better Mac OS X/Cygwin and Windows compatibility
 394
 395   - fix Hunspell's Valgrind environment and memory handling errors
 396     detected by Valgrind
 397
 398   - other bug fixes (see ChangeLog)
 399
 400 2007-07-06: Hunspell 1.1.7 release:
 401   - fix warning messages of OpenOffice.org build
 402
 403 2007-06-29: Hunspell 1.1.6 release:
 404   - check capitalization of the following word forms
 405     - words with mixed capitalisation: OpenOffice.org - OPENOFFICE.ORG
 406     - allcap words and suffixes: UNICEF's - UNICEF'S
 407     - prefixes with apostrophe and proper names: Sant'Elia - SANT'ELIA
 408
 409   - suggestion for missing sentence spacing: something.The -> something. The
 410
 411   - Hunspell executable: improved locale support
 412     - -i option: custom input encoding
 413     - use locale data for default dictionary names.
 414     - tools/hunspell.cxx: fix 8-bit tokenization (letters without
 415       casing, like ß or Hebrew characters now are handled well)
 416     - dictionary search path (automatic detection of OpenOffice.org directories)
 417     - DICPATH environmental variable
 418     - -D option: show directory path of loaded dictionary
 419
 420   - patches and bug fixes for Mozilla, OpenOffice.org.
 421
 422 2007-03-19: Hunspell 1.1.5 release:
 423   - optimizations: 10-100% speed up, smaller code size and memory footprint
 424     (conditional experimental code and warning messages)
 425
 426   - extended Unicode support:
 427     - non BMP Unicode characters in dictionary words and affixes (except
 428       affix rules and conditions)
 429     - support BOM sequence in aff and dic files
 430
 431   - IGNORE feature for Arabic diacritics and other optional characters
 432
 433   - New edit distance suggestion methods:
 434     - capitalisation: nasa -> NASA
 435     - long swap: permenant -> permanent
 436     - long move: Ghandi -> Gandhi, greatful -> grateful
 437     - double two characters: vacacation -> vacation
 438     - spaces in REP sug.: REP alot a_lot (NOTE: "a lot" must be a dictionary word)
 439
 440   - patches and bug fixes for Mozilla, OpenOffice.org, Emacs, MinGW, Aqua,
 441     German and Arabic language, etc.
 442
 443 2006-02-01: Hunspell 1.1.4 release:
 444   - Improved suggestion for typical OCR bugs (missing spaces between
 445     capitalized words). For example: "aNew" -> "a New".
 446     http://qa.openoffice.org/issues/show_bug.cgi?id=58202
 447
 448   - tokenization fixes (fix incomplete tokenization of input texts on big-endian
 449     platforms, and locale-dependent tokenization of dictionary entries)
 450
 451 2006-01-06: Hunspell 1.1.3.2 release:
 452   - fix Visual C++ compiling errors
 453
 454 2006-01-05: Hunspell 1.1.3 release:
 455   - GPL/LGPL/MPL tri-license for Mozilla integration
 456
 457   - Alias compression of flag sets and morphological descriptions.
 458     (For example, 16 MB Arabic dic file can be compressed to 1 MB.)
 459
 460   - Improved suggestion.
 461
 462   - Improved, language independent German sharp s casing with CHECKSHARPS
 463     declaration.
 464
 465   - Unicode tokenization in Hunspell program.
 466
 467   - Bug fixes (at new and old compound word handling methods), etc.
 468
 469 2005-11-11: Hunspell 1.1.2 release:
 470
 471   - Bug fixes (MAP Unicode, COMPOUND pattern matching, ONLYINCOMPOUND
 472     suggestions)
 473
 474   - Checked with 51 regression tests in Valgrind debugging environment,
 475     and tested with 52 OOo dictionaries on i686-pc-linux platform.
 476
 477 2005-11-09: Hunspell 1.1.1 release:
 478
 479   - Compound word patterns for complex compound word handling and
 480     simple word-level lexical scanning. Ideal for checking
 481     Arabic and Roman numbers, ordinal numbers in English, affixed
 482     numbers in agglutinative languages, etc.
 483     http://qa.openoffice.org/issues/show_bug.cgi?id=53643
 484
 485   - Support ISO-8859-15 encoding for French (French oe ligatures are
 486     missing from the latin-1 encoding).
 487     http://qa.openoffice.org/issues/show_bug.cgi?id=54980
 488
 489   - Implemented a flag to forbid obscene word suggestion:
 490     http://qa.openoffice.org/issues/show_bug.cgi?id=55498
 491
 492   - Checked with 50 regression tests in Valgrind debugging environment,
 493     and tested with 52 OOo dictionaries.
 494
 495   - other improvements and bug fixes (see ChangeLog)
 496
 497 2005-09-19: Hunspell 1.1.0 release
 498
 499 * complete comparison with MySpell 3.2 (from OpenOffice.org 2 beta)
 500
 501 * improved ngram suggestion with swap character detection and
 502   case insensitivity
 503
 504 ------ examples for ngram improvement (input word and suggestions) -----
 505
 506 1. pernament (instead of permanent)
 507
 508 MySpell 3.2: tournaments, tournament, ornaments, ornament's, ornamenting, ornamented,
 509         ornament, ornamentals, ornamental, ornamentally
 510
 511 Hunspell 1.0.9: ornamental, ornament, tournament
 512
 513 Hunspell 1.1.0: permanent
 514
 515 Note: swap character detection
 516
 517
 518 2. PERNAMENT (instead of PERMANENT)
 519
 520 MySpell 3.2: -
 521
 522 Hunspell 1.0.9: -
 523
 524 Hunspell 1.1.0: PERMANENT
 525
 526
 527 3. Unesco (instead of UNESCO)
 528
 529 MySpell 3.2: Genesco, Ionesco, Genesco's, Ionesco's, Frescoing, Fresco's,
 530              Frescoed, Fresco, Escorts, Escorting
 531
 532 Hunspell 1.0.9: Genesco, Ionesco, Fresco
 533
 534 Hunspell 1.1.0: UNESCO
 535
 536
 537 4. siggraph's (instead of SIGGRAPH's)
 538
 539 MySpell 3.2: serigraph's, photograph's, serigraphs, physiography's,
 540              physiography, digraphs, serigraph, stratigraphy's, stratigraphy
 541              epigraphs
 542
 543 Hunspell 1.0.9: serigraph's, epigraph's, digraph's
 544
 545 Hunspell 1.1.0: SIGGRAPH's
 546
 547 --------------- end of examples --------------------
 548
 549 * improved testing environment with suggestion checking and memory debugging
 550
 551   memory debugging of all tests with a simple command:
 552
 553   VALGRIND=memcheck make check
 554
 555 * lots of other improvements and bug fixes (see ChangeLog)
 556
 557
 558 2005-08-26: Hunspell 1.0.9 release
 559
 560 * improved related character map suggestion
 561
 562 * improved ngram suggestion
 563
 564 ------ examples for ngram improvement (O=old, N = new ngram suggestions) --
 565
 566 1. Permenant (instead of Permanent)
 567
 568 O: Endangerment, Ferment, Fermented, Deferment's, Empowerment,
 569         Ferment's, Ferments, Fermenting, Countermen, Weathermen
 570
 571 N: Permanent, Supermen, Preferment
 572
 573 Note: Ngram suggestions was case sensitive.
 574
 575 2. permenant (instead of permanent)
 576
 577 O: supermen, newspapermen, empowerment, endangerment, preferments,
 578         preferment, permanent, preferment's, permanently, impermanent
 579
 580 N: permanent, supermen, preferment
 581
 582 Note: new suggestions are also weighted with longest common subsequence,
 583 first letter and common character positions
 584
 585 3. pernemant (instead of permanent)
 586
 587 O: pimpernel's, pimpernel, pimpernels, permanently, permanents, permanent,
 588         supernatant, impermanent, semipermanent, impermanently
 589
 590 N: permanent, supernatant, pimpernel
 591
 592 Note: new method also prefers root word instead of not
 593 relevant affixes ('s, s and ly)
 594
 595
 596 4. pernament (instead of permanent)
 597
 598 O: tournaments, tournament, ornaments, ornament's, ornamenting, ornamented,
 599         ornament, ornamentals, ornamental, ornamentally
 600
 601 N: ornamental, ornament, tournament
 602
 603 Note: Both ngram methods misses here.
 604
 605
 606 5. obvus (instad of obvious):
 607
 608 O: obvious, Corvus, obverse, obviously, Jacobus, obtuser, obtuse,
 609         obviates, obviate, Travus
 610
 611 N: obvious, obtuse, obverse
 612
 613 Note: new method also prefers common first letters.
 614
 615
 616 6. unambigus (instead of unambiguous)
 617
 618 O: unambiguous, unambiguity, unambiguously, ambiguously, ambiguous,
 619         unambitious, ambiguities, ambiguousness
 620
 621 N: unambiguous, unambiguity, unambitious
 622
 623
 624
 625 7. consecvence (instead of consequence)
 626
 627 O: consecutive, consecutively, consecutiveness, nonconsecutive, consequence,
 628         consecutiveness's, convenience's, consistences, consistence
 629
 630 N: consequence, consecutive, consecrates
 631
 632
 633 An example in a language with rich morphology:
 634
 635 8. Misisipiben (instead of Mississippiben [`in Mississippi' in Hungarian]):
 636
 637 O: Misikédéiben, Pisisedéiben, Misikéiéiben, Pisisekéiben, Misikéiben,
 638         Misikéidéiben, Misikékéiben, Misikéikéiben, Misikéiméiben, Mississippiiben
 639
 640 N: Mississippiben, Mississippiiben, Misiiben
 641
 642 Note: Suggesting not relevant affixes was the biggest fault in ngram
 643    suggestion for languages with a lot of affixes.
 644
 645 --------------- end of examples --------------------
 646
 647 * support twofold prefix cutting
 648
 649 * lots of other improvements and bug fixes (see ChangeLog)
 650
 651 * test Hunspell with 54 OpenOffice.org dictionaries:
 652
 653 source: ftp://ftp.services.openoffice.org/pub/OpenOffice.org/contrib/dictionaries
 654
 655 testing shell script:
 656 -------------------------------------------------------
 657 for i in `ls *zip | grep '^[a-z]*_[A-Z]*[.]'`
 658 do
 659         dic=`basename $i .zip`
 660         mkdir $dic
 661         echo unzip $dic
 662         unzip -d $dic $i 2>/dev/null
 663         cd $dic
 664         echo unmunch and test $dic
 665         unmunch $dic.dic $dic.aff 2>/dev/null | awk '{print$0"\t"}' |
 666         hunspell -d $dic -l -1 >$dic.result 2>$dic.err || rm -f $dic.result
 667         cd ..
 668 done
 669 --------------------------------------------------------
 670
 671 test result (0 size is o.k.):
 672
 673 $ for i in *_*/*.result; do wc -c $i; done
 674 0 af_ZA/af_ZA.result
 675 0 bg_BG/bg_BG.result
 676 0 ca_ES/ca_ES.result
 677 0 cy_GB/cy_GB.result
 678 0 cs_CZ/cs_CZ.result
 679 0 da_DK/da_DK.result
 680 0 de_AT/de_AT.result
 681 0 de_CH/de_CH.result
 682 0 de_DE/de_DE.result
 683 0 el_GR/el_GR.result
 684 6 en_AU/en_AU.result
 685 0 en_CA/en_CA.result
 686 0 en_GB/en_GB.result
 687 0 en_NZ/en_NZ.result
 688 0 en_US/en_US.result
 689 0 eo_EO/eo_EO.result
 690 0 es_ES/es_ES.result
 691 0 es_MX/es_MX.result
 692 0 es_NEW/es_NEW.result
 693 0 fo_FO/fo_FO.result
 694 0 fr_FR/fr_FR.result
 695 0 ga_IE/ga_IE.result
 696 0 gd_GB/gd_GB.result
 697 0 gl_ES/gl_ES.result
 698 0 he_IL/he_IL.result
 699 0 hr_HR/hr_HR.result
 700 200694989 hu_HU/hu_HU.result
 701 0 id_ID/id_ID.result
 702 0 it_IT/it_IT.result
 703 0 ku_TR/ku_TR.result
 704 0 lt_LT/lt_LT.result
 705 0 lv_LV/lv_LV.result
 706 0 mg_MG/mg_MG.result
 707 0 mi_NZ/mi_NZ.result
 708 0 ms_MY/ms_MY.result
 709 0 nb_NO/nb_NO.result
 710 0 nl_NL/nl_NL.result
 711 0 nn_NO/nn_NO.result
 712 0 ny_MW/ny_MW.result
 713 0 pl_PL/pl_PL.result
 714 0 pt_BR/pt_BR.result
 715 0 pt_PT/pt_PT.result
 716 0 ro_RO/ro_RO.result
 717 0 ru_RU/ru_RU.result
 718 0 rw_RW/rw_RW.result
 719 0 sk_SK/sk_SK.result
 720 0 sl_SI/sl_SI.result
 721 0 sv_SE/sv_SE.result
 722 0 sw_KE/sw_KE.result
 723 0 tet_ID/tet_ID.result
 724 0 tl_PH/tl_PH.result
 725 0 tn_ZA/tn_ZA.result
 726 0 uk_UA/uk_UA.result
 727 0 zu_ZA/zu_ZA.result
 728
 729 In en_AU dictionary, there is an abbrevation with two dots (`eqn..'), but
 730 `eqn.' is missing. Presumably it is a dictionary bug. Myspell also
 731 haven't accepted it.
 732
 733 Hungarian dictionary contains pseudoroots and forbidden words.
 734 Unmunch haven't supported these features yet, and generates bad words, too.
 735
 736 * check affix rules and OOo dictionaries. Detected bugs in cs_CZ,
 737 es_ES, es_NEW, es_MX, lt_LT, nn_NO, pt_PT, ro_RO, sk_SK and sv_SE dictionaries).
 738
 739 Details:
 740 --------------------------------------------------------
 741 cs_CZ
 742 warning - incompatible stripping characters and condition:
 743 SFX D   us          ech        [^ighk]os
 744 SFX D   us          y          [^i]os
 745 SFX Q   os          ech        [^ghk]es
 746 SFX M   o           ech        [^ghkei]a
 747 SFX J   ém          ej         ám
 748 SFX J   ém          ejme       ám
 749 SFX J   ém          ejte       ám
 750 SFX A   ou¾it       up         oupit
 751 SFX A   ou¾it       upme       oupit
 752 SFX A   ou¾it       upte       oupit
 753 SFX A   nout        l          [aeiouyáéíóúýùìr][^aeiouyáéíóúýùìrl][^aeiouy
 754 SFX A   nout        l          [aeiouyáéíóúýùìr][^aeiouyáéíóúýùìrl][^aeiouy
 755
 756 es_ES
 757 warning - incompatible stripping characters and condition:
 758 SFX W umar úse [ae]husar
 759 SFX W emir iñáis eñir
 760
 761 es_NEW
 762 warning - incompatible stripping characters and condition:
 763 SFX I unan únen unar
 764
 765 es_MX
 766 warning - incompatible stripping characters and condition:
 767 SFX A a ote e
 768 SFX W umar úse [ae]husar
 769 SFX W emir iñáis eñir
 770
 771 lt_LT
 772 warning - incompatible stripping characters and condition:
 773 SFX U ti      siuosi          tis
 774 SFX U ti      siuosi          tis
 775 SFX U ti      siesi           tis
 776 SFX U ti      siesi           tis
 777 SFX U ti      sis             tis
 778 SFX U ti      sis             tis
 779 SFX U ti      simës           tis
 780 SFX U ti      simës           tis
 781 SFX U ti      sitës           tis
 782 SFX U ti      sitës           tis
 783
 784 nn_NO
 785 warning - incompatible stripping characters and condition:
 786 SFX D   ar  rar  [^fmk]er
 787 SFX U   Øre  orde  ere
 788 SFX U   Øre  ort  ere
 789
 790 pt_PT
 791 warning - incompatible stripping characters and condition:
 792 SFX g   ãos        oas        ão
 793 SFX g   ãos        oas        ão
 794
 795 ro_RO
 796 warning - bad field number:
 797 SFX L   0          le         [^cg] i
 798 SFX L   0          i          [cg] i
 799 SFX U   0          i          [^i] ii
 800 warning - incompatible stripping characters and condition:
 801 SFX P   l          i          l [<- there is an unnecessary tabulator here)
 802 SFX I   a          ii         [gc] a
 803 warning - bad field number:
 804 SFX I   a          ii         [gc] a
 805 SFX I   a          ei         [^cg] a
 806
 807 sk_SK
 808 warning - incompatible stripping characters and condition:
 809 SFX T   µa»         olú        kla»
 810 SFX T   µa»         olúc       kla»
 811 SFX T   sµa»        ¹lú        sla»
 812 SFX T   sµa»        ¹lúc       sla»
 813 SFX R   µc»         lèiem      åc»
 814 SFX R   iás»        ätie       mias»
 815 SFX R   iez»        iem        [^i]ez»
 816 SFX R   iez»        ie¹        [^i]ez»
 817 SFX R   iez»        ie         [^i]ez»
 818 SFX R   iez»        eme        [^i]ez»
 819 SFX R   iez»        ete        [^i]ez»
 820 SFX R   iez»        ú          [^i]ez»
 821 SFX R   iez»        úc         [^i]ez»
 822 SFX R   iez»        z          [^i]ez»
 823 SFX R   iez»        me         [^i]ez»
 824 SFX R   iez»        te         [^i]ez»
 825
 826 sv_SE
 827 warning - bad field number:
 828 SFX  C  0  net  nets [^e]n
 829 --------------------------------------------------------
 830
 831 2005-08-01: Hunspell 1.0.8 release
 832
 833 - improved compound word support
 834 - fix German S handling
 835 - port MySpell files and MAP feature
 836
 837 2005-07-22: Hunspell 1.0.7 release
 838
 839 2005-07-21: new home page: http://hunspell.sourceforge.net