1 2016-04-29 Caolán McNamara <caolanm at LibO>:
2 * deprecate old api and add new one
3 old one remains implemented in terms of new one
4 and will eventually be removed
5 * shrink exposed api down to just hunspell.hxx
6 * next major release is likely to require C++11
8 2016-04-15 Caolán McNamara <caolanm at LibO>:
9 * generally using std::string and std::vector internally
11 2016-04-13 Caolán McNamara <caolanm at LibO>:
12 * gh#371 drop experimental code
14 2015-09-11 Caolán McNamara <caolanm at LibO>:
15 * rhbz#1261421 crash on mashing hangul korean keyboard
17 2014-12-03 Németh László <nemeth at numbertext dot org>:
18 * tools/hunspell.cxx: security fixes of the Hunspell executable
19 - secure file name handling, the problem (checking
20 OpenDocument files with malicious file names)
21 reported by Eric Sesterhenn
22 - using tmpnam() only with system("mkdir tempname && ...")
24 2014-10-17 Caolán McNamara <caolanm at LibO>:
25 * sf#245 Feature from Anish Patil -S mode
26 to show suggestions for completion of
27 correctly spelled words
28 * sf#248 Fix manpage about how to include
30 2014-10-16 Caolán McNamara <caolanm at LibO>:
31 * rhbz#915448, sf#57, sf#185 report character offset
32 and not byte offset in ispell mode
33 * sf#56 segv in experimental mode
34 * sf#228 don't translate init string
36 2014-09-22 Németh László <nemeth at numbertext dot org>:
37 * fix crash in morphological analysis of the Hungarian
38 compound word 'művészegyéniség', reported by Gáspár Sinai
40 2014-08-26 Németh László <nemeth at numbertext dot org>:
41 * unmunch separates flags of prefixes from the word,
42 bug reported by Daniel Naber
44 2014-08-05 Németh László <nemeth at numbertext dot org>:
45 * moz#318040 Mozzilla accepts abbreviations without dots
46 * myfopen(): add _wfullpath to expand relative parts of absolute paths
48 2014-07-16 Caolán McNamara <caolanm at LibO>:
49 * moz#675553 Switch from PRBool to bool
50 * moz#690892 replace PR_TRUE/PR_FALSE with true/false
51 * Silence the warning about empty while body loop in clang
52 * moz#777292 Make nsresult an enum
53 * moz#579517 Use stdint types in gecko
54 * moz#784776 consistently use FLAG_NULL
55 * moz#927728 Convert PRUnichar to char16_t
56 * moz#943268 Remove nsCharsetAlias and nsCharsetConverterManager
57 * Don't include config.h in license.hunspell if MOZILLA_CLIENT is set
59 2014-06-26 Caolán McNamara <caolanm at LibO>:
60 * clang scan-build: Allocator sizeof operand mismatch
61 * clang scan-build: other low hanging warnings
62 * clang scan-build: significant warnings
64 2014-06-02 Németh László <nemeth at numbertext dot org>:
65 * escape spaces in paths of ODF files
67 2014-05-28 Németh László <nemeth at numbertext dot org>:
68 * add long path/Unicode path support in WIN32 environment:
69 - hunspell#233 (reported by mahak gark) and LibreOffice fdo#48017
70 * flat ODF support, eg.:
72 cat doc.fodt | hunspell -l -O
74 - -X (XML) input format
75 - -O (ODF or flat ODF) input format
76 - --check-apostrophe: check and force Unicode apostrophe usage
77 (ASCII or Unicode apostrophe has to be in the
78 WORDCHARS section of the affix file)
80 - break 1-line XML of ODT documents at </style:style>, too,
81 not only at </text:p> (limiting tokenization problems, when
82 fgets stops within an XML tag)
83 - show ODF file path on the UI instead of the temporary file
85 - ', ", &, < and > in replacements converted to XML entities
86 - recognize &apos at tokenization, depending from WORDCHARS
87 - ' in tokens converted to ' before spell checking and
88 in the output of the pipe interface
89 * better apostrophe usage:
90 - WORDCHARS only with one of the Unicode or ASCII apostrophe
91 results extended word tokenization: both of them will be part of
92 the words (if they are inside: eg. word's, but not words').
93 - convert Unicode apostrophes to ASCII ones for 8-bit dictionaries
94 (eg. English dictionaries), or for UTF-8 dictionaries only
95 with ASCII apostrophe supports (eg. French dictionaries).
97 - hunspell.4 renamed to hunspell.5, see
98 hunspell#241 reported by Cristopher Yeleighton
99 - updated translations
100 - note about long/Unicode paths in WIN32 (hunspell.3)
102 2014-04-25 Németh László <nemeth at numbertext dot org>:
103 * OpenDocument support, eg.
106 * always load default personal dictionary (fix
107 filtering bad words - reduce this word list - using
108 it as a personal dictionary workflow)
109 * fix parsing/URL recognition problem (bad tokens
112 2013-07-25 pchang9@cs.wisc.edu
113 * moz#897255 Wasted work in line_uniq
114 * moz#897780 Wasted work in SuggestMgr::twowords
116 2013-07-25 Caolán McNamara <caolanm at LibO>:
117 * hunspell#167 layout problems with long lines
118 - based on the original fix by xorho
120 * rhbz#925562 upgrade config.guess for aarch64
122 2013-07-24 pchang9@cs.wisc.edu
123 * moz#896301 Wasted work in SfxEntry::checkword
124 * moz#896844 Wasted work in AffixMgr::defcpd_check
126 2013-06-13 Konstantin Khlebniko
127 * #49 HashMgr::add_word computes wrong size for struct hentry
129 2013-06-13 Ville Skyttä
130 * #53 Man page syntax fixes
132 2013-04-19 John Thomson <john thomson at SIL>
133 * win_api: add remove() of Hunspell API (hun#3606435)
135 2013-04-19 Rouslan Solomokhin <at sf.net>
136 * fix crash in suggestions for 99-character long words
137 by extending arrays of SuggestMgr::forgotchar_*
138 (hun#3595024, also http://crbug.com/130128),
139 thanks to also Paweł Hajdan to report the patch
141 2013-04-01 Caolán McNamara <caolanm at LibO>:
142 * hunspell: -Werror=undef
144 2013-03-13 Caolán McNamara <caolanm at LibO>:
145 * rhbz#918938 crash in interaction with danish thesaurus
147 2012-09-18 Németh László <nemeth at numbertext dot org>:
148 * src/hunspell/affixmgr.*: - fix morphological analysis of
149 compound words (hun#3544994, reported by Dávid Nemeskey, fdo#55045)
151 2012-06-29 Caolán McNamara <caolanm at LibO>:
152 * fix various coverity warnings
154 2012-01-10 Ehsan Akhgari <ehsan at mozilla dot com>
155 * moz#710940 Firefox Crash [@ AffixMgr::parse_file(char const*, char
158 2011-12-16 Jared Wein <jwein at mozilla dot com>
159 * moz#710967 Incorrect argument passed to strncmp in
160 AffixMgr::parse_convtable
162 2011-12-06 Caolán McNamara <caolanm at LibO>:
163 * rhbz#759647 fixed tempname of hunSPELL.bak collides with other users
164 when multiple edits in one dir
166 2011-10-13 Caolán McNamara <caolanm at LibO>:
167 * moz#694002 crash in hunspell affixmgr on exit with bad .aff
168 * leak in hunspell affixmgr with bad .aff
170 2011-09-19 Caolán McNamara <caolanm at LibO>:
171 * make libparsers.a not installed thanks to Tomáš Chvátal
173 2011-06-23 Caolán McNamara <caolanm at LibO>:
174 * fix some windows compiler warnings
176 2011-05-24 Németh László <nemeth at numbertext dot org>:
177 * src/hunspell/affixmgr.*: allow twofold suffixes in compounds
178 by extended version of Arno Teigseth's patch, see hun#3288562.
179 - new option for this feature: COMPOUNDMORESUFFIXES
181 2011-02-16 Németh László <nemeth at numbertext dot org>:
182 * src/*/Makefile.am: fix library versioning, the probem reported by
183 Rene Engerhald and Simon Brouwer.
185 * man/hunspell.4: new version based on the revised version of Ruud Baars
187 2011-02-02 Németh László <nemeth at OOo>:
188 * suggestngr.cxx: fix ngram PHONE suggestion for input words with
189 diacritics using UTF-8 encoded dictionaries (add byte length to the
190 8-bit phonet() argument instead of character length)
192 * suggestmgr.cxx: fix missing csconv problem with UTF-8 encoding
193 dictionares, when the input contains non-BMP characters
194 - tests/utf8_nonbmp.sug: test file
196 * suggestmgr.cxx: mixed and keyboard based character suggestions
197 don't forbid ngram suggestion search (optimized tests/suggestiontest)
199 * affixmgr.cxx: fix hun#2999225: interfering compounding mechanisms,
200 tested on Dutch word list and reported by Ruud Baars
202 * affixmgr.cxx: allomorph fix for hun#2970240 (Hungarian
203 compound "vadász+gép" was analyzed as vad+ász+gép, and rejected
204 by the ss->s rep rule (verb "vadássz"), but the analysis
205 didn't continue for the longer word parts (vadász+gép).
207 * csutil.cxx: add lang code "az_AZ", "hu_HU", "tr_TR" for back
208 compatibility (fixing Azeri and Turkish casing conversion, also
209 Hungarian compound handling)
211 * affixmgr.cxx: fix morphological analysis
213 2011-01-26 Németh László <nemeth at OOo>:
214 * affixmgr.cxx: fix for moz#626195 (memcheck problem with FULLSTRIP).
216 * affixmgr.*, suggestmgr.cxx: FORBIDWARN parameter (see manual)
218 2011-01-24 Németh László <nemeth at OOo>:
219 * suffixmgr.cxx: fix bad suggestion of forbidden compound words, eg.
220 "termijndoel" with the Dutch dictionary. Reported by Ruud Baars.
222 * latexparser.cxx: fix double apostrophe TeX quoation mark tokenization
223 (hun#3119776), reported by Wybodekker at SF.net.
225 * tests/suggestiontest/*: multilanguage and single Hunspell version, see README
226 * tests/suggestiontest/prepare2: for make -f Makefile.orig single
228 2011-01-22 Németh László <nemeth at OOo>:
229 * affixmgr.*, suggestmgr.*: new features
230 ONLYMAXDIFF: remove all bad ngram suggestions (default mode keeps one)
231 NONGRAMSUGGEST: similar to NOSUGGEST, but it forbids to use the word
232 in ngram based (more, than 1-character distance) suggestions.
234 2011-01-21 Németh László <nemeth at OOo>:
235 * suggestmgr.*: limit wild suggestions (hun#2970237 by Ruud Baars)
236 - limited compound word suggestions
237 - improved and limited ngram based suggestions
238 * tests/*.sug: modified test files
239 - feature MAXCPDSUGS:
240 MAXCPDSUGS 0 : no compound suggestion, suggested by
241 Finn Gruwier Larsen in hunfeat#2836033
242 MAXCPDSUGS n : max. ~n compound suggestions
243 - feature MAXDIFF: differency limit for ngram suggestions: 0-10
244 eg. MAXDIFF 5: normal (default) limit
245 MAXDIFF 0: only one ngram suggestion
246 MAXDIFF 10: ~maxngramsugs ngram suggestions
248 * affixmgr.*, hunspell.*: add flag FORCEUCASE (hun#2999228), force
249 capitalization of compound words, see Hunspell 4 manual),
250 suggested by Ruud Baars
251 test/forceucase.*: test files
253 * affixmgr.*, hunspell.*: add flag WARN (hun#1808861), optional warning feature
254 for rare words, suggested by Ruud Baars
255 tests/warn: test files
256 * tools/hunspell.cxx: add option -r for optional filtering of rare words
258 * affixmgr.cxx: fix hun#3161359 (gcc warnings) reported by Ryan VanderMeulen.
260 2011-01-17 Németh László <nemeth at OOo>:
261 * suggestmgr.cxx: fix hun#3158994 and hun#3159027 (missing csconv table
262 using awkward 8bit capitalization of UTF-8 encoded dictionary words with PHONE
263 suggestion, reported by benjarobin and dicollecte at SF.net).
265 2011-01-13 Németh László <nemeth at OOo>:
266 * affixmgr.cxx: ONLYINCOMPOUND fix for hun#2999224 (fogemorphene
267 was allowed in end position of compoundings). Reported by Ruud Baars.
268 * tests/onlyincompound2.*: test files
270 2011-01-10 Ingo H. de Boer <idb_winshell at SF.net>:
271 * win_api/{hunspell,libhunspell, testparser}.vcproj: updated project
272 files for the library and the executables. Compiling problem
273 also reported by Don Walker.
275 2011-01-06 Németh László <nemeth at OOo>:
276 * affixmgr.cxx: fix freedesktop#32850 (program halt during Hungarian
277 spell checking of the word "6csillagocska6", reported by András Tímár)
279 * tools/hunspell.cxx: add Mac OS X Hunspell dictionary paths, asked by
280 Vidar Gundersen in hunfeat#3142010
282 2011-01-05 Caolán McNamara <cmc at OOo>:
283 * moz#620626 NS_UNICHARUTIL_CID doesn't support
286 2011-01-03 Németh László <nemeth at OOo>:
287 * NEWS and THANKS: update for release 1.2.13
289 2010-12-20 Németh László <nemeth at OOo>:
290 * affixmgr.cxx: hun#3140784
292 2010-12-16 Németh László <nemeth at OOo>:
294 - improved fix of hun#2970242 (supporting
295 zero affixes, reported by Ruud Baars
296 - tests/opentaal_cpdpat{,2}: test files
298 - switching off default BREAK parameters by BREAK 0,
299 reported by Ruud Baars
301 - hun#2999225: interfering compounding mechanisms, reported by Ruud Baars
303 2010-12-11 Németh László <nemeth at OOo>:
304 * affixmgr.cxx: fix hun#2970242 (CHECKCOMPOUNDPATTERN only with flags),
305 the bug reported by Ruud Baars
306 * tests/2970242.*: test files
308 * tests/2970240.*: test files for CHECKCOMPOUNDPATTERN fix (check all
309 boundaries in compound words, fixed by the previous CHECKCOMPOUNDREP
310 fix), the bug reported by Ruud Baars
312 * win_api/Makefile.cygwin: update
314 2010-12-09 Caolán McNamara <cmc at OOo>:
315 * moz#617953 fix leak
317 2010-11-08 Caolán McNamara <cmc at OOo>:
318 * rhbz#650503 crash in arabic dictionary
320 2010-11-05 Caolán McNamara <cmc at OOo>:
321 * rhbz#648740 don't warn on empty flagvector
323 2010-11-03 Caolán McNamara <cmc at OOo>:
324 * logically we shouldn't need a csconv table in utf-8 mode
326 2010-10-27 Németh László <nemeth at OOo>:
327 * hun#3000055 (requested by Ruud Baars) add REP boundary specifiation:
329 REP ^wordstarting xxxx
332 * hun#3008434 (requested by Adrián Chaves Fernández) and
333 hun#3018929 (requested by Ruud Baars): REP with more than 2 words:
334 REP morethantwo more_than_two
336 * suggestmgr.cxx: fix incomplete suggestion list for capitalized words,
337 eg. missing Machtstrijd->Machtsstrijd in the Dutch dictionary
338 (reported by Ruud Bars)
340 * tests, man: related updates
342 2010-10-12 Caolán McNamara <cmc at OOo>:
343 * moz#603311 HashMgr::load_tables leaks dict when decode_flags fails
344 * fix mem leak found with new tests
345 * hun#3084340 allow underscores in html entity names
347 2010-10-07 Németh László <nemeth at OOo>:
349 - hun#2970239 fix bad suggestion of forbidden compound words
350 - hun#2999224 fix keepcase feature on compound words (only partial
351 fix for COMPOUNDRULE based compounding)
352 - fix checkcompoundrep feature in compound words (check all boundaries,
353 not only the last one)
354 Problems reported by Ruud Baars.
356 * tests/opentaal_forbiddenword[12]*, tests/opentaal_keepcase*:
357 new test files for the previous fixes
358 * tests/checkcompoundrep: extended test file.
360 2010-09-05 Caolán McNamara <cmc at OOo>:
361 * moz#583582 fix double buffer gcc fortify issue
363 2010-08-13 Caolán McNamara <cmc at OOo>:
364 * moz#586671 AffixMgr::parse_convtable leaks pattern/pattern2 if it
366 * moz#586686 tidy up get_xml_list and friends
368 2010-08-10 Caolán McNamara <cmc at OOo>:
369 * hun#3022860 fix remove duplicate code
371 2010-07-17 Caolán McNamara <cmc at OOo>:
372 * remove ununsed get_default_enc and avoid potential misrecognition of
373 three letter language ids
374 * normalize encoding names before lookup
376 2010-07-05 Caolán McNamara <cmc at OOo>:
377 * hun#2286060 add Hangul syllables to unicode tables
379 2010-06-26 Caolán McNamara <cmc at OOo>:
380 * moz#571728 keep new[]/delete[] wrappers in sync for embedded in moz
383 2010-06-13 Caolán McNamara <cmc at OOo>:
384 * moz#571728 keep new[]/delete[] wrappers in sync for embedded in moz
387 2010-06-02 Caolán McNamara <cmc at OOo>:
388 * moz#569611 compile cleanly under win64
390 2010-05-22 Caolán McNamara <cmc at OOo>:
391 * moz#525581 apply mozilla's current preferred get_current_cs impl
393 2010-05-17 Németh László <nemeth at OOo>:
394 * affixmgr.cxx: fix bad limitation of parenthesized flags at
395 COMPOUNDRULEs. Windows crash reported by Ruud Baars and Simon Brouwer.
397 2010-05-05 Caolán McNamara <cmc at OOo>:
398 * rhbz#589326 malloc of int that should have been of char**
399 * hun#2997388 fix ironic misspellings
401 2010-04-28 Caolán McNamara <cmc at OOo>:
402 * moz#550942 get_xml_list doesn't handle failure from get_xml_par
404 2010-04-27 Caolán McNamara <cmc at OOo>:
405 * moz#465612 mozilla-specific code leaks
406 * moz#430900 phone is dereferenced before oom check
407 * moz#418348 ckey_utf alloc is used unchecked in SuggestMgr::badcharkey_utf
408 * CID#1487 pointer "rl" dereferenced before NULL check
409 * CID#1464 Returned without freeing storage "ptr"
410 * CID#1459 Avoid duplicate strchr
411 * CID#1443 Avoid any chance of dereferencing *slst
412 * CID#1442 Unsafe to have a null morph
413 * CID#1440 Avoid null filenames
414 * CID#1302 Dereferencing NULL value "apostrophe"
415 * CID#1441 Avoid deferencing null ppfx
417 2010-04-16 Caolán McNamara <cmc at OOo>:
418 * hun#2344123 fix U)ncap in utf-8 locale
419 * fix up hunspell text UI and lines wider than terminal
421 2010-04-15 Caolán McNamara <cmc at OOo>:
422 * hun#2613701 fix small leak in FileMgr::FileMgr
423 * fix small leak in tools/hunspell
424 * hun#2871300 avoid crash if def and words are NULL
425 * hun#2904479 fix length of hzip file
426 * hun#2986756 mingw build fix
427 * hun#2986756 fix double-free
428 * hun#2059896 fix crash in interactive mode without nls
429 * hun#2917914 add some extra words to the latexparser
430 * make some structs static
431 * C-api has duped symbol names
432 * regenerate gettext/intl with recent version
433 * hun#2796772 build a .dll under MinGW
434 * rhbz#502387 allow cross-compiling for MinGW target
435 * hun#2467643 update .vcproj files to include replist.?xx
436 * unify visiblity/dll_export support across platforms
437 * hun#2831289 sizeof(short) typo
438 * hun#2986756 add -u3 gcc style output
440 2010-04-14 Caolán McNamara <cmc at OOo>:
441 * hun#2813804 fix segfault on hu_HU stemming
443 2010-04-13 Caolán McNamara <cmc at OOo>:
444 * hun#2806689 fix ironic misspellings
445 * hun#2836240 add Italian translations
447 2010-04-09 Caolán McNamara <cmc at OOo>:
448 * fix titchy possible leak in command-line spellchecker
450 2010-04-07 Caolán McNamara <cmc at OOo>:
451 * hun#2973827 apply win64 patch
452 * hun#2005643 fix broken mystrdup
454 2010-03-04 Caolán McNamara <cmc at OOo>:
455 * ooo#107768 fix crash in long strings in spellml mode
456 * hun#1999737 add some malloc checks
457 * hun#1999769 drop old buffer on realloc failure
458 * hun#2005643 tidy string functions
459 * hun#2005643 micro-opt
460 * hun#2006077 free strings on failed dict parse
461 * hun#2110783 ispell-alike verbose mode implementation
463 2010-03-03 Németh László <nemeth at OOo>:
464 * hunspell/(affixmgr, suggestmgr).cxx: add character sequence
465 support for MAP suggestion, using parenthesized character groups
466 in the syntax, eg. MAP ß(ss).
467 * man/hunspell.4, tests/map*: documentation and test files
469 2010-02-25 Németh László <nemeth at OOo>:
470 * hunspell/hunspell.cxx: add recursion limit for BREAK (fix OOo Issue 106267)
472 * hunspell/hunspell.cxx: fix crash in morphological analysis of
473 capitalized words with ending dashes
475 * affixmgr.cxx: fix morphological analysis of long numbers combined with dash,
476 eg. 45-00000045 (reported by a@freeblog.hu).
478 2010-02-23 Caolán McNamara <cmc at OOo>:
479 * hun#2314461 improve ispell-alike mode
480 * hun#2784983 improve default language detection
481 * hun#2812045 fix some compiler warnings
482 * hun#2910695 survive missing HOME dir
483 * hun#2934195 fix suggestmgr crash
484 * hun#2921129 remove unused variables
485 * hun#2826164 make sure make check uses the in-tree libhunspell
486 * bump toolchain to support --disable-rpath
487 * hun#2843984 fix coverity warning
488 * hun#2843986 fix coverity warning
489 * hun#2077630 add iconv lib
490 * make gcc strict-aliasing warning free
491 * make cppcheck warning free
493 2008-11-01 Németh László <nemeth at OOo>:
494 * replist.*, hunspell.cxx, affixmgr.cxx: new input and output
495 conversion support, see ICONV and OCONV keywords in the Hunspell(4)
496 manual page and the test examples. The input/output conversion
497 problem of syllabic languages reported by Daniel Yacob and
499 - tests/{iconv,oconv}.*: test examples
501 * tools/wordforms: word generation script for dictionary developers
502 (Hunspell version of the unmunch program)
504 * hunspell/hunspell.cxx: extended BREAK feature: ^ and $ mean in break
505 patterns the beginning and end of the word.
506 - tests/BREAK.*: modified examples.
508 * hunspell/hunspell.cxx: set default break at hyphen characters.
509 The associated problem reported by S Page in Hunspell Bug 2174061.
510 See Mozilla Bug ID 355178 and OOo Issue 64400, too.
511 - tests/breakdefault.*: test data
512 The following definition is equivalent of the default word break:
519 * affixmgr.cxx: SIMPLIFIEDTRIPLE is a new affix file keyword to allow
520 simplified forms of the compound words with triple repeating letters.
521 It is useful for Swedish and Norwegian languages.
523 * affixmgr.cxx: extend CHECKCOMPOUNDPATTERN to support
524 alternations of compound words for example by sandhi
525 feature of Indian and other languages. The problem reported
526 by Kiran Chittella associated with Telugu writing system
527 (see Telugu example in tests/checkcompoundpattern4.test).
528 The new optional field of CHECKCOMPOUNDPATTERN definition is the
529 replacement of the compound boundary defined by the previous fields:
530 CHECKCOMPOUNDPATTERN ff f ff
531 means ff|f compound boundary has been replaced by "ff", like in
532 the (prereform) German Schiffahrt (Schiff+fahrt).
533 - CHECKCOMPOUNDPATTERN supports also optional flag conditions now:
534 CHECKCOMPOUNDPATTERN ff/A f/B ff
535 means that the first word of the compound needs flag "A" and
536 the second word of the compound needs flag "B" to the operation.
538 * tools/hunspell.cxx: add empty lines as separators to the output of
539 the stemming and morphological analysis.
541 * affixmgr.cxx: fix condition checking algorithm. Bad suggestion
542 generation reported by Mehmet Akin in SF.net Bug 2124186 with help of
545 * affixmgr,cxx: fix COMPOUNDWORDMAX feature. The problem and its
546 code details reported by Göran Andersson under SF.net Bug ID 2138001.
548 * csutil.cxx: fix bad conditional code for Mozilla compilation.
549 Patch by Serge Gautherie. The problem reported by Ryan VanderMeulen.
551 * hunspell/hunspell.cxx: add missing ngram suggestion for HUHINITCAP
552 (capitalized mixed case) words.
554 * w_char.hxx: use GCC conditions for GCC related code. Patch by
557 * affixmgr.cxx: check morphological description in morphgen()
558 (fix potential program fault by incomplete morphological
559 description of affix rules)
561 * src/win_api: config.h: switch on warning messages on Windows
563 * tools/affixcompress: extended help for -h (use LC_ALL=C sort
566 * man/hunspell.4: updated manual:
567 - new and modified features (SIMPLIFIEDTRIPLE, ICONV, OCONV,
568 BREAK, CHECKCOMPOUNDPATTERN).
569 - note about costs of zero affixes, suggested by Olivier Ronez.
571 * hunspell/hunspell.cxx: remove deprecated word breaking codes.
573 2008-08-15 Németh László <nemeth at OOo>:
574 * affentry.cxx: add FULLSTRIP option. With FULLSTRIP, affix rules can
575 strip full words, not only one less characters. Suggested by
576 Davide Prina and other developers in OOo Issue 80145.
577 * tests/fullstrip.*: Test data based on Davide Prina's example.
578 * tools/unmunch.cxx: modified for FULLSTRIP.
580 * affixmgr.cxx: COMPOUNDRULE now works with long and numerical flag
581 types by parenthesized flags. Syntax: (flag)*, (flag)(flag)?(flag)*.
582 * tests/compoundrule[78].*: tests with parenthesized COMPOUNDRULE
585 * suggestmgr.cxx: modified badchar*(), forgotchar*() and extrachar*()
586 1-character distance suggestion algorithms: search a TRY character
587 in all position instead of all TRY characters in a character position
588 (it can give more readable suggestion order, also better suggestions
589 in the first positions, when TRY characters are sorted by frequency.)
590 For example, suggestions for "moze":
591 ooze, doze, Roze, maze, more etc. (Hunspell 1.2.6),
592 maze, more, mote, ooze, mole etc. (Hunspell 1.2.7).
594 * suggestmgr.cxx: extended compound word checking for better COMPOUNDRULE
595 related suggestions, for example English ordinal numbers: 121323th ->
596 121323rd (it needs also a th->rd REP definition).
598 * phonet.cxx: cast unsigned char parameter of isdigit() and fix
599 isalpha by myisalpha() (potential problems in Windows environment).
600 Reported by Thomas Lange in OOo Issue 92736.
602 * hunspell/csutil.*,hunspell/{affentry,affixmgr,hunspell,suggestmgr}.cxx:
603 fix potential buffer overloading under morphological analysis by the
604 new mystrcat() function. Reported by Molnár Andor (dolhpy at true
605 dot hu) in SF.net Bug 2026203.
607 * affixmgr.cxx: add recursion limit to defcpd(). Fix OOo Issue 76067:
608 crash-like deceleration by checking hexadecimal numbers with long FFF
609 sequence (combinatory explosion by the en_US words "f" and "ff").
610 Missing fix reported by Mathias Bauer.
612 * affixmgr.cxx: fix the difference in the Unicode and non-Unicode
613 parts of cpdcase_check(). Bug report by Brett Wilson.
615 * filemgr.*, affixmgr.cxx, csutil.*, hashmgr.*: warning messages now
616 contain line numbers (use --with-warnings configure option for
619 * hunspell.cxx: analyze(): fix case conversion of stemming and
620 morphological analysis of UTF-8 encoded input. Reported by Ferenc Godó.
622 * tools/hunspell.cxx: fix LaTeX Unicode support in filter mode.
623 Reported by Jan Seeger in SF.net Bug 2039990.
625 * affixmgr.hxx: 0.5 or in 64 bit environment, 1 MB (virtual) memory
626 saving using only the requested size for sFlag and pFlag arrays.
627 Bug report by Brett Wilson.
629 * affixmgr.cxx,tools/hunspell.cxx: get_version() returns with full
630 VERSION affix parameter instead of its first word. Fixes for
631 Hunspell's header. Some problems with Hunspell header reported in
634 2008-07-15 Németh László <nemeth at OOo>:
635 * affentry.cxx: fixes of the affix rule matching algorithm (affected
636 only the sk_SK dictionary from all OpenOffice.org dictionaries):
637 - fix dot pattern + accented letters matching (in non Unicode encoding)
638 - word-length conditions work again
639 * tests/condition.*: extended test for the fix.
641 * hashmgr.cxx: load multiword expressions: spaces may be parts
642 of the dictionary words again (but spaces also work as morphological
643 field separators: word word2 -> "word word2", word po:noun -> "word").
644 * man/hunspell.4: updated manual
646 * tools/hunspell.cxx: add iconv character conversion support to
647 stemming and morphological analysis
649 * tools/hunspell.cxx: add /usr/share/myspell/dicts search path for
652 2008-07-09 Németh László <nemeth at OOo>:
653 * affentry.cxx: fixes of the affix rule matching algorithm:
654 - right ASCII character handling in bracket expression;
655 - fault-tolerant nextchar() for bad rules.
656 Problem with the en_GB dictionary and nextchar() with a detailed
657 code analysis reported by John Winters in SF.net Bug ID 2012753.
658 * tests/condition.*: extended test for the fix.
660 * hunspell/hunspell.*, parsers/*, tools/hunspell.cxx: fix compiler
661 warnings (deprecated const-free char consts)
663 * win_api/hunspelldll.*: add hunspell_free_list(), the problem
664 reported by Laurier Mercer.
666 2008-06-30 Török László <torok_laszlo at users dot SF dot net>:
667 * tests/affixmgr.cxx: fix morphological analysis: strcat() on
668 an uninitialized char array in suffix_check_morph().
670 2008-06-18 Németh László <nemeth at OOo>:
671 * src/hunspell/affixmgr.cxx: fix GCC compiler warnings
672 (comparisons with string literal results in unspecified behaviour).
673 The problem reported by Ladislav Michnovič.
675 2008-06-17 Németh László <nemeth at OOo>:
676 * src/hunspell/{hunspell.cxx,hunspell.h}: add free_list() to the C and
677 C++ interface to deallocate suggestion lists. The problem
678 reported by Laurie Mercer and Christophe Paris.
679 * csutil.cxx: fix freelist() to deallocate non-NULL list, when n = 0.
680 * tools/{analyze,example,chmorph,hunspell}.cxx: use free_list().
682 * tools/hunspell.cxx: fix only --with-readline compiling problem.
683 Reported by Volkov Peter in SF.net Bug 1995842.
685 * man/hunspell.3,hunspell.hxx: fix analyze and generate examples in
686 the manual and comments (using char*** parameter instead of char**).
688 * tools/example.cxx: fix suggestion example.
690 2008-06-17 Németh László <nemeth at OOo>:
691 * affentry.cxx: fix the new affix rule matching algorithm of
692 Hunspell 1.2. Arabic dictionary problem reported by Khaled Hosny
693 in SF.net Bug ID 1975530. Mohamed Kebdani also sent a
695 * tests/{1975530,condition*}: tests for the fix
697 2008-06-13 Ingo H. de Boer <idb_winshell at SF.net>:
698 * src/hunspell/{affixmgr.cxx,hunspell.cxx}: add missing type
699 cast to strstr() calls for VC8 compatibility.
701 2008-06-13 Németh László <nemeth at OOo>:
702 * suggestmgr.cxx: add also part1-part2 suggestion with dash
703 for bad part1part2 word forms, suggested by Ruud Baars.
704 For example, now suggestion of "parttime": "part time"
706 NOTE: this feature will work only when the TRY definition
707 contains "-" or the letter "a".
709 * hunspell.cxx: new XML API in spell() and suggest() (see hunspell(3)).
711 * src/hunspell/*: fixes for OpenOffice.org build environment.
713 * man/{hunspell.3,hzip.1,hunzip.1}: add new manual pages for
714 Hunspell programming API and dictionary compression and
715 encryption utilities.
717 * src/hunspell/*: handle failed mystrdup() calls and other potential
718 insufficient memory problems. The problem reported by Elio Voci
719 in OpenOffice.org Issue 90604 and others.
721 * src/tools/affixmgr.cxx: restore original behaviour of get_wordchars
722 without conditional code. Problem reported by Ingo H. de Boer
723 in SF.net Bug 1763105.
725 * win_api/hunspelldll.h: put_word() renamed to add() in the (old)
726 Windows DLL API bug reported in SF.net Bug 1943236. Also reported
729 * tools/hunspell.cxx: fix chench() for environments without
730 native language support (ENABLE_NLS 0 in config.h),
731 PHP system_exec() bug reported by Michel Weimerskirch in
734 * hunspell.cxx, affixmgr.cxx: remove "result" from the
735 (result && *result) conditions, when "result" is a static variable.
736 The problem and a possible solution reported by Ladislav Michnovič.
738 * affixmgr.cxx: parse_affix(): print line instead of NULL in
739 the warning message, when affix class header is bad.
740 The problem reported by Ladislav Michnovič.
742 2008-06-01 Christian Lohmaier <cloph at OOo>
743 * configure.ac: patch to fix --with-readline, --with-ui logic.
744 Reported in the SF.net Bug 981395.
746 2008-05-04: Volkov Peter <volkov_peter at users sourceforge net>
747 * configure.ac: fix LibTool 2.22 incompatibility by removing
748 unused LT_* macros. Report and patch in SF.net Bug 1957383.
749 The problem reported and fixed by Ladislav Michnovič, too.
751 2008-04-23: Ladislav Michnovič <lmichnovic at suse cz>
752 * hunspell.pc.in: fix wrongly set directories.
754 2008-04-12 Németh László <nemeth at OOo>:
755 * src/tools/hunspell.cxx:
756 - Multilingual spell checking and special dictionary support with -d.
757 Multilingual spell checking suggested by Khaled Hosny (SF.net
758 Bug 1834280). Example for the new syntax:
760 -d en_US,en_geo,en_med,de_DE,de_med
762 en_US and de_DE are base dictionaries, and en_geo, en_med, de_med
763 are special dictionaries (dictionaries without affix file).
764 Special dictionaries are optional extension of the base dictionaries.
765 There is no explicit naming convention for special dictionaries,
766 only the ".dic" extension: dictionaries without affix file will
767 be an extension of the preceding base dictionary. First dictionary
768 in -d parameter must have an affix file (it must be a base
771 - new options for debugging, morphological analysis and stemming:
772 -m: morphological analysis or flag debug mode (without affix
773 rule data it signs the flag of the affix rules)
775 -D: show also available dictionaries and search path
776 (suggested by Aaron Digulla in SF.net Bug 1902133)
778 - add missing refresh() to print bad words before the slower suggestion
779 search in UI (better user experience)
781 - fix tabulator problems (reported by ugli-kid-joe AT sf DOT net)
783 - fix different encoding of dic and input, and suggestions
785 - add per mille sign to LANG hu_HU section.
787 - rewrite program messages. Concatenating multiple printfs for
788 easier translation suggested by András Tímár and Gábor Kelemen.
790 * src/hunspell/csutil.cxx: set static encds variable. Patch by
791 Rene Engerhald. SF.net Bug 1896207 and 1939988.
793 * src/hunspell/w_char.hxx,csutil.hxx: reorganizing
794 w_char typedef and HENTRY_DATA, HENTRY_FIND consts
796 * src/hunspell/hunzip.cxx: fopen(): using rb options instead of r (fix
799 * src/tools/affixmgr.cxx: restore original behaviour of get_wordchars
800 in an #ifdef WINSHELL section. Problem reported by Ingo H. de Boer
801 in SF.net Bug 1763105.
803 * src/tools/chmorph.cxx: remove the experimental modifications
805 * src/tools/hzip.c: fopen(): using wb options instead of w (fix
808 * src/tools/hunzip.cxx: add missing MOZILLA_CLIENT. Reported
809 by Ryan VanderMeulen.
811 * man/*, man/hu/*: updated manual
813 * man/hunspell.4: fix formatting problem (missing header)
815 * tools/makealias: now works with the extra data fields.
817 * phonet.cxx: use HASHSIZE const
819 * tests/rep.aff: fix REP count
821 * src/win_api/Makefile.cygwin, README: native Windows compilation
822 in Cygwin environment without cygwin1.dll dependency (see README
823 for compiling instructions).
825 2008-04-08 Roland Smith <rsmith AT xs4all DOT nl>:
826 * src/parsers/latexparser.cxx: fix PATTERN_LEN for AMD64 and
827 other platforms with different struct padding (SF.net Bug 1937995).
829 2008-04-03 Kelemen Gábor <kelemeng AT gnome DOT hu>:
830 * po/POTFILES.in: fix path of the source file
832 * po/Makevars: add --from-code=UTF-8 gettext option
834 * hunspell.cxx: add comments for shortkey translation
836 2008-02-04 Flemming Frandsen <flfr AT stibo DOT com>
837 * src/hunspell.h: fix Windows DLL support
838 - this patch also reported by Zoltán Bartkó.
840 2008-01-30 Mark McClain <marc_mcclain AT users DOT sf DOT net>
841 * src/hunspell.cxx: stem(): fix function call side effect
842 for PPC platform (SF.net Bug 1882105).
844 2008-01-30 Németh László <nemeth at OOo>:
845 * hunspell.cxx, csutil.cxx, hunspelldll.c: fix
846 SF.et Bug 1851246, patch also by Ingo H. de Boer.
848 * hunspell.h: fix SF.net Bug 1856572 (C prototype problem),
849 patch by Mark de Does.
851 * hunspell.pc.in: fix SF.net Bug 1857450 wrong prefix, reported
854 * hunspell.pc.in: reset numbering scheme: libhunspell-1.2.
855 Fix SF.net Bug 1857512 reported by Mark de Does,
856 also by Rene Engelhard.
858 * csutil.cxx: patches for ARM platform, signed_chars.dpatch
859 by Rene Engelhard and arm_structure_alignment.dpatch by
860 Steinar H. Gunderson <sesse@debian.org>
862 * hunzip.*, hzip.c: new hzip compression format
864 * tools/affixcompressor: affix compressor utility (similar to
865 munch, but it generates affix table automatically), works
866 with million-words dictionaries of agglutinative languages.
868 * README: fix problems reported by Pham Ngoc Khanh.
870 * csutil.cxx, suggestmgr: Warning-free in OOo builds.
872 * hashmgr.*, csutil.*: fix protected memory problems with
873 stored pointers on several not x86 platforms by
874 store_pointer(), get_stored_pointer().
876 * src/tools/hunspell.cxx: fix iconv support on Solaris platform.
878 * tests/IJ.good: add missing test file
880 * csutil.cxx: fix const char* related errors. Compiling bug
881 with Visual C++ reported by Ryan VanderMeulen and Ingo H. de Boer.
883 2008-01-03 Caolan McNamara <cmc at OO.o>:
884 * csutil.cxx: SF.net Bug 1863239, notrailingcomma patch and
885 optimization of get_currect_cs().
887 2007-11-01 Németh László <nemeth at OOo>:
888 * hunspell/*: new feature: morphological generation,
889 also fix experimental morphological analysis and stemming.
890 - new API functions and improved API:
891 - analyze(word): (instead of morph()) morphological analysis
892 - stem(word): stemming
893 - stem(list): stemming based on the result of an analysis
894 - generate(word, word2): morphological generation
895 - generate(word, list): morphological generation
896 - add(word): add word to the run-time dictionary (renamed put_word())
897 - add_with_affix(word, word2): (renamed put_word_pattern()):
898 add word to the run-time dictionary with affix flags of the
899 second parameter: all affixed forms of the user words will be
900 recognised by the spell checker. Especially useful for
901 agglutinative languages.
902 - remove(word): remove word from the run-time dictionary (not
904 - see manual and hunspell/hunspell.hxx header and tests/morph.*
905 * tests/morph.*: test data, example for morphological analysis,
906 stemming and generation
908 * tools/analyze, tools/chmorph: extended and new demo applications:
909 - analyze (originally hunmorph): analyses and stems input words,
910 generates word forms from input word pairs.
911 - chmorph: morphological transformation filter
913 * configure.ac, hunspell/makefile.am: set library version number.
914 Bug reported by Rene Engelhard.
916 * affentry.cxx, affixmgr.cxx: new pattern matching algorithm in
917 condition checking of affix rules instead of the Dömölki-algorithm:
918 - Unlimited condition length (instead of max. 8 characters).
919 - Less memory consumption, especially useful for affix rich languages:
920 5,4 MB memory savings with hu_HU dictionary.
921 - Speed change depends from dictionaries and CPU caches: English spell
922 checking is 4% faster on Linux words with en_US dictionary, Hungarian
923 spell checking is 25% slower on most frequent words of Hungarian
926 * tests/sug.*, sugutf.*: updated test data (use "a" and "lot"
927 dictionary items instead of "a lot".)
929 * src/hunspell/hunspell.cxx: free(csconv) instead of delete csconv.
930 Report and patch by Sylvain Paschein in Mozilla Issue 398268.
932 * suggestmgr.cxx, tools/hunspell.cxx: bad spelling of "misspelled".
933 Ubuntu Bug #134792, patch by Malcolm Parsons.
935 * tests/base_utf.*: use Unicode apostrophe instead of 8-bit one.
937 * hunspell.cxx, hashmgr.cxx: add(): use HashMgr::add()
939 2007-10-25 Pavel Janík <pjanik at OOo>:
940 * hunspell/csutil.cxx: Fix type cast warnings on 64bit Linux in
941 printing of character positions in u8_u16(). OOo issue 82984.
943 2007-09-05 Németh László <nemeth at OOo>:
944 * win_api/Hunspell.vproj, parsers/testparser.cxx,textparser.hxx:
945 warning fixes and removing unnecessary Windows project file.
946 Reported by Ingo H. de Boer.
948 * hashmgr.*, {affixmgr,suggestmgr}.cxx: optimized data structure
949 for variable-count fields (only "ph" transliteration field in
950 this version, see next item). Also less memory consumption:
951 -13% (0.75 MB) with en_US dictionary, -6% (1 MB) with hu_HU.
953 * suggestmgr.cxx: dictionary based phonetic suggestion for special
954 or foreign pronounciation (see also rule-based PHONE in manual).
955 Usage: tab separated field in dictionary lines, started with "ph:".
956 The field contains a phonetic transliteration of the word:
958 Marseille ph:maarsayl
959 * tests/phone.*: test data for dictionary and rule based phonetic
962 * hunspell.cxx: fix potential bad memory access in allcap word
963 capitalization in suggest() (bug of previous version).
965 * hunspell.cxx, atypes.hxx: set correct limit for UTF-8 encoded
966 input words (256 byte).
968 * suggestmgr.cxx: improved REP suggestions with spaces: it works
969 without dictionary modification.
970 OOo issue 80147, reported by Davide Prina.
971 * tests/rep.*: new test data: higher priority for "alot" -> "a lot",
972 and Italian suggestion "un'alunno" -> "un alunno".
974 * affixmgr.cxx: fix Unicode ngram suggestions in expand_rootword().
975 (Suggestions with bad affixes.)
976 Bug reported by Vitaly Piryatinksy <piv dot v dot vitaly at gmail>.
977 * tests/ngram_utf_fix.*: test based on Vitaly Piryatinksy's data.
979 * suggestmgr.cxx: fix twowords() for last UTF-8 multibyte character.
980 (conditional jump or move depended on uninitialised value).
982 2007-08-29 Ingo H. de Boer <idb_winshell at SF.net>:
983 * win_api/{hunspell,libhunspell, testparser}.vcproj: new project
984 files for the library and the executables.
986 * Hunspell.rc, Hunspell.sln, config.h: updated versions.
987 Version number problem also reported by András Tímár.
989 2007-08-27 Németh László <nemeth at OOo>:
990 * suggestmgr.hxx: put fixed version. Bug report by Ingo H. de Boer.
992 * suggestmgr.cxx: remove variable-length local character array
993 reported by Ingo H. de Boer.
995 2007-08-27 Németh László <nemeth at OOo>:
996 * suggestmgr.hxx: change bad time_t to clock_t in header, too.
997 Bug reports or patches by Ingo H. de Boer under SF.net
998 Bug ID 1781951, János Mohácsi and Gábor Zahemszky, András Tímár,
999 OMax3 at SF.net under SF.net Bug ID 1781592.
1001 * phonet.*: change variable-length local character array to
1002 portable fixed size character array. Problem reported by
1003 Ingo H. de Boer under SF.net Bug ID 1781951 and
1006 * suggestmgr.cxx: remove debug message (also by
1009 2007-08-26 Ingo H. de Boer <idb_winshell at SF.net>:
1010 * win_api/Hunspell.vcproj: updated version (with phonet.*)
1012 2007-08-23 Németh László <nemeth at OOo>:
1013 * phonet.{c,h}xx, suggestmgr.cxx: PHONE parameter:
1014 pronounciation based suggestion using Björn Jacke's original Aspell
1015 phonetic transcription algorithm (http://aspell.net), relicensed
1016 under GPL/LGPL/MPL tri-license with the permission of the author.
1019 * affixmgr,suggestmgr.cxx: add KEY parameter for keyboard and
1020 input method error related suggestions.
1021 Example: KEY qwertyuiop|asdfghjkl|zxcvbnm
1023 * man/hunspell.4: description about PHONE and KEY suggestion parameters.
1025 * suggestmgr.cxx: enhancements for better suggestions:
1026 - Set ngram suggestions for badchar-type errors
1027 and only two word and compound word suggestions, too.
1028 - Separate not compound and compound word
1029 suggestions for MAP suggestion, too.
1030 - Double swap suggestions for short words.
1031 For example: ahev -> have, hwihc -> which.
1032 - Better time limits using clock() instead of time()
1033 (tenths of a second resolution instead of second ones).
1034 - leftcommonsubstring() weigth function.
1036 * htype.hxx, hashmgr.cxx: blen (byte length) and clen (character
1037 length) fields instead of wlen
1039 * affixmgr.cxx: fix get_syllable() for bad Unicode inputs.
1041 * tests/suggestiontest/*: test environment for suggestions
1043 2007-08-07 Martijn Wargers:
1044 * csutil.cxx: fix Mingw build error associated with ToUpper() call.
1045 Report and patch in Mozilla Issue 391447.
1047 2007-08-07 Robert Longson:
1048 * atypes.cxx: use empty inline function HUNSPELL_WARNING instead of
1049 variadic macros to switch of Hunspell warnings.
1050 Reported by Gavin Sharp in Mozilla Issue 391147.
1052 2007-08-05 Ginn Chen:
1053 * hashmgr.cxx: Hunspell failed to compile on OpenSolaris (use stdio
1054 instead of csdio). Report and patch in Mozilla Issue 391040.
1056 2007-07-25 Németh László <nemeth at OOo>:
1057 * parsers/*.cxx: Hunspell executable recognises and accepts URLs,
1058 e-mail addresses, directory paths, reported by Jeppe Bundsgaard.
1059 * src/tools/hunspell.cxx: --check-url: new option of Hunspell program.
1060 Use --check-url, if you want check URLs, e-mail addresses and paths.
1062 * parsers/textparser.cxx: strip colon at end of words for Finnish
1063 and Swedish (colon may be in words in Finnish and Swedish).
1064 Problem reported by Lars Aronsson.
1065 * tests/colons_in_words.*: test data
1067 * tests/digits_in_words.*: example for using digits in words
1068 (eg. 1-jährig, 112-jährig etc. in German), reported by Lars Aronsson.
1070 * hashmgr.cxx: Hunspell accepts allcaps forms of mixed case
1071 words of personal dictionaries (+allcaps custom dictionary words with
1073 Sf.net Bug ID 1755272, reported by Ellis Miller.
1075 * hashmgr.cxx: fix small memory leaks with alias compressed
1076 dictionaries (free flag vectors of affixed personal dictionary words
1077 and flag vectors of hidden capitalized forms of mixed case and
1080 * affixmgr.cxx: fix COMPOUNDRULE checking with affixed compounds.
1081 Sf.net Bug ID 1706659, reported by Björn Jacke. Also fixing for
1082 OOo Issue 76067 (crash-like deceleration for hexadecimal numbers
1083 with long FFFFFF sequence using en_US dictionary).
1085 * tools/hunspell.cxx: add missing return to save_privdic().
1087 * man/hunspell.4: add information about affixation of personal words:
1088 "Personal dictionaries are simple word lists, but with optional
1089 word patterns for affixation, separated by a slash:
1094 In this example, "foo" and "Foo" are personal words, plus Foo
1095 will be recognised with affixes of Simpson (Foo's etc.)."
1097 2007-07-18 Németh László <nemeth at OOo>:
1098 * src/win_api/: add missing resource files, reported by Ingo H. de Boer.
1100 2007-07-16 Németh László <nemeth at OOo>:
1101 * hunspell.cxx: fix dot removing from UTF-8 encoded words in cleanword2()
1102 (Capitalised words with dots, as "Something." were not recognised
1103 using Unicode encoded dictionaries.)
1104 * tests/{base.*,base_utf.*}: extended and new test files for
1105 dot removing and Unicode support.
1107 * tools/hunspell.cxx: fix Cygwin, OS X compatibility using platform
1108 specifics iconv() header by ICONV_CONST macro of Autoconf.
1109 Sf.net Bug ID 1746030, reported by Mike Tian-Jian Jiang.
1110 Sf.net Bug ID 1753939, reported by Jean-Christophe Helary.
1112 * tools/hunspell.cxx: fix missing global path setting with -d option.
1114 * tests/test.sh: fix broken Valgrind checking (missing warnings
1115 with VALGRIND=memcheck make check).
1117 * csutil.cxx: fix condition in u8_u16() to avoid invalid read
1118 of not null-terminated character arrays (detected by Valgrind
1119 in Hunspell executable: associated with 8-bit character table
1120 conversion in tools/hunspell.cxx).
1122 * csutil.cxx: free_utf_tbl(): use utf_tbl_count-- instead of utf_tbl--.
1123 Memory leak in Hunspell executable detected by Valgrind.
1125 * hashmgr.cxx: add missing free_utf_tbl(), memory leak in Hunspell
1126 executable detected by Valgrind.
1128 * hashmgr.cxx: load_tables(): fix memory error in spec. capitalization.
1129 Use sizeof(unsigned short) instead of bad sizeof(unsigned short*).
1130 Invalid memory read detected by Valgrind.
1132 * hashmgr.cxx: add_word(): fix memory error in spec. capitalization.
1133 Update also affix array length of capitalized homonyms. Invalid
1134 memory read detected by Valgrind.
1136 * hunspell.cxx: suggest(): fix invalid memory write and leak.
1137 Bad realloc() and missing free() detected by Valgrind associated
1138 with suggestions for "something.The" type spelling errors.
1140 * {dictmgr,csutil,hashmgr,suggestmgr}.cxx: check memory allocation.
1141 Sf.net Bug ID 1747507, based on the patch by Jose da Silva.
1143 2007-07-13 Ingo H. de Boer <idb_winshell at SF.net>:
1144 * atypes.cxx: fix Visual C compatibility: Using
1145 "HUNSPELL_WARNING(a,b,...} {}" macro instead of empty "X(a,b...)".
1147 * hunspell.cxx: changes for Windows API.
1148 * win_api/Hunspell.*: new resource files
1149 * win_api/hunspelldll.*: set optional Hunspell and Borland spec. codes
1150 Sf.net Bug ID 1753802, patch by Ingo H. de Boer.
1151 See also Sf.net Bug ID 1751406, patch by Mike Tian-Jian Jiang.
1153 2007-07-09 Caolan McNamara <cmc at OO.o>:
1154 * {hunspell,hashmgr,affentry}.cxx: fix warnings of Coverity program
1155 analyzer. Sf.net Bug ID, 1750219.
1157 2007-07-06 Németh László <nemeth at OOo>:
1158 * atypes.cxx: warning-free swallowing of conditional warning messages
1159 and their parameters using empty HUNSPELL_WARNING(a,b...) macro.
1160 * {affixmgr,atypes,csutil}.cxx: fix unused variable warnings
1161 using WARNVAR macro for conditionally named variables.
1162 * hashmgr.cxx: fix unused variable warning in add_word() by cond. name
1163 * hunspell.cxx: fix shadowed declaration of captype var. in suggest()
1165 2006-06-29 Caolan McNamara <cmc at OO.o>:
1166 * hunspell.cxx: patch to fix possible memory leak in analyze() of
1167 experimental morphological analyzer code. Sf.net Bug ID 1745263.
1169 2007-06-29 Németh László <nemeth at OOo>:
1171 * src/hunspell/hunspell.cxx: check bad capitalisation of Dutch letter IJ.
1172 - Sf.net Feature Request ID 1640985, reported by Frank Fesevur.
1173 - Solution: FORBIDDENWORD for capitalised word forms (need
1174 an improved Dutch dictionary with forbidden words: Ijs/*, etc.).
1175 * tests/IJ.*: test data and example.
1177 * hashmgr.cxx, hunspell.cxx: check capitalization of special word forms
1178 - words with mixed capitalisation: OpenOffice.org - OPENOFFICE.ORG
1179 Sf.net Bug ID 1398550, reported by Dmitri Gabinski.
1180 - allcap words and suffixes: UNICEF's - UNICEF'S
1181 - prefixes with apostrophe and proper names: Sant'Elia - SANT'ELIA
1182 For Catalan, French and Italian languages.
1183 Reported by Davide Prina in OOo Issue 68568.
1184 * tests/allcaps*: tests for OPENOFFICE.ORG, UNICEF'S capitalization.
1185 * tests/i68568*: tests for SANT'ELIA capitalization.
1187 * hunspell/hunspell.cxx: suggestion for missing sentence spacing:
1188 something.The -> something. The
1190 * tools/hunspell.cxx: multiple character encoding support
1191 - -i option: custom input encoding
1192 Sf.net Bug ID 1610866, reported by Thobias Schlemmer.
1193 Sf.net Bug ID 1633413, reported by Dan Kenigsberg.
1194 See also hunspell-1.1.5-encoding.patch of Fedora from Caolan Mc'Namara.
1195 * tests/*.test: add input encodings
1197 * tools/hunspell.cxx: use locale data for default dictionary names.
1198 Sf.net Bug ID 1731630, report and patch from Bernhard Rosenkraenzer,
1199 See also hunspell-1.1.4-defaultdictfromlang.patch of Fedora Linux
1200 from Caolan McNamara.
1202 * tools/hunspell.cxx: fix 8-bit tokenization (letters without
1203 casing, like ß or Hebrew characters now are handled well)
1205 * tools/hunspell.cxx: dictionary search path
1206 - DICPATH environmental variable
1207 - -D option: show directory path of loaded dictionary
1208 - automatic detection of OpenOffice.org directories
1211 * affixmgr.cxx: fault-tolerant patch for REP and other affix
1212 table data problems. Problem with Hunspell and en_GB dictionary
1213 reported by Thomas Lange in OOo Issue 76098 and
1214 Stephan Bergmann in OOo Issue 76100.
1215 Sf.net Bug ID 1698240, reported by Ingo H. de Boer.
1217 * csutil.cxx: fix mkallcap_utf() for allcaps suggestion in UTF-8.
1219 * suggestmgr.cxx: fix bad movechar_utf() (missing strlen()).
1221 * hunspell.cxx: fix bad degree sign detection in Unicode
1224 * hunspell/hunspell.cxx: free allocated memory of csconv in
1225 ported Mozilla code.
1226 - Mozilla Bugzilla Bug 383564, report and Mozilla MySpell patch
1227 by Andrew Geul. Reported by Ryan VanderMeulen for Hunspell.
1229 * suggestmgr.cxx: fix minor difference in Unicode suggestion
1230 (ngram suggestion of allcaps words in Unicode).
1232 * hashmgr.cxx: close file handle after errors.
1233 Sf.net Bug ID 1736286, reported by John Nisly.
1235 * configure.ac: syntax error (shell variable with spaces).
1236 Sf.net Bug ID 1731625, reported by Bernhard Rosenkraenzer.
1238 * hunspell.cxx: check_word(): fix bad usage of info pointer.
1240 * hashmgr.cxx: fix de_DE related bug (accept words with leading dash).
1241 Sf.net Bug ID 1696134, reported by Björn Jacke.
1243 * suggestmgr.cxx, tests/1695964.*: fix NEEDAFFIX homonym suggestion.
1244 Sf.net Bug ID 1695964, reported by Björn Jacke.
1246 * tests/1463589*: capitalized ngram suggestion test data for
1247 Sf.net Bug ID 1463589, reported by Frederik Fouvry.
1249 * csutil.cxx, affixmgr.cxx: fix possible heap error with
1250 multiple instances of utf_tbl.
1251 Sf.net Bug ID 1693875, reported by Ingo H. de Boer.
1253 * affixmgr.cxx, suggestmgr.cxx, license.hunspell: convert to ASCII.
1254 Locale dependent compiling problems. Sf.net Bug ID 1694379, reported
1255 by Mike Tian-Jian Jiang. OOo Issue 78018 reported by Thomas Lange.
1257 * tests/test.sh: compatibility issues
1258 - fix Valgrind support (check shared library instead of shell wrapper)
1259 - remove deprecated "tail +2" syntax
1260 - set 8-bit locale for testing (LC_ALL=C)
1262 * hunspell.hxx: remove license.* and config.h dependencies.
1263 - hunspell-1.1.5-badheader.patch from Caolan McNamara <cmc at OO.o>
1265 2007-03-21 Németh László <nemeth at OOo>:
1266 * tools/Makefile.am, munch.h, unmunch.h: add missing munch.h and unmunch.h
1267 Reported by Björn Jacke and Khaled Hosny (sf.net Bug ID 1684144)
1268 * hunspell/hunspell.cxx, hunspell.hxx: fix --with-ui compliling error (add get_csconv())
1269 Reported by Khaled Hosny (sf.net Bug ID 1685010)
1271 2007-03-19 Németh László <nemeth at OOo>:
1272 * csutil.cxx, hunspell/hunspell.cxx: Unicode non BMP area (>65K character range) support
1273 (except conditional patterns and strip characters of affix rules)
1274 * tests/utf8_nonbmp*: test data
1276 * src/hunspell/*: add Mozilla patches from David Einstein
1277 - run-time generated 8-bit character tables
1278 - other Mozilla related changes (see Mozilla Bugzilla Bug 319778)
1280 * csutil.cxx, affixmgr.cxx, hashmgr.cxx: optimized version of IGNORE feature
1281 - IGNORE works with affixes (except strip characters and affix conditions)
1282 * tests/ignore*: test data with latin characters
1283 * tests/ignoreutf*: Unicode test data with Arabic diacritics (Harakat)
1285 * src/hunspell/suggestmgr.cxx: new edit distance suggestion methods
1286 - capitalization: nasa -> NASA
1287 - long swap: permenant -> permanent
1288 - long mov.: Ghandi -> Gandhi
1289 - double two characters: vacacation -> vacation
1290 * tests/sug.*: test data
1292 * src/hunspell/affixmgr.cxx: space in REP strings (alot -> a lot)
1293 Note: Underline character signs the space in REP strings: REP alot a_lot, and
1294 put the expression with space ("a lot") into the dic file (see tests/sug).
1296 * hashmgr.cxx, affixmgr.cxx: ignore Unicode byte order mark (BOM sequence)
1297 * tests/utf8_bom*: test data
1299 * hunspell/*.cxx: OOo Issue 68903 - Make lingucomponent warning-free on wntmsci10
1300 - fix Hunspell related warning messages on Windows platform (except some assignment
1301 within conditional expressions). Reported and started by Stephan Bergmann.
1303 * hunspell/affixmgr.cxx: fix OOo Issue 66683 - hunspell dmake debug=x fails
1304 - Reported by Stephan Bergmann.
1306 * src/hunspell/hunspell.[ch]xx: thread safe API for Hunspell executable
1307 (removing prev*() functions, new spell(word, info, root) function)
1309 * configure.ac, src/hunspell/*: HUNSPELL_EXPERIMENTAL code
1310 --with-experimental configure option (conditional compiling of morphological analyser
1313 * configure.ac, src/hunspell/*: conditional Hunspell warning messages
1314 --with-warnings configure option
1316 * affixmgr.cxx: new, optimized parsing functions
1318 * affixmgr.cxx: fix homonym handling for German dictionary project,
1319 reported by Björn Jacke (sf.net Bug ID 1592880).
1320 * tests/1592880.*: test data by Björn Jacke
1322 * src/hunspell/affixmgr.cxx: fix CIRCUMFIX suggestion
1323 Bug reported by Erdal Ronahi.
1325 * hunspell.cxx: reverse root word output (complex prefixes)
1326 Bug reported by Munzir Taha.
1328 * tools/hunspell.cxx: fix Emacs compatibility, patch by marot at sf.net
1329 - no % command in PIPE mode (SourceForge BugTracker 1595607)
1330 - fix HUNSPELL_VERSION string
1332 * suggestmgr.[hc]xx: rename check() functions to checkword() (OOo Issue 68296)
1333 adopt MySpell patch by Bryan Petty (tierra at ooo) for Hunspell source
1335 * csutil.cxx, munch.c, unmunch.c: adopt relevant parts of the MinGW patch
1336 (OOo Issue 42504) by tonal at ooo
1338 * affigmgr.cxx: remove double candidate_check() call, reported by Bram Moolenaar
1340 * tests/test.sh: add LC_ALL="C" environment. Locale dependency of make check
1341 reported by Gentoo project.
1343 * src/tools/hunspell.cxx: UTF-8 highlighting fix for console UI
1344 (not solved: breaking long UTF-8 lines)
1346 * src/tools/unmunch.c: fix bad generation if strip is shorter than condition,
1347 reported by Davide Prina
1348 * src/tools/unmunch.h: increase 5000 -> 500000
1350 * src/tools/hunspell.cxx: fix memory error in suggestion (uninitialized parameter),
1351 Bug also reported by Björn Jacke in SourceForge Bug 1469957
1353 * csutil.cxx, affixmgr.cxx: fix Caolan McNamara's patch for non OOo environment
1355 2006-11-11 Caolan McNamara <cmc at OO.o>:
1356 * csutil.cxx, affixmgr.cxx: UTF-8 table patch (OOo Issue 71449)
1357 Description: memory optimization (OOo doesn't use the large UTF-8 table).
1359 * Makefile.am: shared library patch (Sourceforge ID 1610756)
1361 * hunspell.h, hunspell.cxx: C API patch (Sourceforge ID 1616353)
1363 * hunspell.pc: pkgconfig patch (Sourceforge ID 1639128)
1365 2006-10-17 Ryan Jones <at Mozilla Bugzilla>:
1366 * affixmgr.cxx: missing fclose(affixlst) calls
1367 Reported by <gavins at ooo> in OOo Issue 70408
1369 2007-07-11 Taha Zerrouki <taha at gawab>:
1370 * affixmgr.cxx, hunspell.cxx, hashmgr.cxx, csutil.cxx: IGNORE feature to remove
1371 optional Arabic and other characters from input and dictionary words.
1372 * src/hunspell/langnum.hxx: add Arabic language number, lang_ar=96
1373 * tests/ignore.*: test data
1375 2006-05-28 Miha Vrhovnik <mvrhov at users.sourceforge>:
1376 * src/win_api/*: C API for Windows DLLs
1377 - also Delphi text editor example (see on Hunspell Sourceforge page)
1379 2006-05-18 Kevin F. Quinn <kevquinn at gentoo>:
1380 * utf_info.cxx: struct -> static struct
1381 Shared library patch also developed by Gentoo developers (Hanno Meyer-Thurow,
1382 Diego Pettenò, Kevin F. Quinn)
1384 2006-02-02 Németh László <nemethl@gyorsposta.hu>:
1385 * src/hunspell/hunspell.cxx: suggest(): replace "fooBar" -> "foo bar" suggestions
1386 with "fooBar" ->"foo Bar" (missing spaces are typical OCR bugs).
1387 Bug reported by stowrob at OOo in Issue 58202.
1388 * src/hunspell/suggestmgr.cxx: twowords(): permit 1-character words.
1389 (restore MySpell's original behavior). Here: "aNew" -> "a New".
1390 * tests/i58202.*: test data
1392 * src/parsers/textparser.cxx: fix Unicode tokenization in is_wordchar()
1393 (extra word characters (WORDCHARS) didn't work on big-endian platforms).
1395 * src/hunspell/{csutil,affixmgr}.cxx: inline isSubset(), isRevSubset():
1396 little speed optimalization for languages with rich morphology.
1398 * src/tools/hunspell.cxx: fix bad --with-ui and --with-readline compiling
1399 when (N)curses is missing. Reported by Daniel Naber.
1401 2006-01-19 Tor Lillqvist <tml@novell.com>
1402 * src/hunspell/csutil.cxx: mystrsep(): fix locale-dependent isspace() tokenization
1404 2006-01-06 András Tímár <timar@fsf.hu>
1405 * src/hunspell/{hashmgr.hxx,hunspell.cxx}: fix Visual C++ compiling errors
1407 2006-01-05 Németh László <nemethl@gyorsposta.hu>:
1408 * COPYING: set GPL/LGPL/MPL tri-license for Mozilla integration.
1409 Rationale: Mozilla source code contains an old MySpell version
1410 with GPL/LGPL/MPL tri-license. (MPL license is a copyleft license, similar
1411 to the LGPL, but it acts on file level.)
1412 * COPYING.LGPL: GNU Lesser General Public License 2.1 (LGPL)
1413 * COPYING.MPL: Mozilla Public License 1.1 (MPL)
1414 * license.hunspell, src/hunspell/license.hunspell: GPL/LGPL/MPL tri-license
1416 * src/hunspell/{affixmgr,hashmgr}.*: AF, AM alias definitions in affix file:
1417 compression of flag sets and morphological descriptions (see manual,
1418 and tests/alias* test files).
1419 Rationale: Alias compression is also good for loading time and memory
1420 efficiency, not only smaller resources.
1421 * src/tools/makealias: alias compression utility
1422 (usage: ./makealias file.dic file.aff)
1423 * tests/alias{,2,3}: AF, AM tests
1424 * man/hunspell.4: add AF, AM documentation
1425 * src/hunspell/affentry.cxx, atypes.hxx: add new opts bits (aeALIASM, aeALIASF)
1427 * tools/hunspell, src/parser/*, src/hunspell/*: Hunspell program
1428 tokenizes Unicode texts (only with UTF-8 encoded dictionaries).
1429 Missing Unicode tokenization reported by Björn Jacke, Egmont Koblinger,
1430 Jess Body and others.
1431 Note: Curses interactive interface hasn't worked perfectly yet.
1432 * tests/*.tests: remove -1 parameters of Hunspell
1433 * tests/*.{good,wrong}: remove tabulators
1435 * src/hunspell/{hunspell,affixmgr}.cxx: BREAK option: break words at
1436 specified break points and checking word parts separately (see manual).
1437 Note: COMPOUNDRULE is better (or will be better) for handling dashes and
1438 other compound joining characters or character strings. Use BREAK, if you
1439 want check words with dashes or other joining characters and there is no time
1440 or possibility to describe precise compound rules with COMPOUNDRULE.
1441 * tests/break.*: BREAK example.
1443 * src/hunspell/{affixmgr,hunspell}.cxx: add CHECKSHARPS declaration instead
1444 of LANG de_DE definitions to handle German sharp s in both spelling and
1446 * src/hunspell/hunspell.cxx: With CHECKSHARPS, uppercase words are valid
1447 with both lower sharp s (it's is optional for names in German legal texts)
1448 and SS (MÜßIG, MÜSSIG). Missing lower sharp s form reported by Björn Jacke.
1449 * src/hunspell/hunspell.cxx: KEEPCASE flag on a sharp s word has a special
1450 meaning with CHECKSHARPS declaration: KEEPCASE permits capitalisation and SS upper
1451 casing of a sharp s word (Müßig and MÜSSIG), but forbids the upper cased form
1452 with lower sharp s character(s): *MÜßIG.
1453 * tests/germancompounding*: add CHECKSHARPS, remove LANG
1454 * tests/checksharps*: add CHECKSHARPS and KEEPCASE, remove LANG
1456 * src/hunspell/hunspell.cxx: improved suggestions:
1457 - suggestions for pressed Caps Lock problems: macARONI -> macaroni
1458 - suggestions for long shift problems: MAcaroni -> Macaroni, macaroni
1459 - suggestions for KEEPCASE words: KG -> kg
1460 * src/hunspell/csutil.cxx: fix mystrrep() function:
1461 - suggestions for lower sharp s in uppercased words: MÜßIG -> MÜSSIG
1462 * tests/checksharps{,utf}.sug: add tests for mystrrep() fix
1464 * src/hunspell/hashmgr.cxx: Now dictionary words can contain slashes
1465 with the "\/" syntax. Problem reported by Frederik Fouvry.
1467 * src/hunspell/hunspell.cxx: fix bad duplicate filter in suggest().
1468 (Suggesting some capitalised compound words caused program crash
1469 with Hungarian dictionary, OOo Issue 59055).
1471 * src/hunspell/affixmgr.cxx: fix bad defcpd_check() call in compound_check().
1472 (Overlapping new COMPOUNDRULE and old compounding methods caused program
1473 crash at suggestion.)
1475 * src/hunspell/affixmgr.{cxx,hxx}: check affix flag duplication at affix classes.
1476 Suggested by Daniel Naber.
1478 * src/hunspell/affentry.cxx: remove unused variable declarations (OOo i58338).
1479 Compiler warnings reported by András Tímár and Martin Hollmichel.
1481 * src/hunspell/hunspell.cxx: morph(): not analyse bad mixed uppercased forms
1482 (fix Arabic morphological analysis with Buckwalter's Arabic transliteration)
1484 * src/hunspell/affentry.{cxx,hxx}, atypes.hxx: little memory optimization
1486 - using unsigned char fields instead of short (stripl, appndl, numconds)
1487 - rename xpflg field to opts
1488 - removing utf8 field, use aeUTF8 bit of opts field
1490 * configure.ac: set tests/maputf.test to XFAILED on ARM platform.
1491 Fail reported by Rene Engelhard.
1493 * configure.ac: link Ncursesw library, if exists.
1495 * BUGS: add BUGS file
1497 * tests/complexprefixes2.*: test for morphological analysis with COMPLEXPREFIXES
1499 * src/hunspell/affixmgr.cxx: use "COMPOUNDRULE" instead of
1500 "COMPOUND". The new name suggested by Bram Moolenaar.
1501 * tests/compoundrule*: modified and renamed compound.* test files
1503 * man/hunspell.4: AF, AM, BREAK, CHECKSHARPS, COMPOUNDRULE, KEEPCASE.
1504 - also new addition to the documentation:
1505 Header of the dictionary file define approximate dictionary size:
1506 ``A dictionary file (*.dic) contains a list of words, one per line.
1507 The first line of the dictionaries (except personal dictionaries)
1508 contains the _approximate_ word count (for optimal hash memory size).''
1509 Asked by Frederik Foudry.
1511 One-character replacements in REP definitions: ``It's very useful to
1512 define replacements for the most typical one-character mistakes, too:
1513 with REP you can add higher priority to a subset of the TRY suggestions
1514 (suggestion list begins with the REP suggestions).''
1516 2005-11-11 Németh László <nemethl@gyorsposta.hu>:
1517 * src/hunspell/affixmgr.*: fix Unicode MAP errors (sorted only n-1
1518 characters instead of n ones in UTF-16 MAP character lists).
1519 Bug reported by Rene Engelhard.
1521 * src/hunspell/affixmgr.*: fix infinite COMPOUND matching (default char
1522 type is unsigned on PowerPC, s390 and ARM platforms and it will never
1523 be negative). Bug reported by Rene Engelhard.
1525 * src/hunspell/{affixmgr,suggestmgr}.cxx: fix bad ONLYINCOMPOUND
1527 * tests/onlyincompound.sug: empty test file to check this fix.
1528 Bug reported by Björn Jacke.
1530 * src/hunspell/affixmgr.cxx: fix backtracking in COMPOUND pattern matching.
1531 * tests/compound6.*: test files to check this fix.
1533 * csutil.cxx: set bigger range types in flag_qsort() and flag_bsearch().
1535 * affixmgr.hxx: set better type for cont_classes[] Boolean data (short -> char)
1537 * configure.ac, tests/automake.am: set platform specific XFAIL test
1538 (flagutf8.test on ARM platform)
1540 2005-11-09 Németh László <nemethl@gyorsposta.hu>:
1542 * src/hunspell/affixmgr.*: new and improved affix file parameters:
1544 - COMPOUND definitions: compound patterns with regexp-like matching.
1545 See manual and test files: tests/compound*.*
1546 Suggested by Bram Moolenaar.
1547 Also useful for simple word-level lexical scanning, for example
1548 analysing numbers or words with numbers (OOo Issue #53643):
1549 http://qa.openoffice.org/issues/show_bug.cgi?id=53643
1550 Examples: tests/compound{4,5}.*.
1552 - NOSUGGEST flag: words signed with NOSUGGEST flag are not suggested.
1553 Proposed flag for vulgar and obscene words (OOo Issue #55498).
1554 Example: tests/nosuggest.*.
1555 Problem reported by bobharvey at OOo:
1556 http://qa.openoffice.org/issues/show_bug.cgi?id=55498
1558 - KEEPCASE flag: Forbid capitalized and uppercased forms of words
1559 signed with KEEPCASE flags. Useful for special ortographies
1560 (measurements and currency often keep their case in uppercased
1561 texts) and other writing systems (eg. keeping lower case of IPA
1564 - CHECKCOMPOUNDCASE: Forbid upper case characters at word bound in compounds.
1565 Examples: tests/checkcompoundcase* and tests/germancompounding.*
1567 - FLAG UTF-8: New flag type: Unicode character encoded with UTF-8.
1568 Example: tests/flagutf8.*.
1569 Rationale: Unicode character type can be more readable
1570 (in a Unicode text editor) than `long' or `num' flag type.
1573 * src/hunspell/hunspell.cxx: accept numbers and numbers with separators (i53643)
1574 Bug reported by skelet at OOo:
1575 http://qa.openoffice.org/issues/show_bug.cgi?id=53643
1577 * src/hunspell/csutil.cxx: fix casing data in ISO 8859-13 character table.
1579 * src/hunspell/csutil.cxx: add ISO-8859-15 character encoding (i54980)
1580 Rationale: ISO-8859-15 is the default encoding of the French OpenOffice.org
1581 dictionary. ISO-8859-15 is a modified version of ISO-8859-1
1582 (latin-1) character encoding with French œ ligatures and euro
1583 symbol. Problem reported by cbrunet at OOo in OOo Issue 54980:
1584 http://qa.openoffice.org/issues/show_bug.cgi?id=54980
1586 * src/hunspell/affixmgr.cxx: fix zero-byte malloc after a bad affix header.
1587 Patch by Harri Pitkänen.
1589 * src/hunspell/suggestmgr.cxx: fix bad NEEDAFFIX word suggestion
1590 in ngram suggestions. Reported by Daniel Naber and Friedel Wolff.
1592 * src/hunspell/hashmgr.cxx: fix bad white space checking in affix files.
1593 src/hunspell/{csutil,affixmgr}.cxx: add other white space separators.
1594 Problems with tabulators reported by Frederik Fouvry.
1596 * src/hunspell/*: replace system-dependent <license.*> #include
1597 parameters with quoted ones. Problem reported by Dafydd Jones.
1599 * src/hunspell/hunspell.cxx: fix missing morphological analysis of dot(s)
1600 Reported by Trón Viktor.
1603 * src/hunspell/affixmgr.cxx: rename PSEUDOROOT to NEEDAFFIX.
1604 Suggested by Bram Moolenaar.
1606 * src/hunspell/suggestmgr.hxx: Increase default maximum of
1607 ngram suggestions (3->5). Suggested by Kevin Hendricks.
1609 * src/hunspell/htypes.hxx: Increase MAXDELEN for long affix flags.
1611 * src/hunspell/suggestmgr.cxx: modify (perhaps fix) Unicode map suggestion.
1612 tests/maputf test fail on ARM platform reported by Rene Engelhard.
1614 * src/hunspell/{affentry.cxx,atypes.hxx}: remove [PREFIX] and
1615 MISSING_DESCRIPTION messages from morphological analysis.
1616 Problems reported by Trón Viktor.
1618 * tests/germancompounding.{aff,good}: Add "Computer-Arbeit" test word.
1619 Suggested by Daniel Naber.
1621 * doc/man/hunspell.4: Proof-reading patch by Goldman Eleonóra.
1623 * doc/man/hunspell.4: Fix bad affix example (replace `move' with `work').
1624 Bug reported by Frederik Fouvry.
1626 * tests/*: new test files:
1627 affixes.*: simple affix compression example from Hunspell 4 manual page
1628 checkcompoundcase.*, checkcompoundcase2.*, checkcompoundcaseutf.*
1629 compound.*, compound2.*, compound3.*, compound4.*, compound5.*
1630 compoundflag.* (former compound.*)
1631 flagutf8.*: test for FLAG UTF-8
1632 germancompounding.*: simplification with CHECKCOMPOUNDCASE.
1633 germancompoundingold.* (former germancompounding.*)
1634 i53643.*: check numbers with separators
1635 i54980.*: ISO8859-15 test
1636 keepcase.*: test for KEEPCASE
1637 needaffix*.* (former pseudoroot*.* tests)
1638 nosuggest.*: test for NOSUGGEST
1640 2005-09-19 Németh László <nemethl@gyorsposta.hu>:
1641 * src/hunspell/suggestmgr.cxx: improved ngram suggestion:
1642 - detect not neighboring swap characters (pernament -> permanent)
1643 Rationale: ngram method has a significant error with not neighboring
1644 swap characters, especially when swap is in the middle of the word.
1645 - suggest uppercase forms (unesco -> UNESCO, siggraph's -> SIGGRAPH's)
1646 - suggest only ngram swap character and uppercase form, if they exist.
1647 Rationale: swap character and casing equivalence give mutch better
1648 suggestions as any other (weighted) ngram suggestions.
1649 - add uppercase suggestion (PERMENANT -> PERMANENT)
1651 * src/hunspell/*: complete comparison with MySpell 3.2 (in OOo beta 2):
1652 - affixmgr.cxx: add missing numrep initialization
1653 - hashmgr.cxx: add_word(): don't allocate temporary records
1654 - hunspell.cxx: in suggest():
1655 - check capitalized words first (better sug. order for proper names),
1656 - check pSMgr->suggest() return value
1657 - set pSMgr->suggest() call to not optional in HUHCAP
1658 - csutil.cxx: fix bad KOI8-U -> koi8r_tbl reference in enc_entry encds
1659 - csutil.cxx: fix casing data in ISO 8859-2, Windows 1251 and KOI8-U
1660 encoding tables. Bug reported by Dmitri Gabinski.
1662 * src/hunspell/affixmgr.*: improved compound word and other features
1663 - generalize hu_HU specific compound word features with new affix file
1664 parameters, suggested by Bram Moolenaar:
1665 - CHECKCOMPOUNDDUP: forbid word duplication in compounds (eg. foo|foo)
1666 - CHECKCOMPOUNDTRIPLE: forbid triple letters in compounds (eg. foo|obar)
1667 - CHECKCOMPOUNDPATTERN: forbid patterns at word bounds in compounds
1668 - CHECKCOMPOUNDREP: using REP replacement table, forbid presumably bad
1669 compounds (useful for languages with unlimited number of compounds)
1670 - ONLYINCOMPOUND flag works also with words (see tests/onlyincompound.*)
1671 Suggested by Daniel Naber, Björn Jacke, Trón Viktor & Bram Moolenaar.
1672 - PSEUDOROOT works also with prefixes and prefix + suffix combinations
1673 (see tests/pseudoroot5.*). Suggested by Trón Viktor.
1674 - man/hunspell.4: updated man page
1676 * src/hunspell/affixmgr.*: fix incomplete prefix handling with twofold
1677 suffixes (delete unnecessary contclasses[] conditions in
1678 prefix_check_twosfx() and prefix_check_twosfx_morph()).
1679 Bug reported by Trón Viktor.
1681 * src/hunspell/affixmgr.*: complete also *_morph() functions with
1682 conditions of new Hunspell features (circumfix, pseudoroot etc.).
1684 * src/hunspell/suggestmgr.cxx:
1685 - fix missing suggestions for words with crossed prefix and suffix
1686 - fix redundant non-compound word checking
1687 - fix losing suggestions problem. Bug reported by Dmitri Gabinski.
1689 * src/hunspell/dictmgr.*:
1690 - add new dictionary manager for Hunspell UNO modul
1691 Problems with eo_ANY Esperanto locale reported by Dmitri Gabinski.
1693 * src/hunspell/*: use precise constant sizes for 8-bit and 16-bit character
1694 arrays with MAXWORDUTF8LEN and MAXSWUTF8L macros.
1696 * src/hunspell/affixmgr.cxx: fix bad MAXNGRAMSUGS parameter handling
1698 * src/hunspell/affixmgr.cxx, src/tools/{un}munch.*: fix GCC 4.0 warnings
1699 on fgets(), reported by Dvornik László
1701 * po/hu.po: improved translation by Dvornik László
1703 * tests/test.sh: improved test environment
1704 - add suggestion testing (see tests/*.sug)
1705 - add memory debugging environment, based on the excellent Valgrind debugger.
1706 Usage on Linux and experimental platforms of Valgrind:
1707 VALGRIND=memcheck make check
1708 - rename test_hunmorph to test.sh
1710 * tests/*: new tests:
1711 - base.*: base example based on MySpell's checkme.lst.
1712 - map{,utf}.*, rep{,utf}: MAP and REP suggestion examples
1713 - tests on new CHECKCOMPOUND, ONLYINCOMPOUND and PSEUDOROOT features
1714 - i54633.*: capitalized suggestion test for Issue 54633 from OOo's Issuezilla
1715 - i35725.*: improved ngram suggestion test for Issue 35725
1717 2005-08-26 Németh László <nemethl@gyorsposta.hu>:
1720 * src/hunspell/suggestmgr.cxx:
1721 Unicode support in related character map suggestion
1723 * src/hunspell/suggestmgr.cxx: Unicode support in ngram suggestion
1725 * src/hunspell/{suggestmgr,affixmgr,hunspell}.cxx: improve ngram suggestion.
1726 Fix http://qa.openoffice.org/issues/show_bug.cgi?id=35725. See release
1727 notes for examples. This problem reported by beccablain at OOo.
1728 - ngram suggestions now are case insensitive (see `Permenant' bug in Issuezilla)
1729 - weight ngram suggestions (with the longest common subsequent algorithm,
1730 also considering lengths of bad word and suggestion, identical first
1731 letters and almost completely identical character positions)
1732 - set strict affix congruency in expand_rootword(). Now ngram suggestions
1733 are good for languages with rich morphology and also better for English.
1734 Rationale: affixed forms of the first ngram suggestion
1735 very often suppress the second and subsequent root word suggestions. But
1736 faults in affixes are more uncommon, and can be fix without suggestions.
1737 We must prefer the more informative second and subsequent root word
1738 suggestions instead of the suggestions for bad affixes.
1739 - a better suggestion may not be substring of a less good suggestion
1740 Rationale: Suggesting affixed forms of a root word is
1741 unnecessary, when root word has got better weighted ngram value.
1742 (Checking substrings is a good approximation for this refinement.)
1743 - lesser ngram suggestions (default 3 maximum instead of 10)
1744 Rationale: For users need a big extra effort to check a lot of bad ngram
1745 suggestions, nine times out of ten unnecessarily. It is very
1746 distracting, because ngram suggestions could be very different.
1747 Usually Myspell and Hunspell suggest one or two suggestions with
1748 the old suggestion algorithms (maximum is 15), with ngram algorithm
1749 often gives maximum number suggestions. With strict affix congruency
1750 and other refinements, the good suggestion there is usually among the
1751 first three elements.
1752 - new affix parameter: MAXNGRAMSUG
1754 * src/hunspell/*: support agglutinative languages with rich prefix
1755 morphology or with right-to-left writing system (for example, Turkic
1756 and Austronesian languages with (modified) Arabic scripts).
1757 - new affix parameter: COMPLEXPREFIXES
1758 Set twofold prefix stripping (but single suffix stripping)
1759 * src/hunspell/affixmgr.cxx:
1760 - speed up prefix loading with tree sorting algorithm.
1761 * tests/complexprefixes.*, tests/complexprefixesutf.*:
1762 Coptic example posted by Moheb Mekhaiel
1764 * src/hunspell/hashmgr.cxx: check size attribute in dic file
1765 suggested by Daniel Naber
1766 Rationale: With missing size attribute Hunspell allocates too small and
1767 more slower hash memory, and Hunspell can lose first dictionary word.
1769 * src/hunspell/affixmgr.cxx: check stripping characters and condition
1770 compatibility in affix rules (bugs detected in cs_CZ, es_ES, es_NEW,
1771 es_MX, lt_LT, nn_NO, pt_PT, ro_RO and sk_SK dictionaries). See release
1772 notes of Hunspell 1.0.9 in NEWS.
1774 * src/hunspell/affixmgr.cxx: check unnecessary fields in affix rules
1775 (bugs detected in ro_RO and sv_SE dictionaries). See release notes.
1777 * src/hunspell/affixmgr.cxx: remove redundant condition checking
1778 in affix rules with stripping characters (redundancy in OpenOffice.org
1779 dictionaries reported by Eleonóra Goldman)
1780 Rationale: this is a little optimization, but it was excellent for
1781 detect the bad ngram affixation with bad or weak affix conditions.
1783 * tests/germancompounding.aff: improve compound definition
1784 - use dash prefix instead of language specific tokenizer
1785 Rationale: Using uniform approach is the right way to check and analyze
1786 compound words. Language specific word breaking is deprecated, need
1787 a sophisticated grammar checking for word-like word pairs
1788 (for example in Hungarian there is a substandard, but accepted
1789 syntax with dash for word pairs: cats, dogs -> kutyák-macskák (like
1790 cats/dogs in English).
1792 * test Hunspell with 54 OpenOffice.org dictionaries: see release notes
1796 * src/hunspell/suggestmgr.*: add time limit to exponential
1797 algorithm of the related character map suggestion
1798 Rationale: a long word in agglutinative languages or a special pattern
1799 (for example a horizontal rule) made of map characters can `crash' the
1802 * src/hunspell/affentry.cxx: add() functions: fix bad word generation
1803 checking stripping characters (see similar bug in unmunch)
1805 * src/hunspell/affixmgr.cxx: parse_file(): fix unconditional getNext()
1806 call for ~AffixMgr() when affix file is corrupt.
1808 * src/hunspell/affixmgr.*: AffixMgr(), parse_cpdsyllable(): fix missing
1809 string duplications for ~AffixMgr() when affix file is corrupt.
1811 * src/hunspell/affixmgr.*: parse_affix(): fix fprintf() call when affix
1812 file is corrupt. Bug reported by Daniel Naber.
1814 * suggestmgr.cxx: replace single usage of 'strdup' with 'mystrdup'
1815 patch by Chris Halls (debian.org)
1817 * src/hunspell/makefile.mk: add makefile.mk for compiling in OpenOffice.org
1818 See README in Hunspell UNO modul.
1819 Problems with separated compiling reported by Rene Engelhard
1821 * src/hunspell/hunspell.cxx: fix pseudoroot support
1822 - search a not pseudoroot homonym in check()
1823 * tests/pseudoroot4.*: test this fix
1825 * src/tools/unmunch.c: fix bad word generation when conditions
1826 are shorter or incompatible with stripping characters in affix rules
1828 * src/tools/unmunch.c: fix mychomp() for de_AT.dic and other dic files
1829 without last new line character.
1832 * src/hunspell/suggestmgr.*: erase ACCENT suggestion
1833 Rationale: ACCENT suggestion was the same as Kevin Hendrick's map
1834 suggestion algorithm, but with a less good interface in affix file.
1836 * src/hunspell/suggestmgr.*: combine cycle number limit
1837 in badchar(), and forgotchar() with a time limit.
1839 * src/hunspell/affixmgr.*: remove NOMAPSUGS affix parameter
1841 * src/hunspell/{suggestmgr,hunspell}.*: strip periods from
1842 suggestions (restore MySpell's original behaviour)
1843 Rationale: OpenOffice.org has an automatic period handling mechanism
1844 and suggestions look better without periods.
1845 - new affix file parameter: SUGSWITHDOTS
1846 Add period(s) to suggestions, if input word terminates in period(s).
1847 (No need for OpenOffice.org dictionaries.)
1849 * tests/germancompounding.aff: improve bad german affix in affix example
1850 (computeren->computern). Suggested by Daniel Naber.
1852 * src/tools/example.cxx: add Myspell's example
1854 * src/tools/munch.cxx: add Myspell's munch
1856 * man{,/hu}/hunspell.4: refresh manual pages
1858 2005-08-01 Németh László <nemethl@gyorsposta.hu>:
1859 * add missing MySpell files and features:
1860 - add MySpell license.readme, README and CONTRIBUTORS ({license,README,AUTHORS}.myspell)
1861 - add MySpell unmunch program (src/tools/unmunch.c)
1862 - add licenses to source (src/hunspell/license.{myspell,hunspell})
1863 - port MAP suggestion (with imperfect UTF-8 support)
1864 - add NOSPLITSUGS affix parameter
1865 - add NOMAPSUGS affix parameter
1867 * src/man/man.4: MAP, COMPOUNDPERMITFLAG, NOSPLITSUGS, NOMAPSUGS
1869 * src/hunspell/aff{entry,ixmgr}.cxx:
1870 - improve compound word support
1871 - new affix parameter: COMPOUNDPERMITFLAG (see manual)
1872 * src/tests/compoundaffix{,2}.*: examples for COMPOUNDPERMITFLAG
1873 * src/tests/germancompounding.*: new solution for German compounding
1874 Problems with German compounding reported by Daniel Naber
1876 * src/hunspell/hunspell.cxx: fix German uppercase word spelling
1877 with the spellsharps() recursive algorithm.
1878 Default recursive depth is 5 (MAXSHARPS).
1879 * src/tests/germansharps*: extended German sharp s tests
1881 * src/tools/hunspell.cxx: fix fatal memory bug in non-interactive
1882 subshells without HOME environmental variable
1883 Bug detected with PHP by András Izsók.
1885 2005-07-22 Németh László <nemethl@gyorsposta.hu>:
1886 * src/hunspell/csutil.hxx: utf16_u8()
1887 - fix 3-byte UTF-8 character conversion
1889 2005-07-21 Németh László <nemethl@gyorsposta.hu>:
1890 * src/hunspell/csutil.hxx: hunspell_version() for OOo UNO modul
1892 2005-07-19 Németh László <nemethl@gyorsposta.hu>:
1894 - src/morphbase -> src/hunspell
1895 - src/hunspell, src/hunmorph -> src/tools
1896 - src/huntokens -> src/parsers
1898 * src/tools/hunstem.cxx: add stemmer example
1900 2005-07-18 Németh László <nemethl@gyorsposta.hu>:
1901 * configure.ac: --with-ui, --with-readline configure options
1902 * src/hunspell/hunspell.cxx: fix conditional compiling
1904 * src/hunspell/hunspell.cxx: set HunSPELL.bak temporaly file
1905 in the same dictionary with the checked file.
1907 * src/morphbase/morphbase.cxx:
1909 - handling German sharp s (ß)
1911 - fix (temporaly) analyize()
1913 * tests: a lot of new tests
1915 * po/, intl/, m4/: add gettext from GNU hello
1917 * po/hu.po: add Hungarian translation
1919 * doc/, man/: rename doc to man
1921 2005-07-04 Németh László <nemethl@gyorsposta.hu>:
1922 * src/morphbase/hashmgr.cxx: set FLAG attributum instead of FLAG_NUM and FLAG_LONG
1924 * doc/hunspell.4: manual in English
1926 2005-06-30 Németh László <nemethl@gyorsposta.hu>:
1927 * src/morphbase/csutil.cxx: add character tables from csutil.cxx of OOo 1.1.4
1929 * src/morphbase/affentry.cxx: fix Unicode condition checking
1931 * tests/{,utf}compound.*: tests compounding
1933 2005-06-27 Németh László <nemethl@gyorsposta.hu>:
1934 * src/morphbase/*: fix Unicode compound handling
1936 2005-06-23 Halácsy Péter:
1937 * src/hunmorph/hunmorph.cxx: delete spelling error message and suggest_auto() call
1939 2005-06-21 Németh László <nemethl@gyorsposta.hu>:
1940 * src/morphbase: Unicode support
1941 * tests/utf8.*: SET UTF-8 test
1943 * src/morphbase: checking and fixing with Valgrind
1944 Memory handling error reported by Ferenc Szidarovszky
1946 2005-05-26 Németh László <nemethl@gyorsposta.hu>:
1947 * suggestmgr.cxx: fix stemming
1948 * AUTHORS, COPYING, ChangeLog: set CC-LGPL free software license
1950 2004-05-25 Varga Dániel <daniel@all.hu>
1951 * src/stemtool: new subproject
1953 2005-05-25 Halácsy Péter <peter@halacsy.com>
1954 * AUTHORS, COPYING: set CC Attribution license
1956 2004-05-23 Varga Dániel <daniel@all.hu>
1957 * src: - modifications for compiling with Visual C++
1959 * src/hunmorph/csutil.cxx: correcting header of flag_qsort(),
1960 * src/hunmorph/*: correct csutil include
1962 2005-05-19 Németh László <nemethl@gyorsposta.hu>
1963 * csutil.cxx: fix loop condition in lineuniq()
1964 bug reported by Viktor Nagy (nagyv nyelvtud hu).
1966 * morphbase.cxx: handle PSEUDOROOT with zero affixes
1967 bug reported by Viktor Nagy (nagyv nyelvtud hu).
1968 * tests/zeroaffix.*: add zeroaffix tests
1970 2005-04-09 Németh László <nemethl@gyorsposta.hu>
1971 * config.h.in: reset with autoheader
1973 * src/hunspell/hunspell.cxx: set version
1975 2005-04-06 Németh László <nemethl@gyorsposta.hu>
1979 New optional parameters in affix file:
1980 - PSEUDOROOT: for forbidding root with not forbidden suffixed forms.
1981 - COMPOUNDWORDMAX: max. words in compounds (default is no limit)
1982 - COMPOUNDROOT: signs compounds in dictionary for handling special compound rules
1983 - remove COMPOUNDWORD, ONLYROOT
1985 2005-03-21 Németh László <nemethl@gyorsposta.hu>
1987 - 2-byte flags, FLAG_NUM, FLAG_LONG
1988 - CIRCUMFIX: signed suffixes and prefixes can only occur together
1989 - ONLYINCOMPOUND for fogemorpheme (Swedish, Danish) or Flute-elements (German)
1990 - COMPOUNDBEGIN: allow signed roots, and roots with signed suffix in begin of compounds
1991 - COMPOUNDMIDDLE: like before, but middle of compounds
1992 - COMPOUNDEND: like before, but end of compounds
1993 - remove COMPOUNDFIRST, COMPOUNDLAST