Hunspell Clarion template and classes
Sean Hennessy – Don’t Panic Business Management Updated 2020-10-13
Hunspell is the spell checker of LibreOffice, OpenOffice.org and Mozilla Firefox & Thunderbird, among others This forms part of the HunSpell spellchecking classes and templates
Most of the classes are derived directly from code provided in SpellDemo.clw by Axel Beilhartz of Schul- Office.de (Note at some point Schul-Office.de has disappeared and been replaced )
I’ve used the example program provided by Axel and split the classes out and created a few new methods. I’ve then created a template to do all the yucky ‘find the right embed’ stuff and do the usual clarion ‘this is easy’ thing
Enhancements and fixes are welcome.
As noted Below Hunspell is used by both Mozilla and openoffice. So any dictionary from them can be used. I’ve provided some English and German dictionaries, the defaults used by the templates are Australian English.
(From their website: http://hunspell.sourceforge.net)
Hunspell is the spell checker of LibreOffice, OpenOffice.org and Mozilla Firefox 3 & Thunderbird,
Google Chrome, and it is also used by proprietary softwares, like Mac OS X, memoQ, Opera and SDL Trados. Main features:
- Extended support for language peculiarities; Unicode character encoding, compounding and complex morphology.
- Improved suggestion using n-gram similarity, rule and dictionary based pronounciation data.
- Morphological analysis, stemming and generation.
- Hunspell is based on MySpell and works also with MySpell dictionaries.
- C++ library under GPL/LGPL/MPL tri-license.
- Interfaces and ports: Enchant (Generic spelling library from the Abiword project),
XSpell (Mac OS X port, but Hunspell is part of the OS X from version 10.6 (Snow Leopard), and now it is enough to place Hunspell dictionary files into ~/Library/Spelling
or /Library/Spelling for spell checking), Delphi, Java (JNA, JNI), Perl, .NET, Python, Ruby
Hunspell is based on MySpell and is backward-compatible with MySpell dictionaries.
While MySpell uses a single-byte character encoding, Hunspell can use Unicode UTF-8-encoded dictionaries. See Also http://en.wikipedia.org/wiki/Hunspell
Demo Version available HERE
Hunspell4Clarion is available for $69 from ClarionShop
The Hunspell library was originally written in C and updated to C++. Unfortunately, the authors made use of the case-sensitive nature of the language in defining exports.
So that we can use the library in our Clarion programs, we needed to change the source code of the “libhunspell.dll” and make a few changes. The dll file exported functions such as “hunspell_spell” and “Hunspell_spell”. In C + + this is fine. However, this results in Clarion-link failures, as Clarion is not case sensitive.
We, therefore, created Hunspell4Clariion.dll where we have the lowercased all the function names, and in some cases altered the types. There are a couple of C++ string constructs that are difficult to deal with in Clarion.
For example, the function “libhunspell_suggest” expects its third argument as “char *** SLST”, i.e. a pointer to a list of CString’s. To avoid problems we have added the function “slo_getsuggest” to the dll.
“Slo_getsuggest” wraps “libhunspell_suggest” in a clarion friendly way and returns the list into a CString, which is passed as the third parameter of “slo_getsuggest”.
This CSTRING must be initialized in Clarion with a size of 2021 bytes and can hold a maximum of 20 correction suggestions. The various suggestions for correction are indicated by “|” separated so that the CSTRING be transferred directly to a pop-up menu.
We could also exclude some morphological functions from export, however, the dll is still valid to use the Hunspell documentation.
The personal dictionary contains the words added via the ‘Add to Dictionary’ menu item in the popup. The dictionary is a simple ASCII text file listing of words.
Format of the list is:
word is the word for checking.
ExampleWord is an optional existing word to use for Afflix : plural’s, pre and postfixes to words. The two must be separated by a ‘/’
There is a utility (Clarion code) for the maintenance of the personal dictionary.
Where to get Dictionaries
Many, many places. Available in most Linux distributions. LibreOffice, Mozilla, Chromium, to name a few. German and a range of English dictionaries are supplied. See the Accessory/resource and example directories.
There are 2 templates
This gets the Dictionary names and defines the global classes
Selects the controls to check and allocates appropriate classes and code.
- Add the Hunspell global extension
The dic and aff files are the Dictionary and Afflix files respectfully. Affix are pre-fix and post-fix additions to words. I.e. that is a base word that’s is base word plus affix.
The dictionaries are Hunspell or Aspell dictionaries. German and a range of English dictionaries are provided in the example folder.
The personal dictionary is a simple ASCII file accessed by using the ASCII driver. If the file does not exist it will be created.
- In the procedure that requires spell check.
Add the Procedure Extension.
Here you simply add the controls that you desire checking. ENTRY and TEXT controls are supported.
The timer is required and is the space of time between checks.
- In program Use
Only the control with focus will be checked. That a slight limitation. Any misspelled words will be underlined in a squiggly red line.
You need to click onto the appropriate word and then Right-Mouse-Click to popup a menu.
The menu will have up to 20 words as a suggestion for replacement. Just click on to the appropriate word to replace it.
Add to Dictionary will add the word to the personal dictionary.
Ignore for session will ignore that word while this instance of the object exists. (The word is added to the runtime dictionary only)
To distribute Hunspell to your clients you need to include hunspell4clarion.dll and ensure the MS C++ 2015-2019 redistributable library is installed on the client machine. A version is downloadable from MS, in accessory/bin, or available in Setup Builder.
There are 3 example programs provided
\ExampleApp is a variation of the Clarion RTF example App
\Handcode is a handcoding example
\Personal Dict Maint is a simple program to create and edit personal dictionaries
Upgrading and notes
Any Enhancements and fixes are welcome
Hunspell is licenced under the LGPL 2.1
Hunspell4Clarion is compatible with all versions of clarion.
The examples are provided as Version 8 App and Dct. Txd and Txa files of the C8 examples are also provided in case imports into ancient versions is required. Note: that may require txa/txd file changes. Imports into versions prior to 8 has not been attempted.
From Hunspell prior to version 3
The Hunspell DLL name has changed from libhunspell.dlll to hunspell4clarion.dll This will need to be added to the solution and the old libhunspell removed
The personal dictionary may need to be added
MS C Runtime version 2019 is required. vc_redistx98.exe is supplied iu the accessory/bin folder
The dll name change was done to reflect the fact that the dll used is not the standard libhunspell from the hunspell project. There are (and always have been) a small number of key changes made to the libraryl
- The function names are made lower case to cope with Clarions case insensitive linking
- To avoid duplication errors when linking the function names were changed to libhunspell *
- The slo_suggest function was added to ease the ract that Clarion cope poorly with C++ list of strings construct. It’s defined as char*** which is a pointer to a list of pointers. It’s just crazy. So this was changed to a pipe deliminated string, which is just right for the Clarion POPUP().
Version 3 of hunspell4clarion is based on Hunspell version 1.7 source and compiled using MS C++ 2019 This will require the latest run time libraries from MS to support
specialThanks to Jeffrey Kuijt and www.adsystems.nl for providing assistance to help get version 3 done
Added Templates now also support Legacy template family
Added Personal dictionary support. SpellUserDict class.
Fixed Template added ASCII file driver to support user dic
Fixed Occasional GPF when using RTF text where Len > 4000 Fixed Correcting a word in RTF would lose formatting
Fixed Template still referred to libhunspell.lib Change Template Dict name prompts increased to S50
Fixed Was not using ‘.’ as a word delimiter.
Changed Sean Hennessy. Changed and recompiled the C++ DLL to use Hunspell Ver 17 source compiled with VS C++ 2019. Quite a few changes required.
Changed DLL renamed hunspell4clarion.dll Changed libhunspell_suggest parameter change
Added New Functions exposed:
Added libhunspell_remove Remove a word current session
Added libhunspell_add_dic Add an additional dictionary
Changed slo_getsuggest() now returns the number of suggested words
Special Thanks to Jeffrey Kuijt and www.adsystems.nl for providing assistance to help get this upgrade done
Jeffrey Kuijt Fixed a number of problems with popup actions. Fix so doesn’t return to the top of a text box after spelling
Increase size of text buffers used, like to an equate
Take into account many lines in a text box
Update to classes previously provided by Axel for better RTF usage.
Also reordered and changed the names of some files to ensure all spell files are together.
1.0.1 Added #Project to add the lib to the solution
Added addition test to check if the current window was really active to avoid the wrong display of squiggly lines as suggested by Axel Beilhartz
1.0 Initial Release 14 Sep 2011