Dictionary
: main library interface
-
class
Dictionary
(aff, dic)[source] The main and only interface to
spylls.hunspell
as a library.Usage:
from spylls.hunspell import Dictionary # from folder where en_US.aff and en_US.dic are present dictionary = Dictionary.from_files('/path/to/dictionary/en_US') # or, from Firefox/LibreOffice dictionary extension dictionary = Dictionary.from_zip('/path/to/dictionary/en_US.odt') # or, from system folders (on Linux) dictionary = Dictionary.from_system('en_US') print(dictionary.lookup('spylls')) # False for suggestion in dictionary.suggest('spylls'): print(sugestion) # spells # spills
Note that for easyness of experimentation,
en_US
dictionary from SCOWL is distributed with Spylls, so this will work without any additional dictionary installation:from spylls.hunspell import Dictionary en = Dictionary.from_files('en_US')
Internal algorithm implementations
lookuper
andsuggester
are exposed in order to allow experimenting with the implementation:# Produce all ways this word might be analyzed by current dictionary for form in dictionary.lookuper.good_forms('building'): print(form) # AffixForm(building = building) # AffixForm(building = build + Suffix(ing: G×, on [[^e]]$)) # Internal suggest method, showing information about suggestion method for suggestion in dictionary.suggester.suggestions('spylls'): print(suggestion) # Suggestion[badchar](spells) # Suggestion[badchar](spills)
Dictionary creation
-
classmethod
from_files
(path)[source] Read dictionary from pair of files
/some/path/some_name.aff
and/some/path/some_name.dic
.For easyness of experimentation,
en_US
dictionary from SCOWL is distributed with Spylls, so this will work without any additional dictionary installation:from spylls.hunspell import Dictionary en = Dictionary.from_files('en_US')
- Parameters
path (str) – Should be just
/some/path/some_name
.- Return type
@classmethod def from_files(cls, path: str) -> Dictionary: """ Read dictionary from pair of files ``/some/path/some_name.aff`` and ``/some/path/some_name.dic``. For easyness of experimentation, ``en_US`` dictionary from `SCOWL <http://wordlist.aspell.net/>`_ is distributed with Spylls, so this will work without any additional dictionary installation:: from spylls.hunspell import Dictionary en = Dictionary.from_files('en_US') Args: path: Should be just ``/some/path/some_name``. """ if path in cls.DISTRIBUTED and not os.path.exists(path + '.aff'): path = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'data', cls.DISTRIBUTED[path], path) aff, context = readers.read_aff(FileReader(path + '.aff')) dic = readers.read_dic(FileReader(path + '.dic', encoding=context.encoding), aff=aff, context=context) return cls(aff, dic)
-
classmethod
from_zip
(path)[source] Read dictionary from zip-archive containing
*.aff
and*.dic
path. Note that Open/Libre Office dictionary extensions (*.odt
) and Firefox/Thunderbird dictionary extensions (*.xpi
) are in fact such archives, soDictionary
can be read from them without unpacking.- Parameters
path (str) – Path to zip-file/extension.
- Return type
@classmethod def from_zip(cls, path: str) -> Dictionary: """ Read dictionary from zip-archive containing ``*.aff`` and ``*.dic`` path. Note that Open/Libre Office dictionary extensions (``*.odt``) and Firefox/Thunderbird dictionary extensions (``*.xpi``) are in fact such archives, so ``Dictionary`` can be read from them without unpacking. Args: path: Path to zip-file/extension. """ file = zipfile.ZipFile(path) # TODO: fail if there are several aff_path = [name for name in file.namelist() if name.endswith('.aff')][0] dic_path = [name for name in file.namelist() if name.endswith('.dic')][0] aff, context = readers.read_aff(ZipReader(file.open(aff_path))) dic = readers.read_dic(ZipReader(file.open(dic_path), encoding=context.encoding), aff=aff, context=context) return cls(aff, dic)
-
classmethod
from_system
(name)[source] Tries to find
<name>.aff
and<name>.dic
on system paths known to store Hunspell dictionaries. Probably works only on Linux.- Parameters
name (str) – Language/dictionary name, like
en_US
- Return type
@classmethod def from_system(cls, name: str) -> Dictionary: """ Tries to find ``<name>.aff`` and ``<name>.dic`` on system paths known to store Hunspell dictionaries. Probably works only on Linux. Args: name: Language/dictionary name, like ``en_US`` """ for folder in cls.PATHES: pathes = glob.glob(f'{folder}/{name}.aff') if pathes: return cls.from_files(pathes[0].replace('.aff', '')) raise LookupError(f'{name}.aff not found (search pathes are {cls.PATHES!r})')
Dictionary usage
-
lookup
(word)[source] Checks if the word is correct.
>>> dictionary.lookup('spylls') False >>> dictionary.lookup('spells') True
- Parameters
word (str) – Word to check
- Return type
bool
def lookup(self, word: str) -> bool: """ Checks if the word is correct. :: >>> dictionary.lookup('spylls') False >>> dictionary.lookup('spells') True Args: word: Word to check """ return self.lookuper(word)
-
suggest
(word)[source] Suggests corrections for the misspelled word (in order of probability/similarity, best suggestions first), returns lazy generator of suggestions.
>>> suggestions = dictionary.suggest('spylls') <generator object Dictionary.suggest at 0x7f5c63e4a2d0> >>> for suggestion in dictionary.suggest('spylls'): ... print(sugestion) spells spills
- Parameters
word (str) – Misspelled word
- Return type
Iterator[str]
def suggest(self, word: str) -> Iterator[str]: """ Suggests corrections for the misspelled word (in order of probability/similarity, best suggestions first), returns lazy generator of suggestions. :: >>> suggestions = dictionary.suggest('spylls') <generator object Dictionary.suggest at 0x7f5c63e4a2d0> >>> for suggestion in dictionary.suggest('spylls'): ... print(sugestion) spells spills Args: word: Misspelled word """ yield from self.suggester(word)
Data objects
-
aff
: data.aff.Aff Contents of
*.aff
-
dic
: data.dic.Dic Contents of
*.dic
Algorithms
-
lookuper
: lookup.Lookup Instance of
Lookup
, can be used for experimenting, seealgo.lookup
.
-
suggester
: suggest.Suggest Instance of
Suggest
, can be used for experimenting, seealgo.suggest
.
-
classmethod