Dictionary: main library interface

class Dictionary(aff, dic)[source]

The main and only interface to spylls.hunspell as a library.

Usage:

from spylls.hunspell import Dictionary

# from folder where en_US.aff and en_US.dic are present
dictionary = Dictionary.from_files('/path/to/dictionary/en_US')
# or, from Firefox/LibreOffice dictionary extension
dictionary = Dictionary.from_zip('/path/to/dictionary/en_US.odt')
# or, from system folders (on Linux)
dictionary = Dictionary.from_system('en_US')

print(dictionary.lookup('spylls'))
# False
for suggestion in dictionary.suggest('spylls'):
    print(sugestion)
# spells
# spills

Note that for easyness of experimentation, en_US dictionary from SCOWL is distributed with Spylls, so this will work without any additional dictionary installation:

from spylls.hunspell import Dictionary
en = Dictionary.from_files('en_US')

Internal algorithm implementations lookuper and suggester are exposed in order to allow experimenting with the implementation:

# Produce all ways this word might be analyzed by current dictionary
for form in dictionary.lookuper.good_forms('building'):
    print(form)

# AffixForm(building = building)
# AffixForm(building = build + Suffix(ing: G×, on [[^e]]$))

# Internal suggest method, showing information about suggestion method
for suggestion in dictionary.suggester.suggestions('spylls'):
    print(suggestion)

# Suggestion[badchar](spells)
# Suggestion[badchar](spills)

Dictionary creation

classmethod from_files(path)[source]

Read dictionary from pair of files /some/path/some_name.aff and /some/path/some_name.dic.

For easyness of experimentation, en_US dictionary from SCOWL is distributed with Spylls, so this will work without any additional dictionary installation:

from spylls.hunspell import Dictionary
en = Dictionary.from_files('en_US')
Parameters:

path (str) – Should be just /some/path/some_name.

Return type:

spylls.hunspell.dictionary.Dictionary

@classmethod
def from_files(cls, path: str) -> Dictionary:
    """
    Read dictionary from pair of files ``/some/path/some_name.aff`` and ``/some/path/some_name.dic``.

    For easyness of experimentation, ``en_US`` dictionary from `SCOWL <http://wordlist.aspell.net/>`_
    is distributed with Spylls, so this will work without any additional dictionary installation::

        from spylls.hunspell import Dictionary
        en = Dictionary.from_files('en_US')

    Args:
        path: Should be just ``/some/path/some_name``.
    """

    if path in cls.DISTRIBUTED and not os.path.exists(path + '.aff'):
        path = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'data', cls.DISTRIBUTED[path], path)

    aff, context = readers.read_aff(FileReader(path + '.aff'))
    dic = readers.read_dic(FileReader(path + '.dic', encoding=context.encoding), aff=aff, context=context)

    return cls(aff, dic)
classmethod from_zip(path)[source]

Read dictionary from zip-archive containing *.aff and *.dic path. Note that Open/Libre Office dictionary extensions (*.odt) and Firefox/Thunderbird dictionary extensions (*.xpi) are in fact such archives, so Dictionary can be read from them without unpacking.

Parameters:

path (str) – Path to zip-file/extension.

Return type:

spylls.hunspell.dictionary.Dictionary

@classmethod
def from_zip(cls, path: str) -> Dictionary:
    """
    Read dictionary from zip-archive containing ``*.aff`` and ``*.dic`` path. Note that Open/Libre
    Office dictionary extensions (``*.odt``) and Firefox/Thunderbird dictionary extensions (``*.xpi``)
    are in fact such archives, so ``Dictionary`` can be read from them without unpacking.

    Args:
        path: Path to zip-file/extension.
    """

    file = zipfile.ZipFile(path)
    # TODO: fail if there are several
    aff_path = [name for name in file.namelist() if name.endswith('.aff')][0]
    dic_path = [name for name in file.namelist() if name.endswith('.dic')][0]
    aff, context = readers.read_aff(ZipReader(file.open(aff_path)))
    dic = readers.read_dic(ZipReader(file.open(dic_path), encoding=context.encoding), aff=aff, context=context)

    return cls(aff, dic)
classmethod from_system(name)[source]

Tries to find <name>.aff and <name>.dic on system paths known to store Hunspell dictionaries. Probably works only on Linux.

Parameters:

name (str) – Language/dictionary name, like en_US

Return type:

spylls.hunspell.dictionary.Dictionary

@classmethod
def from_system(cls, name: str) -> Dictionary:
    """
    Tries to find ``<name>.aff`` and ``<name>.dic`` on system paths known to store Hunspell dictionaries.
    Probably works only on Linux.

    Args:
        name: Language/dictionary name, like ``en_US``
    """

    for folder in cls.PATHES:
        pathes = glob.glob(f'{folder}/{name}.aff')
        if pathes:
            return cls.from_files(pathes[0].replace('.aff', ''))

    raise LookupError(f'{name}.aff not found (search pathes are {cls.PATHES!r})')

Dictionary usage

lookup(word)[source]

Checks if the word is correct.

>>> dictionary.lookup('spylls')
False
>>> dictionary.lookup('spells')
True
Parameters:

word (str) – Word to check

Return type:

bool

def lookup(self, word: str) -> bool:
    """
    Checks if the word is correct.

    ::

        >>> dictionary.lookup('spylls')
        False
        >>> dictionary.lookup('spells')
        True

    Args:
        word: Word to check
    """

    return self.lookuper(word)
suggest(word)[source]

Suggests corrections for the misspelled word (in order of probability/similarity, best suggestions first), returns lazy generator of suggestions.

>>> suggestions = dictionary.suggest('spylls')
<generator object Dictionary.suggest at 0x7f5c63e4a2d0>

>>> for suggestion in dictionary.suggest('spylls'):
...    print(sugestion)
spells
spills
Parameters:

word (str) – Misspelled word

Return type:

Iterator[str]

def suggest(self, word: str) -> Iterator[str]:
    """
    Suggests corrections for the misspelled word (in order of probability/similarity, best
    suggestions first), returns lazy generator of suggestions.

    ::

        >>> suggestions = dictionary.suggest('spylls')
        <generator object Dictionary.suggest at 0x7f5c63e4a2d0>

        >>> for suggestion in dictionary.suggest('spylls'):
        ...    print(sugestion)
        spells
        spills

    Args:
        word: Misspelled word
    """

    yield from self.suggester(word)

Data objects

aff: data.aff.Aff

Contents of *.aff

dic: data.dic.Dic

Contents of *.dic

Algorithms

lookuper: lookup.Lookup

Instance of Lookup, can be used for experimenting, see algo.lookup.

suggester: suggest.Suggest

Instance of Suggest, can be used for experimenting, see algo.suggest.