An open API service indexing awesome lists of open source software.

https://github.com/rmraya/mxspell

Spellchecker that uses Hunspell dictionaries
https://github.com/rmraya/mxspell

Last synced: 2 months ago
JSON representation

Spellchecker that uses Hunspell dictionaries

Awesome Lists containing this project

README

          

# MXSpell

Java spellchecker library using Hunspell dictionaries. Requires Java 21+. No external dependencies.

## Core Classes

### SpellChecker

Main entry point for spellchecking operations.

**Constructor:**

```java
SpellChecker(String language, String dictionaryFolder) throws IOException
```

- `language`: BCP47 language code (e.g., "en-US", "de-DE", "fr"). Internally converted to match dictionary file names with underscores.
- `dictionaryFolder`: Path to directory containing dictionaries

The constructor looks for dictionaries in two formats:

1. Subdirectory named after the language containing `.aff` and `.dic` files
2. ZIP file named `{language}.zip` containing the dictionary files

**Methods:**

```java
String[] suggest(String word)
```

Returns spelling suggestions for a word. Empty array if word is correct.

```java
void learn(String word)
```

Adds a word to the learned words list (persists across sessions).

```java
void ignore(String word)
```

Ignores a word for the current session only.

```java
Map checkString(String text)
```

Checks all words in a text string. Returns map of misspelled words to their suggestions.

```java
Dictionary getDictionary()
```

Returns the underlying Dictionary object for advanced operations.

### Dictionary

Manages word lookup and validation. Obtained via `SpellChecker.getDictionary()`.

**Key Methods:**

```java
DictionaryEntry lookup(String word)
```

Looks up a word in the dictionary. Returns `null` if not found.

```java
List getWords(DictionaryEntry entry) throws IOException
```

Returns all word forms generated from a dictionary entry by applying affix rules.

```java
boolean isValidCompound(String word)
```

Checks if a word is valid as a compound word based on compound flags.

```java
void learn(String word)
void ignore(String word)
Set getLearnedWords()
Set getIgnoredWords()
```

Manage learned and ignored words.

### DictionaryEntry

Represents a word entry from the dictionary.

**Constructor:**

```java
DictionaryEntry(String word, String flags, String morphology)
```

**Methods:**

```java
String getWord()
String getFlags()
String getMorphology()
```

## Basic Usage

```java
import com.maxprograms.mxspell.SpellChecker;

// Initialize with US English dictionary
SpellChecker checker = new SpellChecker("en-US", "path/to/dictionaries");

// Check a word
String[] suggestions = checker.suggest("speling");
if (suggestions.length > 0) {
System.out.println("Did you mean: " + String.join(", ", suggestions));
}

// Check multiple words
Map results = checker.checkString("The speling is rong");
for (Map.Entry entry : results.entrySet()) {
System.out.println(entry.getKey() + " -> " + String.join(", ", entry.getValue()));
}

// Learn a word
checker.learn("customword");

// Ignore a word for this session
checker.ignore("propertyname");
```

## Supported Features

- Affix rules (prefixes and suffixes)
- Cross-product affix application (prefix+suffix combinations)
- Compound word validation with full flag support
- Replacement tables (REP directive)
- TRY characters for suggestion generation
- FORBIDDENWORD flag (rejects forbidden words)
- NOSUGGEST flag (excludes words from suggestions)
- KEEPCASE flag (enforces exact case matching)
- Encoding detection (UTF-8, ASCII, ISO-8859-*, Windows codepages)
- Thread-safe operations
- Learned and ignored words

## Dictionary Format

Requires Hunspell-compatible dictionaries:

- `.aff` file: Affix rules and configuration
- `.dic` file: Word list with flags

Dictionary structure:

```
dictionaryFolder/
language/
index.aff
index.dic
```

Or as ZIP files:

```
dictionaryFolder/
language.zip (containing .aff and .dic)
```

## Building

```bash
gradle clean build
```

Output: `lib/mxspell.jar`

## Thread Safety

All classes are thread-safe. Dictionary and AffixParser use immutable collections after initialization. SpellChecker can be safely shared across threads.