org.apache.lucene.search.spell
public class SpellChecker extends Object
Spell Checker class (Main class)
(initially inspired by the David Spencer code).
Example Usage:
SpellChecker spellchecker = new SpellChecker(spellIndexDirectory); // To index a field of a user index: spellchecker.indexDictionary(new LuceneDictionary(my_lucene_reader, a_field)); // To index a file containing words: spellchecker.indexDictionary(new PlainTextDictionary(new File("myfile.txt"))); String[] suggestions = spellchecker.suggestSimilar("misspelt", 5);
Modifier and Type | Field and Description |
---|---|
static String |
F_WORD
Field name for each word in the ngram index.
|
Constructor and Description |
---|
SpellChecker(Directory spellIndex)
Use the given directory as a spell checker index.
|
Modifier and Type | Method and Description |
---|---|
void |
clearIndex()
Removes all terms from the spell check index.
|
boolean |
exist(String word)
Check whether the word exists in the index.
|
protected void |
finalize()
Closes the internal IndexReader.
|
void |
indexDictionary(Dictionary dict)
Index a Dictionary
|
void |
setAccuracy(float minScore)
Sets the accuracy 0 < minScore < 1; default 0.5
|
void |
setSpellIndex(Directory spellIndex)
Use a different index as the spell checker index or re-open
the existing index if
spellIndex is the same value
as given in the constructor. |
String[] |
suggestSimilar(String word,
int numSug)
Suggest similar words.
|
String[] |
suggestSimilar(String word,
int numSug,
IndexReader ir,
String field,
boolean morePopular)
Suggest similar words (optionally restricted to a field of an index).
|
public static final String F_WORD
public SpellChecker(Directory spellIndex) throws IOException
spellIndex
- IOException
public void setSpellIndex(Directory spellIndex) throws IOException
spellIndex
is the same value
as given in the constructor.spellIndex
- IOException
public void setAccuracy(float minScore)
public String[] suggestSimilar(String word, int numSug) throws IOException
As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match.
I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion.
word
- the word you want a spell check done onnumSug
- the number of suggested wordsIOException
public String[] suggestSimilar(String word, int numSug, IndexReader ir, String field, boolean morePopular) throws IOException
As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match.
I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion.
word
- the word you want a spell check done onnumSug
- the number of suggested wordsir
- the indexReader of the user index (can be null see field param)field
- the field of the user index: if field is not null, the suggested
words are restricted to the words present in this field.morePopular
- return only the suggest words that are more frequent than the searched word
(only if restricted mode = (indexReader!=null and field!=null)IOException
public void clearIndex() throws IOException
IOException
public boolean exist(String word) throws IOException
word
- IOException
public void indexDictionary(Dictionary dict) throws IOException
dict
- the dictionary to indexIOException
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.