Package | Description |
---|---|
org.apache.lucene.analysis |
API and code to convert text into indexable/searchable tokens.
|
org.apache.lucene.analysis.br |
Analyzer for Brazilian.
|
org.apache.lucene.analysis.cn |
Analyzer for Chinese.
|
org.apache.lucene.analysis.de |
Analyzer for German.
|
org.apache.lucene.analysis.el |
Analyzer for Greek.
|
org.apache.lucene.analysis.fr |
Analyzer for French.
|
org.apache.lucene.analysis.ngram | |
org.apache.lucene.analysis.nl |
Analyzer for Dutch.
|
org.apache.lucene.analysis.payloads |
Provides various convenience classes for creating payloads on Tokens.
|
org.apache.lucene.analysis.ru |
Analyzer for Russian.
|
org.apache.lucene.analysis.snowball |
TokenFilter and Analyzer implementations that use Snowball
stemmers. |
org.apache.lucene.analysis.standard |
A fast grammar-based tokenizer constructed with JFlex.
|
org.apache.lucene.analysis.th | |
org.apache.lucene.index.memory |
High-performance single-document main memory Apache Lucene fulltext search index.
|
Modifier and Type | Class and Description |
---|---|
class |
CachingTokenFilter
This class can be used if the Tokens of a TokenStream
are intended to be consumed more than once.
|
class |
ISOLatin1AccentFilter
A filter that replaces accented characters in the ISO Latin 1 character set
(ISO-8859-1) by their unaccented equivalent.
|
class |
LengthFilter
Removes words that are too long and too short from the stream.
|
class |
LowerCaseFilter
Normalizes token text to lower case.
|
class |
PorterStemFilter
Transforms the token stream as per the Porter stemming algorithm.
|
class |
StopFilter
Removes stop words from a token stream.
|
class |
TeeTokenFilter
Works in conjunction with the SinkTokenizer to provide the ability to set aside tokens
that have already been analyzed.
|
Modifier and Type | Class and Description |
---|---|
class |
BrazilianStemFilter
Based on GermanStemFilter
|
Modifier and Type | Class and Description |
---|---|
class |
ChineseFilter
Title: ChineseFilter
Description: Filter with a stop word table
Rule: No digital is allowed.
|
Modifier and Type | Class and Description |
---|---|
class |
GermanStemFilter
A filter that stems German words.
|
Modifier and Type | Class and Description |
---|---|
class |
GreekLowerCaseFilter
Normalizes token text to lower case, analyzing given ("greek") charset.
|
Modifier and Type | Class and Description |
---|---|
class |
ElisionFilter
Removes elisions from a token stream.
|
class |
FrenchStemFilter
A filter that stemms french words.
|
Modifier and Type | Class and Description |
---|---|
class |
EdgeNGramTokenFilter
Tokenizes the given token into n-grams of given size(s).
|
class |
NGramTokenFilter
Tokenizes the input into n-grams of the given size(s).
|
Modifier and Type | Class and Description |
---|---|
class |
DutchStemFilter
A filter that stems Dutch words.
|
Modifier and Type | Class and Description |
---|---|
class |
NumericPayloadTokenFilter
Assigns a payload to a token based on the
Token.type() |
class |
TokenOffsetPayloadTokenFilter
Adds the
Token.setStartOffset(int)
and Token.setEndOffset(int)
First 4 bytes are the start |
class |
TypeAsPayloadTokenFilter
Makes the
Token.type() a payload. |
Modifier and Type | Class and Description |
---|---|
class |
RussianLowerCaseFilter
Normalizes token text to lower case, analyzing given ("russian") charset.
|
class |
RussianStemFilter
A filter that stems Russian words.
|
Modifier and Type | Class and Description |
---|---|
class |
SnowballFilter
A filter that stems words using a Snowball-generated stemmer.
|
Modifier and Type | Class and Description |
---|---|
class |
StandardFilter
Normalizes tokens extracted with
StandardTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
ThaiWordFilter
TokenFilter that use java.text.BreakIterator to break each
Token that is Thai into separate Token(s) for each Thai word.
|
Modifier and Type | Class and Description |
---|---|
class |
SynonymTokenFilter
Injects additional tokens for synonyms of token terms fetched from the
underlying child stream; the child stream must deliver lowercase tokens
for synonyms to be found.
|
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.