本文共 2023 字,大约阅读时间需要 6 分钟。
org.apache.lucene.analysis.TokenStream
一个 抽象类。一个TokenStream会枚举若干个token的序列,要么来自文档的域,要门来自查询文本。A TokenStream enumerates the sequence of tokens, either from Fields of a Document or from query text.
TokenStream org.apache.lucene.analysis.Analyzer.tokenStream(String fieldName, Reader reader)
从reader的文本中得到一个Analyzer分词后的TokenStream。 Creates a TokenStream which tokenizes all the text in the provided Reader.void org.apache.lucene.analysis.TokenStream.reset() throws IOException
将TokenStream的游标重置到初始位置。 Resets this stream to the beginning.boolean org.apache.lucene.analysis.TokenStream.incrementToken() throws IOException
消费者,也就是IndexWriter使用这个方法来获得下一个token。 Consumers (i.e., IndexWriter) use this method to advance the stream to the next token. org.apache.lucene.analysis.tokenattributes.CharTermAttribute 一个token的词文本。 The term text of a Token. <CharTermAttribute> CharTermAttribute org.apache.lucene.util.AttributeSource.getAttribute(Class<CharTermAttribute> attClass) 获得指定的Attribute。 The caller must pass in a Class<? extends Attribute> value. Returns the instance of the passed in Attribute contained in this AttributeSource。转载地址:http://xvoel.baihongyu.com/