Returns an order in [minNgramOrder, maxNgramOrder] if valid; otherwise errors out.
Packs a sequence of words of type WordType into a single NGramType.
Packs a sequence of words of type WordType into a single NGramType. The
current word is ngram.last
, and the words before are the context.
Unpacks the pos
word out of the packed ngram of type NGramType.
Unpacks the pos
word out of the packed ngram of type NGramType. Position 0
indicates the farthest context (if unigram, the current word), and position
MAX_ORDER-1 represents the current word.
Useful for getting words at special positions (e.g. first two in context).
A family of NGramIndexer that can unpack or strip off specific words, query the order of an packed ngram, etc.
Such indexers are useful for LMs that require backoff contexts (e.g. Stupid Backoff, KN).