The classes in this module encode and decode posting information for a field. The field format essentially determines what information is stored about each occurance of a term.
Abstract base class representing a storage format for a field or vector. Format objects are responsible for writing and reading the low-level representation of a field. It controls what kind/level of information to store about the indexed fields.
Parameters: |
|
---|
Returns a whoosh.analysis.Token iterator from the given unicode string.
Parameters: |
|
---|
Takes the text value to be indexed and yields a series of (“tokentext”, frequency, valuestring) tuples, where frequency is the number of times “tokentext” appeared in the value, and valuestring is encoded field-specific posting value for the token. For example, in a Frequency format, the value string would be the same as frequency; in a Positions format, the value string would encode a list of token positions at which “tokentext” occured.
Parameter: | value – The unicode text to index. |
---|
Only indexes whether a given term occurred in a given document; it does not store frequencies or positions. This is useful for fields that should be searchable but not scorable, such as file path.
Supports: frequency, weight (always reports frequency = 1).
Stores frequency information for each posting.
Supports: frequency, weight.
Parameters: |
|
---|
A Field that stores frequency and per-document boost information for each posting.
Supports: frequency, weight.
Parameters: |
|
---|
A vector that stores position information in each posting, to allow phrase searching and “near” queries.
Supports: frequency, weight, positions, position_boosts (always reports position boost = 1.0).
Parameters: |
|
---|
Stores token position and character start and end information for each posting.
Supports: frequency, weight, positions, position_boosts (always reports position boost = 1.0), characters.
Parameters: |
|
---|
A format that stores positions and per-position boost information in each posting.
Supports: frequency, weight, positions, position_boosts.
Parameters: |
|
---|
A format that stores positions, character start and end, and per-position boost information in each posting.
Supports: frequency, weight, positions, position_boosts, characters, character_boosts.
Parameters: |
|
---|