A hand-written query parser built on modular plug-ins. The default configuration implements a powerful fielded query language similar to Lucene’s.
You can use the plugins argument when creating the object to override the default list of plug-ins, and/or use add_plugin() and/or remove_plugin_class() to change the plug-ins included in the parser.
>>> from whoosh import qparser
>>> parser = qparser.QueryParser("content", schema)
>>> parser.remove_plugin_class(qparser.WildcardPlugin)
>>> parser.parse(u"hello there")
And([Term("content", u"hello"), Term("content", u"there")])
Parameters: |
|
---|
Adds the given plugin to the list of plugins in this parser.
Returns a priorized list of filter functions from the included plugins.
Parses the input string and returns a Query object/tree.
This method may return None if the input string does not result in any valid queries.
Parameters: |
|
---|---|
Return type: |
Removes the given plugin from the list of plugins in this parser.
Removes any plugins of the given class from this parser.
Removes any plugins of the class of the given plugin and then adds it. This is a convenience method to keep from having to call remove_plugin_class followed by add_plugin each time you want to reconfigure a default plugin.
>>> qp = qparser.QueryParser("content", schema)
>>> qp.replace_plugin(qparser.NotPlugin("(^| )-"))
Returns the appropriate query object for a single term in the query string.
Returns a priorized list of tokens from the included plugins.
The following functions return pre-configured QueryParser objects.
Returns a QueryParser configured to search in multiple fields.
Instead of assigning unfielded clauses to a default field, this parser transforms them into an OR clause that searches a list of fields. For example, if the list of multi-fields is “f1”, “f2” and the query string is “hello there”, the class will parse “(f1:hello OR f2:hello) (f1:there OR f2:there)”. This is very useful when you have two textual fields (e.g. “title” and “content”) you want to search by default.
Parameters: |
|
---|
Returns a QueryParser configured to support only +, -, and phrase syntax.
Returns a QueryParser configured to support only +, -, and phrase syntax, and which converts individual terms into DisjunctionMax queries across a set of fields.
Parameters: |
|
---|
Adds the ability to specify the field of a clause using a colon.
This plugin is included in the default parser configuration.
alias of OperatorsPlugin
This plugin is deprecated, its functionality is now provided by the OperatorsPlugin.
Adds the ability to specify wildcard queries by using asterisk and question mark characters in terms. Note that these types can be very performance and memory intensive. You may consider not including this type of query.
This plugin is included in the default parser configuration.
Adds the ability to specify prefix queries by ending a term with an asterisk. This plugin is useful if you want the user to be able to create prefix but not wildcard queries (for performance reasons). If you are including the wildcard plugin, you should not include this plugin as well.
Adds the ability to specify phrase queries inside double quotes.
This plugin has no configuration.
This plugin is included in the default parser configuration.
Adds the ability to specify term ranges.
This plugin has no configuration.
This plugin is included in the default parser configuration.
Adds the ability to specify single “terms” containing spaces by enclosing them in single quotes.
This plugin has no configuration.
This plugin is included in the default parser configuration.
Adds the ability to group clauses using parentheses.
This plugin is included in the default parser configuration.
Adds the ability to boost clauses of the query using the circumflex.
This plugin is included in the default parser configuration.
This plugin is deprecated, its functionality is now provided by the OperatorsPlugin.
Adds the ability to use + and - in a flat OR query to specify required and prohibited terms.
This is the basis for the parser configuration returned by SimpleParser().
Converts any unfielded terms into OR clauses that search for the term in a specified list of fields.
Parameters: |
|
---|
Converts any unfielded terms into DisjunctionMax clauses that search for the term in a specified list of fields.
Parameters: |
|
---|
Adds the ability to use “aliases” of fields in the query string.
>>> # Allow users to use 'body' or 'text' to refer to the 'content' field
>>> parser.add_plugin(FieldAliasPlugin({"content": ["body", "text"]}))
>>> parser.parse("text:hello")
Term("content", "hello")
Parameters: |
|
---|
Looks for basic syntax tokens (terms, prefixes, wildcards, phrases, etc.) occurring in a certain field and replaces it with a group (by default OR) containing the original token and the token copied to a new field.
For example, the query:
hello name:matt
could be automatically converted by CopyFieldPlugin({"name", "author"}) to:
hello (name:matt OR author:matt)
This is useful where one field was indexed with a differently-analyzed copy of another, and you want the query to search both fields.
Parameters: |
|
---|
Allows the user to use greater than/less than symbols to create range queries:
a:>100 b:<=z c:>=-1.4 d:<mz
This is the equivalent of:
a:{100 to] b:[to z] c:[-1.4 to] d:[to mz}
The plugin recognizes >, <, >=, <=, =>, and =< after a field specifier. The field specifier is required. You cannot do the following:
>100
This plugin requires the FieldsPlugin and RangePlugin to work.
Parameters: |
|
---|
An object representing parsed text. These objects generally correspond to a query object type, and are intermediate objects used to represent the syntax tree parsed from a query string, and then generate a query tree from the syntax tree. There will be syntax objects that do not have a corresponding query type, such as the syntax object representing whitespace.
Represents a group of syntax objects. These generally correspond to compound query objects such as query.And and query.Or.
Syntax group corresponding to an And query.
Syntax group corresponding to an Or query.
Syntax group corresponding to an AndNot query.
Syntax group corresponding to an AndMaybe query.
Syntax group corresponding to a DisjunctionMax query.
Syntax group corresponding to a Not query.
A parse-able token object. Each token class has an expr attribute containing a regular expression that matches the token text. When this expression is found, the class/object’s create() method is called and returns a token object to represent the match in the token stream.
Many token classes will do the parsing using class methods and put instances of themselves in the token stream, however parseable objects requiring configuration (such as the Operator subclasses may use separate objects for doing the parsing and embodying the token.
Base class for tokens that don’t carry any information specific to each instance (e.g. “open parenthesis” token), so they can all share the same instance.
Base class for “basic” (atomic) syntax – term, prefix, wildcard, phrase, range.
Syntax object representing a term.