The search can use simple terms, ie single words and phrases, that is, phrases composed of several words included in quotation marks, eg "Nicolaus Copernicus University". When using quotation marks, only those documents that contain the entire phrase will be searched.
Search terms can be combined using logical operators. You can also use the so-called. Mask characters that replace any letters or numbers and their sequences, search for similar terms that are at a distance or define the priority of the search terms.
Fuzzy search is used for simple terms like Copernicus, Copernikus, Kopernikus. Documents containing these terms can be searched by adding a tilde to the term: copernicus ~.
The degree of desired similarity can be determined by a factor that varies from 0 (no similarity) to 1 (identical terms). By default, the affinity factor is set to 0.5. To add a tilde to the search term, add a tilde character along with a clearly defined coefficient,eg. kopernik~0.4.
It is also possible to specify how far one of the search terms should be from another (proximity search). For example, if we remember that the document was a short distance from each other, Choral-buch and Westpreussen, we can use the following query: "Choral-buch Westpreussen"~6.
You can specify the priority of the search term by adding the ^ sign with a number (greater than 1). For example, question stempowski^4 grydzewski will return the documents containing the two names, but at the beginning of the list will be the ones where the higher priority name appears. The default search priority is 1.
Expressions in compound queries can be grouped using parentheses. This procedure allows for elaborate inquiries intended, unequivocal sense, just as it happens in arithmetic operations.
First, partial expressions are inside the parentheses and then the larger whole. The query "De revolutionibus orbium coelestium" AND (Copernicus OR Kopernik) will search for documents containing the title Copernicus and his name at least in one of two forms.
For understandable reasons, the characters used to build compound queries (+ - && ||! () {} [] ^ "* *?: \) Are treated differently than others: they are the query syntax rather than the query expression. In order to include them in the search process, you have to put the so-called "escape character" in front of them. For example, to search for "(2 + 2) * 2" type "\ (2 \ + 2 \) \ * 2"
Full description of how to formulate queries: Jakarta Lucene Query Parser Syntax.
Text originally posted on the pages Kujawsko-Pomorska Digital Library.
This work is available under the Creative Commons Attribution-Share Alike 2.5 United States.