This document is a part of Royal Danish Library's APIs, and in particular The documentation on how use our texts. See also Licences & Legalese and Caveats
All the texts that can be searched in using the API are in Text Encoding Initiative, TEI for short, markup.
The solfware used for indexing is described in the documentation of the project SOLR and Snippets
Read chapter 5, it is so good!They are indexed and searchable in principle. However, the user interface only support them in table of contents and quotation services.
paragraph, level, which implies
Note that this document does not define or describe all fields in the index. The index is far too rich for that, but I believe that it contains what it takes to use it. The thing I have left out is basically more of the same.
Finally, all fields are not available for all editions, because the heterogeneity of the data, or wishes from the projects contributing data.
ID and Relations fields |
||
label | description | values |
---|---|---|
id |
The ID of the record. It identifies the collection, the TEI file and is constructed as a string concatenation of that basename with the xml:id of the the content indexed and some other stuff. |
string |
volume_id_ssi |
The ID of the volume that contain the node | |
part_of_ssim |
Array of IDs of trunk nodes being containers of the node
at hand. Typically containing
|
|
Filter fields |
||
label | description | values |
cat_ssi |
Category of a text. Use when limiting searches to works or to find volumes or find author portraits (biographies), omit otherwise. |
work author period |
is_editorial_ssi |
The contents originator is someone else than the author. In this service it is typically forewords, prefaces, comments etc in a scientific edition. |
yes no |
type_ssi |
Node type in document. A trunk node can be a whole work, a chapter etc, whereas a leaf could a paragraph of prose, a stanza (or strophe) of poetry or a speak in a dialog in a scenic work. For historical reasons, whole texts have type_ssi:work. A type_ssi:trunk will yield a result set comprising chapters or section of some kind. |
work trunk leaf volume |
is_monograph_ssi |
A monograph in text service is perhaps not what you expect (on the other hand, what you expect is a monograph in text service). A monograph is a volume with only one work. |
yes no |
genre_ssi |
Genre of a leaf node. Note that this is not the genre of a work, but the structure of the paragraph level markup. If there is a song in a scenic work, the speak in question might be classified as containing mostlty poetry. Available for all editions except GV. |
prose poetry play |
subcollection_ssi |
Filter with respect to collection. public-index.kb.dk contains all these editions. |
adl gv jura letters lh sks tfs |
Sort fields |
||
position_isi |
The position of the current node along the sibling xpath axis in the document. Sorting with respect to this field will guarantee that the result is presented in document order. (We cannot use page number, which might be a roman numeral or an arabic one. Also, we need to take into account leaf nodes within pages.) |
integer |
Search fields |
||
label | description | values |
work_title_tesim |
Misc. metadata fields. There are more of them, but they should be self explanatory. | just plain text |
volume_title_tesim |
||
work_title_tesim |
||
author_name_tesim |
The author(s) of a document. For messages it is assumed that author is a synonym of sender. | |
text_tesim |
The text | just plain text |
prose_extract_tesim verse_extract_tesim performance_extract_tesim |
The text, as text_tesim, split up into fields according to its form. The to fields get their content from <p> ... </p>, <lg> ... </lg> and <sp> ... </sp> respectively. | just plain text |
contains_ssi |
We measures the length of the texts in prose_extract_tesim verse_extract_tesim performance_extract_tesim, whichever is the longest is used to assign the value of this field. |
prose poetry play |
speaker_tesim |
The name of a character uttering something in a dialogue | just plain text |
page_ssi |
The page number where a leaf node (paragraph, speak or strophe) starts. |
string (either integer or roman numerals) |
person_name_ssim person_name_tesim |
Name of persons mentioned in works, or, in case of letters, name of the recipient. The field can be accessed both as text (tesim) and string (ssim). The names in these fields are normalized to last name first (LNF) format. Also, the normalized form usually hits variants, such as Shakespeare, William hits William Shakespeare, and Jesus hits Kristus (Danish for Christ) as well. But only in these fields, there is no query expansion for the full text. | |
other_location_ssim other_location_tesim sender_location_tesim | Names of places mentioned in works, or, in case of letters, the residence of the sender. The field can be accessed both as text (tesim) and string (ssim). The place names are usually normalized. For instance, a search in these field for Danmark hits Dannemark as well. The reverse is not true, a search for Dannemark hits only the word Dannemark in the full text (see text_tesim above). sender_location_tesim applies to letters only. | |
bible_ref_ssim bible_ref_tesim |
References to the bible mentioned in works. The field can be accessed both as text (tesim) and string (ssim). The references is using standard Danish abbreviations, like 1 Mos; 1 Kor 13,12; 1 Mos 2,7; Matt 16,18; Sl; Åb; ApG; Joh 1,14; Jak; Job. In many cases use bible_ref_ssim and then search for the exact string "1 Kor 13,12". The references are standardized annotations but in the full texts (of Grundtvig and Kierkegaard) may just allude to a place in the Bible. | |
year_itsi | Year of release, publication or, in case of a message, the year it was sent. | long int |
type_ssi:work AND is_editorial_ssi:no
author_name_tesim:munch AND type_ssi:work
genre_ssi:play AND subcollection_ssi:adl AND author_name_tesim:jeppe
genre_ssi:play AND subcollection_ssi:adl AND speaker_tesim:jeppe
type_ssi:leaf AND genre_ssi:poetry AND subcollection_ssi:adl AND author_name_tesim:grundtvig AND text_tesim:hjerte AND text_tesim:smerte
genre_ssi:play AND subcollection_ssi:adl AND text_tesim:mester erich AND author_name_tesim:holberg
subcollection_ssi:letters AND author_name_tesim:georg brandes AND sender_location_tesim:berlin
subcollection_ssi:letters AND sender_location_tesim:paris AND year_itsi:[1000 TO 1850]
author_name_tesim:holberg
{!join to=id from=part_of_ssim}genre_ssi:poetry
year_itsi desc
subcollection_ssi:gv AND verse_extract_tesim:helvede AND type_ssi:workfield list
id year_itsisort by ascending
year_itsi asc
subcollection_ssi:gv AND text_tesim:helvede AND type_ssi:work AND genre_ssi:poetry
subcollection_ssi:gv AND text_tesim:helvede AND type_ssi:leaf AND genre_ssi:poetry
volume_id_ssi:adl-texts-munp1-root AND text_tesim:regn AND genre_ssi:poetry
position_isi desc
volume_id_ssi:adl-texts-munp1-root AND text_tesim:regn
{!join to=id from=part_of_ssim}genre_ssi:poetry
position_isi asc
For now we see only a reflection as in a mirror; then we shall see face to face.) in the works of N.F.S. Grundtvig. try it!
bible_ref_ssim:"1 Kor 13,12" AND subcollection_ssi:gv AND is_editorial_ssi:no
year_itsi asc
{!join to=volume_id_ssi from=part_of_ssim}genre_ssi:prose
{!join to=volume_id_ssi from=part_of_ssim}genre_ssi:poetry
You cannot use the index-test instance outside our network. Forget this if you are not developer at kb.dk
This document was authored by
Sigfrid Lundberg
The Royal Danish Library
Denmark
who also wrote the indexer. However, a large number of people has contributed to this by coding services on top the index. That process has required clarifications of this document and modification of the index. This is the fruit of a teamwork.