]]>
dtsrdbfiles
special file
dtsrdbfiles
Describes the complete set of DtSearch database files
DESCRIPTION
Each DtSearch database consists of a set of core files
that are created and maintained by the DtSearch offline build tools.
Each database may also include a set of one or more language files
that vary depending on the DtSearch language of the database.
Some language files are part of the DtSearch package but
may also be enhanced by the database developer.
All database files for a single database must be located in the same
directory. The directory is specified in the offline build tools by the
optional path prefix in the −ddbname argument. The directory is specified for
the online API by a PATH
configuration file (ocf file).
Core Files
The base name of the core files is formed by appending a period and
3-character name extension to the 1- to 8-character database name
specified at creation time. Core files are binary and accessible only
via DtSearch programs.
The DtSearch core files are as follows:
dbname.dbe
Database dictionary file. Binary schema created by
dtsrcreate from dtsearch.dbe.
Never modified thereafter.
dbname.k00
Main key file for database documents. Created and initialized by
dtsrcreate, updated by dtsrload.
Contains the b-tree of unique keys for each document.
dbname.k01
Optional key file for database documents. Created and initialized by
dtsrcreate. Contains the b-tree of optional keys for
each document. Not currently used.
dbname.d00
Documents header file. Created by dtsrcreate, updated
by dtsrload. Contains the databases configuration
status record and, for each document in the database, a header record
and one or more abstract records.
dbname.d01
Compressed text file. Created by dtsrcreate, but
updated by dtsrload only for AusText type dataases.
Repository of compressed text for each document.
dbname.k21,
dbname.k22,
dbname.k23
Key files for words and stems. Created and initialized by
dtsrcreate, updated by dtsrindex.
Contains the b-tree of each word and stem indexed for the database. The
k21 file finds "short" words, 1 to 15 bytes, in the d21 file. The k22
file finds "long" words, 16 to 39 bytes, in the d22 file. The k23 file
finds "huge" words, 40 to 133 bytes, in the d23 file. Long and huge word
files may not be used depending on the database maximum word size
specified at creation time.
dbname.d21,
dbname.d22,
dbname.d23
Data files for words and stems. Created and initialized by
dtsrcreate, updated by dtsrindex.
For each word contains document counts, offset to inverted index (d99
file), and storage recovery data. The d21 file contains short words, the
d22 file contains long words, and the d23 file contains huge words. Long
and huge word files may not be used depending on the database maximum
word size specified at creation time.
Language Files
Databases also need a set of files associated with the DtSearch language
of the database. When looking for these files DtSearch will first look
for a customized version applicable only to a database, and then look
for the generic language version. Like core files, the base file name of
a customized language file is formed by the database name and a 3
character extension. The alternative generic language files are named
with a language name and the same 3 character extension.
Language files are mandatory or optional depending on the language.
See &cdeman.dtsrlangfiles; for formats of language files.
The DtSearch language-related files are as follows:
dbname.stp
Stop file. The supported stop files are:
eng.stp − for
DtSrLaENG and
DtSrLaENG2
esp.stp − for
DtSrLaESP
fra.stp − for
DtSrLaFRA
deu.stp − for
DtSrLaDEU
ita.stp − for
DtSrLaITA
Stop lists are mandatory for European languages, and
optional for other supported languages.
dbname.inc
An include list is always optional for all supported languages.
There are no generic versions of include lists.
eng.sfx
ForDtSrLaENG and
DtSrLaENG2.
and is not currently required for other supported languages.
dbname.knj
jpn.knj for
DtSrLaJPN2.
A kanji compounds file is mandatory only for language number 7
DtSrLaJPN2,
a supported Japanese language.
Examples
Files associated with a minimum
DtSrLaENG database
(English, ASCII) that uses no customized or optional files:
All core files plus eng.stp, eng.sfx.
Files for a DtSrLaITA
database (Italian, ISO Latin-1)
with enhanced stop list and an include list:
All core files plus dbname.stp, dbname.inc.
Files associated with a minimum DtSrLaJPN
database
(Japanese with full, automatic kanji compounding)
that uses no customized or optional files:
Only core files.
Files for a DtSrLaJPN2
database (Japanese with kanji compounds
from a word list), with optional stop list for ASCII substrings:
All core files plus dbname.stp, jpn.knj.
SEE ALSO
&cdeman.dtsrcreate;,
&cdeman.dtsrload;,
&cdeman.dtsrindex;,
&cdeman.DtSrAPI;,
&cdeman.dtsrlangfiles;,
&cdeman.dtsrocffile;,
&cdeman.DtSearch;