DATA STRUCTURE FOR BIOLOGICAL RECORDING

Specifications for databases to record occurrences of living organisms

THE BIBLIOGRAPHIC LINK DATABASE CBIX

Introductory notes

In addition to specimens in dried reference collections, isolates in living collections, and field observations not backed by material, the other great sources of data for the Collections Database are citations in the literature. This table provides a means of relating Collections Database records to the literature. It is quite possible for many different publications all to refer to the same collection or observation: many publications have, for example, referred to Fleming's observation which led to the discovery of penicillin. The relationship between this table and each individual Collections Database observation is therefore many-to-one. Being designed originally solely to capture observations generated at forays and other field meetings, the BMS Foray Records Database makes no provision for any bibliographic link. The BMS/JNCC Database has the facility of recording a single bibliographic link for each record. For these two databases to be compatible with each other, re-structuring of the BMS Foray Records Database will be necessary, and that may be an opportunity to upgrade both to the much more flexible structure proposed here and in use at IMI.

[CbixLink_N]

A link to the unique observation identifier, 8 characters, indexed

This field stores the numeric link between the Collections Bibliography Cross-reference Table and other tables in the Collections Database.

[CbixBiblkA]

Text link to the Bibliography Database, 100 characters

[CbixBiblkN]

Numeric link to the Bibliography Database, 8 characters

[CbixBiblkA] is the first of two fields, both in current use, permitting a link to be made via this table between the Collections Database and the Bibliography Database. This link is in text form, and is used to connect Collections Database records to the Bibliography Database. Although much longer than a numeric link, it nevertheless has many convenient aspects, including a greater chance of recovering a connexion in the event of a corrupt file than with a numeric link.

The link information in [CbixBiblkA] is built up from the surnames of each author of the bibliographic item, the year of publication, and a single character identifier. Adjacent authors are separated by a comma and space, except the last two authors who are separated by a space, ampersand and a second space. The year on the publication and the single character identifier are enclosed within parentheses, and follow a space after the last author. For a given author / year combination, the first identifier is `a', the second `b' and so on. Like the text link used for the currently accepted organism name [Cco0AccnaA], information in this field uses no accented characters: thus `Leveille' becomes `Leveille'. Authors with names in non-Latin alphabets are transliterated using the CABI Database Production Manual standards. Some examples of typical bibliographic link data are: `Sutton (1980a)', `Gayova (1992a)', `Sutton & Hodges (1978b)', `Muller & Arx (1950a)', `DiCosmo, Nag Raj & Kendrick (1979a)'.

Because the link information closely corresponds to the way humans tend to remember bibliographic references, there is no need to remember numbers. If the authors of the paper, and the year of publication are known, the link can be made by constructing the link data and adding, then replacing single character identifiers successively until the correct data is located, or it becomes clear that the desired publication is not represented in the Bibliography Database. It is generally accepted, however, that for these bibliographic links, full and alternative availability of a numeric link through [CbixBiblkN] is a long-term goal. That link is already being employed in the BMS/JNCC Database through the field [BSM Link], 8 characters.

[CbixPage_A]

Identification of the exact page within the linked publication, 40 characters

This field is used to identify the exact page or page spread with which, within the linked publication, the data of the current Collections Database record is connected. This field is twice the length of [CpexPage_A] which performs a closely related function and stores the same category of data: minor inconsistencies of this sort can be very hard to avoid, and have to be rectified as and when noticed! The BMS/JNCC Database, even though it makes a bibligraphic link, notes no exact pages.

[CbixRef__A]

A place to store unverified bibliographic data, 240 characters

Sometimes a link between the Collections Database and the Bibliography Database cannot be successfully made. Perhaps the desired data simply does not exist in the Bibliography Database, or perhaps the data available to the keyboarder isn't adequate to identify the bibliographic record required: there can be lots of reasons. Under those conditions, it could be undesirable to enter any data, since the link to the bibliographic source cannot be established. To get round that problem, and the keyboarding `log-jams' that can result, there is a need for a field which can store, in free-text form, the bibliographic data actually available to the keyboarder at the time. This field is thus used only when there is no information in either [CbixBiblkA] or [CbixBiblkN].

The existence, and use of this field implies a further stage of work where a specialist bibliographic editor scans the table for records containing data in [CbixRef__A], and deals with the problems. As might be expected, fields with similar functions have to be located strategically at many points in the different databases used for recording, and the work of reviewing such fields forms a significant part of the routine maintenance of these relational databases.

[CbixTypifA]

Replica of typification comment, 1000 characters

Where the current record in the Collections Database relates to a specimen in a dried reference collection or more recently a culture in a living reference collection, there is always a possibility that the specimen in question may be a nomenclatural type. Since type specimens are generally designated through publications, a place to store published information about the type status of the current specimen is clearly desirable. This field is devoted to storing, as nearly as possible, a replica of what was actually published, so that the information it contains can function as a primary source of data.

Given the fact that most nomenclaturalists are said to be failed lawyers, it is not uncommon for the type status of a particular specimen to be argued over in many different publications and, while no-one likes to encourage such litigation, that may be another reason why the Collections Bibliography Cross-reference Table has a many-to-one relationship with the Collections Core Table. Unpublished observations about the type status of a particular specimen should be placed in the general fields [Cco0ExnotA] (notes to be published) or or [Cco0InnotA] (notes not to be published), depending on the commentators assessment of their potential for controversy!

[CbixThemeA]

General theme of current bibliographic observation, 500 characters

This final field in the Collections Bibliography Cross-reference Table allows space to store some indication of the general theme of the bibliographic data in relationship to the current Collections observation. For example, the publication may have cited the observation in the context of a study in ecology, biochemistry, genetics, systematics, or any one or several of a wide range of other scientific aspects. Having this field makes it possible, at least in theory, to pull out all the bibliographic records dealing with, say, the ecology of a particular organism. This field is rather new, and rules to give it structure have not yet been devised. It seems likely to represent an area where, in the future, several different fields will be needed.


Previous page