DATA STRUCTURES FOR BIOLOGICAL RECORDING

Specifications for databases to record occurrences of living organisms

THE ADMINISTRATION DATABASE CADM

Introductory notes

Lastly, and regrettably necessary, is the Collections Administration Table. This separates all the administrative aspects associated with the record from the scientific aspects, but has a one-to-one relationship with the Collections Core Table and the Collections Core Supplementary Table. Apart from the fields relevant to biological recording, this table also contains fields providing the following (and other information): an assessment of any hazards presented by the current item; a note on ownership of the rights to the current item; a note on the confidentiality of current data. In this table, information relating to the administration of records as they progress through an identification service may also, for example, be kept.

Another important, but large and difficult topic, which cannot be discussed here, is the problem of keeping track of exchanged records. If the BMS were to donate a copy of 20000 records to a sister society, how would the BMS be able to tell they were not getting their own records back as duplicates if, at some later point, the sister society made a return donation? The worrying prospect raises itself of duplicated mycological data mushrooming as records circle between different databases on an electronic merry-go-round! It is certainly already happening doubled in trumps over the internet.

[CadmLink_N]

A link to the unique observation identifier, 8 characters, indexed

This field stores the numeric link between the Collections Herbarium Table and other tables in the Collections Database.

[CadmAcdatA]

Date of accession of record, 8 characters

[CadmUpdatA]

Date of most recent update of record, 8 characters

It helps to know when a record arrived in the system, and it is also useful to know when it was last altered: without the first date it may be impossible to know at what rate records are being accessed to the database, the second date may be needed when, for example, a recurrent error has been attributed to a keyboarder who started work on a particular day. These two fields store that information. Both are `date' fields, thus relying on the database software's services for handling date information. [CadmAcdatA] stores the date of accession of the record. [CadmUpdatD] stores the date of most recent update. Neither BMS database provides space to record this information.

[CadmRecstA]

Proof-reading flags, 300 characters

The final example is a field category which has to be located at strategic points throughout databases. All databases need to be proof-read. A problem is that, for many records, it is easy to validate the contents of some fields, while those of other fields may present problems which require further investigation. [CadmRecstA] and similar fields note the data-quality of each field in the current record. One character is allocated to each field and what fills that character depends on where that field is in the process of proof-reading. The BMS databases have no fields to record proof-reading status.

The system in use at IMI (which has not yet been introduced for the Collections Database, but is in regular use for the Bibliography Database) is to recognize three levels for each field, marked by `0', `1' and `2' respectively. The first indicates that the field has not passed the first stage of proof-reading. The second, that the field has passed the first stage, but not the second. The third, that both stages have been passed. A surprisingly large amount of data quality can be checked mechanically, and at IMI the first stage is a mechanical scan of each record.

Only fields for which the proof-reading flag is set to `0' are examined. If the information in the field passes the mechanical test, the flag is automatically reset to `1'. If the information fails the test, the proof-reading software assesses the reason for the failure. The software is designed to correct some problems automatically, and then reset the flag to `1'. When this happens, a warning that mechanical correction of data has occurred is printed in the report at the end of the proof-reading operation. For problems which the software cannot correct, the flag remains set at `0', and attention is drawn to the problem often with an indication of a proposed correction. The second stage is human proof-reading. After which the flag for each field which reaches an acceptable standard is reset to `2'.


Previous page