Metadata & Information Retrieval: June 2008

Monday, June 23, 2008

Cross-system Searches

Another approach to reduce the barriers to interoperability is through the use of cross-system searches. Unlike a union catalog where a union database is maintained and a central search is used to retrieve data, the cross-system search stores metadata records in multiple databases, which are retrieved using the search facilities associated with each individual database system. ANSI/NISO Z39.50 is an example of an international standard protocol that allows one client system to request a search to be performed within another target system (Caplan, 2003). Here, the client receives the results back in a format that it can display. This cross-system search requires that the search be expressed in a common syntax so that every system only needs to comprehend its own search language and that of the international standard protocol.

References:

Caplan, P. (2003). Metadata fundamentals for all libraries. Chicago: American Library Association.

Thursday, June 19, 2008

Union Catalogs

Although interoperability among diverse sets of metadata records can be problematic, there are several current approaches to address these issues. One approach is through the use of a union catalog, a centralized database of metadata from multiple sources. One such union catalog used among libraries, for example, would include the MARC-based library catalog. Union catalogs can exist at any level, from a local institutional level to an international level. In libraries, OCLC’s WorldCat is one example of an international union catalog (Caplan, 2003).

There are several methods of implementing union catalogs. One method is that participating institutions submit copies of their own cataloging records to an organization that maintains the centralized search catalog. Another method is to create records directly into the union catalog database and then copied into the institution’s local system. In either of these two methods, records for the same resource contributed by different institutions can either be maintained as duplicate records or consolidated into a single master record presenting multiple holding locations. A third method includes the creation of a false union catalog via a union index over multiple catalog files, instead of maintaining a compiled database. This approach displays records from the source catalogs when entries from the index are selected.

In general, union catalogs work best when the participating institutions share a common data format and common set of cataloging rules. For example, libraries tend to use similar data formats and cataloging rules, which contributes to the effectiveness of OCLC’s WorldCat. When the records in the central database and local contributing catalogs are relatively homogenous, the familiarity of the search will facilitate retrievals. Although it is more complicated, it is possible to create union catalogs from non-homogenous metadata sources. Non-homogenous contributions usually result when a variety of institutions, as opposed to just one type of institution, participate in the union catalog. These institutions can include archives, libraries, historical societies, museums, and so on. Typically, the creation of a union catalog from non-homogenous sources would require a conversion of the various metadata schemas submitted into a common format for storage and indexing before loading the records into the union catalog (Caplan, 2003).

References

Caplan, P. (2003). Metadata fundamentals for all libraries. Chicago: American Library Association.

Friday, June 6, 2008

Metadata Interoperability Part 2

Extensibility also affects interoperability semantics. Extensibility refers to the ability to include additional metadata elements specific to the needs of a community. The individual metadata creators subjectively determine these inclusions and exclusions. Consequently, extensibility usually exhibits an inverse relationship to interoperability in that the additional metadata elements often cause the metadata to become less understandable to other systems (Taylor, 2004, p. 144).

Incompatible vocabularies are another common factor affecting interoperability that is most apparent when users try to search across metadata or among different institutions such as libraries, archives, and museums. Different organizations often use different or highly specialized vocabularies. For example, one institution, such as a public library, may index a resource using common names whereas another institution, such as a medical lab, may index using scientific names. As a result, the use of more specialized vocabularies must be taken into consideration when working with metadata. In addition to vocabulary, multiple languages also affect interoperability, especially when searching the world wide web. Controlled vocabularies and translations via multilingual thesauri are effective yet limited in their ability to remedy discrepancies (Caplan, 2003, p. 42).

The representation of the metadata elements can also differ, even when the element definitions are identical, since data can be recorded various ways. For example, one set of metadata records may depict an author’s name as “Smith, Jane A.” whereas another set of metadata records may use “Smith, J.A.” for the same author. Consequently, a keyword search on “Jane Smith” would only retrieve records from the first set of metadata records, not the second (Caplan, 2003, p. 42).

References

Caplan, P. (2003). Metadata fundamentals for all libraries. Chicago: American Library Association.
Taylor, A.G. (2004). The organization of information (2nd ed.). Westport, CN: Libraries Unlimited.

Tuesday, June 3, 2008

Metadata Interoperability Part 1

Interoperability refers to the ability of various systems to interact with one another. There are two fundamental forms of interoperability: semantic and syntactic. Semantic interoperability refers to the compatibility of the meanings assigned to the metadata elements of a schema, such as whether or not the term "author" in one schema corresponds in meaning with the term "creator" in another schema. Different applications, databases, and institutions may result in disparate meanings to the same terms or utilize distinct terms to express the same meaning (Gruninger & Kopena, 2005). Syntactic interoperability refers to the ability to extract and use metadata from other systems, requiring the use of a common language or encoding format. In general, metadata interoperability commonly refers to search interoperability, the ability to process various metadata records and retrieve desired results.

Differences in the semantics and syntax of metadata schemas usually cause difficulties in retrieving desired materials. The greater the dissimilarities, the more problematic the retrieval process can become. In terms of semantic differences, there is a wide range of possible variation and misinterpretation in meanings. For example, when comparing two schemas, one schema may require a more precise or well-defined set of rules in determining the meaning of a particular element than the other. For instance, the Dublic Core schema considers the Title element to be any name given to the resource whereas AARC2/MARC follows a strict set of guidelines when assigning what should be considered the Title Proper (Caplan, 2003, p. 41). As a result, there can be various degrees of misinterpretation between the two records. An even more obvious discrepancy would be if one record did not provide a corresponding element at all.

References

Caplan, P. (2003). Metadata fundamentals for all libraries. Chicago: American Library Association.

Gruninger, M., & Kopena, J.B. (2005). Semantic integration through invariants. AI Magazine, 26(1), 11-21. Retrieved May 20, 2008, from InfoTrac OneFile database.

Metadata & Information Retrieval

Monday, June 23, 2008

Cross-system Searches

Thursday, June 19, 2008

Union Catalogs

Friday, June 6, 2008

Metadata Interoperability Part 2

Tuesday, June 3, 2008

Metadata Interoperability Part 1

Purpose

Blog Archive

About Me

Related Blogs

Librarian Action Figure

My Favorite Online Resources