Indexing Language
Collection | GlossariumBITri |
---|---|
Author | José Antonio Moreiro |
Editor | José Antonio Moreiro |
Year | 2010 |
Volume | 1 |
Number | 1 |
ID | 55 |
Object type | Concept |
Domain | Documentation Information Management Library and Information Science |
es | lenguaje documental |
fr | langages documentaires |
de | kontrolliertes Vokabular |
Indexing languages are a subset of natural languages used to describe documents. These languages are part of the information science techniques used to describe resources. The goal is to represent information in order to improve the retrieval of relevant documents.
There are several types of indexing languages. The oldest are library classifications and subject headings. In recent times, Computer Science's development and changes in information needs has brought new indexing languages.
Indexing languages are concerned by two factors:
- Considerations regarding linguistic aspects
- Functional considerations. In specific contexts these tools are used to improve performance.
Types of Indexing languages
Free Language: (i) Uniterm lists, (ii) Keyword lists, (iii) Glossaries, (iv) →Folksonomies
Language codes: library classification schemes
Controlled vocabularies: (i) Based on hierarchies: Taxonomies, (ii) Based on hierarchies, associations and equivalent terms: Thesauri and subject headings, (iii) Based on terminology ontologies in a specific context and with associations to current resources: Topic Maps.
Thesaurus as a reference model
Thesurus is a prototypical indexing language. A thesaurus is structured as a semantic network limited to a domain. This network is composed of nodes, and each node represents a concept. This is an agreed language, with shared definitions in the domain. It is controlled in the sense that only the thesuarus' terms could be used to describe a resource. This principle guarantees uniqueness in the relationship concept-term. As a tool to control terminology it has the following term types:
- Descriptors (terms used to represent the concepts within the domain).
- Non-descriptors (terms from the domain that have an equivalent in the list of descriptors. These terms are not used to represent documents, using the equivalent descriptor).
Descriptors are related by means of:
- Hierarchical and associative relationships
- Equivalence relationship to relate Descriptors and Non-descriptors.
References
- LANCASTER, F. W. (2003) El control del vocabulario en la recuperación de información. 2ª ed. Valencia: Universitat de Valencia.
- MOREIRO GONZALEZ, J. A. (2004). El Contenido de los documentos textuales: su análisis y representación mediante el lenguaje natural. Gijón: TREA
- ROE, S.; THOMAS, A. (eds.) (2004). The thesaurus: review, renaissance and revision. New York: The Hawoeth Information Press.
- SLYPE, G. Van (1991). Los lenguajes de indización: concepción, construcción y utilización en los sistemas documentales. Madrid: Fundación Sánchez Ruipérez.