User Tools

Site Tools


data_management:data_quality:metadata

Metadata

Introduction

This factsheet describes metadata in a nutshell. Metadata are highlighted from different angles in a structured way. This factsheet deals with Defining Metadata, i.e., metadata that defines the meaning of other data. The definitions of other types of metadata are included for clarity.

Definitions

Metadata

Data that defines and describes the characteristics of other data. (DAMA Dictionary)

Defining metadata

Metadata that enables interpretation of the meaning of data to provide information.

Data lineage

Metadata that identifies the sources of data and the transformations through which it passes up to the point of consumption.

Operational metadata

Metadata arising from processing data that identifies and describes the outputs from the process.

Data quality rule

A constraint used to validate data values or records.

Critical data element

A data element that is determined to be vital to the successful operation of the organization. Note: such data elements are a subset of defining metadata.

Business glossary

A register of the terms and their definitions as used by business stakeholders to describe the business.

Data model

A description of the organisation of data in a manner that reflects an information structure. (ISO 11179:1999). Note: data models may be conceptual, logical, or technical.

Data element specification

Specification of a data element composed of identification, definition of the meaning, representation, and permissible values.

Data dictionary

An information resource that records data element specifications and may include specifications of data structures composed of data elements

Purpose

The purpose of defining metadata is to enable consistent interpretation of data and thereby provide information.

Which comes first, the data or the metadata? In principle, the defining metadata is created in the data design phase as specification of the implementation and is maintained as reference source for users of the data.

Procedure for management of defining metadata

Phase Activity
Plan • Compose defining metadata
• Establish defining metadata
Do • Use defining metadata
Check • Evaluate defining metadata
Act • Revise defining metadata

Characteristics and Requirements

Characteristic Requirement
Accuracy of defining metadata Defining metadata should be accurate enough.
Completeness of defining metadata Defining metadata should be complete.
Unambiguity of defining metadata Defining metadata must not be open to misinterpretation
Clarity of defining metadata Defining metadata should be legible and understandable

Relationships

Several types of metadata are distinguished. These are shown in Figure 1. Figure 1 Types of Metadata

Inputs to and outputs from Defining Metadata are shown in Figure 2.

  • Data Quality Requirements, Data Quality Policy and Design Processes are input to Defining Metadata.
    • Data Quality Policy and Requirements provide principles and policies for the creation and management of Defining Metadata.
    • Design Processes provide methodologies for the creation and management of Defining Metadata.
  • Defining Metadata is input to Data Quality Objectives, Data Quality Rules, Monitoring Data Quality, Data Issues, Awareness of Data Quality, Critical Data Elements and Data Cleaning.
    • Data Quality Objectives can include requirements for availability and application of Defining Metadata.
    • Defining Metadata are a basis for determining Data Quality Rules.
    • Defining Metadata provide standards for Monitoring Data Quality.
    • Defining Metadata provide criteria when resolving Data Issues.
    • Defining Metadata assisting in making Awareness of Data Quality explicit.
    • Defining Metadata provide a basis for the selection of Critical Data Elements
    • Defining Metadata provide criteria to be applied in Data Cleaning

Figure 2 Inputs to and Outputs from Defining Metadata

An architecture of defining metadata

Figure 3 shows the three levels of defining metadata, conceptual, logical, and technical with the main forms of metadata at each level and their relationships. The diagram also indicates the roles that are primarily engaged with the metadata at each level. Figure 3 An architecture of defining metadata

Some forms of defining metadata

Data models

The goal of semantic modelling is the creation of a common understanding of the meaning of things, thereby helping people understand each other and done in such a way that the meaning is explicit and accurate and is understood by humans and interpretable by computer systems.

Although the terminology varies from one methodology to another, the elements to be found in most semantic modelling languages are: entities, relations, classes, attributes, terms, and axioms.

Examples of data element specifications

Identifier: DE001
Name: Code of a customer as DUNS Number
Definition Code identifying a customer according to the Data Universal Numbering System (DUNS) of Dun & Bradstreet.
Data type: Numeric
Format: Fixed length 9
Value domain: DUNS code numbers issued by Dun & Bradstreet https://www.dnb.com https://www.altares.nl
Identifier: DE002
Name: Loaded weight of a shipping container
Definition The weight of a shipping container including its contents according to a weight unit of measure.
Data type: Real number
Format: Variable length maximum eight digits with two decimal positions
Note: This data element type requires an associated code of a weight unit of measure.
Identifier: DE003
Name: Code of a unit of measure UN/ECE Rec. 20
Definition Code identifying a unit of measure according to UN/ECE Recommendation 20
Data type: Alphanumeric
Format: Variable length maximum three characters
Value domain: Code of a unit of measure according to Recommendation No. 20 Codes For Units Of Measure Used In International Trade published by United Nations Economic Commission For Europe (UN/ECE) https://unece.org/trade/uncefact/cl-recommendations

Such data element types are registered and maintained in a data dictionary. They are related to data models as specifications of the implementation of attributes of entities.

The names and definitions of data element types are used in user interfaces and example values may be registered in the data dictionary for use as prompts in electronic forms that are filled manually.

Example DE001 would be part of customer master data. It might be the primary identifier of a customer record or may be a secondary identifier used in credit check processes or as a means of building the hierarchy of related corporate organisations.

Example DE002 would be used during the trajectory of a shipment, for example when placing an order for shipment of a container, in the loading plan of containers in a vessel and in declarations to authorities.

Story

For some time issues and arguments had been rumbling on about who was responsible for recurring errors and delays in shipments to customers. And Finance and Accounting were increasingly concerned about accounts receivable problems that were attributable to errors in invoicing.

Eventually it became clear to some executives that poor master data quality was causing operational problems. But how should this be fixed? Who was responsible for master data?

A series of workshops involving key managers were organised, facilitated by a Data Management guru. At first the discussions about data were confused. Manufacturing and supply chain management had differing views and terminology about materials and products. Marketing, sales and administration had differences about channels, contracts, prospects and customers.

However, after a few cycles the guru had a series of posters telling a story about the company's data that everyone agreed on. They were surprised when he explained the posters showed conceptual data models of the primary entities of importance to the business and the beginnings of a business glossary. And that these formed the foundation for a Data Governance framework that would lead to effective rollout of Master Data Management within six months.

References

Alexopoulos, P. (2020). Semantic modeling for data: Avoiding pitfalls and breaking dilemmas. O'Reilly Media.

DAMA (2017). DAMA-DMBOK. Data Management Body of Knowledge. 2nd Edition. Technics Publications LLC. August 2017.

DAMA Dictionary of Data Management. 2nd Edition 2011. Technics Publications, LLC, New Jersey.

ISO 11179:1999 Information technology — Specification and standardization of data elements

ISO 9000:2015. Quality Management Systems – Requirements.

ISO 9001:2015. Quality Management Systems – Fundamentals and vocabulary.

Harris, J. & Hoberman, S. (2020). Data modeling made simple with Erwin DM. Technics Publications LLC, New Jersey.

Authors

Members of the Workgroup Data Quality of DAMA-NL

data_management/data_quality/metadata.txt · Last modified: 2022/01/18 16:39 by andrew