Active and Passive Identifiers

Bringing authentic and deterministic data to a digital network

Yin-yang refers to a concept originating in ancient Chinese philosophy where opposite forces are seen as interconnected and counterbalancing. The concept is symbiotic with the Human Colossus strapline, “a home for synergy”, and offers the perfect analogy to describe the entry and capture duality of a balanced network. 

A secure network must offer objectual integrity through deterministic data capture and factual authenticity through authentic data entry. These fundamentals enable a self-regulating system where (1) semantic data models represent objects and their relationships deterministically and (2) append-only logs accompany signed data inputs to identify the origin and creation of authentic events at recorded moments so that the data can be considered factual.

The following network model highlights the synergistic relationship between elements and characteristics of data entry, depicted in the northern hemispherical Inputs domain, and data capture, which fall south of the equator in the Semantic domain.

Figure 1. A component diagram showing the inputs and semantic counterparts of a balanced network model.


Inputs domain [active] / what is put in, taken in, or operated on by any process or system.

yang.png

Data entry is defined as the process of transcribing information into an electronic medium such as a computer or other electronic device, which entails the medium storing state changes as recorded events to determine the authenticity of the data origin (the “source”), its status, and where it moves over time. In a balanced digital network, data entry requires append-only logs to accompany signed data inputs to identify the origin and creation of authentic events at recorded moments so that the inputted data can be considered factual. Using a lock-and-key analogy, the requirement of signature keys for factual events indicates that data entry elements reside on the male (or yang-) side of the model.


Semantic domain [passive] / the meaning and use of what is put in, taken in, or operated on by any process or system.

yin.png

Data capture is defined as the process of collecting structured and unstructured information electronically and converting it into data readable by a computer or other electronic device, which entails implementing structural, definitional, and contextual definitions (“metadata”) to interpret data that adhere to those definitions. In a balanced digital network, data capture requires semantic data models to represent objects and their relationships deterministically so that the context and meaning of inputted data is the same for all interacting actors. Using the same lock-and-key analogy, the requirement of objectual fields indicates that all data capture elements reside on the female (or yin-) side of the model.


The characteristics of the identifier types required for data entry and data capture differ. In the case of data entry, factual data inputs are identified by active identifiers, a type of identifier that requires a signing key to authenticate and bind an active governing entity to an event. An active identifier cannot be controlled by a passive identifier.

In the case of data capture, on the other hand, objectual metadata and data are identified by passive identifiers, a type of identifier that has an association with a cryptographic hash of digital content which acts as a deterministic fingerprint to identify passive objects and their relationships. A passive identifier can either be (1) controlled by an active identifier or (2) not controlled.

The following hash grid table describes the different identifier states.

Figure 2. A hash grid table describing the different states of active and passive identifiers.

Figure 2. A hash grid table describing the different states of active and passive identifiers.

Governing vs Non-governing

Governing

The identifier identifies an entity that can govern.

Non-governing

The identifier identifies a non-governing entity, an inanimate object, or a static data file.


Authentic vs Deterministic 

Authentic:

The identifier requires a signing key for authentication.

Deterministic:

The identifier has an association with a cryptographic hash of digital content. Any change to the binary state of a single byte of the digital content will invalidate the hash. Thus, a hash value is a deterministic fingerprint for digital content.


In reference to the hemispherical domains shown in the digital network component diagram, this essentially means that all data inputs within the Inputs domain should be signed by a signature key to authenticate and bind an active governing entity to an event. All semantic structures within the Semantic domain, on the other hand, should have an association with a cryptographic hash of digital content which acts as a deterministic fingerprint to identify a passive non-governing entity, an inanimate object, or a static data file.

As a summary statement, active identifiers ensure that inputted data has come from an authentic source and passive identifiers ensure that the context and meaning of inputted data remains deterministic for all interacting actors in a digital network.

Note: The Human Colossus Foundation will be introducing key components of a Dynamic Data Economy (DDE) through a series of blog posts over the course of 2020.

Paul Knowles, Head of the Advisory Council

Paul has a bachelor's degree in Information Business Systems Technology from the University of Essex in Colchester, UK. His reputation as a data semantics expert and innovator has been established through a 25-year career in pharmaceutical biometrics where he has worked with companies including Roche, Novartis, GlaxoSmithKline, Amgen and Pfizer. 

In terms of decentralised data initiatives, Paul is the innovation lead behind the Mouse Head Model (MHM), a conceptual model for a safe and secure data sharing economy. He is also the inventor of the Overlays Capture Architecture (OCA) and the main spearhead behind the Blinding Identity Taxonomy (BIT), both of which facilitate a unified data language so that harmonised data can be pooled into multi-source data lakes for improved data science, statistics, analytics and other meaningful services.

Paul is a co-founder and chair of the Advisory Council at The Human Colossus Foundation, the chair of the Hyperledger Indy Semantics workgroup, Co-convener of the Trust over IP Foundation (ToIP) Decentralized Semantics Working Group, co-lead of Sovrin Foundation's "Comms" work stream and an active contributing member at MyData Global and Kantara Initiative.

Previous
Previous

Blinding Identity Taxonomy

Next
Next

Genesis of a Dynamic Data Economy