One of the myths that is being perpetuated by Data Protection Regulations is that there is some thing called “Personal Data” and some thing called “Sensitive personal data” which companies collect and which needs to be protected.
The regulators however forget that in a corporate environment several kinds of data keep flowing in and out. It is not always that a “Data Set” like a Name, Address, E Mail, Mobile, health Data, Financial Data etc come at one single point of time so that they can be immediately tagged and protected as required. It happens only in cases where a company puts out a Web Form and collects some designated information from a source. In such cases a “Consent” can be obtained and data protection compliance can be achieved.
However in most cases, data flows in in different contexts and through different channels often in unstructured format. A company could have received the name and E Mail address an year back and today the same person’s further data may just land within the Data environment of the organization. When the new information is fused to the earlier information, the simple data grows into bigger and more sensitive form.
Similarly, it is possible for a set of available data can be disintegrated and a sensitive data may be converted into a non sensitive data and also anonymized data.
The fact that a personal data is always a “Set” of elements one of which is the core identifier of a living natural person and there is an organic growth of the data into different forms is not adequately captured in the data protection regulations. Some of the data protection regulations define individual identifiers themselves as “Personal Data” without recognizing that any identifier not being identifiable with another “identity” of a living individual cannot be called “Personal Information” is often missed.
As an example we often hear, IP address is Personal Information or Physical Address is “Personal Information” etc. Though data protection practitioners try to enable their processes to identify the conversion of the status of data from one state to the other through manual intervention or with the use of AI, this remains a lacuna in the regulatory definition of data.
The New Theory of Data has to therefore capture in its Data Definition that “Data is Dynamic”, “It evolves over time” and “Consent” obtained when the data is in its Zero day status fails when a new data element comes within the radar of the Company.
An example could be that a Company may have a group photo of people many of whom is not known to it. Suddenly, one of the person becomes identifiable because he sends in a job application with a photo. Now the Group photo which is already in the data system as of a past date becomes an “Identifiable” data. This dynamic nature also affects the Data Portability and Data Erasure requests.
The New Theory of Data needs to recognize these anomalies and ensure that there is a valid explanation of these special instances of data within the theory of data.
Similarly “Data as a Property” of the Data Subject or Data as a productive asset of an organization is not properly captured by the present technical or legal approach to data.
Thus the current system of understanding data from the perspective of technology and law appears to be posing contradictions because each domain of stake holders have at different points in time tried to describe the term “Data” for their own convenience. If these differences are not amicably resolved the Corporate managements will find it difficult to balance the differing demands of the technologists, lawyers and the business managers.
The need for a new approach to understanding data is therefore critical and this new theory should be capable of creating a proper definition for the term data so that all seemingly contradictory views converge under the new theory.
Watch out for more…
Naavi