With increasing emphasis on digital progress in India, we often hear the term “Big Data” and the “Privacy” issues associated with it. Just as “Privacy” and “Security” issues have become objects of comparative controversy, Big Data and Privacy are also becoming another set of objects for comparative controversy.
In addressing the Privacy Vs Security issues, we have always held that Security is to be preferred over Privacy and in the context of growing terrorism in India and the world, it is impossible for “Privacy” to be at any time be preferred over Security. This controversy can however be settled with a win-win solution of “Regulated Anonymity” which has been debated earlier.
Now let us look at the Big Data Vs Privacy as a problem which we need to address. This requires a better clarification on what is Big Data before we can comment on the issues arising out of Collection, Mining,Processing, Publishing, Transmission, Disclosure and Harnessing of Big Data.
The Concept of Big Data as against the normal term of “Data” arose when the volume of data to be handled for processing in a single process grew too large to be handled by the normal data processing systems. Along with the size of the data, came the complexity of diverse nature of data and the need to process the huge and complex data.
Big data requires a set of techniques and technologies with new forms of integration to reveal insights from datasets that are diverse, complex, and of a massive scale. The technical issues raised by the size and complexity gives raise to legal issues that are also difficult of being handled by the existing Cyber Laws.
Hence there is a need to have a re-look at Cyber Laws related to Big Data. In this context, Privacy is presently in the top of the discussion table. “Big Data Crimes” will also be relevant for discussion and will subsequently flow on to “Big Data Security” as we go along the path of understanding the issues raised by Big Data.
The raise of IOT, Smart Cities, Smart Grids etc feed onto generation of Big Data and along with it the need to discuss “Big Data Laws” as a necessary subject of discussion.
The growth of Cyber Terrorism and Cyber Warfare, prevention of which requires “Cyber Intelligence” also overlaps with the policies and laws that are needed for the Collection, Mining, Processing, Storage, Publishing, Transmission, Disclosure and harnessing etc of Big Data.
Nature of Big Data
In order to discuss the “Big Data Laws”, we need to first understand the nature of Big Data.
The Source of Big Data is the information transmission nodes and the public data storage points. Beyond these, data is stored in private custody, behind Firewalls and unless it is transferred from the place of creation across an open network, it may never become accessible to Big Data Sniffers.
When Big Data Sniffers “Mine” for data, they may not target any specific type of data or an individual. The data collected from an omnibus data collection drive may later get filtered and classified into different types of data and tagged accordingly for further harnessing.
Components of Big Data are
a) Personal Data collected from Individuals including individualized data such as emanating from devices embedded to the human body such as Wearable s and Medical implants.
b) Corporate Data which includes business information as well as personal data of individuals in the hands of a corporate either as custodians of employee data or as intermediaries processing data of customers and public.
c) Environmental data including those collected from Weather satellites, Mapping devices, CCTVs in public places etc where the primary aim is not to collect personal data but it becomes part of the overall data collected.
d) Meta Data which is “Data about Data” which involves transactions of Netizens, tracking of data movement over a public network and includes “Log Records” of all kinds. Though this data is impersonal at the time of collection, they are amenable to further analysis and conversion from a de-identified state to an identified state.
Privacy Issues are concerns that arise when an individual’s personal data becomes accessible to another without the knowledge and consent of the data subject.
When an individual is providing specific personal data, the principles of Privacy protection revolves around informing the subject of data being collected, the purpose for which it is collected, how it is being used, secured, disposed off etc., following which a consent of the data subject is obtained by the agency collecting the information.
This is a contractual obligation and any violation of privacy which is in breach of the contract is punishable under various laws.
Even in India where there is no specific Privacy Protection law, Information Technology Act 2000 as amended in 2008 (ITA2000/8) provides protection for the contractual arrangement between the data subject and the data collection agent through Sections 43, 43A, 72A etc. Additionally certain powers are vested with certain authorities which provides for exceptions to Privacy which is used for surveillance, intelligence gathering by security agencies, investigation and prosecution of crimes etc.
The problem in Privacy that arises in the Big data context is that at the time data comes into the hands of a Big Data Sniffer, neither he knows that he is collecting personal data nor the data subject knows that his personal data is being collected.
Take for example a street view CCTV which captures the movement of a Car in which the license plate is visible or the face of a person is visible as he is walking across the street. This is initially a data of an activity that a car is moving in a particular street or a man is walking along. But if this data is parsed along with the vehicle registration data it can be presumed that the car’s owner is moving in the street.
Similarly if a face recognition is made on the person walking along by checking with tagged photographs in the social media, the CCTV data becomes a highly personalized data.
If the camera is capturing the person entering and exiting an ATM or a Hospital, we are entering into sensitive personal information about the individual.
These examples indicate that “Data can Change its status from the time it is collected to when it goes into processing”. Herein lies the biggest challenge to Big Data law making.
We cannot prevent the CCTV footage being collected in the first place because there may be a myriad security reasons for the same. Beyond the security reasons there could also be purely functional requirements such as managing the traffic lights in an automated traffic light system.
Once the information which is collected in a public place has an element of “Privacy” there will always be disagreements on how the data can be handled.
We therefore need to perhaps re-think if our definition of privacy itself needs to be reviewed in the context of the development of a digitized environment.
If a person is using a public place, whether the fact that he used the public place can be an information which he can claim to be private? is a point of discussion. Similarly, we can question if watching a person move along the road threough the CCTV cameras, amount to “Cyber Stalking”?.
Obviously, some would agree that such watching may amount to privacy violation and needs to be protected. But law makers need to think twice before recognizing the “Public Activity” of a person as “Private Data” subject to privacy protection.
It is a common practice today to see notices such as “This area is under CCTV surveillance” just to ensure that there is no complaint on privacy violation. In the Big Data law making scenario, we need to debate if such a notice is required in a public place (including malls and public offices).
The key point we need to therefore settle is,
Do we try to make new laws that fit into the Big Data scenario by changing some of the existing concepts or try to fit existing laws to where it cannot be regulated and enforced?
When Cyber Laws were made by people who had no understanding of the Cyber Space, we observed many anomalies creeping into the system. Most of these still remain in the statute and are often the cause of imperfect legal implementation. It will take generations before Jurisprudence develops and matures to address the doubts that arise because the laws made are imperfect to the needs of the society.
A similar situation now prevails where laws made for the normal Cyber Society for privacy protection may not be effective in a Big Data scenario.
We need to therefore re-define what is Privacy in the context of a Digital world and the Big data processing. What is “Personal Data” subject to “Privacy Rights” may have to be re-defined to exclude personal data which is in such state where it is in the form of “raw data not associated with the personal information” though it may be capable of being tagged by a further sequential process.
Once this re-definition of privacy is accepted, the Big Data collector can be free from the obligations of Privacy. It is however the responsibility of Big Data processors to ensure that the linking of “Big Data” with “Identifiable Individual” does not happen except through a regulated process. The new Privacy laws have to therefore address this technical stage of processing Big Data. In a way this is keeping data collected as anonymous data being retained in anonymous state even when it goes down the further processing stream.
For Big Data to be useful, at some stage down stream of the processing chain, it has to be identified with an individual and it is at this process that the Privacy Protection laws can be applied.
The several “Intermediaries” involved in the Big Data Analytics have to be therefore classified into different categories such as “Anonymous Data Processors”, “Identified Data Processors” and “Data Identification Gate keepers” . The “Big Data Privacy Law” can then apply different norms to these different entities.
I invite comments and suggestions …..
(…..Discussions will continue)
Naavi