In the past, I have blogged about de-identification. Simply put, de-identification is the process of rendering personally identifiable data unidentifiable and thus enabling the use of the information for various purposes free from regulation. A recent court case made me think to write again not just on the value of anonymizing data prior to sale or release, but also of the need for older privacy laws to be revisited in view of the Internet and expansive databases filled with personally identifiable Information.
In Ellis v. Cartoon Network(1:14-cv-00484, U.S. District Court for the Northern District of Georgia) the plaintiffs asserted a class action suit against the Turner cable TV channel, Cartoon Network. Lead plaintiff Mark Ellis alleged that when he downloaded the Cartoon Network app ("App") on his Android device, he never agreed to the disclosure of his personally identifiable information and that each time he viewed a Cartoon Network clip using the App, that information was sent directly to a third party company called Bango that allegedly used the information for direct marketing purposes. The information being sent to Bango comprised, amongst other things, the identification numbers for App user's devices, specifically the Android ID number. Bango would allegedly take the data from the App, build a larger consumer profile using outside data in its possession and linked via the Android ID and then sell the enhanced, identifiable consumer information for targeted marketing purposes.
In his dismissal of the plaintiffs' suit, the judge held that Android ID numbers were not "personally identifiable information" ("PII") under the Video Privacy Protection Act ("VPPA") because Bango needed to use additional information to reverse-engineer the users’ identities. "Without more, an Android ID does not identify a specific person," Judge Thrash wrote. "From the information disclosed by the defendant alone, Bango could not identify the plaintiff or any other member of the putative class." Thus, the judge took a literal and technical interpretation of the law to hold that if a third party that receives the information in question cannot identify the individuals without further action, such as matching it to other data sets, the information disclosed cannot be PII. Another, similar putative class action is being litigated against CNN in Georgia as I write this.
A few observations, if I may.
1. Indirect identifiers require consideration and protection too. In actuality, what the judge is saying here is that the Android ID is not a direct identifier, in that it cannot be used by a third party to identify the plaintiff consumer solely on its face. However, I would argue this does not mean it's not PII, but rather an indirect identifier which can be used to identify an individual when combined with other indirect identifiers or direct identifiers. Indeed, under other privacy laws, such as HIPAA, indirect identifiers such as zip codes, and dates of birth, are also protected.
The judge also said if a third party needs to take additional steps to match an ID to a specific person, there is no violation as the information is not "personally identifiable information" under the VPPA. He is right, as the law, passed in 1988, defines personally identifiable information as including "information which identifies a person as having requested or obtained specific video materials or services from a video tape service provider." Under this definition, an Android ID could indeed be interpreted to not meet this definition. It most definitely would not have been the case in 1988. However, today, in the age of interconnected databases and the ability to retain information almost indefinitely, what constitutes a potential identifier is much different.
2. Reduce the risk and anonymize the data. Rather than take random chances in court with razor thin interpretations of definitions in statute, why not de-identify all such data using redaction, an expert statistician model or software to render such information truly incapable of reidentifying individuals, or at least having an acceptable low level or likelihood of any risk of doing so? HIPAA provides a world-class standard for doing just this. Businesses embracing such technologies can capitalize and potentially identify new revenue streams without compromising privacy and customer loyalty. De-identification in this case may or may not have worked, but it could have provided an opportunity to review the data set to ensure any unneeded risks were removed or modified.
3. Govern the data. Regardless if de-identification is a solution or not, you still need to have a solid data governance plan in place to ensure released data meets your company's compliance standards - be they based in regulation or not. Such a program includes policies, procedures, training audits and other oversight controls to ensure that the company's business practices are in line with those administrative guidelines and the law. Such a data governance plan also needs to take into account the public perception of your business practices, and should include educating more sophisticated customers on what your company does to comply with the law and meet their individual privacy expectations.