The National Institute of Standards and Technology (NIST) has updated its 2016 draft “De-Identifying Government Data Sets” – a guidance of do’s and don’ts for Federal agencies to ensure that de-identified data cannot be re-engineered to reveal Americans’ sensitive information.
“Every Federal agency creates and maintains internal datasets that are vital for fulfilling its mission,” NIST’s new third public draft reads. “Agencies can use de-identification to make government datasets available while protecting the privacy of the individuals whose data are contained within those datasets.”
The agency defined de-identification as the removal of “identifying information from a dataset so that the remaining data cannot be linked to specific individuals.” With the passing of the Foundation for Evidence-based Policymaking Act of 2018 – requiring agencies to collect and publish their data – it is imperative they “use de-identification to reduce the privacy risks associated with collecting, processing, archiving, distributing, or publishing government data.”
Attacks against datasets to re-identify undisclosed information have become more sophisticated in recent years, NIST said, with actors like journalists and activists leveraging commercially available, high-resolution geo-location data and re-identification techniques to learn confidential information.
In the draft guidelines, NIST offers agencies several ways to manage this risk:
- Creating a formal Disclosure Review Board that consists of legal and technical privacy experts, stakeholders within the organization, and representatives of the organization’s leadership;
- Create or adopt standards to guide those performing de-identification, and regarding the accuracy of de-identified data; and
- Consider performing de-identification with trained individuals using software specifically designed for the purpose.
NIST is seeking comments on this latest draft of the de-identification guidelines through Jan. 15, 2023.