1st and 2nd place in Directed Hack Hackathon

Image - 1st and 2nd place in Directed Hack Hackathon

Over four days in February (7th, 14th, 21st and 28th) a group from the Centre for Big Data Research in Health (CBDRH) UNSW, participated in the Australian Computer Society (ACS) Directed Ideation/Hackathon on Data Sharing and Privacy. Organised by Dr Ian Oppermann, ACS Vice President and CEO & Chief Data Scientist of the NSW Data Analytics Centre, the challenge of the hackathon was to investigate whether we can “develop an algorithm to protect individual privacy in large numbers of linked, deidentified data sets” The hackathon brought together 24 participants and data privacy experts from CSIRO/Data61, other NSW and federal government divisions, ICT private sector and PhD students and data scientists Georgina Kennedy, Elliot Zhu and Oisin Fitzgerald from the CBDRH. The hackathon was held across four rounds, and two teams with CBDRH members survived all rounds, ending the competition in overall first (Oisin Fitzgerald and Elliot Zhu of team Led Zeppelin) and second place (Georgina Kennedy of team Data Destroyers).

The overaching goal of the hackathon was to develop a framework for quantifying the amount of information in a dataset, and methods to allow sharing of this data whilst protecting individual privacy. In particular, the teams worked on the combined development of a Personal Information Factor (PIF), a score that quantifies the level of personal information in a particular dataset, and a Data Safety Factor (DSF), that links the PIF to other factors such as data coverage and accuracy, which could facilitate decisions concerning safety for public data release.

With their healthcare background a strong focus for the teams of CBDRH members was on privacy preserving methods for sharing longitudinal data. In electronic medical records, , one individual is likely to have many records – e.g. a record of all visits to a doctor with associated drug prescriptions. This temporal pattern, possibly linked to other information (e.g. social media), is one potential method an attacker could use to reidentify an individual in a deidentified dataset. However, such patterns of usage may also be of upmost importance to researchers and policy makers. As a result, both the winning CBDRH teams spent time focusing on how you measure the amount of personal information in (possibly millions) of such patterns, and methods to obfuscate/aggregate this information so as to improve data safety, whilst still maintaining usefulness to researchers/policy makers.

Date Published
Tuesday, 5 March 2019
Back to Top