Using data for social good
Who gets to decide on what’s socially beneficial? And for whom?
Data can be a powerful tool to help improve lives, from making informed policy choices meeting the climate challenge to delivering better healthcare and more efficient services to citizens. According to Chelsey Colbert, it can also help build trust among the population on initiatives that use sensitive personal information, who moderated a panel during this month’s CBA Access to Information and Privacy Law Online Symposium.
Of course, anyone can lay claim to using data for a positive social impact. But what does that even mean? Governments, businesses and non-profits have all been known to talk up its merits. “So it kind of makes you wonder, is ‘data as a social good’ just a marketing or a PR term, or is there some substance here?” Colbert asked the panel.
A starting point for the discussion in Canada is Bill C-11, introduced last year to update Canada’s privacy regime for the private sector. The bill ultimately died on the order paper before the election. Still, it contained a provision that would have allowed disclosure of deidentified personal information, without a person’s knowledge or consent, to a prescribed entity, provided it was for socially beneficial purposes. The entities contemplated by the bill included governments, healthcare and post-secondary educational institutions, public libraries and any organization mandated under federal or provincial law, to carry out a socially beneficial purpose.
According to Dean Eurich, of the School of Public Health at the University of Alberta, people can’t agree on an interpretation of what the provision means. “We’ve been trying to liberate data for 20 years and I hope C-11 can help bridge that divide [if reintroduced],” he says. “But I’m also a little skeptical of whether or not it’s going to achieve the endpoint.” Our current privacy regime already allows for some sharing of data, he notes. “It’s just the application, and the interpretation of the law seems to be different with every group that looks at that particular law.”
As a researcher, Eurich says he can access deidentified data through Alberta Health or its counterparts in Ontario and British Columbia. “But it’s very difficult for me to share this data with other individuals who may not be part of my research team because of the various agreements that have been put in place at the university level.” And it’s practically impossible to share data with commercial organizations – never mind that they’re taking up a greater and greater role in our health systems.
One promising solution is to use machine-learning techniques to enhance the privacy around deidentified data – a practice known as fabricating synthetic data. This is done using individual patient records and administrative health data. “We evaluate the patterns of disease that healthcare encounters in the real world and then we use this to develop a complete population of patients that mimic the real world but are completely fictitious,” Eurich explains. He and his collaborators have generated a dataset of 80,000 fictitious opioid users in Alberta. The synthetic data is useful in that it “mimics real-world patterns of care and some of the adverse outcomes that have been happening around opioid use.” But meeting the technical challenge is the easy part. Far more uncertain is convincing the provincial privacy commissioners and the health authorities that the data is fictitious and should be shared for a socially beneficial purpose.
There is a legitimate question as to whether an allowance for sharing deidentified data for social good should exclude anyone with commercial interests? “From my perspective, I would hope that it doesn’t,” he says. “There’s so much that influences health and utilization of health services that are outside of the health system and embedded in commercial entities.” If Loblaws were to share its users’ data, we could learn about people’s nutritional habits and break them down by geographical area. “That could be a major determinant of the type of health events that are happening within that community,” he says.
Turning to the national statistical office, one is likely to hear the same opinion. “We produce information on the Canadian economy. We produce information on the health of Canadians, the socioeconomic conditions. We produce information on the environment,” says André Loranger of Statistics Canada. The value of data lies in our ability to make linkages and draw insight from it, he adds. So the goal is to make as much data as possible open to the public - much of it that’s in the hand of the private sector.
A case in point is the Data for Good program launched by TELUS Communications, which it is promoting “to solve pressing societal issues in ways that preserve privacy and build trust.” According to TELUS’ Chief Data & Trust Officer, Pamela Snively, the company had been working for several years with experts on different methodologies to deidentify its wireless data. By the time the pandemic hit, TELUS was in a position to grant access to researchers looking for patterns to help authorities coordinate their health responses. But first, it would take a sustained effort to get regulators on board, as well as the company’s customer base. It reached out to media outlets to make a case for sharing the data. “Everyone was hungry for solutions to the pandemic crisis,” says Snively. But even in an emergency, government agencies were nervous about the potential public backlash. Hence the importance of taking a proactive and transparent approach to rolling out the program and showing that the business is operating with “absolute rigour,” she says.
It’s one thing to mobilize people during an emergency. But as Snively noted during the panel session, organizations are skittish about sharing data in the absence of bright lines. “The potential reputational impact is significant,” she says. “We don’t want to lose customers because they’re not happy with what we’re doing with their data. But is it reasonable to expect that, as a society, we can agree on what those bright lines are, particularly during normal times when a pandemic isn’t raging?"
Part of the problem is that there’s no "single public" that we can appeal to for public trust, says Lisa Austin, chair in law and technology at the University of Toronto. “We have multiple publics, and there are multiple social groups. And that’s really, really important in some contexts.” To illustrate her point, she cites the work of the Indigenous data sovereignty movement that advocates for Indigenous communities to decide what’s socially beneficial, and how their data is used. “Now that’s a bright line,” she acknowledges. “But it’s a bright line about community governance.”
It’s not unusual for communities that have experienced historical disadvantages to be distrustful of how data is used, Austin says. “The question is, who gets to decide on what’s socially beneficial? Who decides for whom? And I think there’s going to be a context where that’s absolutely critical to figure that out, and we might need special mechanisms to manage that.”
Part of it is agreeing on standards that are recognized and certified across jurisdictions. “I don’t think C-11 tackles this well enough, and I think that’s part of the puzzle,” says Austin.