Skip to Content

Designing data trusts

And what not to do when setting up a mechanism for data stewardship.

Commuter crowd walking

If failure is the mother of success, something might still be salvaged from the collapse of Sidewalk Labs' plans for a tech-forward urban utopia on the Toronto waterfront.

The company, a sister firm of Google, became in 2017 the winning bidder on a Waterfront Toronto project to redevelop the 12-acre Quayside district. Sidewalk Labs' pitch — a "smart city" built around ubiquitous data collection — was nothing if not ambitious. It would have featured a network of cameras and sensors collecting data on everything from traffic patterns to bicycle use to energy consumption in private homes — data that Sidewalk said could be used to make public services like transit and waste collection more efficient.

Ambitious — and to a lot of Torontonians, alarming. Three years later, Sidewalk Labs pulled out of the project, citing "unprecedented" economic uncertainty and volatility in the city's real estate market. But even the project's supporters acknowledge the real reason had more to do with the odour lingering over Sidewalk's plans for handling personal data.

Sidewalk's plans for Quayside were controversial for two reasons: the vast amount of data it wanted to siphon from public and private spaces, and the way in which it wanted to handle that data. Sidewalk's data management plan rested on two notions: "de-identifying" personal data to make it free to use and share, and a data trust to manage and protect the information.

The concept of a data trust is popular these days with policymakers looking for ways to unlock the value in data without undermining privacy. But the term itself is a bit of an empty bucket: data trusts can refer to anything from a legal model (where trustees manage property in a fiduciary arrangement with the trust's beneficiaries) to organizational structures to actual data stores. The Open Data Institute defines a data trust as a legal structure that provides independent stewardship of data.

But stewardship for whom? Sidewalk Labs proposal for a data trust differed in one essential from every other data trust concept: it was to be built around geography. People within the physical borders of the Quayside district would be monitored. "Consent" was to be obtained through signage — new symbols to alert people to the presence of sensors and cameras.

The "trust," meanwhile, was a trust in name only. Sidewalk's proposed data trust was essentially a non-profit; the plan was to hire a chief data officer, draft a data use "charter," and require all parties looking to collect data within the zone to enter into contracts with the trust. Eventually, had the project continued, the trust might have been transformed into a public sector entity.

"The whole concept of data trusts is really hard to pin down," said Brad Limpert, a tech law practitioner in Toronto and director of Osgoode Hall Law School's LLM Program in Privacy and Cybersecurity Law. "It's not really a thing in law. What Sidewalk Labs was proposing wasn't a legal trust. It didn't have clearly defined beneficiaries or a clear purpose."

The result, said Lisa Austin, chair in law and technology at the University of Toronto, was an "incoherent" model that never reconciled the goal of open access to data (embraced by Sidewalk to obviate claims that it was building a monopoly) with the need to review data collection and use. Austin will be a panelist at the upcoming CBA Access to Information and Privacy Law Symposium discussing how data trust concepts can advance the use of data for social good.

"Indeed," she wrote in a paper co-written with U of T computer engineering prof David Lie, "many of the reasons for wanting the Urban Data Trust to review data collection and use — that there are important norms regarding data harms and data justice that go beyond either individual privacy or ensuring public access — are also reasons for wanting the Urban Data Trust to review disclosure."

It didn't help that Sidewalk was pitching a new classification for the data it would be harvesting — "urban data" — that has no meaning in Canadian law. According to Sidewalk's definition, urban data encompasses non-personal, aggregate and de-identified data, and personal information, "collected and used in physical or community spaces where meaningful consent prior to collection and use is hard, if not impossible, to obtain."

Sidewalk's signage was supposed to fulfil the function of advance consent. But is that kind of "consent" meaningful in a public space, where novel symbols are supposed to alert people to data harvesting and where the only way to withhold consent is to physically leave the area?

And given what we know now about the technical limits of de-identification — how it can fail, how de-identified data can be linked to other data and re-identified — did the project not lean far too heavily on the assumption that de-identified data could be safely shared as broadly as possible?

The anomalous legal nature of Sidewalk's data trust model didn't help the project either. Making the trust a private sector corporation, wrote Austin and Lie, removed it from the ambit of federal private sector and provincial public sector data protection law.

"In other words," they wrote, "the Sidewalk Labs' proposal would actually take the Urban Data Trust entirely outside of the public regulatory framework for privacy protection."

Sidewalk itself proposed the trust's structure, but the trust was supposed to regulate Sidewalk's data activities. Waterfront Toronto essentially delegated the design of data and privacy governance to a private sector vendor charged with coming up with a policy for a highly controversial data harvesting scheme. Questions about conflicts and legitimacy were inevitable.

"For a start, was Sidewalk not usurping the policy-making rights of the Ontario or Toronto governments?" said Chantal Bernier, national practice leader for privacy and cybersecurity at Dentons. "What right did it have to even purport to develop a data trust structure in relation to this data?"

And that's the big irony of the Quayside experience: a data trust project failed in part because people didn't trust it. Data trusts have great potential for harnessing data in service of scientific research and better government — but not if they sidestep questions of consent with hand-waving promises of de-identification and open access.

"Having it be led by a municipal or provincial government could have helped, but I think the main thing is that there wasn't enough involvement of the data subjects up front," said Alison Paprica of the Institute of Health Policy, Management and Evaluation at the University of Toronto.

She and colleagues published a paper last year outlining essential traits of successful data trusts. One of them — public and stakeholder engagement — goes to the heart of what went wrong with Sidewalk and Quayside: a data trust can obey the law and still fail if it neglects the need to obtain social licence.

"I think it would be rare that a large and sophisticated organization like the one behind Sidewalk Labs, or a government, would propose something that's not legal," she said. "But what can happen is that large organizations focus on potential benefits to the point that they propose or plan to do something that is legal but outside the boundaries of 'social licence,' i.e., isn't supported by or acceptable to the people who would be contributing the data.

"There are ways of using the data, with conditions and risk mitigation that the data subjects would support, but large organizations can't figure those out on their own. The organizations need to have deep involvement of the intended beneficiaries in the development of the grand plan … vs. consulting or trying to persuade people once the broad strokes of a plan have already been set."