Big Network is Watching You
With Enhanced GPS, cell phones will soon be able to pinpoint a user’s location down to a specific street address. For users, this new capability—which combines satellite GPS with timing data from the cellular network—will improve directions, mapping and other location-based phone services. Meanwhile, marketers plan to use the data to track consumer preferences, to better target ads and to personalize recommendations shown onscreen. Such accurate GPS data means marketers don’t need sales records from retailers to know where people like to shop.
“Some kids might go to Macy’s just to ride the escalators,” engineering professor John Canny explains. “But if you’re at that location for five minutes, the odds are that you’re making a purchase or at least considering one.” Canny, the Paul and Stacy Jacobs Distinguished Professor of Electrical Engineering and Computer Sciences, joined the faculty in 1987.
While improved recommendations are nice, so is personal privacy, and having some company tracking your every move poses risks, no matter what the information is used for. “The user should become a gatekeeper of his or her own information,” Canny says. “It’s normal for you to maintain some location history, which can be used to recommend services you’d like; but you don’t want the phone to just send data to the service provider unprotected.”
With funding from the National Science Foundation and the Center for Information Technology Research in the Interest of Society (CITRIS), Canny is developing a privacy-protection scheme called Ant Club Trails that will let companies personalize your recommendations while preventing them from determining your identity. Like the chemical traces left by ants, the system stores a limited and temporary record of where each mobile phone user has visited. But this information is decoupled from individual identities.
“The service providers can query users about their information, but only in a way that gives aggregated info,” Canny explains. “It’s something like anonymous ballots. You can count them, but you can’t tell how any individual person voted.”
From this aggregated info, the user can get personal recommendations, just as they would from Amazon or iTunes. It’s the same experience for the user, but the service provider learns less. “They don’t know anything about individual mobile phone users,” Canny continues, “but they learn just enough to provide the recommendation services that drive sales for them.”
To conceal people’s identities, the system uses linear algebra to distill user behavior into a limited set of “archetypes,” each of which represents a set of preferences that tend to correlate. The archetypes are derived purely mathematically, but they tend to make intuitive sense. As Canny explains, “When you generate archetypes for movie preferences, which Netflix does for example, they tend to correlate the way you might expect. One archetype will like all the action movies, for example.”
Once the archetypes have been calculated, each user can be represented by how closely they match each archetype by a short string of numbers. The system can generate effective recommendations using these numbers but can’t use them to re-create a specific history or determine an identity.
Translating user histories into archetypes compresses the data enormously, which makes it faster and easier for the network to handle. It also shields service providers from the possible complications of having potential evidence. “Most companies want to decrease their archiving load and avoid subpoena of their data, which is expensive and messy,” says Canny.
In addition to location data, cell phones and other mobile devices are loaded with other personal information like call logs, contacts, datebooks and family photos, which you might not want a company to have—let alone any stalker or criminal who might gain access it. In an era of ubiquitous personal devices and wireless communication, privacy-protection systems like Ant Club Trails must be built into the architecture of systems from the very beginning. Otherwise the door will be open for privacy intrusion by companies, governments or hackers.
At the multidisciplinary Berkeley Institute of Design (BiD), Canny and other researchers, including his Ph.D. student Yitao Duan, work and run experiments in their “smart room testbed,” a privacy-challenging environment filled with microphones, cameras and wireless hubs that rivals even the most wired airports, malls and casinos. See those microphones? Canny notes that software recently developed at Georgia Tech can diagnose depression with 95 percent accuracy from a five-minute sample of someone’s speech. “And I think we can do better,” he adds. See those cameras? Canny explains that face recognition technology isn’t powerful enough yet to identify people if there are thousands of possible matches; but, with a pool of possibilities the size of a typical cell phone contact list, it can recognize faces in stills or video.
This summer, the BiD will move into its new home on the fourth floor of the new CITRIS headquarters building, which officially opens February 27.
“Privacy is a basic right, but companies hold all the cards,” Canny summarizes. “They design the software and hardware, they manage all the data, and users don’t have enough information to understand the risks. All you can do is click ‘Yes.’”