Andreas Weigend | Social Data Revolution | Fall 2012
School of Information | University of California at Berkeley | INFO 290A-3

Class 4: September 24, 2012

On Privacy, see the recording of an earlier class with Cynthia Dwork
http://stanford2009.wikispaces.com/9_Privacy+Wed%2C6.10

Responsible for initial page (up by 10pm on Thursday after class):
  • Rachel Tsao
  • Omar Rehmane
  • Carl Shan
Assisting Andreas with preparation of class (discuss after previous class, due Thursday before this class):

Class materials

Timeline

1600 Setup
  • 1615 Andreas: The Space of Location Data
  • 1630 Demo: Google Latitude
  • 1635 Andrew Wansley, GOOG (via video): Latitude’s Data Sources
  • 1645 Hugh Fletcher, VZ: Verizon’s Geo API
1655 Break
1745 End
1830 Office hour

Guests- Bio and Summary of Presentation

(A) Andrew Wansley, Google Latitude PM (via Skype)

Andrew came in to the class via a choppy Skype connection, but was able to share some key messages nevertheless. He gave internal technical details about the product and spoke at length regarding each of the layers in the stack. Andrew described how Google Maps has gone into detection on Android devices to figure out whether an individual is biking, running, walking or driving. The anonymous data is collected for the purpose of giving things like biking directions and for enhancing 'fuzzy' edges of maps/data. This is done through analysis of an accelerometer.

external image hero.jpg

Google Latitude works by calling the app on the phone about once a minute, and thus sends data regarding the location of the user. These data is batched and uploaded to Google Latitude's server about once every 10 minutes. This is an opt-in process. Push notifications are also used to ping a user's device.

Andreas asked for expansion upon this idea, but at this point (32:00) the connection between Andrew and the class became incredibly choppy and every other word of Andrew's was cut out. The class called into Andrew's cell phone to get a clearer connection while leaving Andrew's Skype on so we could still see him.

Other Sources of Data Collection:

Andrew mentioned that there were other ways that Android phones can request location data. For example, Google's navigation on phones can send a stream of data to Google.

Rachel asked a question to Andrew at this point: What does he do as a product manager, and where will the role be in 2 years?

Andrew answered by describing his specific work on Google Maps, such as the implementation of location history. He then talked about how in the future, the work at Google was starting to transition over to Google Now. Andreas had used, and been impressed by, Google Now while on a flight -- Google Now showed more accurate data about the arrival times of his flight. Andreas noted that this really shows how impressive of a data company Google is.

The final question that was asked was: Does Google use our geolocation data to do things or in ways that users are not aware of?

Sketchy Uses?:
external image google_latitude_1289717c.jpg

Andrew's answer began by pointing out that many of the ways in which the data is used is very obvious to the user, and with their consent. He thhen described how Google does use geolocation data for language translation, biking directions, giving traffic guides and updates.

In Summary:
Google Latitude: Lets you store your position history and share it with your friends.
The Google Latitude API allows you to build applications around a user’s latitude location/location history.
Some of the challenges of Google Latitude include precision and pinpointing the exact geolocation of someone.

(B) Hugh Fletcher, VZ's Geo API

Hugh is an amazing advocate for the API Ecosystem. Hugh is the Associate Director of Technology at Verizon Wireless. His background includes working at Mobio Networks, Sprint, PalmSource and Aicent.

Link to audio of his talk here (1:00 - 13:30).

APIs and Platforms
"Everyone is a platform now, everyone has got an API!" - Hugh Fletcher
Hugh, who has had a long history of working with telecommunications companies (Sprint) and other startups, is an advocate and supporter of the usage of Geolocation APIs. Hugh brought up the example of how Verizon has been able to get access to the geolocation data of Verizon customers who have opted into the feature. Through the multitude of the hundreds of thousands of cell towers scattered throughout United States, Verizon is able to triangulate a fairly precise location of any individual on the Verizon network. A demonstration took place with the consent of a student in the class who had a Verizon phone. The student, via text, opted in for 24 hours to a service that would allow Verizon to be able to pinpoint their location with a fiar degree of accuracy. After 24 hours, Verizon would no longer have this ability. This demonstration was conducted to display how specific Geolocation data is being collected and used by various companies around the world, chief among them being major telecoms like Verizon.

Privacy & Consent
Hugh brought up the key point that this is only possible with the consent of the user! This is an example of how the Social Data Revolution is closely intertwined with evolving social norms and ideas about privacy. Hugh also brought up Sarnoff's Lawas an illustration of how the revolution of data in digital world is impacting relative value and importance of networks. According to Wikipedia, "
Sarnoff's law states that the value of a broadcast network is directly proportional to the number of viewers."

In the video below, Hugh elaborates on the utility of Oracle's OCSG with Verizon's network API Project:



Key Quote:

"As part of this new open platform, Verizon-as-a-Service type thinking, it enables new revenue opportunities both for the developer and for the operator. So its certainly incumbent upon the operator to operate a great network...But as we look for new services and new margins, network apis are the next building blocks. Networkers around the world are exposing to developers so developers can enable rich services for their Web 2.0 services to increase their ability to make money..."

All in all, Hugh's talk was an exemplary example of how geolocation data is quickly asserting itself as a powerful force driving the Social Data Revolution.

(C) Chris Conley (ACLU)

external image easset_upload_file898_10904_e.jpg
Chris is currently a Technology and CIvil Liberties Fellow at the ACLU of North California (http://bit.ly/NmBEbv). Previously, Chris was a Research Fellow at the Berkman Center for Internet & Society (http://bit.ly/dD8F) where his focus was on censorship and surveillance of internet communication. Chris received his JD from Harvard Law School as well as a Masters Degree in Computer Science from MIT.

Mr. Conley brought up many interesting (and surprising!) points regarding social data. One of the most important points he raised was that "anonymization" is actually not that safe in most cases. As long as any persistent ID is attached to the data, patterns can be established - thus, though the data is harder to piece together, it is still there to be found. The only way to be 100% sure that the data is gone and cannot be attached to a person is simply to delete it. Fully aggregated data is fairly harmless to the user while exposing useful information. (Google Flu Trends, for example)

He brought up that to know where someone is at all times is to know their life wholly. Their routine, where they like to go, etc. In addition, as this sensitive data is often stored indefinitely and is often in the hands of multiple people, it is impossible to know who has access to it. In addition to malicious civilians, the government can also obtain this information, often times without a warrant. Everything from the records from every cell phone near a robbed bank to the location of a sheriff's daughter has been obtained by the government or government employees without a warrant. In addition, location information can be gathered not only through voluntary services, but also through more surreptitious methods, such as license plate cameras and fake cell towers.

Statutes in place to protect user privacy were drafted long before location-based services existed in the form they do now. This has resulted in an unclear mess of laws that no one understands - not consumers, not providers, not even law enforcement. The problem is that the development of these technologies moves much faster than the law, and as such the judiciary lacks precedents for dealing with social data privacy.

The one bright spot is that more and more users are becoming aware of the problem and are beginning to show concern about what happens to their location data. As a result, there's more of a demand for what Mr. Conley wants to do - that is, make it so that people understand not only the value of their information, but also understand what is being done with it - how long is it being stored, who has access to it, etc. And that's why more meaningful information presented to the user is important, and is one of the things Mr. Conley would like to see.

It was certainly enlightening, especially the part about anonymization - I thought it was a useful process, but it seems only to produce a false sense of security.