Researchers at Stanford recently published a paper exploring the harvesting of demographic information from an unlikely source: Google Street View. By using a method of deep learning called “Convolutional Neural Networks”, the researchers successfully determined what areas preferred certain car models, and could then determine data points like political affiliation, ethnicity, and level of wealth.
The researchers used a machine vision model referred to as “CNN” (Convolutional Neural Networks) to examine a dataset of 50 million Google Street View images. With these, they trained their algorithm to “pull out” all vehicles and identify them using distinguishing characteristics (grills, tail lights, etc). This yielded a precise categorization into 2,657 vehicle categories, encompassing 8% of all vehicles driven in the USA: “… a nearly exhaustive list of all visually distinct automobiles sold in the United States since 1990. For instance, our models accurately identified cars (identifying 95% of such vehicles in the test data), vans (83%), minivans (91%), SUVs (86%), and pickup trucks (82%)”. By combining this data with datasets from the American Community Survey, they successfully showed high correlations between the types of vehicles a neighborhood used and certain demographic information in that neighborhood. Correlations include:
While the researchers indicate that this study is a “proof-of-concept” more than anything, it raises several interesting privacy concerns, namely, “How do current privacy laws play into this?” As our company president recently wrote, with the coming of GDPR and other regulations, the kind of data that you must protect, how you need to protect it, and the penalties for non-compliance are more explicit and restrictive than ever.
The EU regulations note that protected information includes “[a]ny information related to a natural person or ‘Data Subject’, that can be used to directly or indirectly identify the person. It can be anything from a name, a photo, an email address, bank details, posts on social networking websites, medical information, or a computer IP address.”
Since this study indicates that even more data is “personally identifiable” than expected, how will this “muddy the waters”? With the “right to be forgotten”, if a citizen wants to remove their car from Google Street View (GSV), how will Google have to react? By just removing their specific car if it’s in front of a residence or work location? By removing all instances of that car (as IDed by license plate) from the area, or from all GSV images? Will Google be allowed to keep any images related to any vehicles on GSV in the long term? Conversely, how will someone prove that this information is indeed considered “protected” and not just general information? Are street-specific identifiers already too specific, or is it generic enough that it doesn’t identify someone at the per-person level? Will policies and procedures have to be drafted to protect against new potential “data leaks” like this research, even if we’re not aware of exactly what has been leaked yet? How will we have to design machine learning algorithms going forward? Will safeguards have to be hardcoded? How can we train these algorithms to know what constitutes personally identifiable information and obfuscate or anonymize it?
If you need custom guidance on these kinds of questions, we at SCS can help. We’re experts on privacy policies and regulations and can help you protect your business from the fines and legal actions caused by non-compliance. Contact us today for a free, no-obligation discussion of your specific needs.
Secure Compliance Solutions is the trusted security advisor for Chicagoland’s small-to-medium businesses. We offer a variety of services that promote a strengthened security posture and a culture of compliance. Our solutions include: risk advisory services, strategic cybersecurity planning, security and privacy awareness, regulatory guidance, penetration testing, and managed security services. We tailor our engagements and solutions to align with your cultural needs and business objectives; not the other way around. We keep your appetite for risk, budget constraints, and timeline in mind to define strategy and operational tactics that maximize your return on investment. At SCS, we help you navigate the course of your cybersecurity journey.