If you walk into a store, would you appreciate it if the store owner
phoned a random stranger to tell him/her that you are at their store?
Probably not. Because it's weird. Because it serves no purpose to you.
Because you feel it could, in fact, be harmful to you. Or simply because
you feel it is none of their frickin' business. To put it more
eloquently, it intuitively constitutes a violation of
contextual integrity.
Yet, that is exactly what happens when you visit many websites.
To
me, Facebook is equivalent to a random stranger. And every time I visit
a website that has a Facebook `Like'-button, that website makes my
browser disclose that visit to Facebook, despite the fact that I do not
have a Facebook profile. When I visit Dutch online bookstore
Bol.com, their website makes my browser send the following HTTP request to www.facebook.com:
GET
/plugins/likebox.php?href=http%3A%2F%2Fwww.facebook.com%2Fbolpuntcom&width=292&height=260&colorscheme=light&show_faces=true&border_color=%23EEEEEE&stream=false&header=true
HTTP/1.1
Host: www.facebook.com
User-Agent: Mozilla/5.0 (X11; OpenBSD i386; rv:5.0) Gecko/20100101 Firefox/5.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Proxy-Connection: keep-alive
Referer: http://www.bol.com/nl/index.html
The
Referer-header
discloses to Facebook that I'm visiting Bol.com. Chances are that if
Facebook would want to, they could easily identify me by matching my IP
address +
HTTP headers to data collected by themselves or (other)
private intelligence agencies (.pdf) during my prior (non-anonymous) online purchases and my (non-anonymous) social media activity.
When I visit Dutch take-away food ordering webshop
ThuisBezorgd.nl, my browser fetches a page from Facebook, Twitter, Google and Hyves (Hyves is a Dutch/Belgian social network):
So,
effectively, ThuisBezorgd.nl makes my browser tell four random
strangers my identity and that I'm interested in take-away dinners.
In case of ThuisBezorgd.nl there is another subtlety. Whenever I visit the website, I have to fill in my postal code:
When clicking the `Search'-button, my browser opens
http://www.thuisbezorgd.nl/en/order-food-amsterdam-1098 :
...that
URL contains the four numbers of my postal code at the end. Indeed,
that page too makes my browser fetch content from Google's systems. Now,
thanks to the
Referer-header,
the postal code I provided is disclosed to Google as well.
Specifically, it is disclosed to www.googleadservices.com,
www.google-analytics.com and googleads.g.doubleclick.net:
GET
/pagead/conversion/1071768439/?random=1337601791571&cv=7&fst=1337601791571&num=1&fmt=3&label=HMtdCNrcuAEQ98aH_wM&bg=666666&hl=en&guid=ON&u_h=1080&u_w=1920&u_ah=1080&u_aw=1920&u_cd=24&u_his=6&u_tz=120&u_java=true&u_nplug=8&u_nmime=81&ref=http%3A//www.thuisbezorgd.nl/en/&url=http%3A//www.thuisbezorgd.nl/en/order-food-amsterdam-1098&frm=0
HTTP/1.1
Host: www.googleadservices.com
User-Agent: Mozilla/5.0 (X11; OpenBSD i386; rv:5.0) Gecko/20100101 Firefox/5.0
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Proxy-Connection: keep-alive
Referer: http://www.thuisbezorgd.nl/en/order-food-amsterdam-1098
GET
/__utm.gif?utmwv=5.3.1&utms=4&utmn=1587224412&utmhn=www.thuisbezorgd.nl&utmcs=UTF-8&utmsr=1920x1080&utmvp=1024x605&utmsc=24-bit&utmul=en-us&utmje=1&utmfl=11.2%20r202&utmdt=Order%20food%20online%20in%20Amsterdam%201098%20-%20Thuisbezorgd.nl&utmhid=1647671063&utmr=0&utmp=%2Fen%2Forder-food-amsterdam-1098&utmac=UA-2290863-1&utmcc=__utma%3D251997388.1444340185.1337593125.1337599450.1337601573.4%3B%2B__utmz%3D251997388.1337593125.1.1.utmcsr%3D(direct)%7Cutmccn%3D(direct)%7Cutmcmd%3D(none)%3B&utmu=q~
HTTP/1.1
Host: www.google-analytics.com
User-Agent: Mozilla/5.0 (X11; OpenBSD i386; rv:5.0) Gecko/20100101 Firefox/5.0
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Proxy-Connection: keep-alive
Referer: http://www.thuisbezorgd.nl/en/order-food-amsterdam-1098
GET
/pagead/viewthroughconversion/1071768439/?random=1337601791571&cv=7&fst=1337601791571&num=1&fmt=3&label=HMtdCNrcuAEQ98aH_wM&bg=666666&hl=en&guid=ON&u_h=1080&u_w=1920&u_ah=1080&u_aw=1920&u_cd=24&u_his=6&u_tz=120&u_java=true&u_nplug=8&u_nmime=81&ref=http%3A//www.thuisbezorgd.nl/en/&url=http%3A//www.thuisbezorgd.nl/en/order-food-amsterdam-1098&frm=0&ctc_id=CAIVAgAAAB0CAAAA&ct_cookie_present=false
HTTP/1.1
Host: googleads.g.doubleclick.net
User-Agent: Mozilla/5.0 (X11; OpenBSD i386; rv:5.0) Gecko/20100101 Firefox/5.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Proxy-Connection: keep-alive
Referer: http://www.thuisbezorgd.nl/en/order-food-amsterdam-1098
Cookie: id=ccfc97b450000c1||t=1337591014|et=730|cs=002213fd4815288209299939c3
(Yes,
GeoIP services may already reveal the geographical location of an IP
address with more precision and accuracy, but that is besides the
point.)
Information disclosure via these types of
web bugs is old and well-known. In fact,
EFF's The Web Bug FAQ
dates back to 1999. But the problem is becoming more relevant now that
those third parties are used by 100M+ people and more and more personal
data is collected and sold in the market.
Besides a violation of your visitors' privacy, loading external content may also pose a
security risk
to your visitors: every system that your website requires your visitors'
browser to load content from can get compromised and serve malware. That also holds for Google, Facebook and Twitter. The
more systems you make your visitors' browser load content
from, the more risk you expose your visitors to.
`Browser-Reflected Information Disclosure" might be an
appropriate label for these types of privacy violations. (If you have a
better suggestion, please comment.)
The solution is very simple: instead
of including a `Like'-button e.g. via an IFRAME that loads likebox.php
from Facebook's systems, put up a hyperlink to the Facebook page you
want your visitors to `Like'. Instead of including a `+1'-button, put up
a hyperlink to your Google Plus page. Instead of including a Paypal
`Donate'-button from Paypal's systems, make a local copy of that button
image and link to
that image in your
<img>-tags.