Politicians are public figures and therefore have reduced reasonable expectations of privacy. The Dutch House of Representatives provides information about all 150 representatives in a single XML file: http://www.tweedekamer.nl/xml/kamerleden.xml (mirror of today’s copy; also in Google-cache, but not archive.org). Some of the personal information it contains (not all values are present for all representatives):
- full name
- gender
- date of birth
- place of birth
- home town
- education
- work experience
- work e-mail (@tweedekamer.nl)
- travels
- personal website
- personal statement
- (past) affiliations w/foundations, associations
- political affiliation
- photo
When stumbling upon that file, the following thoughts came to mind:
- I hope these public figures don’t use that information as password or answer to security question in their private life.
- With personal data being readily available, these high-profile targets surely must have already been victim (although maybe not be aware of it) of password-guessing and social engineering attacks?
- If they aren’t, is that…
- …because nobody cared to target them?
- …because this particular knowledge does not pose a threat?
- …because their personal subscriptions/service-usage is unknown?
- E.g. you don’t know they use Gmail, which bank, insurance, webshops.
- …because their personal logins/names are unknown?
- E.g. you know they are customer/employee/student at X but you don’t know their username for logging in to X
- …because this personal info was not used as password or answer to a security question?
- E.g. you know <username>@gmail.com but can’t guess the password
- …because this personal info is, by itself, insufficient to compromise accounts?
- E.g. more information is needed (SSN, bank account number), or multifactor authentication requires possession of token
- …because of something else?
In a sense, our representatives function as guinea pigs for testing assumptions about the risk associated with disclosing personal data — or rather, at least with disclosing this particular personal data. Disclosing SSN, bank account numbers, credit card numbers and DigiD credentials probably remains a bad idea.
UPDATE 2011-04-23: I suddenly realize that A Study on the Re-Identifiability of Dutch Citizens (.pdf) presented at HotPETS 2010 is relevant here. Guido van ‘t Noordende, Cees de Laat and I studied registry office (GBA) data of 2.7 million Dutch citizens (~16% of the total population) to explore their identifiability by various quasi-identifiers consisting of partial or full postal code, partial or full date of birth and gender. We also included this one (tables 2 and 3 in the paper):
QID = { town + date-of-birth + gender }
The median anonymity set size was 2, meaning that half of the combinations of town + date of birth + gender in our data set either unambiguously identified an individual (Dutch citizen), or a group of only 2 individuals. The numbers vary depending on town size, but for ~37% of Dutch citizens in our set that QID is identifying up to a group of 5 or less individuals. As you see on the above list, the disclosed personal information possibly includes quasi-identifier value + real identity for the representatives. Just thought this is worth mentioning.
Since the data is publicly available anyway: here is the list of all representatives and their quasi-identifier value.