Reading Notes: “The Flaw of Averages” (Savage, 2009)

I just finished reading The Flaw of Averages – Why We Underestimate Risk in the Face of Uncertainty (2009) by Sam Savage, a professor at Stanford. The author states on his website that “[s]imply stated, the Flaw of Averages implies that”:

Plans based on average conditions are wrong on average.

The book deals with uncertain numbers (e.g. how many sales will product X have per month in the next year?), and more specifically, the erroneous forecasting of uncertain numbers due to incorrect use of averages. Savage distinguishes two forms of the Flaw of Averages:

  • Weak Form of the Flaw of Averages: using a single number (or regression line) in forecasting future values of an uncertain number, instead of taking into account the distribution of possible outcomes;
  • Strong Form of the Flaw of Averages: also screwing up the average itself. From page 83: “Consider a drunk staggering down the middle of a busy highway and assume that his average position is the centerline. Then the state of the drunk at his average position is alive, but on average he’s dead.”

Pages 130–132 list the Seven Deadly Sins of Averaging, which were first published in the article Probability Management in ORMS Today in 2006. In fact, the list has grown beyond seven since then. But Savage states on page 130:

I plan to go on calling them the Seven Deadly Sins regardless of how long the list becomes. Be sure to check in at FlawOfAverages.com to see where it stands today.

Both the 2009 edition of his book and today’s version of the website list twelve sins. Both lists reference scenarios explained elsewhere in the book. Therefore, I will quote sins 1 to 7 from the self-contained ORMS Today article; I will quote sins 8 to 12 from the book, and/or refer within []’s to online resources of my choice.

  • The Family with 1 1/2 Children: Often the “average” scenario, like the “average” family with 1 1/2 children, is non-existent. For example, a bank may have two main groups of young customers — students with an average income of $10,000 and young professionals with an average income of $70,000. Would it make sense for the bank to design products or services for customers with the average income of $40,000?
  • Why Everything is Behind Schedule: Imagine a software project that requires 10 separate subroutines to be developed in parallel. The time to complete each subroutine is uncertain and independent, but known to average three months, with a 50 percent chance of being over or under. It is tempting to estimate the average completion time of the entire project as three months. But for the project to come at three months or less, each of the 10 subroutines must be completed at or below its average duration. The chance of this is the same as flipping 10 sequential heads with a fair coin, or less than one in a thousand!
  • The Egg Basket: Consider putting 10 eggs all in the same basket, versus one by one in separate baskets. If there is a 10-percent chance of dropping any particular basket, then either strategy results in an average of nine unbroken eggs. However, the first strategy has a 10-percent chance of losing all the eggs, while with the second, there is only one chance in 10 billion of losing all the eggs.
  • The Risk of Ranking: It is common when choosing a portfolio of capital investment projects to rank them from best to worst, then start at the top of the list and go down until the budget has been exhausted. This flies in the face of modern portfolio theory, which is based on the interdependence of investments. According to the ranking rule, fire insurance is a ridiculous investment because on average it loses money. But insurance doesn’t look so bad if you have a house in your portfolio to go along with it.
  • Ignoring Restrictions: Consider a capital investment in infrastructure sufficient to provide capacity equal to the “average” of uncertain future demand. It is common to assume that the profit associated with average demand is the average profit. This is generally false. If actual demand is less than average, clearly profit will drop. But if demand is greater than average, the sales are restricted by capacity. Thus, there is a downside without an associated upside, and the average profit is less than the profit associated with the average demand.
  • Ignoring Optionality: Consider a petroleum property with known marginal production costs and an uncertain future oil price. It is common to value such a property based on the “average” oil price. If oil price is above average, the property is worth a good deal more. But if the price drops below the marginal cost of production, the owners have the option to halt production. Thus, there is an upside without an associated downside, and the average value is greater than the value associated with the average oil price. (…)
  • The Double Whammy: Consider a perishable inventory of goods with uncertain demand, in which the quantity stocked is the “average” demand. If demand exactly equals its average, then there are no costs associated with managing the inventory. However, if demand is less than average then there will be spoilage costs, and if demand is greater than average there will be lost sales costs. So the cost associate with average demand is zero, but average cost is positive.
  • The Flaw of Extremes: In bottom-up budgeting, reporting the 90th percentile of cash needs leads to ever thicker layers of unnecessary cash as the figures are rolled up to higher levels. Even more harmful things result from focusing on above- or below-average results, such as test scores or health-related statistics. (…) [From p138: T]he flaw of extremes results from focusing on abnormal outcomes such as 90th percentiles, worse than average cancer rates, or above average test scores. Combining or comparing such extreme outcomes can yield misleading results. (…) The smaller the sample size, the greater the variability of the average of that sample.
  • Simpson’s Paradox: [see Simpson’s Paradox (Wikipedia) and Chapter 18 online supplement]
  • The Scholtes Revenue Fallacy: [From p146: T]he Scholtes Revenue Fallacy occurs when revenue is the result of multiplying two uncertain numbers, such as (…) price and quantity. If the two uncertain numbers are inversely (negatively) interrelated, the average revenue is less than the revenue associated with the average uncertainties. If the two uncertain numbers are directly (positively) interrelated, the average revenue is greater than the revenue associated with the average uncertainties.
  • Taking credit for chance occurrences: We all like to take credit for our hard work, but some successes may be due to dumb luck. (…) [This is about null hypothesis (statistical) testing. See  Statistical hypothesis testing (Wikipedia) and Chapter 20 online supplement]
  • Believing there are only eleven deadly sins: The twelfth of the Seven Deadly Sins is being lulled into a sense of complacency, thinking you now know all of the insidious effects of averages.

Sam Savage did a great job: The Flaw of Averages is written in an amusing and down-to-earth style, and is a worthy read. If you don’t like mathematics, rest assured: no mathematical background or skill are required to enjoy it.

Further reading on statistics:

EOF

 

The Affirmation of Humanism: A Statement of Principles (Paul Kurtz)

UPDATE 2014-12-09: some readers may also want to check out the Reality of Morality project by The Brights Network (@TheBrightsNet), that provides an overview of scientific perspectives on human morality. And the accompanying list of 97 scientific studies that substantiate claims of morality’s natural origins.

In salute to skepticism and secular humanism — while respecting others worldviews —, I care to share The Affirmation of Humanism: A Statement of Principles, as published in 1997 by Paul Kurtz (1925-2012). Kurtz was a philosophy professor, a prominent American skeptic and secular humanist, and one of the greatest voices of reason in the last four decades. We owe the existence of the Council for Secular Humanism (known of the Free Inquiry magazine) and the Committee for Skeptical Inquiry (known of Skeptical Inquirer magazine) to many great minds, and Kurtz was one of the most prevalent among them. If you enjoy science & reason, consider subscribing to Free Inquiry and/or Skeptical Inquirer.

  • We are committed to the application of reason and science to the understanding of the universe and to the solving of human problems.
  • We deplore efforts to denigrate human intelligence, to seek to explain the world in supernatural terms, and to look outside nature for salvation.
  • We believe that scientific discovery and technology can attribute to the betterment of human life.
  • We believe in an open and pluralistic society and that democracy is the best guarantee of protecting human rights from authoritarian elites and repressive majorities.
  • We are committed to the principle of the separation of church and state.
  • We cultivate the arts of negotiation and compromise as a means of resolving differences and achieve mutual understanding.
  • We are concerned with securing justice and fairness in society and with eliminating discrimination and intolerance.
  • We believe in supporting the disadvantaged and the handicapped so that they will be able to help themselves. We attempt to transcend divisive parochial loyalties based on race, religion, gender, nationality, creed, class, sexual orientation , or ethnicity, and strive to work together for the common good of humanity.
  • We want to protect and enhance the earth to preserve it for future generations, and to avoid inflicting needless suffering on other species.
  • We believe in enjoying life here and now and in developing our creative talents to their fullest.
  • We believe in the cultivation of moral excellence.
  • We respect the right to privacy. Mature adults should be allowed to fulfill their aspiration, to express their sexual preferences, to exercise reproductive freedom, to have access to comprehensive and informed health-care, and to die with dignity.
  • We believe in the common moral decencies: altruism, integrity, honesty, truthfulness, responsibility. Humanist ethics is amenable to critical, rational guidance. There are normative standards that we discover together. Moral principals are tested by their consequences.
  • We are deeply concerned with the moral education of our children. We want to nourish reason and compassion. We are engaged by the arts no less than by sciences.
  • We are citizens of the universe and are excited by discoveries still to be made in the cosmos.
  • We are skeptical of untested claims to knowledge, and are open to novel ideas and seek new departures in our thinking.
  • We affirm humanism as a realistic alternative to theologies of despair and ideologies of violence and a source of rich personal significance and genuine satisfaction in the service to others.
  • We believe in optimism rather than pessimism, hope rather than despair, learning in the place of dogma, truth instead of ignorance, joy rather than guilt or sin, tolerance in the place of fear, love instead of hatred, compassion over selfishness, beauty instead of ugliness, and reason rather than blind faith or irrationality.
  • We believe in the fullest realization of the best and noblest that we are capable of as human beings.

EOF

The Curious Case of 42.0.20.80

UPDATE 2013-09-xx: slides (.pdf, Sep 2013) made and presented by @Yafsec at BruCON 2013.

UPDATE 2013-03-10: everything is caused by this bug in dproxy, a caching DNS proxy that runs on the Conceptronic C54APRB2+ router. Tip of the hat to the anonymous commenter who suggested this!

UPDATE 2012-12-27: here is small Python script written by @Yafsec that, given a hostname, shows how gethostbyname() would misinterpret resolved IPv6 addresses as IPv4 addresses. 

A friend told me that his computer was periodically unable to connect to Google while still being able to connect to other websites. I recently was at his place when the problem occurred again and I decided to take a look. I started tcpdump, directed Firefox to www.google.com and observed his system sending SYN-packets to tcp/80 and tcp/443 at IPv4 address 42.0.20.80. Indeed, on my friend’s system, host resolved www.google.com to that address:

$ host -t a www.google.com
www.google.com has address 42.0.20.80

A whois query revealed that 42.0.20.80 is not part of Google’s address space, but of address space allocated to China Telecom:

inetnum:        42.0.16.0 – 42.0.23.255
netname:        CHINANET-GD
descr:          CHINANET Guangdong province network
descr:          Data Communication Division
descr:          China Telecom
country:        CN
admin-c:        CH93-AP
tech-c:         IC83-AP
status:         ALLOCATED PORTABLE
notify:         […redacted…]
remarks:        service provider
changed:        […redacted…] 20110412
mnt-by:         APNIC-HM
mnt-lower:      MAINT-CHINANET-GD
mnt-irt:        IRT-CHINANET-CN
source:         APNIC

[…]

A search engine query for 42.0.20.80 revealed a few online messages posted in 2009, 2010, 2011 and 2012 that mention 42.0.20.80 in varying contexts involving connectivity problems. It is associated with (at least) the following hostnames, and observed on various OS’s:

  • talk.google.comhere
  • www.google.comhere
  • dl-ssl.google.comhere
  • v8.lscache4.c.youtube.comhere
  • www.picasaweb.google.comhere

Next, I asked on Twitter:

What’s up with @Google domains incidentally resolving to 42.0.20.80, owned by China Telecom (Guangdong)? Is that bonafide?

@Yafsec (Edwin van Andel) replied:

@mrkoot If the resolver uses gethostbyname, it expects ipv4. When on ipv6 it apparently uses the first 4 bytes of the ipv6 address as ipv4.

Indeed, an AAAA record exists for www.google.com, as it does for many other Google domains:

$ host -t aaaa www.google.com
www.google.com has IPv6 address 2a00:1450:4013:c00::63

….but I only now noticed that the first four bytes of that address, 2a00:1450, hexadecimally represent 42.0.20.80!

After further testing, my suspicion shifted to the router: a Conceptronic C54APRB2+ that runs firmware dating back to 2008 (no firmware update is available). Eventually I learned that the problem can be reproduced with the help of ipv6.l.google.com.  The first step is to run a lookup for A records for that label (which do not exist):

$ host -t a ipv6.l.google.com
ipv6.l.google.com has no A record

15:41:11.460855 IP (tos 0x0, ttl 64, id 50854, offset 0, flags [none], proto UDP (17), length 63)
192.168.1.2.51965 > mygateway1.ar7.domain: [udp sum ok] 50896+ A? ipv6.l.google.com. (35)
0x0000:  4500 003f c6a6 0000 4011 30b4 c0a8 0102  E..?….@.0…..
0x0010:  c0a8 0101 cafd 0035 002b 1a4c c6d0 0100  …….5.+.L….
0x0020:  0001 0000 0000 0000 0469 7076 3601 6c06  ………ipv6.l.
0x0030:  676f 6f67 6c65 0363 6f6d 0000 0100 01    google.com…..
15:41:11.484512 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 113)
mygateway1.ar7.domain > 192.168.1.2.51965: [udp sum ok] 50896 q: A? ipv6.l.google.com. 0/1/0 ns: l.google.com. [1m] SOA ns1.google.com. dns-admin.google.com. 1507162 900 900 1800 60 (85)
0x0000:  4500 0071 0000 4000 4011 b728 c0a8 0101  E..q..@.@..(….
0x0010:  c0a8 0102 0035 cafd 005d 862e c6d0 8180  …..5…]……
0x0020:  0001 0000 0001 0000 0469 7076 3601 6c06  ………ipv6.l.
0x0030:  676f 6f67 6c65 0363 6f6d 0000 0100 01c0  google.com……
0x0040:  1100 0600 0100 0000 3c00 2603 6e73 31c0  ……..<.&.ns1.
0x0050:  1309 646e 732d 6164 6d69 6ec0 1300 16ff  ..dns-admin…..
0x0060:  5a00 0003 8400 0003 8400 0007 0800 0000  Z……………
0x0070:  3c                                       <

The second step is to run a lookup for AAAA records for that label, which yields an IPv6 address (the highlighted bytes are the last 16 bytes of the DNS ‘answer section’ that correspond to the IPv6 address 2a00:1450:400c:c05::68):

$ host -t aaaa ipv6.l.google.com
ipv6.l.google.com has IPv6 address 2a00:1450:400c:c05::68

15:41:11.504818 IP (tos 0x0, ttl 64, id 24467, offset 0, flags [none], proto UDP (17), length 63)
192.168.1.2.62353 > mygateway1.ar7.domain: [udp sum ok] 9696+ AAAA? ipv6.l.google.com. (35)
0x0000:  4500 003f 5f93 0000 4011 97c7 c0a8 0102  E..?_…@…….
0x0010:  c0a8 0101 f391 0035 002b 77a8 25e0 0100  …….5.+w.%…
0x0020:  0001 0000 0000 0000 0469 7076 3601 6c06  ………ipv6.l.
0x0030:  676f 6f67 6c65 0363 6f6d 0000 1c00 01    google.com…..
15:41:11.528881 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 91)
mygateway1.ar7.domain > 192.168.1.2.62353: [udp sum ok] 9696 q: AAAA? ipv6.l.google.com. 1/0/0 ipv6.l.google.com. [5m] AAAA 2a00:1450:400c:c00::68 (63)
0x0000:  4500 005b 0000 4000 4011 b73e c0a8 0101  E..[..@.@..>….
0x0010:  c0a8 0102 0035 f391 0047 cca2 25e0 8180  …..5…G..%…
0x0020:  0001 0001 0000 0000 0469 7076 3601 6c06  ………ipv6.l.
0x0030:  676f 6f67 6c65 0363 6f6d 0000 1c00 01c0  google.com……
0x0040:  0c00 1c00 0100 0001 2c00 102a 0014 5040  ……..,..*..P@
0x0050:  0c0c 0000 0000 0000 0000 68              ……….h

The third and last step is to repeat the first lookup, and establish that it now “succeeds”, in that an A-record (that does not actually exist) is returned with IPv4 address 42.0.20.80:

$ host -t a ipv6.l.google.com
ipv6.l.google.com has address 42.0.20.80

 15:41:11.550621 IP (tos 0x0, ttl 64, id 15024, offset 0, flags [none], proto UDP (17), length 63)
192.168.1.2.50255 > mygateway1.ar7.domain: [udp sum ok] 45480+ A? ipv6.l.google.com. (35)
0x0000:  4500 003f 3ab0 0000 4011 bcaa c0a8 0102  E..?:…@…….
0x0010:  c0a8 0101 c44f 0035 002b 3622 b1a8 0100  …..O.5.+6″….
0x0020:  0001 0000 0000 0000 0469 7076 3601 6c06  ………ipv6.l.
0x0030:  676f 6f67 6c65 0363 6f6d 0000 0100 01    google.com…..
15:41:11.555103 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 79)
mygateway1.ar7.domain > 192.168.1.2.50255: [udp sum ok] 45480- q: A? ipv6.l.google.com. 1/0/0 ipv6.l.google.com. [2h46m40s] A 42.0.20.80 (51)
0x0000:  4500 004f 0000 4000 4011 b74a c0a8 0101  E..O..@.@..J….
0x0010:  c0a8 0102 0035 c44f 003b 42db b1a8 8100  …..5.O.;B…..
0x0020:  0001 0001 0000 0000 0469 7076 3601 6c06  ………ipv6.l.
0x0030:  676f 6f67 6c65 0363 6f6d 0000 0100 01c0  google.com……
0x0040:  0c00 0100 0100 0027 1000 042a 0014 50    …….’…*..P

So, it appears that the router stored the first four bytes of the just-received IPv6 address and now answers the A lookup from its cache.

A second observation is that after the initial AAAA lookup, repeat AAAA lookups break, in that the response does not contain a DNS ‘answer section’ at all (=malformed):

$ host -t aaaa ipv6.l.google.com
;; Warning: Message parser reports malformed message packet.
ipv6.l.google.com has no AAAA record
 

15:41:54.578422 IP (tos 0x0, ttl 64, id 60328, offset 0, flags [none], proto UDP (17), length 63)
192.168.1.2.53570 > mygateway1.ar7.domain: [udp sum ok] 64254+ AAAA? ipv6.l.google.com. (35)
0x0000:  4500 003f eba8 0000 4011 0bb2 c0a8 0102  E..?….@…….
0x0010:  c0a8 0101 d142 0035 002b c4d8 fafe 0100  …..B.5.+……
0x0020:  0001 0000 0000 0000 0469 7076 3601 6c06  ………ipv6.l.
0x0030:  676f 6f67 6c65 0363 6f6d 0000 1c00 01    google.com…..
15:41:54.583505 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 63)
mygateway1.ar7.domain > 192.168.1.2.53570: [udp sum ok] 64254- q: AAAA? ipv6.l.google.com. 1/0/0 [|domain]
0x0000:  4500 003f 0000 4000 4011 b75a c0a8 0101  E..?..@.@..Z….
0x0010:  c0a8 0102 0035 d142 002b 44d7 fafe 8100  …..5.B.+D…..
0x0020:  0001 0001 0000 0000 0469 7076 3601 6c06  ………ipv6.l.
0x0030:  676f 6f67 6c65 0363 6f6d 0000 1c00 01    google.com…..

Looking further into this issue, I found the following posts:

As @Yafsec told me, it is suggested that gethostbyname() should not be used anymore, and that getaddrinfo() or getipnodebyname() should be used instead. My best guess from all this is that my friend’s router uses the gethostname()— and that he should buy a new router.

The curious case of 42.0.20.80 is now solved, but questions come to mind:

  • How many routers currently in operation have this bug?
  • How often does misinterpretation of IPv6[0-3] as IPv4 take place?
  • Could this bug be abused?
    • Improbably, to collect credentials/cookies for Google services, 42.0.20.80 could host spoofed versions of Google/Youtube/etc. To avoid detection, it would answer on port 80/443 only during a short time window and/or only to specific IP ranges. (Password re-use, misuse value of private e-mail communications, yada yada yada.)
  • What other services (besides Google) run IPv6 and could have users experiencing this?

EOF

AOC Professional Reading List 2012

The Association of Old Crows (AOC) surveyed Crows on their favorite books to form an “AOC Professional Reading List”. The results are in. Here are the Top 5 EW Books, Top 3 IO Books and Top 2 Great Reads:

  1. EW 101, EW 102, and EW 103 – Dave Adamy
  2. Introduction to Radar Systems – Merrill Skolnik
  3. Introduction to Airborne Radar – George W. Stimson
  4. Electronic Warfare in the Information Age – D. Curtis Schleher
  5. Electronic Intelligence: The Analysis of Radar Signals – Richard G. Wiley
  6. Information Warfare: Principles and Operations – Edward Waltz
  7. Information Operations – Doctrine and Practice: A Reference Handbook (Contemporary Military, Strategic, & Security Issues) – Christopher Paul
  8. Information Operations: The Hard Reality of Soft Power – Edwin L. Armistead
  9. Most Secret War – Reginald V. Jones
  10. Deep Black – William Burrows

I shamelessly ripped the above list from eCrow and added hyperlinks myself. I’m aware not everyone is a fan of Amazon but it has one of the lowest probabilities of removing or changing URLs. After all: cool URIs don’t change!

Dutch Govt Expresses Intent To Draft New Cybercrime Legislation

UPDATE 2015-12-22: and here they are: the new cybercrime bill and MoU (in Dutch) as submitted by the cabinet to the House. Notably, the cabinet cancelled compelled decryption because of the right not to self-incriminate (nemo tenetur principle). Thus, the final bill, that will be discussed in the House, does not contain a power for LE to compel suspects of certain “very serious criminal offenses” to decrypt their data under penalty three years imprisonment or a fine of up to ~20k euro.

UPDATE 2015-06-11: it is reported that the cabinet will submit the proposal after the parliamentary summer break of 2015, which ends on August 31st 2015.

UPDATE 2012-10-18: legal expert Jan-Jaap Oerlemans blogged about the cross-border remote search that is proposed in this letter. Recommended read!

On October 15th 2012, the Dutch Minister of Security & Justice (Ivo Opstelten) sent this letter (.pdf in Dutch) to the Dutch parliament expressing intentions to draft new cybercrime legislation in the Netherlands.

Below is my Dutch-to-English translation of the entire letter. Hyperlinks and parts between [] are mine (note: the parts between () are from the original letter). I translated as neutral/objective as I could. I welcome your corrections/improvements at koot at cyberwar dot nl.

WARNING: this is an unofficial translation.

Date: October 15th 2012
Subject: Cybercrime legislation

By submitting this letter I fulfill my promise to send a message to you, the Parliament, concerning the inventory I made of necessary, new criminal investigative powers on the internet.

Summary

This letter proposes, within the framework of the rule of law, proportionality, subsidiarity and respect for the privacy of citizens, legislative elaborations of a number of issues to strengthen the powers in the investigation and prosecution of cybercrime. The aim of this new legislation is to tune the legal framework to the needs brought forward by the services that are responsible for the investigation and prosecution of cybercrime. Based on practical experiences and wishes, such as appeared in the recent Cyber Security Assessments Netherlands [aka Cyber Security Report] of 2011 [.pdf, English] and 2012 [.pdf, English] and my letter of December 23rd 2011 to you, the Parliament, about the legal framework for cybersecurity, this concerns the following topics:

  • Remote entry of automated works (=computers) and the placement of technical means (such as software) for the purpose of investigation of severe forms of cybercrime;
  • Remote search of data that is accessible from an automated work (=computer), regardless of the location of the automated work on which the data is stored and taking into consideration agreements and rules of international legal assistance;
  • Remotely making data inaccessible that is accessible from an automated work (=computer), regardless of the [geographical] location of the automated work on which the data is stored and taking into consideration agreements and rules of international legal assistance;
  • Criminalization of the trade in stolen (digital) data.

The structure of this letter is as follows: first, several introductory remarks are made (paragraph 1), then, the above topics are elaborated (paragraph 2). Next, conditions for the exercise of these investigative powers are discussed (paragraph 3) and international developments are outlined (paragraph 4). Finally, I indicate what next steps are being discussed (paragraph 5).

1. Introductory remarks

IT applications play an increasingly important role in daily life. The current situation is that the number of cybercrimes is increasing and the capacity, knowledge and experience within the criminal justice system does not keep pace. Our national and international possibilities to act against it are further decreasing as result of the cross-border nature and the emergence of so-called cloud computing. It also appears that the industrial self-regulation malfunctions and that offenses that could be prevented through better and earlier technical measures often still occur. A burning issue is that it has become very complicated to trace criminal activity on the Internet because it is relatively easy for criminals to prevent their digital tracks from being monitored, for example by the use of software to encrypt data and delete the communication paths. The investigations of the High Tech Crime Team [THTC] of the National Police Services Agency [KLPD] confirm this. In the investigation of child pornography on the Tor network, the team found that through the use of this network it is possible to view, download or upload child pornography images on to servers, without the identity of the suspect being visible. Furthermore, in several places, including the servers that were found, encryption was used. In another investigation of a large botnet that was being used to commit many crimes, the THTC found that the owner of the botnet could move his data around the world easily and very fast using a few keystrokes on his computer, which severely hindered or rendered impossible figuring out where the data was located on servers. I believe that this kind of countermeasures of suspects against investigation ought not to be successful. Crimes that are committed must be detected and perpetrators must be prosecuted. Society expects this from the government.

Data of which it cannot be established where they are geographically located
The police and Public Prosecution Service expressed their practical need for broadening of legal possibilities to act,  so that the desired and agreed-upon investigative and prosecution performance can be delivered. The police currently attempts to compensate the narrow legal possibilities to investigate on the internet. For example, the police has copied the content of the servers on the aforementioned Tor network containing images of severe sexual abuse of children and then destroyed it or rendered it inaccessible. At that time, the exact location of the servers could not be determined with certainty because the communication path had been obscured. The result of this approach in this case is that the copied data can now be used for (internationally) investigation and that access to those images is no longer possible via these servers. In this specific example, the Public Prosecution Service and the police made a decision in favor of acting against child pornography on the internet. A similar decision will need to be made in the future in acting against, for example, botnets. I believe that updating of legislation that provides the police and Public Prosecution Service a solid base to perform their necessary work in investigation and prosecution on the internet is necessary.

Mobile internet use

The current investigative powers for acting against cybercrime largely assume that computers have a fixed location and that digital data is stored on a single, individual computer. Meanwhile, the digital world has significantly changed. Because of that, these powers are no longer sufficient. In this context, the possibilities of modern mobile computers, such as smartphones and increasingly tablets, and the ways in which they are used, can be pointed out. These new forms of mobile computers can be continuously connected to the internet and be used for many forms of cybercrime. In addition, they are frequently used by criminals for their collaborative communication. Obfuscation of this communication is increasingly seen. In part due to the use of cloud computing, it will be increasingly difficult for investigators to figure out where the data of a certain smartphone or tablet is located at a certain time, while remaining uncertain about how long the data will remain stored there and thereby traceable. I believe that the investigative powers for acting against cybercrime ought to be designed so that these are practical and effective in the current digital world of mobile equipment and cloud computing. For the possibilities of digital investigation, it should not matter where an automated work is located at the time of carrying out the desirable investigative actions. According to international law, (digital) investigative actions on foreign terrain can onlytake place via international legal assistance. But as shown by the above examples, it will not always be possible to determine where data is located. If that is the case, the police and Public Prosecution Service must be able to continue their investigation under the conditions outlined below.

2. Elaboration of the aforementioned proposed legislation

Below, the proposed legislation that I announced above will be explained further.

2.1.  Remote entry of automated works (=computers) and the placement of technical means (such as software) for the purpose of investigation of severe forms of cybercrime.
Paragraph 1 described the development toward more mobile Internet usage. It also raised the increasing use of encryption on computers. Police and the Public Prosecution Service indicate that various forms of crime exist that are hidden from their sight because they do not have the power to invade a computer. Article 125i of the Dutch Code of Criminal Procedure offer a framework for the power to search a place to record data that are stored or recorded at that place on a data carrier. From parliamentary history it can be inferred that it is not permitted that an automated work is penetrated remotely for the purpose of investigation of serious forms of cybercrime. This concerns both remote entering for the purpose of wiretapping confidential communication and remote entering for the purpose of searching an automated work. In order to get access to this data for the purpose of investigation of serious forms of cybercrime, it is necessary that software can be secretly installed that allows the encryption of the data to be undone or circumvented.
Partly in the light of technological developments, a statutory power should be established for remotely penetrating an automated work, concerning the above purposes. The changed circumstances warrant the inclusion in the Dutch Code of Criminal Procedure of a specific power to remote intrusion of an automated work for the investigation of serious forms of cybercrime.

2.2. Remote search of data that is accessible from an automated work (=computer), regardless of the location of the automated work on which the data is stored and taking into consideration agreements and rules of international legal assistance.
In paragraph 1 I provided the example of a botnet where the criminal was able to move his data around the world very fast. This is increasingly common. Criminals know that police is attempting to access their networks and data and take measures against that. Usually, the data are moved around the internet (globally) very fast or the paths to the data are changed. Criminal groups also often take measure to detect whether third-parties, including the police, are attempting to access their files. When they detect such signals or suspect this they move their files as fast as possible and don’t hesitate to act against intruders using digital means. These technological development make it difficult to determine the location of the stored data and that the location changes often. Where data used to be stored on one’s own computer or on a separate data carrier, data is now stored via the internet on a foreign server or in the cloud. Starting point is that power for criminal investigation can only be exercised on one’s own territory. To carry out investigative actions on the territory of another state, international legal assistance is required. The reverse also applies: if a foreign state wants to carry out investigative actions on Dutch territory, they also require official legal assistance (article 552h in the Dutch Code of Criminal Procedure). However, the time delay incurred by this often works against the investigation an limits the effectiveness of official legal assistance.
The Cybercrime Convention of the Council of Europe has a provision on remote access to computer data regardless of the location of that data (article 32). This access is limited to publicly accessible data and other data on the condition of consent of the rightful claimant. The Cybercrime Convention does not have provisions on the gathering of data that are not publicly accessible without consent of the rightful claimant, meaning the official legal assistance is required. But, as argued above, in the remote search of computers it is in practice not always possible to determine the location of the data. A request for official legal assistance is impossible in that case. From the perspective of effective investigations it is of vital importance that data can be retrieved regardless of the location where they are stored. Therefore, the police and the Public Prosecution Service insisted on relevant legislation. In the legislation that I have in mind, I use the following principles. If knowledge is available about the location of the data, and the data are located on a foreign server, a request for legal assistance is designated. If there is no knowledge about the location of stored data, they should for the purpose of obtaining evidence be able to be searched and taken over.

The Belgian Code of Criminal Procedure also stipulates that during the search of an automated work, data can be taken over. When it turns out that the data are not loacted on Belgian territory, the data are only copied and the foreign state is notified.

2.3. Remotely making data inaccessible that is accessible from an automated work (=computer), regardless of the [geographical] location of the automated work on which the data is stored and taking into consideration agreements and rules of international legal assistance.
A special aspect is the possibility of rendering data that is found during remote search of an automated work inaccessible. In the Netherlands, the possibility currently exists that, when a place is entered to record data that is stored on data carrier at that place, and when the data is or is used for committing a crime (such as child porn), the data are rendered inaccessible to end the crime (article 125o of the Dutch Code of Criminal Procedure). Following that, it is desirable that during introduction of the power to remotely intrude an automated work, also a power is created to render such data inaccessible. After all, it is possible that during a remote search, child porn is found. This was the case during the aforementioned investigation that the THTC carried out on child pornographic images on servers in the Tor-network, where the police found very harmful pornographic material that was stored in encrypted form on a server. In absence of knowledge about the location of the storage of data, it is impossible to search for legal assistance. Nobody can be addressed in that case, while the crime continuous. The severity of the crimes can require that the data are immediately rendered inaccessible. This can entail that the data is erased. I therefore believe it is desirable to establish a legal power to render inaccessible or erase data that are found during remote searches of an automated work, modelled such as the provisions of article 125o of the Dutch Code of Criminal Procedure. Here, again, it applies that if knowledge is available about the location of the data, a request for legal assistance must be addressed to the authorities of the foreign state.

2.4. Criminalization of the trade in stolen (digital) data.

Offenses are committed on the internet where data is gathered, via hacking or other means, that are of interest to third-parties for the use in crime. Examples of this are personal data in databases that have been compromised and that can then be used to, for example, buy goods on the internet. Also, creditcard data that are gathered via phishing is offered and sold on the internet. Although, in the latter example, the use of this data to make creditcards is already punishable by law, the holding, transferring and buying this data is not punishable. This complicates investigations. The requirement to wait until the data is actually used to commit crimes, implies that it is not possible to act to prevent crimes. That is certainly not reassuring to citizens and in fact a bad signal because this form of trade in stolen items would be permissible in digital form. The trade or selling of such data has developed into a separate form of crime on its own.
That trade of stolen data is currently not punishable, is related to the fact that computer data, based on jurisprudence, can only be considered in specific circumstances to be goods in the meaning of articles 310 and 416 of the Dutch Criminal Code. This is relevant when data is outside the disposal of the holder and represent economical trade value. From this it follows that copying the holder’s data is not punishable because the holder retains disposal of the data. I believe that it is unacceptable for the involved victims that the current legislation results in unwanted gaps in cyberspace and thinkit is desirable to make these offenses punishable.

3. Conditions for exercise of investigative powers

The investigative powers described in paragraphs 2.1 to 2.3 must be surrounded with strict safeguards. The power to search a place to record data that is stored or recorded on storage media, based on article 125i of the Dutch Code of Criminal Procedure, is assigned to both the examining judge, the prosecutor, the assistant public prosecutor and the investigating officer. In various other powers it is specifically provided that the examining judge and the public prosecutor are authorized (for example articles 125la, 125n third paragraph and 125o first paragraph of the Dutch Code of Criminal Procedure). However, given the degree of intrusiveness of the legal powers to remote intrusion of automated work and the installation of technical devices for the detection of serious forms of cybercrime, especially considering the infringement on the right to respect for the privacy of persons, authorization of the examining judge must at all times be obtained prior to the use of the power. Also, the power can only be exercised in cases of suspected offenses of a certain gravity, for example offenses for which custody is provided or that carry a maximum imprisonment of four years or more.
Furthermore, of course, the general requirement applies that report must be made when this power is exercised. In addition, all transactions occurring during the exercise of these powers are automatically logged and stored and thus always accessible and verifiable afterwards.

4. International developments

I have already informed you that the Netherlands firmly contributes at the international level to the further development of the international framework, especially within the context of the Council of Europe. The Netherlands is both a member of the Convention Committee (of which all Treaty Parties are member) and the Agency (an elected body within the Convention Committee) associated with the Cybercrime Convention of the Council of Europe. In that context, we contribute to the active recruitment of new members of this Convention. Meanwhile, 33 countries acceded to the Convention (of which 17 have ratified), including 2 non-European countries (the United States and Japan). Partly at the instigation of the Netherlands, a debate started in 2010 on the scope of article 32 of the Convention that was mentioned above. I think it is of great importance that any cross-border investigative powers are secured internationally. This is a process that will take many years. The Netherlands will continue to monitor this. I choose to set the improvements to combat cybercrime already in motion in the Netherlands.

5. Next steps

The coming months will be used together with the police, prosecutors and other relevant stakeholders to elaborate further and prepare a draft bill. I am convinced that this catching-up is necessary to strenghten the investigation and prosecution of cybercrime.

The Minister of Security and Justice,

I.W. Opstelten

EOF