UPDATE 2013-09-xx: slides (.pdf, Sep 2013) made and presented by @Yafsec at BruCON 2013.
UPDATE 2013-03-10: everything is caused by this bug in dproxy, a caching DNS proxy that runs on the Conceptronic C54APRB2+ router. Tip of the hat to the anonymous commenter who suggested this!
UPDATE 2012-12-27: here is small Python script written by @Yafsec that, given a hostname, shows how gethostbyname() would misinterpret resolved IPv6 addresses as IPv4 addresses.
A friend told me that his computer was periodically unable to connect to Google while still being able to connect to other websites. I recently was at his place when the problem occurred again and I decided to take a look. I started tcpdump, directed Firefox to www.google.com and observed his system sending SYN-packets to tcp/80 and tcp/443 at IPv4 address 42.0.20.80. Indeed, on my friend’s system, host resolved www.google.com to that address:
$ host -t a www.google.com
www.google.com has address 42.0.20.80
A whois query revealed that 42.0.20.80 is not part of Google’s address space, but of address space allocated to China Telecom:
inetnum: 42.0.16.0 – 42.0.23.255
netname: CHINANET-GD
descr: CHINANET Guangdong province network
descr: Data Communication Division
descr: China Telecom
country: CN
admin-c: CH93-AP
tech-c: IC83-AP
status: ALLOCATED PORTABLE
notify: […redacted…]
remarks: service provider
changed: […redacted…] 20110412
mnt-by: APNIC-HM
mnt-lower: MAINT-CHINANET-GD
mnt-irt: IRT-CHINANET-CN
source: APNIC
[…]
A search engine query for 42.0.20.80 revealed a few online messages posted in 2009, 2010, 2011 and 2012 that mention 42.0.20.80 in varying contexts involving connectivity problems. It is associated with (at least) the following hostnames, and observed on various OS’s:
- talk.google.com – here
- www.google.com – here
- dl-ssl.google.com – here
- v8.lscache4.c.youtube.com – here
- www.picasaweb.google.com – here
Next, I asked on Twitter:
What’s up with @Google domains incidentally resolving to 42.0.20.80, owned by China Telecom (Guangdong)? Is that bonafide?
@Yafsec (Edwin van Andel) replied:
@mrkoot If the resolver uses gethostbyname, it expects ipv4. When on ipv6 it apparently uses the first 4 bytes of the ipv6 address as ipv4.
Indeed, an AAAA record exists for www.google.com, as it does for many other Google domains:
$ host -t aaaa www.google.com
www.google.com has IPv6 address 2a00:1450:4013:c00::63
….but I only now noticed that the first four bytes of that address, 2a00:1450, hexadecimally represent 42.0.20.80!
After further testing, my suspicion shifted to the router: a Conceptronic C54APRB2+ that runs firmware dating back to 2008 (no firmware update is available). Eventually I learned that the problem can be reproduced with the help of ipv6.l.google.com. The first step is to run a lookup for A records for that label (which do not exist):
$ host -t a ipv6.l.google.com
ipv6.l.google.com has no A record
15:41:11.460855 IP (tos 0x0, ttl 64, id 50854, offset 0, flags [none], proto UDP (17), length 63)
192.168.1.2.51965 > mygateway1.ar7.domain: [udp sum ok] 50896+ A? ipv6.l.google.com. (35)
0x0000: 4500 003f c6a6 0000 4011 30b4 c0a8 0102 E..?….@.0…..
0x0010: c0a8 0101 cafd 0035 002b 1a4c c6d0 0100 …….5.+.L….
0x0020: 0001 0000 0000 0000 0469 7076 3601 6c06 ………ipv6.l.
0x0030: 676f 6f67 6c65 0363 6f6d 0000 0100 01 google.com…..
15:41:11.484512 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 113)
mygateway1.ar7.domain > 192.168.1.2.51965: [udp sum ok] 50896 q: A? ipv6.l.google.com. 0/1/0 ns: l.google.com. [1m] SOA ns1.google.com. dns-admin.google.com. 1507162 900 900 1800 60 (85)
0x0000: 4500 0071 0000 4000 4011 b728 c0a8 0101 E..q..@.@..(….
0x0010: c0a8 0102 0035 cafd 005d 862e c6d0 8180 …..5…]……
0x0020: 0001 0000 0001 0000 0469 7076 3601 6c06 ………ipv6.l.
0x0030: 676f 6f67 6c65 0363 6f6d 0000 0100 01c0 google.com……
0x0040: 1100 0600 0100 0000 3c00 2603 6e73 31c0 ……..<.&.ns1.
0x0050: 1309 646e 732d 6164 6d69 6ec0 1300 16ff ..dns-admin…..
0x0060: 5a00 0003 8400 0003 8400 0007 0800 0000 Z……………
0x0070: 3c <
The second step is to run a lookup for AAAA records for that label, which yields an IPv6 address (the highlighted bytes are the last 16 bytes of the DNS ‘answer section’ that correspond to the IPv6 address 2a00:1450:400c:c05::68):
$ host -t aaaa ipv6.l.google.com
ipv6.l.google.com has IPv6 address 2a00:1450:400c:c05::68
15:41:11.504818 IP (tos 0x0, ttl 64, id 24467, offset 0, flags [none], proto UDP (17), length 63)
192.168.1.2.62353 > mygateway1.ar7.domain: [udp sum ok] 9696+ AAAA? ipv6.l.google.com. (35)
0x0000: 4500 003f 5f93 0000 4011 97c7 c0a8 0102 E..?_…@…….
0x0010: c0a8 0101 f391 0035 002b 77a8 25e0 0100 …….5.+w.%…
0x0020: 0001 0000 0000 0000 0469 7076 3601 6c06 ………ipv6.l.
0x0030: 676f 6f67 6c65 0363 6f6d 0000 1c00 01 google.com…..
15:41:11.528881 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 91)
mygateway1.ar7.domain > 192.168.1.2.62353: [udp sum ok] 9696 q: AAAA? ipv6.l.google.com. 1/0/0 ipv6.l.google.com. [5m] AAAA 2a00:1450:400c:c00::68 (63)
0x0000: 4500 005b 0000 4000 4011 b73e c0a8 0101 E..[..@.@..>….
0x0010: c0a8 0102 0035 f391 0047 cca2 25e0 8180 …..5…G..%…
0x0020: 0001 0001 0000 0000 0469 7076 3601 6c06 ………ipv6.l.
0x0030: 676f 6f67 6c65 0363 6f6d 0000 1c00 01c0 google.com……
0x0040: 0c00 1c00 0100 0001 2c00 102a 0014 5040 ……..,..*..P@
0x0050: 0c0c 0000 0000 0000 0000 68 ……….h
The third and last step is to repeat the first lookup, and establish that it now “succeeds”, in that an A-record (that does not actually exist) is returned with IPv4 address 42.0.20.80:
$ host -t a ipv6.l.google.com
ipv6.l.google.com has address 42.0.20.80
15:41:11.550621 IP (tos 0x0, ttl 64, id 15024, offset 0, flags [none], proto UDP (17), length 63)
192.168.1.2.50255 > mygateway1.ar7.domain: [udp sum ok] 45480+ A? ipv6.l.google.com. (35)
0x0000: 4500 003f 3ab0 0000 4011 bcaa c0a8 0102 E..?:…@…….
0x0010: c0a8 0101 c44f 0035 002b 3622 b1a8 0100 …..O.5.+6″….
0x0020: 0001 0000 0000 0000 0469 7076 3601 6c06 ………ipv6.l.
0x0030: 676f 6f67 6c65 0363 6f6d 0000 0100 01 google.com…..
15:41:11.555103 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 79)
mygateway1.ar7.domain > 192.168.1.2.50255: [udp sum ok] 45480- q: A? ipv6.l.google.com. 1/0/0 ipv6.l.google.com. [2h46m40s] A 42.0.20.80 (51)
0x0000: 4500 004f 0000 4000 4011 b74a c0a8 0101 E..O..@.@..J….
0x0010: c0a8 0102 0035 c44f 003b 42db b1a8 8100 …..5.O.;B…..
0x0020: 0001 0001 0000 0000 0469 7076 3601 6c06 ………ipv6.l.
0x0030: 676f 6f67 6c65 0363 6f6d 0000 0100 01c0 google.com……
0x0040: 0c00 0100 0100 0027 1000 042a 0014 50 …….’…*..P
So, it appears that the router stored the first four bytes of the just-received IPv6 address and now answers the A lookup from its cache.
A second observation is that after the initial AAAA lookup, repeat AAAA lookups break, in that the response does not contain a DNS ‘answer section’ at all (=malformed):
$ host -t aaaa ipv6.l.google.com
;; Warning: Message parser reports malformed message packet.
ipv6.l.google.com has no AAAA record
15:41:54.578422 IP (tos 0x0, ttl 64, id 60328, offset 0, flags [none], proto UDP (17), length 63)
192.168.1.2.53570 > mygateway1.ar7.domain: [udp sum ok] 64254+ AAAA? ipv6.l.google.com. (35)
0x0000: 4500 003f eba8 0000 4011 0bb2 c0a8 0102 E..?….@…….
0x0010: c0a8 0101 d142 0035 002b c4d8 fafe 0100 …..B.5.+……
0x0020: 0001 0000 0000 0000 0469 7076 3601 6c06 ………ipv6.l.
0x0030: 676f 6f67 6c65 0363 6f6d 0000 1c00 01 google.com…..
15:41:54.583505 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 63)
mygateway1.ar7.domain > 192.168.1.2.53570: [udp sum ok] 64254- q: AAAA? ipv6.l.google.com. 1/0/0 [|domain]
0x0000: 4500 003f 0000 4000 4011 b75a c0a8 0101 E..?..@.@..Z….
0x0010: c0a8 0102 0035 d142 002b 44d7 fafe 8100 …..5.B.+D…..
0x0020: 0001 0001 0000 0000 0469 7076 3601 6c06 ………ipv6.l.
0x0030: 676f 6f67 6c65 0363 6f6d 0000 1c00 01 google.com…..
Looking further into this issue, I found the following posts:
- [libc6] gethostbyname fails on IPv6 addresses (December 2007)
- Small Dropbear – Does your program use gethostbyname() ? (June 2010)
As @Yafsec told me, it is suggested that gethostbyname() should not be used anymore, and that getaddrinfo() or getipnodebyname() should be used instead. My best guess from all this is that my friend’s router uses the gethostname()— and that he should buy a new router.
The curious case of 42.0.20.80 is now solved, but questions come to mind:
- How many routers currently in operation have this bug?
- How often does misinterpretation of IPv6[0-3] as IPv4 take place?
- Could this bug be abused?
- Improbably, to collect credentials/cookies for Google services, 42.0.20.80 could host spoofed versions of Google/Youtube/etc. To avoid detection, it would answer on port 80/443 only during a short time window and/or only to specific IP ranges. (Password re-use, misuse value of private e-mail communications, yada yada yada.)
- What other services (besides Google) run IPv6 and could have users experiencing this?
EOF