Shortly after Leopard was released we began receiving a few reports of issues, typically a spinning color wheel on startup, that affected a small number of users. We were unable to replicate the problem in-house and eventually determined it was something specific to the user’s network connection (since they could often get it to work on the same machine from work but not home or vise-versa). With some additional help and debugging from affected users we were able to track the problem down to changes in DNS resolution in Leopard. We’re documenting the problem here to help Leopard users who may be having problems with other applications, as well as application developers or ISPs who may start getting complaints.
Note: the rest of this post is technical info. For users on Leopard having trouble with Jungle Disk, you can download the latest Beta version for a fix. This problem only affects some users with old or misconfigured DNS servers – most Leopard users should be fine already. We’ll be releasing a non-beta version shortly incorporating the change.
Summary
The DNS resolver in Leopard has been changed to first attempt SRV requests for lookups initiated by the getaddrinfo() function. If the user’s DNS server drops these requests the DNS lookup may take an extended period of time to complete (30 seconds to several minutes) as Leopard tries different domain requests and eventually falls back to making an A record request. This can result in application freezes or timeouts, as was occurring with Jungle Disk.
Details
There are two primary BSD APIs for domain name resolution – gethostbyname() and getaddrinfo(). getaddrinfo() is the more robust API with better support for things like IPV6. Jungle Disk uses libcurl for HTTP access (as do many other apps), which by default uses getaddrinfo() when compiled with IPV6 support.
On OSX 10.4 (Tiger) both APIs perform similarly – they do an A record lookup of the provided domain. On 10.5 (Leopard) the getaddrinfo() API attempts an SRV lookup first when provided with a well-known port. For example, when performing a lookup for “s3.amazonaws.com” on port 80, you will first see requests for:
- SRV _http._tcp.s3.amazonaws.com
- SRV _http._tcp.s3.amazonaws.com.yourlocaldomain.com
Depending on your DNS server, you will get one of three possible responses:
- The correct SRV response (in this case a CNAME record for s3-directional-w.amazonaws.com)
- NXDOMAIN (if your server does not support or understand SRV requests)
- No response at all
In the second case, Leopard will fall back to doing a A record query after the SRV requests fail and although there is a slight delay it is not generally visible to the user. In the last case Leopard will make several retries of the query over a period of several minutes, finally falling back to an A record query. During this retry period the application will appear frozen or unresponsive (depending on what thread the lookup occurs on).
It’s important to note that this problem does not appear to be a bug in Leopard – it’s caused by old, buggy, or misconfigured DNS servers. The change in Leopard to use the latest IETF recommendations for DNS lookups is simply bringing the DNS server problem to the surface. It’s unclear how many users or applications will be affected by this change, since it only appears with some DNS servers and only for applications using getaddrinfo (most applications still use gethostbyname). For many users these days, their DNS server is actually their home router, which then proxies the request (possibly with a local cache) – so updating router firmware may address the issue.
To work around the issue in Jungle Disk we’ve switched off IPV6 support in libcurl, which changes it to use the gethostbyname() function. It’s not clear if there is a way to disable SRV lookups system-wide on Leopard to fix other applications using getaddrinfo(). Anyone with further information on this issue is encouraged to post in the comments.
If you’d like to see the behavior for yourself, try the following test on Tiger and Leopard:
- Open up two terminal windows
- In one window, run “sudo tcpdump port 53″
- In the second window, run “curl http://s3.amazonaws.com” (or another domain)
On Tiger you will see an initial request for “A? s3.amazonaws.com” and reply. On Leopard you will see a request for “SRV? _http._tcp.s3.amazonaws.com”. Depending on your DNS server you will then see either the correct response, a NXDomain response, or a series of timeouts or retries.
21 Comments
RSS feed for comments on this post · TrackBack URI


Rob Lewis said,
November 1, 2007 @ 10:02 am
I tried your test in Tiger Server 10.4.10.
tcpdump port 53
tcpdump: (no devices found) /dev/bpf0: Permission denied
curl http://s3.amazonaws.com
(this returned no response at all except another prompt line)
This same box is running OS X Server’s built-in DNS, which serves my LAN.
Todd Vierling said,
November 1, 2007 @ 10:30 am
The _http._tcp.s3.amazonaws.com.yourlocaldomain.com can be skipped. DNS resolvers with BIND lineage, as with OS X, will attempt to append the local domain name if the request to gethostname*()/getaddrinfo() is not fully qualified.
Try changing the request from “s3.amazonaws.com” to “s3.amazonaws.com.” (note the trailing period!) and you should see the local-domain-appended request vanish from the set.
(When doing lookup for any service that relies on a domain name that is known to be fully qualified, it’s always a good idea to include the trailing dot to assert the fact that it is already fully qualified. This is not platform dependent — it works on Linux and Windows too and is a documented standard feature of DNS resolvers.)
Jungle Dave said,
November 1, 2007 @ 10:31 am
Sorry, forgot to mention that you need to run tcpdump as root. I’ll update the instructions.
Jungle Dave said,
November 1, 2007 @ 10:41 am
The local domain request isn’t really the issue, although it may slightly extend the time before a timeout occurs for DNS servers that ignore SRV lookups. Nice tip though!
Todd Vierling said,
November 1, 2007 @ 11:13 am
Removing the local domain qualified entry might also skip some buggy DNS problems and allow applications such as JD to progress to the new APIs just fine. Generally recursive resolvers don’t have a problem passing SRV records from an external domain through to the user (because they’re now already used by many applications from e-mail to Google Talk).
However, if the DNS server for the user’s yourlocaldomain.com is buggy and goes belly up on a SRV request, that would likely cause delays. I’d suspect this is the primary cause of problems with the SRV lookup, and that’s what the trailing dot fixes.
The OS Quest » The OS Quest Trail Log #13: More Leopard said,
November 4, 2007 @ 3:25 am
[...] JungleDisk.com: Leopard DNS Issues (and work-around) – Solution for issue some Jungle Disk users had with Leopard. I like Jungle Disk even if I haven’t found a reason for me to use it. [...]
JP said,
November 6, 2007 @ 5:01 pm
Check out:
http://discussions.apple.com/thread.jspa?messageID=5771082�
I wonder if this is related to any of the DNS issues folks are experiencing. It appears that Leopard is generating queries with an RR type of 0, which appears invalid. Some name servers ignore this query, which causes delay on opening a connection.
If others are seeing this behavior, or if you know of a valid reason to generate a query with RR type 0, please post to the discussions thread above.
Thank you.
Leopard 10.5.1 released | Rick Tech said,
November 15, 2007 @ 4:50 pm
[...] Leopard 10.5.1 released Apple has released Leopard 10.5.1 today. The list of fixes is pretty exhaustive but by no means complete. [...]
Solution for Leopard Networking Problems | Mac Amour said,
November 24, 2007 @ 9:24 pm
[...] upgrade, I’m having a very unstable internet connection, it seems it’s a known and common issue which will probably fixed in the next Leopard’s update (a similar problem happened on [...]
End User said,
November 27, 2007 @ 3:33 pm
I am experiencing the issue with 10.4.10. Maybe it’s not just a DNS/10.5 issue.
Iwaruna.com » Upgrade to Leopard said,
November 28, 2007 @ 1:29 am
[...] at least with web browsing, grabbing syndicated content (RSS), and signing onto to chat services. DNS resolution in Leopard might be the culprit. Or, rather, Leopard implements DNS resolution in the more compliant IPv6 [...]
David Smiley said,
December 30, 2007 @ 8:38 pm
Thanks for this post… I eventually figured out that by configuring my Macs at home to use opendns.com’s DNS servers, my problems has gone away. Actually, my router was configured to use it (netgear router MR814v2 with latest firmware 5.3_05 ) and at first it seemed to not be sufficient but it seems fine now. I’m on Mac OS X 10.5.1
roger pack said,
January 4, 2008 @ 8:04 pm
Yeah it appears that if you have flaky internet, leopard will cache your ‘failed’ dns entries, which means that unless you are constantly flushing your dns, your internet will be only half functional. Switching to opendns seems to help.
links for 2008-01-18 - Daily Eclecticism: Assorted Sumptuous Links Daily by Debajit Adhikary said,
January 18, 2008 @ 10:18 am
[...] Leopard DNS Issues (and work-around) [...]
Nathanael Jones said,
January 30, 2008 @ 9:43 am
I’m having a similar problem – Leopard is generating invalid SRV queries, and grinding to a halt. Your article helped me discover that it was a DNS issue.
http://discussions.apple.com/thread.jspa?threadID=1369046&tstart=15
Leopard DNS Aiport Issue - Why + Fix — The Brain of Wade said,
February 8, 2008 @ 10:02 am
[...] little bit later and it turns out Leopard’s changed “The DNS resolver to first attempt SRV requests for lookups initiated by the getaddrinfo() function.”, which is what happen to be [...]
iPhoto2Gmail 0.10 said,
February 17, 2008 @ 12:49 pm
[...] as well as more descriptive and useful error messages. This new authentication also addresses the DNS issues some users had on [...]
Why SSH is Slow to Connect on Mac OS X Leopard « Mike Brittain said,
March 30, 2008 @ 11:20 pm
[...] 30, 2008 · No Comments According to this post at Jungle Disk, my SSH connections to some servers are stalling (for nearly a minute) because of changes in the [...]
Craig Hunter said,
June 3, 2008 @ 9:12 am
Anybody notice that 10.5.3 no longer has this problem? I have been unable to find any info in the release notes or developer notes, but it appears that they reverted back to making DNS-A requests by default.
Slow Internet with Leopard | Mac OS X Leopard & Tiger Dual Boot said,
June 5, 2008 @ 4:50 am
[...] your router’s firmware (references: jungledisk.com, ubuntu.com) for better SRV and IPv6 [...]
Erik H. said,
December 28, 2008 @ 2:30 pm
I had this problem and unfortunately my router has no support for IPv6 and no newer firmware. Thus I had to bypass my router for DNS queries. Once I did this all my slow internet with OS X Leopard problems were solved. To state it simply you bypass the router’s DNS by specifying a DNS server other than your DHCP router in network preferences > (interface) > advanced > DNS.
Detailed instructions are available here: http://installingcats.com/2008/06/05/slow-internet-with-leopard/#direct_dns_better_dns
Actually I would recommend reading that entire webpage. It helped me fully understand the problem and the multiple possible solutions.