Email and DNS
As most of you know, I am a tech-weenie. I don’t often write of technical issues because I am burnt out of the industry. I’ve written hundreds of white-papers on many different subjects over the years and I basically cannot stand to deal with tech-issues in my spare time. Therefore, I stay away from the subject when I am blogging or developing the Warzone. Well, there’s one particular subject that came to light today that I thought I would share my opinion of, and that is of course, email and DNS.
I won’t be going into the technical specifics of either, I’ll just be touching on the issue because most of my readers aren’t system admins and I would be wasting my time and yours speaking too technical. And, there’s really no reason to do so, I am just trying to explain and help you out so that you can understand what happens and why I think it happens. *I think* is a very relative term because I am not a member of an I.T. staff for one of the major ISPs, however, my deductive reasoning gives me ample insight into the industry and the fact that I can test, from my home office, many aspects of the internal workings of commercial email servers and give you a break-down of what exactly happens when email fails to send or doesn’t get received.
Most system admins can easily start up an email server and get it working to a certain point. Most system admins eventually do this rather quickly, as it’s a primary-use tool by most companies. Some are apprenticed into the industry. Some go to college, get a B.S. in computer tech or graduate some trade school with something like an MCSE (look it up) and then go out into the world, obtain a job and start doing their work. As most of you realize, most people lie on their resumes. They bite off more than they can chew.
As some of you may or may not know, the Internet was developed by the Military and high-end academia started in the late 1960’s. The folks doing the work were college professors and graduate students (very few) along with theoretical computer scientists. Eventually, the “operation” of computers was deemed suitable for the lay-person to handle and non-technical types were given the jobs of data entry. Even during the 70’s, most, if not all of the high-end stuff was still left to the scientists that specialized in this particular field. Not until the early 80’s were folks that were not formally trained in computers delving into the programming realm. This is where I joined the legion of teenagers infatuated with computers.
Later on down the road, folks like me obtained jobs in the tech field and we started learning more and more about computers. Some of us went on to different paths. The backbone DNS (domain name services: takes a name, such as whatever.com and breaks it down into a number (address) that is locatable by a series of DNS servers and eventually, tells everything in between where it’s found on the Internet) were ran on Unix based operating systems. It wasn’t until Windows NT 4.0 came out that anything “windows” based had much of anything to do with Internet traffic. At this time, it became evident to just about everyone involved that the introduction of production based web-servers were easily accomplished by the lay-person, someone not formally trained in the tech field. When this happened, droves of people not formally trained in computer tech became their companies’ I.T. person. Eventually, their skills became needed in higher paying jobs and they moved on. A lot of these folks were very good and what they did. They’re still good at what they do, but their place is very remedial. They can run internal stuff “o.k.” but when they’re introduced to the outside world, they tend to be overwhelmed and start doing little things that screw up the way things work.
The very first thing that gets hosed by these people is DNS. How do I know this? Even though I’ve been hacking with computers since 1982, I did not fully understand the complexities of the TCP/IP stack (how the current Internet is ran) until 1998 or so. At that time, I was taught by a very good systems administrator (Steve Gielda of Cotse.com, later Cotse.net) on how DNS and Email are tied together to such an extreme, if DNS is handled incorrectly, all hell breaks lose. That’s not to say that Email breaks DNS. It doesn’t. But if DNS is broken, nothing, not Email, not the web, not FTP, gopher, etc. will work properly. But especially Email because of how closely tied Email is to DNS.
Depending on how the email server is setup (and there’s many ways) if DNS breaks at any level, it can directly effect email. I’ve seen this time and time again. I have gotten to the point that you can give me a basic idea of what your network is doing and I can troubleshoot it immediately, on-the-fly, without taking any true troubleshooting steps. Email can be very complicated. With the introduction of spam-filters and blacklists, it becomes even more complicated. And of course, with the introduction of huge email providers like Hotmail, Yahoo, Gmail (probably the exception to the rule because Google actually seems to know what’s going on) and others, comes the unavailability of enough trained professionals to man the posts needed to properly administer the servers. Let alone smaller ISPs (when I say smaller, you must take the actual size of Hotmail, etc. into consideration. Anything less than similar to something as huge is smaller, even large ISPs like Time-Warner, etc. are small in comparison), it’s extremely tough to employ the quality and number of admins needed to properly admin email servers. Believe this or not, there’s a rule of thumb that the bean-counters absolutely hate: There’s a 4 or 5 to 1 ratio of Micrsoft servers to Admins and a 7 to 10 to one ratio of Unix servers to admins. Meaning, a typical admin of Microsoft servers is capable of admin’ing up to around 5 Microsoft servers during their normal daily routine and upwards of 10 or so Unix based servers for every one admin during their daily routine). Some of these email providers are utilizing an extreme amount of servers with a relatively few number of admins. Email and DNS are two things you cannot overlook and you cannot underman. When I say “underman”, I don’t mean quanity only. I mean quality first, quanity second. You MUST have good admins and they must be enough of them to handle the jobs assigned and that’s just not happening.
In my area, I primarily deal with three major ISPs. Charter (now Suddenlink), Verizon DSL and Fibernet (a local company). What I’ve found is, the larger they are in my area, the worse they are at admin’ing and maintaining a stable set of DNS servers, not to mention email servers that are stable. Since 2002, I believe I have found Charter’s (aka Suddenlink) DNS and/or email servers to be down, not responding or a black hole appx. 10 times or so. The DNS problems doesn’t just affect mail. It will also affect web traffic and the ability of customers to find websites. Without proper DNS, just about everything breaks and there’s nothing you can do about it. From a misconfigured DNS server with entries such as unknown mail servers, server blocked by their own internal firewalls, etc. to completely unresponsive DNS servers. And when you call tech support and tell them their DNS servers are down, the first-tier tech support folks want you to “reset your cable modem”. Duh.
Most of my former clients (I am now a part-time employee of many companies) now rely on my DNS servers to take care of their DNS needs. Some know this, some do not (some aren’t tech-savvy enough to understand there’s no “any” key), however, most are using my servers because my servers are stable and I know IMMEDIATELY when they go down, rather than the intermittant problems I see with ISP DNS serves.
At any rate, this is one of the primary reasons your Internet service goes down. You’re still connected to the Internet, whether you know it or whether you acknowledge it or not. Most of the times, when your browser cannot find “www.cnn.com”, if you tried to connect to 64.236.24.20 instead of www.cnn.com, you will get their web page. Most people don’t know this and immediately assume their Internet service is down, when in reality, their DNS server is down and the Internet is completely reachable, if you know the right address.
Anyway, this is just a little insight into how things work and what breaks. And trust me, DNS breaks and breaks often. As recent as a month ago, I was approached by an employee at a place I do I.T. work for. This guy was having email trouble. “..no one from here can email me at home but my brother can email me just fine.” I asked what ISP he was on and what ISP his brother was on. He told me both were on Suddenlink (Charter). I immediately had an idea of what was wrong. I sent some test messages from different accounts (5). Out of the 5 test messages from totally different email accounts/providers, one made it through. I logged into one of my production servers and did some tests from there and found that my production server with a good, working set of DNS servers found the Suddenlink email server to be located at “X”. I tested “X” and there was no email server running on that IP address. Another invalid DNS entry. I called Suddenlink tech support and explained the problem. They did their own test-email to the account and it went through. They tested from inside their own system and assured me everything was working. I tried to explain to them that of course their internal stuff was working because their DNS servers were reporting the correct information and they were internal. The problem was, externally, we were getting an invalid return in DNS queries for their email server and as long as that persists, it would never work for us. They said “…but I just checked from Yahoo and Yahoo email is coming through.” That’s because Yahoo caches DNS entries for a longer period of time (a different configuration as mentioned above) and your change was recently made. They did not understand this. It took around 30 minutes for me to figure out they were not going to get this. So…. I gathered all of my data and emailed it to them and showed them verbatim, step-by-step how to properly check to see if DNS and email were working properly. Within 2 hours, their incorrect MX (mail exchange) record was removed from their DNS server and email finally started to work properly again. I explained at the bottom of the email that since I did their troubleshooting for them and isolated the problem, during working hours at one of my employers, I believe my employer needed to be compensated for outsourcing my time. I didn’t get a response. Maybe their email was down? ~g~