head-in-a-tv

A never-ending cascade of errors, oversold managed servers and 72 hours of waiting around for email to work again. A tragic tale starring GoDaddy.

I’m no stranger to servers, websites and the Internet. I’ve been online in one form or another since 1995. I’ve run websites since 1996. I hosted my original main domain back in 1998 with a small Florida company called Hostway. They were gobbled up by another company whose name escapes me. That company was in turn gobbled up by a bigger company called Verio (who went on to be the biggest hosting company in the world for a while.) I had to move to a VPS (Virtual Private Server) back in the 2001, when traffic to my sites was interfering with my neighbors in a “shared server” environment. A couple of years later I moved to an MPS (Managed Private Server) box at Verio’s insistence and a higher monthly fee. I stayed there for almost 15 years – happy for many of them but increasingly frustrated as their service, tech support and gear slowly descended into madness territory.

Being an early adopter of their vaunted (and expensive) MPS service, I was stuck on an old server that became more and more obsolete. Their tech support was off-shored to Singapore (where the people were lovely, just handcuffed as to how much help they could actually offer) and while new sign-ups got the new gear, paying customers were stuck with the old. This is a similar issue we all have with cellular phone and cable providers. When Verio’s tech support became almost unreachable, a lot of the features we needed to run on our sites couldn’t be patched due to old versions of server software (and our boxes so old they couldn’t be updated) I finally reached out to management level. Their attitude was somewhere along the lines of “why should we give a shit” and “ha ha, serves you right for sticking with us.” Turns out Verio had been purchased by yet another company, and while the pleasant emails reminded me daily that I should move to this new outfit because they were awesome just like Verio, the cascading set of problems with keeping our websites alive grew worse daily.

It was time to move.

GoDaddy seemed like the only real choice, so in September of last year (in the middle of two major site rebuilds) we moved the entire kit and kaboodle over to GoDaddy’s MPS platform. Prices were decent, I was promised the moon in terms of tech support and their servers were modern and up to date. The sales guy seemed really nice too, a breath of fresh air. I had accounts scatted all over GoDaddy’s network – I buy a lot of “snap idea” domains so it was probably time to amalgamate those accounts into one too. It would be nice to have all my Internet services and accounts under one roof and the helpful sales people made it all happen.

Yay.

Spammers and the internet.

Everyone with an email account knows that spammers have pretty much ruined email for the world. The excessive lengths we go to in order to stop receiving these idiots’ emails, and the excessive lengths these morons go to in order to get their ridiculous emails to us, has become a fierce battle that consumes a helluva lot of resources in your email account. That’s increased ten fold in the back end of your provider, and in smaller degrees (but with the same intensity) people like me who run a small network of websites and email servers. I’ve got several email accounts that I use and they’re flooded with spams on a daily basis. My personal account gets hit on average about 3,000 times a month and that’s after filtering, and deleting and trashing by filters, blocklists and stop words. I get it, I understand it.

Spam is really bad.

Once burned, twice shy.

In the twenty years I’ve been doing my Internet thing, I’ve never spammed anyone – nary an unsolicited email – with one possible exception, which I’ll still argue wasn’t spamming at all. After the terrorist attacks of 9/11, activity at the shop had dried up. Business had just stopped (though our expenses hadn’t.) At the time, we didn’t publish our price lists (designers had decreed that publishing “flate rate” pricing was kinda whorish) so you had to request our pricing through a mail autoresponder. If you sent an email to our pricing email address, or used a little pop-up form on our website, you’d get an auto-acked email that told you how much we charged to design you a logo. Nothing terribly elaborate, but it worked. Or at least, it had. As business slowed down in the fall of 2001 and beginning of winter 2002, I needed to drum up some sales and went the route most suppliers and vendors do.

We lowered our prices.

Figuring people who had requested our rates around the time might be interested in this price reduction, I had my admin gather together emails of everyone who had sent for our rates the previous month. We sent them individual emails that started off with “You requested prices recently and thought you might like to know..” It wasn’t a great deal of people – a couple of hundred or so – but the hate mail that came back was staggering. I received death threats. Some guy was going to find me and personally remove parts of my anatomy. A lot of them sent the email over-and-over again, figuring that spamming us with their threats underlined how serious they were about hating spam. The experience was an eye-opening one. Keep in mind that was way back in 2002 and if anything, people’s attitudes towards junk mail and unsolicited proposals have hardened, not lessened. I’ve toyed around with the idea of a studio newsletter, even had a popup form to grab double opt-in email addresses (people get one mail that they have to respond to, in the affirmative, that more or less asks “do you really want to be on this list?”) but have never gone through with a premier edition shipment. That original experience – over 14 years ago – is still very much in the back of my mind and I’m terrified of the repercussions.

My little online empire.

I run a few websites in various degrees of completion and with varying amounts of traffic. Our main studio site is the granddaddy of them all, the most complete, fully functional and with a decent visitor rate for a niche site. There’s this one you’re now reading, a Canadian site for the shop (that tries, with varying success to be well, Canadian) and a few others – always to do with design in some form or another – that are either hobby or “forever work in progress” sites. All of these websites now sit on our GoDaddy MPS server, except for the Canadian enterprise which is hosted with a small server outfit in Toronto (this helps Google realize the site is Canadian site and to offer it up to people when they search for stuff using Google.ca.) In terms of email, everything dovetails towards our main site – everything forwards to that email server and from there, it’s sorted and forwarded on to various Gmail accounts. Gmail spam filters are the boss, it allows easy access for anyone at the shop when they’re not actually at the shop and it’s easily accessible through smartphones and tablets. It’s how we’ve run things since 2007 (I know that because one of the emails actually ends in 07, the year it was set up.) It’s worked and we’ve never had any issues. Until a few days ago.

The trickle.

We get a lot of email. With all the various sites, inquiries, client messages, feedback comments and what not, we get a steady stream of email. There’s a lot of spam that tries to make it through too, but it’s filtered at various points in it’s journey to our email boxes. Whenever I first open my phone in the morning, I always get a blue progress bar as the email downloads into my mail app, a daily ritual for as long as I remember. Tuesday morning – nothing. Not a solitary email overnight. I checked our other accounts and they had received mail, though a smaller volume than usual. Didn’t think too much about it until I realized I wasn’t getting any emails from our main account. I was able to send through my phone but not my desktop client. It grew progressively worse as the morning turned into afternoon. By one, nothing was moving at all.

Something was definitely up.

The backend.

If you run your own server, the backend has an awful lot of information that you can sift through. There’s delivery reports, message queues and other data that tell you a lot of things if you know where to look for such things and how to interpret the things it’s trying to tell you. I’m semi-clued on this tech jargon, so it didn’t take me long to realize that most of our email was getting binned at the server. It either wasn’t being delivered, or was being “frozen” in the delivery queue. Time to call GoDaddy tech support. That was at 3:00 in the afternoon. The somewhat standoffish guy in their tech department told me that since this was an MPS he couldn’t help me at all. Apparently I was the one doing the “managing” of this Managed Private Server. No biggie I guess, but at odds with the 24/7 tech support I was promised a year previously when I first being sold the services, paying for a year in advance. Don’t believe in screaming at tech support people, so I began a seven hour Google journey through various forums, tech articles and FAQ sections to figure out what was going on. What was going on is that our entire server was on an email blocklist. For spamming.

And that blocklist belonged to the GoDaddy mail server.

Straightening out the spaghetti strings.

I rummaged through the various email headers to figure out just what was up. The email activity had seen a slight uptick starting just after 12:01 Tuesday morning but nothing obvious. As I scanned through the various warnings, bounced emails and notifications, one IP number kept coming up. Our Canadian site. On that little hosting platform in Toronto. There were quite a few of those. The return address was “fail2ban@mail.com.” An email address I had never heard of.

How did these even get to this server?

Here’s what had happened. Sometime on Monday, some spammers from China had started probing our Canadian site for open relays – these are a favorite target for spammers as they allow sending spam runs with the originating server holding the bag, their IP listed as the source of the junk emails. Ah-ha. So the Canadian site was spamming maybe? Nope. The Canadian server was locked down tight. It gave the Chinese spammers 5 kicks at the cat – an SSH login attempt – before banning their IP range from communicating with our Canadian site at all. It was a default server setup that I guess hadn’t been changed since we started hosting the site back in 2005 or so. Trouble is, whenever these morons pinged the site for that fifth time, an application on our server told them to fuck off. Then it figured out where their IP was and sent those guys one of these “fail2ban” notifications, informing the ISP (Internet Service Provider) that one of their users was being a bad Internet citizen (as if anyone in China cares.) Our Canadian server also sent our “root” account – the admin and technical contact – a copy of this alert too (as if to say “look what a good server I’m being.”) That email account is on our main server on Goddady. Which then forwards it on to Gmail. All of a sudden, a lot of duplicate emails from “fail2ban” were being relayed through the Godaddy server from our Canadian site. At peak, it was one failed probe every two minutes – that’s not a lot of extra email traffic all things considered – but at some point early Tuesday morning the relay server noticed the uptick and decided someone was going on. So it banned the Canadian IP for sending these messages, right? Nope. It added our entire main GoDaddy IP – where the messages were actually going – to the GoDaddy block list.

Our email started to die after that.

Yep, we could literally not send any email, from a GoDaddy hosted website, through the GoDaddy mail server for receiving notifications that someone was unsuccessfully trying to spam through our Canadian site. Messages from us. To us. Literally the opposite of spam. And the company that we pay a decent monthly fee to keep us on the internet, had effectively stopped us communicating with people on the internet. Amongst ourselves.

Or with our clients.

Hurry up and wait.

At 11:30 (we’ve now been without email for almost 24 hours, a lifetime in Internet terms) and armed with this freshly sleuthed info, I figured it would just be a quick call to GoDaddy tech support to sort it all out. At the outside it was an overaggressive spam filter. At worst it was an automatic misunderstanding of what was going on. This was far beyond the “manage it yourself” instructions the curt tech support guy had told me 8 hours earlier – it was a problem at the GoDaddy level – and it should be obvious to anyone who understood what the email headers and logs contained. Anyhoo, when I told the nice (this time) tech fella about my ongoing issues, he asked me what I had done to “stop the spamming from your server?” I explained once again that we weren’t spamming thank you very much – it was stuff being sent TO US, BY US. By this time I had also deleted the mail forwarder so it was no longer happening anyways – until I figured out how to stop them entirely at our Canadian source, the “fail2ban” notices were being dropped into an actual email folder on our GoDaddy server. Sure, that was probably good enough he thought, telling me they’d remove the block – probably take about 5 minutes – and Bob would be my Uncle once more. I was put on hold for 12 minutes.

That’s never a good sign.

12 hours, 24 hours, 72 hours, whatever.

When the tech guy came back, it wasn’t with good news. Apparently this procedure was more complicated than he first thought. After talking to someone hunched over a keyboard, this could take a few hours. At the “outside” 24 hours.

But “probably overnight.”

Ah well, at least we were headed in the right direction. I thanked him for his help, hung up, and went back to our server logs to try and piece together what email clients had been sending us throughout the day. I hadn’t worked on client projects at all, and people were starting to bitch because I wasn’t getting back to them. Finally hit the sack at 4:00 am, figuring all would be squared away by the time I woke up, and I could get back to you know, actually working for a living.

It wasn’t. And I still couldn’t.

My morning iPhone ritual showed no improvement over the previous day. Mail was still being frozen at our server, the blocklist still holding firm. Another call to tech support and I was told this time (by a very nice, but less than helpful woman) that this incident had been given an “escalation ticket” – a six number ID – that could take up to 72 hours to be resolved. That was a lot different than the 24 hours “on the outside” I’d been told before. I’d later find out that this magical 72 hours isn’t a “fix” time at all, but a “don’t expect to hear anything from us for 72 hours, and even then it may just be a notice that we’re still looking at this” time.

Jesus.

That would amount to almost 4 days without any outgoing email (or, due to the way they’re set up, any incoming email from our contact forms, project submission pages or client feedback section) if they did fix it, longer if they didn’t. This is some serious problems for a company that makes it’s bread and butter on the Internet. The helpful young lady suggested I set up a dedicated email server to address my immediate concerns (for an additional fee, natch.) This would bypass our IP completely, allow us to operate email like normal and around the GoDaddy blocklist (our website forms still wouldn’t work because they required the server Send Mail application which was still attached to our IP and needed to go through the relay server that had us blocked.) It was a band-aid solution sure, but we could communicate with each other and clients until our 72 hours had expired and hopefully everything was hunky dory with our email again. This dedicated email server sounded like an easy interim fix.

It wasn’t.

Dedicated email servers, the DNS records that weren’t.

To be continued..