Help:About/archive/2005-07-11 Server Outage

From VbzWiki
Jump to navigation Jump to search

As of 2005-07-27, VBZ.NET has been moved over to a new server, and everything seems to be working. Here's what happened.

Sometime during the day on July 11, CI Host moved vbz.net's hosting account to a new server. They did this without warning, and did not copy the files over ahead of time.

This caused several problems:

  • Visitors received time-out dialogs for about 24 hours (until I re-pointed the domain)
  • I had no inkling that order data would soon be unavailable, so 2 orders were lost (except for the emails -- which, for security reasons, do not include credit card numbers)
  • I had no reason to expect that the outage was permanent, so I did nothing about it for 18 hours or so, thinking that the site would probably come back online soon
  • CI Host copied some files over to the new server, but not all; they did not copy order data, and the home page was a "this site coming soon" notice
  • I had to scramble to re-generate all the pages and images for the new server, a process which took several days; had there been advance warning, I could have had these ready beforehand.

What CI Host should have done is made an announcement beforehand, indicating:

  • the purpose of the outage (server upgrade, preferably with specifics)
  • the ETA for availability of the new server (should be before the old server goes offline)
  • the IP address of the account on the new server
  • the URL of the shared SSL for the account
  • the ETA for when the old server would be taken offline

The first I knew of the changeover was when I was attempting to download the data for those orders, and I kept getting error messages. I initially suspected a bug in my software and spent about half an hour tinkering on that end to try and locate the problem.

Due to the way the vbz.net domain is configured, it continued pointing at the old address, at which there was no longer a server to respond; visitors to vbz.net received "time-out" or "server unavailable" messages. The wiki (this site) continued to function, as it is hosted elsewhere.

As soon as I noticed the problem, I began rebuilding the files necessary to move the site to a new server -- something I've been planning for a long time, but which was until that point a relatively low-priority task.

I was finally able to contact CI Host's tech support the morning of July 12, when I found out what had happened; prior to that time, I had assumed there was some kind of temporary server issue and that they would restore the site soon.

Progress Log

  • 2005-07-15: Most of the static pages and images are back online, but the shopping cart script still isn't quite right and the checkout is totally non-functional due to lack of SSL (https) on the new server. We now have a SSL certificate for vbz.net, so when it is working the checkout will be at https://ssl.vbz.net instead of https://some-other-domain.net/vbz, which I'm hoping will be a good thing -- but we're having difficulty installing the certificate. Tech support at TheRealms, our new web host, is working on it.
As mentioned elsewhere, I'm hoping to get something working Saturday morning, but we have to go out of town (to pick up Anna) Saturday afternoon / Sunday morning, so Sunday afternoon would then be the very earliest I could get back to it. Wish me luck. --Woozle 22:00, 15 Jul 2005 (EDT)
  • 2005-07-19: Still working on the SSL issue. Yes, we are still in business too ;-)
  • 2005-07-21: Douglas at TheRealms was able to track down the SSL issue, and apparently it's at the server farm; expecting to have it fixed today. Meanwhile, I'm working on a revised cart system, which I will be dropping in piece by piece to replace the old cart system.
  • 2005-07-27: I've reinstated the old cart system, as cranky as it sometimes is, because (a) it has become obvious that the new one is going to take at least another week of work, and (b) with the last obstacle -- SSL -- fixed, it was time to get something up. It took me about half a day to get the old system working again.
  • I will be continuing to work on the new system as time permits, having done much of the toughest uphill parts over the past few weeks.