Dec
I am Kabaka, one of the server administrators that keeps the PonyChat IRC Network running. One of my jobs is to help maintain our web chat interfaces, including the Iris interface. Recently, you may have noticed a long Iris outage. This was the result of a small oversight in one of my server management tools.
The host for the Iris server, Linode, informed me via email that the server that hosts our account needs to be taken off-line for maintenance, so our account was being moved to another server. In preparation for that outage, I ran a tool which automatically removes servers from the global and regional IRC pools. The Iris server happens to be on Rainbow Dash (rainbowdash.ponychat.net), which is the server that Linode plans to take off-line.
Until I had discovered the error in my tool, the de-pooling process simply removed all references to the target server from our DNS records. In other words, anything that pointed at the server (except the exact server name) was deleted. As part of that process, iris.ponychat.net was automatically removed. This seemed like a great idea when I wrote it.
I have repaired my tool — it will now correctly remove servers from only the irc.* host names.
In any event, I am very sorry about this downtime. Because the problem has been repaired, it should not be repeated. I have also created additional server monitoring tools which should almost immediately alert staff to any similar problems.
And because of the upcoming maintenance on Rainbow Dash, we have already (prior to this slip-up) begun working on bringing up a second Iris server so that, baring something incredibly unlikely, we always have at least one Iris node available (and my de-pooling tool’s action would have been correct).
Thank you for bearing with us through this downtime. If you have any questions or comments, please feel free to contact me or any of the other staff.



