It appears the outage was caused by a DDOS attack against one of our clients on the node. Unfortunately our network monitoring stopped working at the time so we are unsure of the size of the attack but the timing of the outage coincides with the same time a VPS was suspended by our automated system for exceeding our Packets Per Second limit. It appears that our primary NIC failed for some reason which made the backup NIC the active one and the backup switch (100Mbps) was unable to handle the size of the attack which resulted in the outage. A reboot of the server forced it back onto the primary NIC which is on our primary router (1Gbps) and as soon as the server came back online the client received another DDOS attack but the IP was nullrouted before causing any downtime.
Outage started: Fri Dec 06 2013 18:08:48 GMT-5.0
Outage resolved: Fri Dec 06 2013 18:39:15 GMT-5.0
Total downtime for the node: ~30 minutes
Dec 6 18:07:25 fl1ovz01 kernel: [6166168.098461] igb 0000:05:00.0: eth0: Reset adapter
Dec 6 18:07:25 fl1ovz01 kernel: [6166168.166158] bonding: bond0: link status definitely down for interface eth0, disabling it
Dec 6 18:07:25 fl1ovz01 kernel: [6166168.166167] bonding: bond0: making interface eth1 the new active one.
Dec 6 18:07:26 fl1ovz01 kernel: [6166169.774455] CT: 1340: stopped
As of 20:47PM EST, we still have a handful of VPSs offline that require manual intervention along with some additional DDOS attacks against other clients. We will continue to monitor the network as we work on these VPSs and hope to have an e-mail out to clients once everything is resolved and stable. We apologize for this outage which was compounded by a malfunctioning remote management network which has also been fixed at this time and we will be putting proper tests and monitoring in place to ensure future accessibility when we need it. Thank you.
-The Secure Dragon Staff
Friday, December 6, 2013