Author Topic: Latest outage  (Read 12270 times)

0 Members and 1 Guest are viewing this topic.

Hrakkar

  • Administrator
  • Sr. Member
  • *****
  • Posts: 445
  • Karma: +24/-0
  • Dothraki Fan
    • View Profile
Latest outage
« on: August 26, 2012, 05:01:45 pm »
There is a saying that goes 'To err is human, to really mess things up, it takes a computer...' And so it goes.

It seems that in the super-duper new Learnnavi server, they put such a fast SATA II RAID controller that, if loaded to capacity, it was faster then the motherboard of the computer. Marki (the Learnnavi system administrator, and a first-class nice guy) didn't know about this until he tried to add some additional disk storage. Well, cloning the RAID turned out to be one of those things that 'maxed out' the disk controller, and overloaded the motherboard.

These kinds ofproblems usually result in the hair and blood of the sysadmin being found on the floor of the server room (trust me, its happened to me, too!) In any case, it took a few days to puzzle this out. The ultimate solution was to use the motherboard's SATA controller rather than the accessory card. The system is now slightly slower overall, but it is stable. And not one byte of data was lost, due to the much improved backup strategy.

Its hard to fault anyone for this problem, because the system was a gift from a member of Learnnavi, and is a much better server than the one that was destroyed back in May. Who would have ever known about this problem? modern computer hardware is pretty much 'plug and play'. But demanding maximum performance out of hardware can bring out unknown 'gotchas' like this.

Right now, I am in Chicago, getting ready for Worldcon. But first, there will be a couple of days spent with the Klingon folks. Although Klingon is high on my list to learn, and I am slowly progressing, Na'vi and Dothraki have the priority. And two languages at once are enough!
Don't tell Khal Drogo I am here ;)