After years of trying to produce IT infrastructure capable of providing highly available applications, trying to provide redundancy, increase uptime to the impossible 100%; I came to a very sad realization. Downtime is impossible to avoid, despite the best effort of a lot of professional people like me. The reason is “Too many variables”, and often, the downtime is caused by the same tools that supposed to prevent it.
Linux-HA, Virtual IPs, Oracle RAC, DB2 HADR, OCFS2, etc; you name it. This technologies are very vulnerable, the simplest hiccup in the environment causes adverse effects. Now if you have been in the Read more »