Skip to main content Skip to page footer

IT failure: when nothing works at all

Strategising for a total IT system failure is a complex undertaking and requires a lot of thought - preferably before the emergency occurs.

The strategy for a total failure of IT systems is a complex endeavour that requires many considerations to analyse:

  • Depending on how critical individual processes and systems are, the creation of redundancies can already be an important component of an IT security strategy. Business-critical processes and applications are stored twice: If the company's own data centre fails, it can switch to an external, parallel centre.
  • If you work with external data centres, it is advisable to discuss geo-redundancies with the operator. If in doubt, how quickly could you switch from a failed data centre to another one at a different location? What steps are necessary within the company?
  • The scenarios that are played out should ideally also include a strategy for disaster recovery. For example, how quickly could you migrate to completely new hardware and software (regardless of the current location) on the basis of a current backup?
  • Based on a worst-case scenario, it is important to consider which processes can also function (at least temporarily) without IT support.
  • A prioritised list should be drawn up that includes the systems and components that are necessary for the survival of the company.


For non-experts, IT is a collective term that ultimately only describes the totality of all information technology systems in a company. In today's working world, the systems are highly interconnected and partly interdependent, but nevertheless remain individual components, which makes it easier to analyse them in the event of an emergency. Two aspects play a role in deriving countermeasures:

  1. Extent of the damage: How many systems are affected or which parts of the company are restricted in operation.
  2. Duration: How long it will take until the fault can be rectified (if it can be determined at all). A worst-case scenario should be assumed for the duration in particular.


Since the disaster at the Chernobyl nuclear power plant, the term "worst-case accident" has become established in common parlance. In a digitalised economy, the failure of the (entire) IT system is such a worst-case scenario.

  • The failure of the central server on which customer orders from the online shop are received hits a company hard.
  • If the telephone system fails, employees can no longer be reached.
  • If the connection to the Internet is cut, there is not even the possibility of informing customers of the telephone failure via the homepage in the case of IP-based telephony.
  • If the company's own data centre is no longer operational (for whatever reason, for example due to a hacker attack or a trivial event such as a power failure), the wheels of the company come to a standstill.
  • If there is a malfunction in the IT system that controls the logistics, production can come to a standstill if raw materials or materials can no longer be stored.
  • The data centre hosted by a provider can also fail. This also affects the use of cloud services.

 

More about IT security at SpaceNet