2017 has seen some terrible events and has highlighted the many business continuity threats that organisations can face. In June, the Grenfell Tower fire resulted in the tragic death of at least 80 people. Terrorist atrocities have continued, notably in London and Manchester. These have had a devastating effect on the people and organisations directly involved and many others connected with them. Cybercrime also continues to evolve with the latest threat to organisations coming from Ransomware. The NHS was not alone in having its operations significantly disrupted for an extended period by the Wannacry virus.
This has reminded us all that business continuity planning must not just be a paper exercise and unless it has been properly thought through and tested the response to such an event may be flawed with serious consequences.
What should organisations have in place?
Much has been written about business continuity, with a wealth of guidance available. The most commonly used sources are ISO 22301and ISO 22313. Both recommend that Business Continuity Management Systems (BCMS) should be founded upon the following:
- Strong support from leadership - business continuity must be taken seriously.
- Sufficient resources – people and funding must be committed to maintain the system in a state of readiness and to provide an effective response to the incident in a short time frame and then to initiate the recovery.
Understanding of threats
- Every organisation is different in terms of location, management structure, premises, people and supplier/ third party dependencies, legal and regulatory commitments - Plans must reflect this.
- Risk and threat analysis – this needs to be thorough. Incidents with a direct impact upon the organisation itself are probably the first types of risks and threats to come to mind. However, more often than not it is the dependency on third parties that proves to be the Achilles heel.
- Impact assessment - The key question is how long can the business continue without key assets such as people, premises, technology and third party support? Weighting each potential crisis according to its likely impact and the probability that it will arise helps highlight those areas of greatest concern and ensure contingency planning is directed accordingly.
- Controls - strong measures need to be established to prevent the incident happening. Although the most serious situations cannot be prevented entirely, there is a lot that can be done to reduce the likelihood, or limit the impact, of a crisis.
- Incident response - the initial response to any crisis is often the most important. The work a business does to assess and minimise the impact in the first hour after it occurs can have a greater effect than anything done in the weeks and months afterwards.
- Communication - is the key to an effective response. Firstly, incident escalation needs to be swift. Communication needs to continue to be effective throughout with all those affected kept informed.
- Recovery plan - Planning in advance what the organisation needs to do to get back on its feet is likely to shorten the recovery time and reduce the impact upon the organisation, its people and any other connected parties.
- Training - there will not be time to think during a crisis situation. Proper and regular training will ensure people are ready to respond.
- Testing - this is the only way to determine whether the plan will actually work and to identify missing details that may make all the difference.
- Maintenance - keeping the BCMS up to date is essential. Threats change and organisations evolve.
- Improvement - the BCMS needs to be refined to reflect lessons learnt from testing and the experiences of other organisations.
Where are the likely gaps?
Most organisations will already have some form of business continuity planning in place that follows this approach. Trends noted from our review of recent high profile events suggest that there is a danger that some may find that their approach has overlooked important details, which will be exposed when an incident arises. Some examples are set out below.
The importance of testing the plan is not always recognised. In late May, British Airways suffered a major IT failure that grounded flights for almost two days affecting 75,000 passengers. Although the causes of the problems with the disaster recovery solution remain unconfirmed publicly, the inability of the business to restore its critical IT systems quickly, pointed to a business continuity plan that had not been tested sufficiently.
The Borough Market incident highlights the importance of checking the details of recovery arrangements very carefully, particularly insurance. Almost immediately a security cordon was established around London Bridge and the surrounding area, limiting access to all premises within the cordon for several days. It appears that many of the businesses did not have business interruption policies and if they did denial of access was not covered.
Communication is another key area where plans can fall down. Assumptions may be made about the availability of the organisation’s incident response team. Borough Market events took place late on a Friday evening. Some key individuals may not have been contactable as quickly as anticipated.
Organisations also evolve. Business acquisitions and changes to third parties can fundamentally undermine the BCMS and organisations may be relying on controls to protect them that are no longer effective. TalkTalk had measures to monitor attempts to breach its IT security, which were undermined by the existence of unmonitored and unpatched webpages belonging to Tiscali - a company acquired by the business.
How should Heads of Internal Audit respond?
Most Heads of Internal Audit will have already looked at business continuity arrangements within their organisations at least once and have probably concluded that these are appropriate and follow the well-established principles of the available guidance. They will have also checked and confirmed that the organisation was taking steps to test these periodically.
In the light of the events of 2017, perhaps we should look again and consider whether the organisation is really testing and checking its plans as well as it should be and whether these are being properly updated to reflect changes to the organisation or any third parties it depends upon. Since the controls involved are so critical in the event of an incident, perhaps they should also be reviewed more frequently.