Security incidents, human error and hardware failure all pose considerable risks to organisations of any size. They can cripple systems and leave them without essential data for extended periods, which can do huge financial and reputational damage.
Understanding the threats and how to manage them effectively should be at the core of your organisation’s business continuity plans, along with what you do with your critical data to ensure you can restore it in an emergency.
Here, we take a closer look at the differences between data backup and business continuity and what an organisation needs to mitigate common risks.
Business continuity vs backup
While they overlap to a certain degree, data backup and business continuity are separate entities, each requiring their own specific solutions. However, you can tailor them to ensure they dovetail neatly where needed.
What is business continuity?
Business continuity is about how you ensure your organisation can get back up and running in the event of a system outage that disrupts normal service provision. That could be the failure of an ICT function, a building problem, flood or fire, or even if key staff members couldn’t perform their duties due to illness or sudden absence.
What is business continuity for ICT?
Business continuity for ICT is how your organisation’s ICT department plans for – and mitigates – risks to its service continuity. It’s about ensuring its ability to deliver the service it’s responsible for – usually user devices, connectivity and applications, and the support for all of these.
What is data backup?
Data backup concerns keeping your organisation’s business-critical data safe and being able to restore it in the event that it is lost or damaged. Backing up your data will ensure you can restore the original files from copies, should the original files become damaged, corrupted or lost due to hardware or software faults or failure, malware and viruses, hacking and cyberattack, power failure or simple human error.
Keeping your organisation’s data backed up regularly will also help to protect against the following risks:
- Accidental deletion of files or emails
- Misconfiguration of retention policies
- Corruption of files within your production environment that isn’t noticed until the corruption has been replicated
- Compromised administrator accounts used to remove retention policies and delete data
- Issues with sync functionality between devices causing files to become corrupted
- Crypto-malware attacks that go unnoticed before the affected files have been encrypted and replicated
Why is business continuity important?
There are many reasons why your organisation’s business continuity is essential.
It can ensure your assets, data and people are protected and continue operating as usual in the event of a disaster.
It can help ensure you remain compliant with all the relevant legislation in your industry, as well as regulations that affect everyone, such as GDPR.
This can, in turn, boost confidence among your customers, stakeholders and employees and give your organisation a competitive advantage.
Crucially, though, the main thing business continuity is concerned with is protecting your organisation in the event of an unforeseen event or disaster, such as a system outage, data breach, fire or flood.
If your organisation has diligently gone about its business continuity planning, it should have plans in place to minimise the damage and risk caused by loss of data, people, infrastructure and workspace.
What business continuity solutions for ICT are available?
Business continuity is about allowing the business to continue when a significant event disrupts regular service provision. While business continuity is not solely concerned with ICT, in today’s increasingly digitised economy, ICT will play a significant role.
So, making your ICT more resilient to the failures it encounters so it can continue to deliver its service is a vital part of business continuity.
That’s what we are going to focus on here.
The main aim of a business continuity plan is to outline the steps your organisation will take to prevent – and recover from – potential threats. Its objective is to ensure your people, assets and operations are protected and able to function as normally as possible in the event of a disaster.
Your business continuity plan should consider what risks your organisation could face, including physical threats like fire or flooding to incidents like cyberattacks or a pandemic, to determine how they might impact your operations and what safeguards, procedures and policies you will put in place to mitigate those risks.
Business continuity solutions are the resources an organisation has available to ensure it can survive when disaster strikes. This can include people, places, processes and technology.
As well as having a relevant suite of business continuity solutions, it’s also important to consider how your organisation will test them to ensure they work and how you review your business continuity plan to keep it relevant and up to date.
When it comes to selecting business continuity solutions, there are a few ICT terms you need to be familiar with:
Highly available solutions are typically delivered out of more than one site and can survive the failure of at least one component at any level.
Single site redundancy
These solutions are delivered from a single site, but redundancy is achieved through a duality in most other aspects of how the service is delivered – for example two switches, two servers, two firewalls etc. Both halves of the solution deliver content to users in parallel, or with near real-time failover.
This solution probably has redundancy at application, server, storage and connectivity level, but not at site level. It is typically the cheapest type of redundancy but is not considered to be highly available due to it being only on a single site.
Dual site delivery
This solution delivers out of two sites fully redundantly, meaning it has redundancy at application level, server level, storage level, connectivity level and at site level. Again, both halves of the solution deliver content to users in parallel or with near real time failover. Dual site delivery is one method of delivering a highly available solution.
Disaster recovery (DR) refers to a (usually off-premise) recovery site for your organisation’s data. It’s essential for recovering data, resuming operations, maintaining business continuity and preventing data loss. DR facilities are usually held in ‘hot’ standby, where they aren’t actively serving content to users but can do so rapidly under failure conditions, or in ‘cold’ standby, where they may take longer to ‘stand-up’ under failure conditions. DR usually requires some level of restore from backups to facilitate.
Recovery time objective (RTO)
RTO refers to the length of time it takes to get a service back up and running in the time of the failure event. It is typically governed by the technology solution used to restore service. For example, if the solution is to restore from backup, then the RTO will be constrained by how long it takes to reconstitute the data to the application, service, server or storage.
The decision on the value of RTO ought to be driven by how long the business can survive without the affected system.
Recovery point objective (RPO)
RPO refers to how many hours of lost data there are between the last backup or valid replication and the failure time. For example, if the restore is from a backup taken at 1am, and the service fails at 1pm, the RPO will be 12 hours. Typically, RPO based on a backup will have a variable RPO from 0 hours to the period between backups. For daily backups, this is between 0 and 24 hours.
The decision on the value of RPO ought to be driven by how many hours of lost data from the affected system the business can survive.
How to choose an ICT-based business continuity solution
There are a considerable number of options for delivering an ICT-based business continuity solution for any particular service. For example, an SQL database could be:
- Restored from backup
- Restored from backup with the transaction log
- Built redundantly using MS replication, mirroring, log shipping, clustering or building an Always-on cluster
- Replicated using Veeam B&R, VMware Fail Tolerance, VMware vSphere High Availability
- Created from any valid combination of the above.
So, the next time someone within your organisation says, ‘I just need ICT to give me a business continuity solution for my application’, the questions you should be asking include:
- What happens if the application goes down?
- If the application went down, what amount of data loss is acceptable? (1 hours’ worth? 24 hours’ worth?). This will give you the RPO.
- How long can you survive without it? This will determine the RTO.
- What are the potential financial, legal, compliance, regulatory and customer service impacts of the system being unavailable?
- How much are you prepared to spend on delivering a solution?
- Do you have that budget?
You will probably find that they aren’t prepared to spend anything but want an RPO/RTO of zero, which is practically impossible to achieve. So, the impact needs to be tied to the cost.
Leveraging RPO and RTO to drive decisions
RTO and RPO are two key things to consider when looking at business continuity from an ICT perspective.
Your desired RTO will determine low long your organisation can be without its systems and data before it is at risk. Your RPO, meanwhile, will influence how often you need to back up your data.
So, you may have an RTO of 12 hours and an RPO of two hours. You can use both these factors to guide your business continuity plan because they will set out how much downtime your organisation can endure and still survive and how much data you can afford to lose.
They can also be used to determine the cost and the benefit of avoiding each scenario you have identified in your business continuity plan.
Not one size fits all
Business Continuity Plans and Solutions need to be designed at an application or system level – not once for all systems. You also cannot determine your most important systems purely based on how many users a system has. You may have an Excel Spreadsheet (Please No!) that is used once a month by just 2 staff members to meet a compliance need and the consequence of it being lost might be prison time or bankruptcy.
This will enable you to work out your priorities and where best to allocate time and resources to mitigate the risk.
An outage can result in a big financial hit to your organisation, resulting in lost or delayed sales and income, the expense of putting things right and any regulatory fines, contractual penalties or lost custom.
The cost of avoiding this in the first place through business continuity planning could be a small fraction of the impact costs. With a robust business continuity plan, you can keep downtime to a minimum and minimise the impact and disruption that an outage can cause.
AMDH Services Ltd works in partnership with its clients to help them define their organisational objectives and understand the link between their technology and their goals. As your ICT partner, we can help you understand a mitigate the risks associated with system downtime to make your ICT infrastructure more robust and improve your operating efficiency. To find out how we can help, give us a call on 01332 322588.