Friday, June 7, 2013

Test Thy BCP

Most organizations rely heavily on their information systems without contingency planning in the case of a disaster.  Imagine an organization that never tests its business continuity plan (BCP). Is that organization ready to respond effectively and resume operations of mission critical services with minimal disruption?

The objective of the BCP is to provide the information and procedures necessary to respond to a disaster, notify necessary personnel, assemble business recovery teams, recover data, and resume operations to ensure minimal disruption to the company’s operations.

The BCP identifies the information, material, facilities, personnel and procedures required to facilitate a rapid recovery from a disaster.  The successful recovery of operations depends on performing a periodic comprehensive test of the BCP.  Therefore testing your BCP is an integral component of a successful recovery of operations, if disaster strikes.

The BCP should include documented and tested procedures which will assist in ensuring the availability of critical resources and in maintaining the continuity of operations during an emergency situation.  The BCP should aid in ensuring organizational stability through an orderly recovery process in the event of significant problems and disruptions.  The plan should not be intended to be a procedures manual of how to perform all departmental functions; it should include only those high priority tasks required to ensure successful recovery from a business disruption.


Testing The BCP
Every component of the BCP should be tested annually.  Critical and/or highly volatile components should be tested at least quarterly and after any major technology change.  Call trees should be tested at least semi-annually, and any component which fails the test should be re-tested as soon as possible.  The tests to be performed should address important business processes and related systems classified as highly critical. Management should consider additional non-highly critical processes and systems to be included on future tests as previous tests of highly critical processes are successfully tested.  Under no circumstances should the testing of highly critical processes be limited or excluded.  The following considerations should be evaluated during the planning, coordination, and execution of business recovery tests.
  • Management formal approval of the costs associated with the tests, normal business deadlines, resource requirements (human, material, equipment), and impact on daily operations due to key personnel participating on tests.
  • Definition of recovery scenarios (e.g. partial or full destruction due to natural and man-made disruptions, date and time of simulated event, affected business processes, etc).
  • Definition of test objectives, scope, expected results as well as the criteria to be used to consider the results of the test as successful. The objectives must have measurable goals such as maximum time to recover, time limit to recover, amount of items completed or failed procedures to determine the effectiveness and successfulness of the tests.
  • Documentation of the test objectives, scope, expected results, and test results.
The main reasons for testing the BCP include:
  • Determining the feasibility of the business recovery process.
  • Verifying the compatibility of alternate processing sites, hardware, software, and telecommunications.
  • Identifying deficiencies in existing procedures.
  • Identifying areas in the BCP that need modification or enhancement.
  • Providing training to the Team Managers and Team Members.
  • Ensuring the adequacy of procedures relating to the various teams involved in the recovery process.
  • Demonstrating the ability of the organization to recover within a reasonable time.
  • Providing a mechanism for maintaining and updating the BCP.

Standards For Testing The BCP
An annual test of the BCP is required. Segments of this test process can be staged throughout the year to minimize disruption and yet facilitate testing of the BCP. Depending on your testing methodology and organizational strategic plans, your organization can leverage the following types of tests to ensure the effectiveness of the BCP:
  • Process Review TestingA third party evaluates whether all critical processes for services are addressed.
  • Checklist – Copies of the plan are sent to department and business unit managers to verify and review BCP procedures corresponding to their functional area. This is a simple test and should be used in conjunction with other tests.
  • Structured Walk-through – Team members and other individuals responsible for recovery meet and walk through the plan step-by-step to identify errors or assumptions.
  • Simulation – This is a simulation of an actual emergency. Members of the response team act in the same way as if there was a real emergency.
  • Parallel – This is similar to simulation testing, but the primary site is uninterrupted and critical systems are run in parallel at the alternative and primary sites.
  • Full interruption – This test involves all areas of the company in a response to an emergency. It mimics a real disaster where all steps are performed to test the plan. Systems are shut down at the primary site and all individuals who would be involved in a real emergency, including internal and external organizations, participate in the test. This test is the most detailed, time-consuming, and expensive test.

Testing Report
For item reviews, equipment, and procedures testing, a checklist will work well to illustrate what was tested and the results. The checklist should be prepared in advance. Sampling techniques can be used to review telephone numbers per critical call list, addresses of individuals, vendors, equipment, employee information, and forms.

Test Results
Test results should be reviewed and approved by Management. Tests will be analyzed on the basis of the following criteria:
  • Actual time to complete BCP recovery procedures and steps versus projected time.
  • Elapsed time to perform each activity in a recovery mode.
  • Analysis of the accuracy of each activity and event in the recovery effort.
  • Amount of work completed.
The test should be rated:
  • Satisfactory. Minimal disruption or problems noted; any exceptions would be easily overcome during a real disaster recovery situation.
  • Partially Satisfactory. In these instances, while certain aspects of the test may have been performed satisfactorily, the following situations would result in this rating: too many minor errors were noted; confusion in the process of recovery occurred during the test; slow recovery time; breakdowns in communications; the need for focused improvement.
  • Unacceptable. Significant problems occurred and the institution is at risk.  This rating reflects some aspect of resumption that did not test well, which in turn may produce problem situations in accomplishing orderly business resumption.
Senior Management should review the test results and note areas for enhancement to the BCP and Recovery Procedures. A plan and record of testing should be maintained by the BCP Coordinator to ensure that each relevant area of the BCP and Recovery Procedures are tested at least annually.

Trying Times
In trying times such as the ones we are experiencing today, an organization’s ability to get back on its feet quickly and efficiently when disaster strikes is critical to both customer retention and business reputation.  It could mean the difference between success and failure.  Test Thy BCP!

No comments:

Post a Comment