|
When so many business processes that determine your
success rely on technology, you'll want to make sure
your IT team remains resilient and failure free.
During the last decade, the increasing pace of
business—and the growing dependence on IT for its
operations—has forced organizations to invest in
technology to manage and protect their critical
information assets. But keeping today's businesses
operational and resilient requires more than
leading-edge technologies—it requires a significant
and continual investment in the people and processes
that operate and support these technologies.
As dependence increases, the potential for an IT
failure to disrupt business operations becomes a
serious management concern. Organizations must find a
way to reduce exposure to IT risks, decrease costs and
build greater capacity for IT to drive business
innovation.
The Man Behind IT
Despite all the leading-edge technology, it's
ultimately people who are responsible for maintaining
a resilient IT infrastructure and business continuity.
Managing multiple job sites, upgrading or patching
systems and storing and securing data all takes
manpower. Meaning the root cause of IT failure
frequently lies in process and skills issues.
According to a recent study conducted by Symantec,
researchers at the University of Maryland and MIT, 53
percent of IT failures were linked to process issues
involving asset management, testing, change control
and patching. In addition, more than 40 percent of IT
failures analyzed were tied to gaps in end-user
expertise and product knowledge.
Master the Basics
The solution lies in establishing processes for these
regular and routine processes. Processes enable
workers to treat all components the same, reducing
effort and potential risk that would be entailed if
each component is managed differently.
And, with today's often tumultuous workforce turnover,
process is needed to fill in the gaps left in the
knowledge base. In the event of an employee's
permanent absenceýýor even a temporary one such as
sickness or vacation—a lack of processes can prove
devastating if information is not passed along
accordingly.
A telecommunications carrier recently learned the
value of processes the hard way. Without protocols in
place for rotating and reusing backup tapes, the wrong
backup tapes were erased and prepared for reuse. IT
staff realized they selected the wrong tapes only
after they were already cleared. Establishing and
following set processes for rotating backup tapes
would have saved the data, which was never recovered.
Processes also must be established for disseminating
information across teams of all types. The size and
geographic location of IT departments often impacts
the flow of communication; however, sharing best
practices and lessons learned with cross-functional
groups is vital for increasing productivity and
eliminating further IT headaches.
For instance, take a healthcare provider with three
major sites. After two sites were infected with a
virus, alerting the third site of the pending danger
would have prevented an infection from the same virus
six months later. Why was information not shared? The
answer: no communication processes were in place to
share experiences and learning across locations.
Reacting to the Unexpected
Processes provide two key benefits to IT personnel
responding to incidents. First, established processes
leave behind an audit trail of changes and activities
that can be referred to when determining the source of
a crisis. Second, depending on the needs of each
individual situation, personnel can customize
pre-determined protocols instead of creating new ones
on the fly, saving significant time, effort and
potential for error.
The processes define a checklist of critical tasks to
be performed and questions to be asked, allowing
people to focus their attention on identifying
additional tasks rather than trying to remember all of
the basics. When unexpected events occur, it's nice to
know that certain standards will be kept and staff can
spend time effectively addressing the most critical
and unique elements of the problem.
Recently, a financial institution rolled out a weekend
upgrade to their cluster environment. As the roll-out
progressed, a configuration issue cropped up. Although
the institution had processes in place to rollout an
upgrade to their environment, there was no protocol to
follow for an unsuccessful roll-out. The problem was
compounded because the key systems architect was on
vacation at the time. Recognizing the potential for
problems with the upgrade would have enabled the
institution to better prepare for and respond to the
issue by having the resources available to support
problem resolution in a timely manner.
More than Words on a Page
Even when processes are in place, organizations
struggle with getting IT staff to follow established
procedures. Unfortunately pages of notes or thick
binders with step-by-step processes for handling
routine or crises situations will not guarantee
success.
For many IT departments, processes for handling change
are either not comprehensive enough or organizations
do not have the right pieces in place to keep them
resilient. In fact, of the 53 percent of cases caused
by process issues, 11 percent were due to poor
execution rather than poor or missing processes.
Although there are no processes that can adequately
address all incidents, ITIL and Six-Sigma practices
provide solid starting frameworks and disciplines to
implement and reliably utilize processes in a variety
of circumstances. Adopting such practices will also
help mitigate many incidents.
While processes can play important roles in handling
unexpected events and ensuring mistakes don’t happen,
it's people that help ensure the right steps occur.
For example, a recent study conducted by IDC showed
that well-trained teams were twice as likely to
properly protect their PCs from security threats and
were 60 percent more likely to successfully complete
backup jobs. With IT failure occurring more than 40
percent of the time from lack of IT staff skill and
training, the need for proper instruction is evident.
It All Comes Down to Culture
Part of creating a resilient infrastructure is
building a high performance culture that can manage
change effectively. In addition to training, holding
IT staff to the highest operational standards, such as
those held by other critical business operations
within a company, will help streamline the
implementation of proper procedures. Much like the
manufacturing industry, which tolerates little or no
downtime, IT organizations should strive to minimize
its level of tolerance for downtime by adhering to
stricter policies and procedures.
In order to successfully make this paradigm shift
organizations should do the following:
• Recognize the value and need for investing in
training, certification and expertise amongst staff.
• Provide Six Sigma-like level of attention to IT
operations around process definition, documentation,
performance measurement, and continuous improvement.
• Focus on understanding the true root cause of issues
rather than settling for convenient explanations,
separating near term incident management from longer
term problem management.
• Recognize warning signs and learn from near misses.
Become preoccupied with small failures as a signal of
deeper process or skills issues that should be
addressed before larger failures occur.
• Build a culture of resilience so that everyone in
the organization can react appropriately when
inevitable problems occur.
Although there will never be a process for every
situation, IT teams can eliminate the root cause of
failures—and identify the cause of failures more
easily—by establishing and following a standard set of
protocols and equipping people with the knowledge to
manage and adapt them properly. Only then can
organizations build a culture and skill set that
addresses the issues standard protocols cannot.
Bob Yang, senior director,Symantec Services. Catherine
Anderson, Smith School of Business, University of
Maryland. George Westerman, Center for Information
Systems Research, MIT Sloan School of Management.
|