The hallmarks of high availability

Mark V. Converti, Honeywell Process Solutions, USA

In the competitive power markets of today, power generation facilities are getting closer and closer to 100 per cent uptime. Companies that lead share a common trait: investment in the right technologies to ensure that the goals of high availability and reliability can be met.

The nature of global business in the 21st century is forcing many companies to operate around the clock if they want to remain competitive. Crucial to this operation are the people, information technology and Internet-based applications that support sales, service and product delivery. All of this activity requires power ” sufficient and reliable power supplied by utility companies. Therefore, power generation facilities and the distribution grids that deliver power must be even more reliable, with the only acceptable goal being 100 per cent uptime.

Driving uptime

Power generation plants that maximize uptime have certain characteristics: safe and reliable operation, minimal forced outages, maximum availability and responsiveness to the needs of the power grid, while achieving the lowest operating costs possible. The technologies of the 21st century can assist greatly in improving and maintaining these characteristics.

Click here to enlarge image

The factors that work against uptime are also well-defined ” operator errors or inability to access critical information when needed and equipment-related problems including leaks, hidden blockages, failures, defects, and gradual or undetected wear and tear. Any of these factors can result in decreased operating levels or trips, which can then lead to unplanned plant outages at power plants. Even after problems are identified and fixed, the amount of time required to get back online and/or up to maximum load levels can significantly extend the length of an outage.

Figure 1. On August 14, 2003, when the lights went out from Ohio to New York City and into Canada, a host of weaknesses in the North American were revealed. The blackout brought to light some clear lessons for continuous process industry plants everywhere about preventing and responding to catastrophic events
Click here to enlarge image


While significant improvements have been achieved through the conversion of electro-mechanical, analogue and hydraulic systems to centralized digital control systems, these advances alone are no longer enough in some of today’s competitive power markets. A modern control and information system needs to anticipate problems, and make information easy to access, embed and distribute.

Here are two examples:

  • A power plant in the northeastern United States implemented modern systems and applications that include automatic runback and rundown procedures embedded in the unit control schemes. During the blackout that occurred in August 2003, these systems and applications enabled the plant to stay running, albeit at reduced levels, while others were forced to shut down.
  • Another power plant in the southern United States installed an asset management solution that included a site-wide performance monitor. This enabled it to find a condenser leak that was unknown to the control system, and had kept the plant from achieving maximum load. Left undetected, this problem would have eventually led to an unplanned outage.

Effective assets

Safe and reliable operation involves keeping the plant equipment running smoothly, within specifications and minimizing the stress or neglect that would cause gradual degradation or sudden failure. Solving this problem starts with managing assets more actively and analysing the performance of critical equipment over time to understand the effects of operational changes, whether planned or unplanned.

New asset management applications have now made it possible to monitor the performance of not only the sensor inputs and final control elements (valves, pumps, motors, etc.) but also condensers, feedwater heaters, coal pulverizers, turbine generators and even the power boiler itself (see sidebar, “A tale of two companies”). Once the fault modes are better understood, improvements can be made to the control loops and startup/shutdown procedures that operate plant equipment. Multivariable, model-based control algorithms can be used on critical loops, or simply ‘tightening’ the control schemes to reduce variability may be enough to minimize stress and extend equipment life.

Figure 2. These two graphs show the desired effects of using technology that helps avoid abnormal situations. Over time, a more efficient operation will result, and unplanned events and other losses will be avoided
Click here to enlarge image


Protection from the effects of power grid upsets can also be ‘built in’ to unit control schemes, providing runbacks and rundowns, and other means of falling back to safe operating levels. Automatic routines can be added to generating unit controls that speed plant restart, and the return to maximum load and profitable online operation.

A crucial component to achieving maximum uptime is to ensure that the experience and knowledge of the best plant operators is captured, easy to find, and is available to all the operations staff whenever it is needed. Modern control systems make it easy to embed this information right where it is needed ” in the system alarm and event summaries, point displays, trends, control schemes, custom graphics displays and the engineering tools used to build them. Context-sensitive help and powerful online search tools speed access to information, and eliminate the need for searching books, manuals and paper drawings to find information to troubleshoot problems.

Getting there

If a power generation company sets the goal of 100 per cent uptime, how should it evaluate its current operation? What approach should it take?

First, baseline measurements should be taken, using data scouts and the plant-wide historian, to determine the current status. Original equipment specifications may also be used. The baseline can be automatically compared to actual operation over time with asset management software, providing an early alert to operators. It is important to do this site-wide, encompassing as much of the process equipment as possible, rather than just focusing on a few control elements such as valves or pumps. Otherwise, key performance trends will be missed. Thus, operators will be notified if performance degrades and will know which equipment to inspect and schedule for a retrofit ” long before a breakdown occurs.

Adoption of wireless and automated approaches to calibration and regular maintenance rounds will help feed more useful data to the asset manager. Plant studies can help determine changes to personnel and work flow required to improve uptime. A modern control and information system that embeds and accesses critical knowledge can speed troubleshooting efforts. In order to work properly, time must be taken to collect useful expert knowledge and make it available to the online system. The initial data collection task can be assisted with tools provided by the system itself, in concert with the vendor. As time goes on, the operations staff can add more data incrementally.

The importance of regularly training operations staff cannot be minimized. And, here again, technology can lend a hand: a high-fidelity process simulator can assist in keeping operator skills consistent and relevant (see sidebar “A tale of two power companies”).

Benefits outweigh the costs

The value of uptime is so high these days that any investment toward the achievement of this goal pays a high return. The costs involved may include process equipment retrofits, purchase of additional control system hardware, measurement and control devices, application software and services. But by doing so, you’ll extend the life of existing equipment and systems and make operator and maintenance personnel more effective, with the net result of getting closer to 100 per cent uptime.

A tale of two companies

Here’s how two USA-based power companies are making investments in technology with an eye to increasing plant uptime.

Coal Creek Station, Great River Energy, North Dakota, USA

Coal Creek Station is using a custom high-fidelity simulator to train employees on all major pieces of equipment such as the generator and electric power systems, condensate and feedwater heaters (LP and HP) systems, turbine-driven variable speed boiler feed pumps, and vacuum pump system.

“Coal Creek Station units are reliable and usually base loaded, which minimizes operator ability to operate under abnormal or routine shutdown/startup conditions,” notes Glen Mueller, Coal Creek Station training specialist, Great River Energy.

Mueller says the need for a high-fidelity operator-training simulator was determined by evaluating operator performance over a five-year period. Four operator factors were identified and used to justify the simulator:

  • Shorten post trip startup times
  • Shorten post outage startup time
  • Reduce operator related trips by one per year
  • Improve heat rate by 10 Btu/kWh

Texas Genco’s W.A. Parish Electric Generating Station, Thompson, Texas, USA

The W.A. Parish plant comprises four coal fired steam units with total net generating capacity of 2462 MW and five gas fired units totalling 1191 MW. Texas Genco’s central maintenance division is responsible for overhauling steam turbines. Specialists monitor a variety of process variables during turbine startup, and the work requires tools for monitoring alarms, trends and other indicators.

Like other power generation companies, Texas Genco is seeking a proactive maintenance strategy to improve management of valuable plant assets. Instead of reacting to equipment damage followed with Root Cause Analysis, predictive maintenance allows for planned

service of field equipment and minimizes the need for unit shutdowns. Due to staff reassignments, personnel working in different plants need tools that assist in process and equipment monitoring.

The facility participated in a two-part pilot project involving Honeywell’s Workcenter PKS and Asset Max Alert Manager. According to Charles C. Longcoy, senior power generation specialist, the systems are enabling personnel to monitor key performance indicators and other production-related data from different data sources. In addition, employees have immediate access to relevant information to facilitate collaborative decision-making and teamwork.

“Improved asset management strategies may reduce downtime and associated costs,” Longcoy says. “Workcenter PKS tools look at process variables and organize data so that personnel can make better decisions regarding operations; the Alert Manager solution increases knowledge of equipment health and performance, allowing repair work to be performed during planned outages.”

Lessons from the North America blackout

Lane Desborough, Honeywell Process Solutions, USA

On August 14, 2003, when the lights went out from Ohio to New York City and into Canada, a host of weaknesses in the massive grid that powers North America were revealed. The blackout also brought to light some clear lessons for continuous process industry plants everywhere about preventing, preparing for and responding to catastrophic events.

Get control of your alarm system

Authorities are still reviewing records of the thousands of alarms and events that occurred up and down the power grid to determine exactly what caused the outage. But whatever the cause, it’s clear that non-functioning alarm systems, alarm floods and ineffective operator responses exacerbated the situation. The lesson: don’t wait until disaster strikes to test your alarm system.

Create a framework for agile decision-making

The ability to make fast, informed decisions in the face of market-impacting events is critical to profits. Even if they weren’t directly affected by the blackout, many companies still had to quickly assess the impact on their raw materials supplies, maintenance schedules, customers and market opportunities, and act accordingly. Systems and software that support business agility by connecting the control room to the maintenance shop to the front office would have enhanced decision-making speed and accuracy.

Assume it could happen to you ” and prepare

It’s human nature to assume that the catastrophe will happen ‘somewhere else’ or to ‘the other guy’. But the lesson here is to heed industry best practices ” put the measures in place to prevent abnormal operating situations from spiralling out of control, implement safety instrumented systems to mitigate the damage of catastrophic events, and use decision support systems to make the most of market transitions.

No posts to display