Software Pipelines A New Approach to High-Performance Business Applications
From Software Pipelines Alliance
Contents |
[edit] Abstract
As organizations continue to accelerate growth, enterprise applications are required to support dramatically higher business volumes. Business applications are consistently required to run faster, support more users, or process more transactions. In the past, technology professionals have relied on faster processors to meet performance goals. However, recent advances in hardware technology use multi-core architectures that cannot be leveraged by most existing enterprise software applications as they are not multi-threaded.
The high performance challenge is particularly daunting for organizations that also require their systems to be flexible in order to support changing business requirements. Many development teams have adopted SOA development models to deliver flexible and agile application components, but frequently have not been able to deliver the performance and scalability that the enterprise requires. Flexibility has come at the expense of lower application performance.
Performance of service-oriented business applications can be dramatically increased using Software Pipelines, a methodology that provides a highly-scalable, flexible paradigm for implementing concurrent processing. This white paper explains how Software Pipelines can deliver a sensible and practical approach to concurrent processing that enables demanding business applications to be scaled to meet growing needs without giving up the flexibility and agility that is so important in today's business environment.
[edit] Introduction
In today’s information-based economy, application software performance can literally mean the difference between success and failure of the business. Because employees and customers are so dependent on business applications to conduct their business transactions, poor application performance can directly impact the bottom line. Performance issues in applications such as banking systems, trading systems, call center operations, or reporting services can lead to severe consequences and even threaten the viability of the company. Sustained performance problems can lead to damages such as lost revenue, poor customer satisfaction, a negative image in the marketplace, and eventually, an inability to compete.
Performance of critical business applications has always been a key concern for the professional developer and throughout IT history there have been numerous solutions, architectures, and approaches offered to address performance and scalability issues. While the promise of SOA surpasses that of previous IT architecture trends in terms of flexibility and adaptability, it has not gained a reputation for high performance. In fact, with service-oriented application development now on the forefront, performance is gaining even more attention due to several factors:
- Service-oriented architecture concepts are generally perceived as putting significant additional performance demand on systems as compared to earlier monolithic or tightly-coupled systems
- The very notion of loosely-coupled “services” infers a messaging-centric approach to application development, meaning that applications must now handle not only traditional processing logic, but also message transmission, validation, interpretation, and generation
- As the use of service-oriented concepts proliferates, messaging volume is expected to explode, adding tremendous load with potentially adverse performance implications to existing IT systems
- Today’s multi-core chip architectures do not automatically deliver higher performance to existing business applications, and concurrent processing techniques may be required to gain the full benefit of new hardware
The performance issues related to service-oriented development that organizations will experience over the next 12 – 24 months are expected to be similar in nature to the growing pains of previous software architectures when they were “new.” Looking back through the past 20 years, the shake-out period associated with each new major paradigm shift in software development caused significant performance-related issues, generally continuing through the first 1 – 3 years of wide adoption. The learning curve developers faced often resulted in deployed applications that failed to meet performance expectations and, in many cases, resulted in the outright failure of entire application development projects.
[edit] Today's IT Challenges
As organizations continue to accelerate growth, business applications are consistently required to run faster, support more users, or process more transactions. With limited budgets and no easy way to expand data center capacity, IT executives are faced with the increasingly difficult challenge of squeezing more performance and greater utilization out of their existing IT systems and applications — all while delivering greater flexibility so that business systems can be quickly adapted to rapidly changing business needs.
These key challenges facing today’s IT executives are driving the need to:
- Find faster and more efficient ways to process rapidly growing volumes of data without increasing capital expenditures
- Build flexibility into business systems so that new services can be provisioned quickly in order to capitalize on new market opportunities or comply with emerging regulatory requirements and industry mandates
- Quickly expand or shrink the allocated resources for IT services in order to respond to spikes in user demand
- Drive down IT costs by improving system utilization and IT efficiency
In the past, application developers have been able to rely on rapid advances in CPU performance to compensate for the lack of software efficiency in their business applications. With CPU clock speed doubling every 18 months or so, an upgrade of the hardware environment often provided enough performance boost to keep up with the growing need for application throughput.
More recently, gains in CPU clock speed have hit a plateau due to physical factors such as power consumption, heat, and quantum mechanics. Current trends in hardware platforms have thus shifted the focus to multi-core or multi-threaded architectures. Distributed service-oriented applications, by their nature, will take advantage of multi-CPU and multi-server architectures. However, for software applications to truly take advantage of multi-core platforms, they must be designed and implemented with a new approach that emphasizes concurrent processing. This new approach is a methodology called Software Pipelines and it can enable businesses to achieve the benefits of concurrent processing without a major redevelopment effort.
The remainder of this white paper provides some historical perspective of concurrent processing and explains how the Software Pipelines methodology can deliver dramatic performance gains for business applications while protecting investments in existing business logic and enabling the flexibility of a service-oriented architecture.
[edit] Concurrent Computing and Business Applications
It is apparent that concurrent computing is required to accomplish the next level of performance needed by many of today’s critical business applications. In order to accomplish dramatic multiples of performance, applications must perform more than one task at a time. In other words, they must utilize a concurrent computing approach. While this fact is obvious, it is not easy to accomplish with today’s business applications. Even service-oriented applications, which are already distributed in nature, generally use a serial approach to processing business logic so that the proper order of execution is maintained. It can be difficult to decompose the business logic of the application into a series of steps, some of which can then be run concurrently to obtain improved performance.
Historically, the computer science field has performed extensive research and developed many techniques to accomplish concurrent architectures. Yet, the focus of past research and development concentrates on specific areas that do not easily lend themselves to the transactional applications of today’s business systems. Therefore, while the need for substantial performance improvements of business applications clearly exists and is becoming more pronounced due to the move to a service-oriented approach, existing concurrent processing techniques are either limited in applicability or are too complex for adaptation to the typical business transaction system.
There are three main approaches that have been developed for concurrent computing:
- Mechanical solutions at the operating system level such as Symmetric Multi-Processing (SMP) and Clustering techniques
- Automated network routing solutions, such as “round-robin” distribution of requests
- Software-controlled Grid Computing
Mechanical solutions at the operating-system level have no doubt benefited many organizations to date by providing a generic one-size-fits-all approach to concurrent computing. SMP solutions automatically split running application tasks among multiple processors on a single physical computer, sharing memory and other hardware resources. This approach is highly efficient and easy to implement, as the application developer needs no specific, detailed knowledge of how the SMP divides the workload. For SMP to be effective, however, a software application must be written using multi-threaded logic. This is a tricky task that is not generally practiced by corporate IT developers. Furthermore, the tight sharing of resources between processors is both limiting in terms of performance and problematic when applied to business application needs. Shared resources become a bottleneck at some level of scalability, as the necessary locking of resources in this type of system is not optimized for any particular application. Therefore, a given application may scale well to 8 processors, but benefit very little from applying 16 processors to the problem. In addition, resource contention (such as shared software components) can be very difficult to debug in this type of black-box approach.
Automated network routing solutions divide application requests using some type of predetermined logic. A common approach is “round-robin” routing, where requests are evenly distributed, one after the next, among a set of physical computers that provide exactly the same application functionality. A good example and use case for this type of concurrent architecture is a Web server application, wherein each Web page request is delivered to one of several available processors. While the approach can be useful, it is highly limited as the router has no concept or logic for determining the best route for a given request, and all downstream processors perform the identical processing tasks.
Clustering techniques are also widely used, allowing separate physical computers to share the workload of an application over a network. Clustering provides some capabilities for automatic concurrent processing and is also used to support fail-over and redundancy. In this scenario, redundant resources are replicated across the network, a highly-inefficient approach. Because clustering techniques are automated, they must copy everything from one node in a cluster to another whenever a change in state occurs or they must rely on a centralized resource (such as a relational database), which can become an even more serious bottleneck.
All of these techniques have their use, yet are limited when it comes to massive scalability, particularly when considering the needs of transaction-based, message-oriented applications. In essence, they can only scale mechanically and automatically to a certain level, at which point the overhead of maintaining shared or redundant resources becomes more of a burden than the resulting performance improvement.
Grid computing, on the other hand, is used to achieve far greater scalability by distributing discrete tasks across many machines in a network. In a grid computing environment, it is left to the application developer to decide how best to divide a single large task into many smaller sub-tasks, utilize the grid environment to distribute the processing, and then re-assemble the results once processing is complete. The typical grid architecture includes a centralized task scheduler for distributing and coordinating the tasks with other computing facilities across the network. It has been shown that a grid approach can deliver far higher throughput than the automated approaches described earlier, yet such an approach puts a significant burden on the developer.
However, and most importantly, grid computing has been modeled primarily to solve the “embarrassingly parallel” problem – long-running, computation-intensive processes typically found in scientific or engineering applications. Typical and productive examples of grid computing applications are modeling fluid dynamics, tracing the human genome, and complex financial analytics simulations. Each of these application areas have the common characteristic of dividing a massive, long-running computation among multiple nodes, dividing the problem into smaller, similar tasks that tend to behave in a predictable manner when considering computational resources.
[edit] Unique Requirements of Business Applications
There are three primary reasons that business applications do not lend themselves to these traditional approaches to concurrent processing.
- Order of processing is critical
Business logic must be performed in a specific sequence to ensure the integrity of the business process. In many cases, applications implement a “first in/first out” (FIFO) queue by waiting for each transaction to be completed before the next one in the queue is processed. For example, a billing application cannot compute the total cost of a bill before it has looked up the rates that apply to the customer and has computed sub-totals for each different category of services. For a mobile phone bill, the business logic would need to know the total daytime minutes, evening, and weekend minutes, before it could compute the total bill. This order of processing is difficult to maintain in a grid computing environment. And while SMP systems are designed to ensure order of execution, unless the application is written with multi-threaded logic, there can be significant performance problems when the volume of transactions reaches a critical point.
- Centrally shared resources create bottlenecks
Business applications almost always involve a database or other centralized resource that is used throughout the application. In a typical concurrent processing environment such as an SMP server or a grid infrastructure, the centralized resource presents a bottleneck that limits application throughput. Resource contention eventually creates a performance problem if transaction volumes continue to increase.
- Unpredictable behavior and resource needs
As compared to a massively parallel scientific application, business applications are much less predictable in their behavior and their system resource needs. The size and processing requirements of business workloads can vary greatly throughout the day or even within a given hour. This not only makes it more difficult to divide a business application into equally sized components in terms of processing time required, but it also means that allocation of resources must be flexible enough to dynamically respond to the resource requirements of each component.
The primary work to date in concurrent computing has been concentrated on either mechanical solutions that offer limited scalability, or “embarrassingly parallel” grid-based scientific and engineering applications that lie outside the business domain.
What is needed is a new, simpler way to implement concurrent computing for business applications that can be easily implemented by the professional developers within business IT organizations and can enable business applications to take greater advantage of the concurrent processing capabilities in today's computing platforms.
[edit] The Solution is Software Pipelines
Software Pipelines offer a new concurrent processing methodology that provides a simple way for business applications to implement concurrency while maintaining order of execution priorities and simplicity of application development.
Software Pipelines represent an architectural approach that supports virtually unlimited peer-to-peer scalability, supporting the decomposition of business processes into specific tasks which are then executed concurrently. Software Pipelines also provide an easy method for the developer to control the exact distribution and concurrent execution of various tasks, or components of a business process. Workloads can be balanced across the resources within a single server or across a multitude of servers. This approach is specifically designed for business applications, particularly those that use, or can use, a service-oriented architecture.
The Software Pipelines architecture is designed to handle a high volume stream of transactions, both large and small, and, thus, is ideal for mixed-workload business application processing. It is based on the use of multiple “pipelines” that each executes a portion of a business application process.
The fundamental component of the Software Pipelines architecture is the pipeline, which is defined as follows:
Pipeline — A logical execution facility for invoking the discrete tasks of a business process in an order-controlled manner. Ordering is controlled based on priority, order of message input (e.g., FIFO), or both.
Within a business application, Software Pipelines can be used to group transactions or business logic for which order of execution or priority must be preserved. For example, each customer of a bank might be associated with a specific pipeline. The pipeline would then execute all computations and transactions that relate to the specific customer and preserve the order of execution for transactions that relate to that customer. Other customer transactions could then be executed on different pipelines which could process those transactions without regard to the order of the first customer's transactions. For customer billing transactions there is no concern about whether one customer's transactions are completed before another. Different customer bills can be easily processed concurrently on separate pipelines while transactions that related to a single customer bill are processed sequentially within the same pipeline.
[edit] Controlling Pipeline Flow with Pipeline Distributors
To implement the Software Pipelines approach, a pipeline distributor is needed for sorting service requests from the business application into their appropriate pipelines and for balancing the load across multiple Software Pipelines. A pipeline distributor is co-located with a pool of pipelines, and effectively front-ends incoming service requests as shown in Figure 1. A pipeline distributor routes service requests by evaluating message content. Requests are routed based on configuration rules which can be easily modified without changing individual business services. Configuration rules can be established and modified to distribute workloads and to optimize throughput via concurrent processing according to priority, order of input (FIFO), or both.
This design approach enables scalability in two dimensions. Additional pipelines can be added under a given pipeline distributor, and when a pipeline distributor has as many pipelines as it can effectively manage, more pipeline distributors can be added as well. Each new pipeline distributor can then serve additional pipelines as necessary. When more pipelines are added to a system, the capacity for managing additional transaction volume grows proportionally. Workloads can also be easily moved between pipelines to avoid bottlenecks.
Figure 1. Software Pipelines are front-ended by a pipeline distributor that routes service requests.
[edit] Combining Simplicity and Flexibility for Developers
One of the major advantages of the Software Pipelines approach is that it is simple to implement. Multi-threading an application can require that developers dissect the application into discrete components that can be run concurrently without disturbing the business logic. With Software Pipelines, however, the business logic can remain intact and the sorting into pipelines can be based on unique identifiers such as customer ID which are often already maintained in the business logic.
In addition to their simplicity, Software Pipelines offer a lot of flexibility to developers. They provide the freedom for developers to:
- Implement concurrency only in the performance-critical portions of the business logic
- Enforce FIFO only where required
- Control distribution of system resources to help maximize utilization
Software Pipelines allow developers to maximize utilization of computing resources by controlling the throughput and distribution of concurrent tasks. The business rules that are used by pipeline distributors to sort service requests can be easily modified to tune performance and redistribute workloads. Pools of pipelines can also be allocated to a specific hardware resource and can be easily moved to take advantage of new hardware resources as they are added to the computing infrastructure.
[edit] A Pipelines Example
Consider the following simple example. A distributed network of banking automated teller machines (ATMs) must access a centralized resource to process account-related transactions. The back-end centralized computing facility for this application is an ideal option for concurrent Software Pipelines, as transaction volume is highly variable, response times are critical, and enforcement of key business rules are essential. The business requirements are obvious enough:
- Ensure that each account transaction is performed by an authorized individual
- Ensure that the transaction is valid (e.g., there are enough funds in an account to support a requested withdrawal)
- Ensure that multiple transactions on a given single account are guaranteed to be performed sequentially (i.e., FIFO is mandatory, preventing a customer from overdrawing an account via near-simultaneous transactions)
First, consider the traditional design of a monolithic, tightly coupled, centralized software component.
[Figure 2]
The simplicity of this design has several benefits:
- It is very easy to implement
- All business rules are in a single set of code
- Sequence of transactions is guaranteed
However, it is obvious that such a design results in every user transaction having to wait for any previous transaction(s) to complete. If volume scales dramatically, as in peak periods, and the input flow outstrips the capacity of this single software component to handle the load, a lot of customers will be waiting for transactions to process. All too often, waiting customers result in lost customers–an intolerable condition for a successful bank.
Using Software Pipelines, the processing task can be divided into logical units or concurrent work. The first step in any pipeline analysis is to decompose the steps required for processing. For this simple application, the steps of the business process are shown in the following diagram.
[Figure 3]
The steps are:
- Authenticate the user
- Ensure the transaction is valid (e.g., if the transaction is a withdrawal, ensure sufficient funds exist to handle the transaction)
- Process the transaction, updating the ATM daily account record
Once we understand the steps of the business process, we must now identify the pipelines for concurrent processing. To do this, the first question the architect must ask is: What portions of this business process can be executed concurrently?
For our simple example, it seems safe initially to authenticate users in a separate pipeline. This task uses a separate system to perform its work, and once the authentication returns, the remainder of the task can be handled. In fact, because we are not concerned with ordering at this stage, it is safe to have multiple pipelines for this single task. Our interest is simply to process as many authentications per unit of time, regardless of order.
[Figure 4]
While this option will potentially speed performance, the bulk of the work (i.e., ATM account updates) are still a serial process, and because these steps are downstream from the authentication step, bottlenecks are going to occur. Therefore, to achieve an order of magnitude performance improvement, we must analyze the process further to determine other potential points of optimization, while enforcing key business rules.
Once authentication is complete, the next step is to ensure the requested transaction is valid. This is done by evaluating the transaction’s current account information. Upon analyzing the business requirements, it is possible to perform multiple transaction validations at a single time, provided that no two transactions for the same account are performed out of sequence (or simultaneously). This presents a FIFO requirement, a key bottleneck in concurrent business applications. Earlier we saw how a single pipeline can be used to guarantee this requirement, but to distribute the process we must support this requirement while implementing a concurrent solution.
The key to the problem is to establish multiple software pipelines, each responsible for processing a certain segment of the incoming transactions. Each pipeline maintains FIFO order, yet can be limited by data content to a small subset of the entire number of transactions.
In this case our solution is to configure a pipeline for each branch of the bank, each of which controls a subset of individual accounts. This requires content-based-distribution of transactions, each to the appropriate pipeline. The pipeline distributor is used perform this function, with configured pipelines for each branch (Branch_1, Branch_2, …).
[Figure 5]
For our ATM example, the distributor evaluates the branch ID of each account number for each transaction, and routes the message to the appropriate branch pipeline for processing. Because each branch pipeline is configured for FIFO order, it sequentially handles the transactions delegated to it. From this example, it can be seen that by processing many branches concurrently, a far greater number of transactions can be completed per unit of time.
The pipelines approach allows for even greater scalability if needed, say for a very large branch of the bank. Assume that one branch has in excess of 100,000 accounts, and it is found that the peak volume of transactions for this branch cannot be sustained by this design. The answer is to create additional downstream pipelines, now dividing transactions by a range of account numbers (Account_1000_1999, Account_2000_2999, etc.).
[Figure 6]
From this simple example, we have shown how Software Pipelines can be used to achieve significant increases in the speed the processing of business transactions.
[edit] Conclusion
SOA has delivered big gains in terms of flexibility and adaptability, but has left some organizations wondering whether a SOA solution can meet their business application performance needs. Today's IT organizations are looking for more efficient ways to process rapidly growing volumes of data and to take better advantage of new multi-core hardware platforms.
The Software Pipelines methodology offers a simple and easy method for business applications to exploit concurrent processing without the large development investment required to add multi-threading to a traditional monolithic business application. They can enable service-oriented applications to support higher transaction volumes within existing IT budgets and help ease the strain on data center capacity.
The primary business benefits that can be achieved with Software Pipelines include:
- Reduced capital expenditures — Less hardware is required to achieve the same throughput results because application performance can be vastly improved.
- New business opportunities — Massive-scale data processing problems that were previously impractical or impossible can now be tackled using the near linear scalability that is possible with the Software Pipelines approach. This allows organizations to address new types of business opportunities.
- Low development costs — The simplicity of the Software Pipelines methodology allows developers to remain focused on business logic instead of getting bogged down in details related to implementing concurrent processing.

