Wednesday, June 20, 2012

Using BPEL Performance Statistics to Diagnose Performance Bottlenecks



Tuning performance of Oracle SOA 11G applications could be challenging. Because SOA is a platform for you to build composite applications that connect many applications and "services", when the overall performance is slow, the bottlenecks could be anywhere in the system: the applications/services that SOA connects to, the infrastructure database, or the SOA server itself. How to quickly identify the bottleneck becomes crucial in tuning the overall performance.

Fortunately, the BPEL engine in Oracle SOA 11G (and 10G, for that matter) collects BPEL Engine Performance Statistics, which show the latencies of low level BPEL engine activities. The BPEL engine performance statistics can make it a bit easier for you to identify the performance bottleneck.

Although the BPEL engine performance statistics are always available, the access to and interpretation of them are somewhat obscure in the early and current (PS5) 11G versions.

This blog attempts to offer instructions that help you to enable, retrieve and interpret the performance statistics, before the future versions provides a more pleasant user experience.

Overview of BPEL Engine Performance Statistics 

SOA BPEL has a feature of collecting some performance statistics and store them in memory.

One MBean attribute, StatLastN, configures the size of the memory buffer to store the statistics. This memory buffer is a "moving window", in a way that old statistics will be flushed out by the new if the amount of data exceeds the buffer size. Since the buffer size is limited by StatLastN, impacts of statistics collection on performance is minimal. By default StatLastN=-1, which means no collection of performance data.

Once the statistics are collected in the memory buffer, they can be retrieved via another MBean oracle.as.soainfra.bpel:Location=[Server Name],name=BPELEngine,type=BPELEngine.>

My friend in Oracle SOA development wrote this simple 'bpelstat' web app that looks up and retrieves the performance data from the MBean and displays it in a human readable form. It does not have beautiful UI but it is fairly useful.

Although in Oracle SOA 11.1.1.5 onwards the same statistics can be viewed via a more elegant UI under "request break down" at EM -> SOA Infrastructure -> Service Engines -> BPEL -> Statistics, some unsophisticated minds like mine may still prefer the simplicity of the 'bpelstat' JSP. One thing that simple JSP does do well is that you can save the page and send it to someone to further analyze

Follows are the instructions of how to install and invoke the BPEL statistic JSP. My friend in SOA Development will soon blog about interpreting the statistics. Stay tuned.

Step1: Enable BPEL Engine Statistics for Each SOA Servers via Enterprise Manager

First st you need to set the StatLastN to some number as a way to enable the collection of BPEL Engine Performance Statistics

  • EM Console -> soa-infra(Server Name) -> SOA Infrastructure -> SOA Administration -> BPEL Properties
  • Click on "More BPEL Configuration Properties"
  • Click on attribute "StatLastN", set its value to some integer number. Typically you want to set it 1000 or more.

Step 2: Download and Deploy bpelstat.war File to Admin Server,

Note: the WAR file contains a JSP that does NOT have any security restriction. You do NOT want to keep in your production server for a long time as it is a security hazard. Deactivate the war once you are done.
  • Download the bpelstat.war to your local PC
  • At WebLogic Console, Go to Deployments -> Install
  • Click on the "upload your file(s)"
  • Click the "Browse" button to upload the deployment to Admin Server
  • Accept the uploaded file as the path, click next
  • Check the default option "Install this deployment as an application"
  • Check "AdminServer" as the target server
  • Finish the rest of the deployment with default settings


  • Console -> Deployments
  • Check the box next to "bpelstat" application
  • Click on the "Start" button. It will change the state of the app from "prepared" to "active"

Step 3: Invoke the BPEL Statistic Tool


  • The BPELStat tool merely call the MBean of BPEL server and collects and display the in-memory performance statics. You usually want to do that after some peak loads.
  • Go to http://<admin-server-host>:<admin-server-port>/bpelstat
  • Enter the correct admin hostname, port, username and password
  • Enter the SOA Server Name from which you want to collect the performance statistics. For example, SOA_MS1, etc.
  • Click Submit
  • Keep doing the same for all SOA servers.

Step 3: Interpret the BPEL Engine Statistics

You will see a few categories of BPEL Statistics from the JSP Page.

First it starts with the overall latency of BPEL processes, grouped by synchronous and asynchronous processes. Then it provides the further break down of the measurements through the life time of a BPEL request, which is called the "request break down".

1. Overall latency of BPEL processes

The top of the page shows that the elapse time of executing the synchronous process TestSyncBPELProcess from the composite TestComposite averages at about 1543.21ms, while the elapse time of executing the asynchronous process TestAsyncBPELProcess from the composite TestComposite2 averages at about 1765.43ms. The maximum and minimum latency were also shown.

Synchronous process statistics
<statistics>
    <stats key="default/TestComposite!2.0.2-ScopedJMSOSB*soa_bfba2527-a9ba-41a7-95c5-87e49c32f4ff/TestSyncBPELProcess" min="1234" max="4567" average="1543.21" count="1000">
    </stats>
</statistics>



Asynchronous process statistics
<statistics>
    <stats key="default/TestComposite2!2.0.2-ScopedJMSOSB*soa_bfba2527-a9ba-41a7-95c5-87e49c32f4ff/TestAsyncBPELProcess" min="2234" max="3234" average="1765.43" count="1000">
    </stats>
</statistics>


2. Request break down

Under the overall latency categorized by synchronous and asynchronous processes is the "Request breakdown". Organized by statistic keys, the Request breakdown gives finer grain performance statistics through the life time of the BPEL requests.It uses indention to show the hierarchy of the statistics.

Request breakdown
<statistics>
    <stats key="eng-composite-request" min="0" max="0" average="0.0" count="0">
        <stats key="eng-single-request" min="22" max="606" average="258.43" count="277">
            <stats key="populate-context" min="0" max="0" average="0.0" count="248">



Please note that in SOA 11.1.1.6, the statistics under Request breakdown is aggregated together cross all the BPEL processes based on statistic keys. It does not differentiate between BPEL processes. If two BPEL processes happen to have the statistic that share same statistic key, the statistics from two BPEL processes will be aggregated together. Keep this in mind when we go through more details below.


2.1 BPEL process activity latencies

A very useful measurement in the Request Breakdown is the performance statistics of the BPEL activities you put in your BPEL processes: Assign, Invoke, Receive, etc. The names of the measurement in the JSP page directly come from the names to assign to each BPEL activity. These measurements are under the statistic key "actual-perform"

Example 1: 
Follows is the measurement for BPEL activity "AssignInvokeCreditProvider_Input", which looks like the Assign activity in a BPEL process that assign an input variable before passing it to the invocation:


                               <stats key="AssignInvokeCreditProvider_Input" min="1" max="8" average="1.9" count="153">
                                    <stats key="sensor-send-activity-data" min="0" max="1" average="0.0" count="306">
                                    </stats>
                                    <stats key="sensor-send-variable-data" min="0" max="0" average="0.0" count="153">
                                    </stats>
                                    <stats key="monitor-send-activity-data" min="0" max="0" average="0.0" count="306">
                                    </stats>
                                </stats>

Note: because as previously mentioned that the statistics cross all BPEL processes are aggregated together based on statistic keys, if two BPEL processes happen to name their Invoke activity the same name, they will show up at one measurement (i.e. statistic key).

Example 2:
Follows is the measurement of BPEL activity called "InvokeCreditProvider". You can not only see that by average it takes 3.31ms to finish this call (pretty fast) but also you can see from the further break down that most of this 3.31 ms was spent on the "invoke-service". 

                                <stats key="InvokeCreditProvider" min="1" max="13" average="3.31" count="153">
                                    <stats key="initiate-correlation-set-again" min="0" max="0" average="0.0" count="153">
                                    </stats>
                                    <stats key="invoke-service" min="1" max="13" average="3.08" count="153">
                                        <stats key="prep-call" min="0" max="1" average="0.04" count="153">
                                        </stats>
                                    </stats>
                                    <stats key="initiate-correlation-set" min="0" max="0" average="0.0" count="153">
                                    </stats>
                                    <stats key="sensor-send-activity-data" min="0" max="0" average="0.0" count="306">
                                    </stats>
                                    <stats key="sensor-send-variable-data" min="0" max="0" average="0.0" count="153">
                                    </stats>
                                    <stats key="monitor-send-activity-data" min="0" max="0" average="0.0" count="306">
                                    </stats>
                                    <stats key="update-audit-trail" min="0" max="2" average="0.03" count="153">
                                    </stats>
                                </stats>



2.2 BPEL engine activity latency

Another type of measurements under Request breakdown are the latencies of underlying system level engine activities. These activities are not directly tied to a particular BPEL process or process activity, but they are critical factors in the overall engine performance. These activities include the latency of saving asynchronous requests to database, and latency of process dehydration.

My friend Malkit Bhasin is working on providing more information on interpreting the statistics on engine activities on his blog (https://blogs.oracle.com/malkit/). I will update this blog once the information becomes available.

Update on 2012-10-02: My friend Malkit Bhasin has published the detail interpretation of the BPEL service engine statistics at his blog http://malkit.blogspot.com/2012/09/oracle-bpel-engine-soa-suite.html.




7 comments:

  1. Link to bpelstat.war webapp looks broken. Can you please let me know the updated location to download the same.

    Thank you! Thank you!

    ReplyDelete
  2. Hum, it works for me. Could you please copy and paste the actual link with which you are having problem?

    ReplyDelete
  3. Hi Francis,

    Thank you for the post.

    I have set StatsLasN=1000 but I do not see any statistics in the EM.

    I am running wls1036, SOA PS5 in production mode, logging/Audit disabled and with inMemoryOptimization=On, no instances are being dehydrated.
    Also, my domain was created without the BPM, do I need BPM to collect these statistics?

    Thank you,
    Alex

    ReplyDelete
    Replies
    1. Good question. Disabling the audit will also disable the collection of in memory performance statistics.

      Delete
  4. No I can see the stats, I had my BPEL-Sensors disabled. Enabling the sensors fixed that and I can see the Request Breakdown stats.

    I still do not see anything in the Thread Statistics and Pending/Active Requests though.

    Thank you,
    Alex

    ReplyDelete
    Replies
    1. The statistics for threads are for invoke threads and engine threads. Those threads are involved only for asynchronous activities where dispatcher threads are needed. Those activities include initial receive of asynchronous messages, mid-process break point activities (such as Receive, Wait, etc), and fault policy retries, etc.

      In the case of synchronous BPEL process without any mid-process break point activities, no invoke or engine threads are involved. Then you will not see any threads statistics.

      Delete
  5. This comment has been removed by the author.

    ReplyDelete