Read Performance Improvement in Peoplesoft Using Oracle Analytic Functions text version

Performance Improvement in Peoplesoft Using Oracle Analytic Functions

White Paper By S.R.Raajesh [email protected]

Abstract

This paper gives an overview of analytic functions in oracle and their implementation in PeopleSoft for better performance. Case based analysis is done and the improvements obtained by using analytic functions are clearly displayed.

1

Introduction

For any PeopleSoft implementation performance has always been a crucial factor, which gets impeded due to poorly, designed SQL queries. And this cannot be attributed to the developer, as the requirements are outside the analytical capabilities of the SQL. This leads to performance issues when the software code runs in real time.

So far computing data across rows using normal SQL has always been a difficult task. It required extensive programming which lies outside the scope of standard SQL, resulting in performance problems. This scenario changed after the introduction of analytic functions in Oracle. With analytic functions, such computations can be made with ease without causing much performance issues. The analytic functions help in enabling enhanced performance and higher developer productivity.

This paper discusses how

analytic functions, when used in PeopleCode provide enhanced

performance. The paper is divided into five sections. Section 1 gives an overview of analytic functions in Oracle. Section 2 discusses the performance improvement that can be obtained by using analytic functions. Case based analysis is done here. Section 3 discusses the advantages and drawbacks of this approach. Section 4 discusses the applications of analytic functions. And finally, conclusions are presented in section 5. An appendix, which gives an explanation of the performance measuring factors, is given at the end.

1.Analytic Functions-An Introduction

In order to achieve the desired query output, the analytic functions add several new elements to SQL processing. These elements build on existing SQL to perform flexible and powerful calculations. The key concepts involved in analytical functions are

Processing order: Query processing using analytic functions takes place in three stages. First, all joins, WHERE, GROUP BY and HAVING clauses are performed. Second, the result set is made available to the analytic functions, and all their calculations take place. Third, if the

2

query has an ORDER BY clause at its end, the ORDER BY is processed to allow for precise output ordering. The processing order is shown in Figure 1.

Joins, WHERE, GROUP BY & HAVING clauses

Partitions created; Analytic functions applied to each row in each partition Fig 1: Processing order of analytic functions

Final ORDER BY

Result Set Partitions - Analytic functions allow users to divide the result of query into ordered groups of rows called partitions. Partitions are defined on the groups created with GROUP BY clauses, so any aggregate results such as SUMs and AVGs are available to them. Partition may be based on any desired columns or expressions. A query result set may have just one partition holding all the rows, a few large partitions, or many small partitions holding just a few rows each.

Window: For each partition, a sliding window of data is defined. This window determines the range of rows which can be used to perform the calculations for "current row" (Each calculation performed in an analytic function is based on a current row within a window. The current row serves as a reference point determining the start and end of the window.). Window sizes can be based on either a physical number of rows or a logical interval such as time. The window has a starting row and an ending row. Depending on its definition, the window may move at one or both ends. For instance, a window defined for a cumulative sum function would have its starting row fixed at the first row of its partition, and its ending row would slide from the starting point all the way to the last row of the partition. In contrast, a window defined for a moving average would have both its starting and end points slide so that they maintained a constant physical or logical range. A window can be set as large as all the rows in a partition. At the other extreme, it could be just a sliding set of two rows within a partition. Users can specify a window containing a constant number of rows, or a window containing all rows where a column value is in a specified range. Users can also define a window to hold all rows where a date value falls within the prior month. More information about the concepts of analytic functions can be obtained from [2].

3

Case based examples Example No 1 The example provided here is based on CRM for communications. The requirement is to get the count of MSISDNs under an account. If there is only one MSISDN available, then fetch that MSISDN and send an SMS to that MSISDN. Assuming that no data is available in the component buffer, a typical peoplecode will have to depend on SQLEXEC(or equivalent methods) to fetch data from the database. The query will have to be framed like this

1. First identify the number of MSISDNs under an account 2. If there exists only one MSISDN under that account, then fetch the MSISDN and do further processing.

Though sqlexec, returns the first row, the count becomes mandatory here because there can be scenarios where there are more than one MSISDN under an account. Under such cases no SMS should be sent to that account. A typical peoplecode will end up like this

1. sqlexec("select count(*) from ps_rf_inst_prod a , ps_rftinst_prod b where a.setid=b.setid and a.inst_prod_id=b.inst_prod_id and a.assettag <> ` ` and b.rbtacctid=:1 and a.inst_prod_Status <> `DIS' ",&St_acctid_l,&Nm_count_l);

For an account `12345' in production, the following statistics are generated Recursive Calls Db block gets Consistent gets Physical reads Redo size Bytes sent via SQL*Net to client Bytes received via SQL*Net from Client SQL*Net roundtrips to/from client Sorts (memory) 484 0 195 9 0 197 277 2 0

4

Sorts (disk) Rows processed

0 1

2. If the count is 1, then fetch the MSISDN from the database again. The query will be the same except that the field fetched will be ASSETTAG. The statistics again will be

Recursive Calls Db block gets Consistent gets Physical reads Redo size Bytes sent via SQL*Net to client Bytes received via SQL*Net from Client SQL*Net roundtrips to/from client Sorts (memory) Sorts (disk) Rows processed

484 0 197 8 0 209 277 2 0 0 1

Eventually we end up in two sqlexecs and one comparison for achieving the functionality. For much higher volume of data in real time, the statistics will have much higher values, creating performance issues.

The same requirement can be met with the help of analytic functions with ease. The query to achieve the same is given below Sqlexec("select assettag,count(assettag) over (partition by b.rbtacctid) from ps_rf_inst_prod a, ps_rftinst_prod a.inst_prod_status b where <>'INS' a.setid=b.setid and and a.inst_prod_id=b.inst_prod_id ` and b.rbtacctid=:1 and ",

A.assettag<>'

&St_Accountid_l,&St_Msisdn_l, &St_cnt_l);

For this query, if the count is 1, then get the assettag value as stored in the variable. The same requirement gets satisfied by just using a single analytic query. The statistics gathered are also appealing.

5

Recursive Calls Db block gets Consistent gets Physical reads Redo size Bytes sent via SQL*Net to client Bytes received via SQL*Net from Client SQL*Net roundtrips to/from client Sorts (memory) Sorts (disk) Rows processed

484 0 195 7 0 274 276 2 1 0 1

This is the power of analytic functions. When used correctly, they cut short the need for using multiple queries.

Example 2:

The requirement is to select and display the first ordered MSISDN for the customer. By the normal SQL means, a self join on the installed services tables seems inevitable. First, we will see the normal way to frame a query for this and then see how analytic functions help to solve this problem with ease. In the former case the query would be select assettag from ps_rf_inst_prod where bo_id_cust='11891' and assettag <>' ' and installed_date=(select min(installed_date) from ps_rf_inst_prod where bo_id_cust='11891'and assettag <>' ')

The statistics gathered are given below Recursive Calls Db block gets Consistent gets Physical reads 812 0 260 0

6

Redo size Bytes sent via SQL*Net to client Bytes received via SQL*Net from Client SQL*Net roundtrips to/from client Sorts (memory) Sorts (disk) Rows processed

0 208 277 2 0 0 1

The analytic version of the SQL for the same requirement would be Select bo_id_cust, installed_date, assettag from (select T1.*,min(installed_date)over(partition by bo_id_cust) min_date from ps_rf_inst_prod T1 where T1.Assettag<>' ` and T1.bo_id_cust='11891') where installed_date=min_date

The statistics are tabulated below

Recursive Calls Db block gets Consistent gets Physical reads Redo size Bytes sent via SQL*Net to client Bytes received via SQL*Net from Client SQL*Net roundtrips to/from client Sorts (memory) Sorts (disk) Rows processed

406 0 129 0 0 285 277 2 1 0 1

Clearly, analytic functions have an upper hand here.

Example No 3 The requirement is to create a view for specific disconnected products, knowing the product category. The result should contain only the latest disconnected product under each product 7

category. A join on PS_PROD_ITEM table is inevitable. In the worst case scenario there can be more than one disconnected product in the same product category. The last disconnected product is selected based on the assumption that the products got installed on different dates. A normal SQL can be written like this

Select a.<required fields> from ps_Rf_inst_prod a, ps_prod_item b where a.setid=b.setid and a.product_id=b.product_id and b.prod_category='USIM' and a.parent_inst_prodid='12345' and a.inst_prod_status='DIS' and a.setid='COM03' and a.installed_date=(select

max(c.installed_date) from ps_Rf_inst_prod c, ps_prod_item d where c.setid=d.setid and c.product_id=d.product_id and d.prod_category='USIM' and c.parent_inst_prodid='12345' and c.inst_prod_Status='DIS' and c.setid='COM03')

This query gets complicated when multiple product categories needs to be fetched. There needs to be a union-all to the results obtained. There might be even better methods to write the above query but a self-join on these tables cannot be avoided. However, a single analytic query can achieve the same objective joining the tables only once. The query is presented below Select <required fields>,installed_date from (select A.*,MAX(installed_date)over(partition by B.prod_category)inst_date from ps_rf_inst_prod A, ps_prod_item B where A.setid=B.setid and A.product_id=B.product_id and B.prod_category in (<comma separated list>) and A.parent_inst_prodid='12345' and A.setid='COM03' and A.inst_prod_status='DIS') where installed_date=inst_date

There is no need for any statistics based proof here. Without the usage of union-all and multiple joins, the desired objective is met with lesser physical reads.

3.Advantages of using Analytic Functions Analytic functions enhance both database performance and developer productivity. They are valuable for all types of processing, ranging from interactive decision support to batch report jobs.

Increased query speed: The processing optimizations supported by these functions enable significantly better query performance. Actions, which before required self-joins or complex procedural processing, may now be performed in native SQL. 8

Enhanced Developer Productivity - The functions enable developers to perform complex analyses with much clearer and more concise SQL code. Tasks which in the past required multiple SQL statements or the use of procedural languages can now be expressed using single SQL statements. The new SQL is quicker to formulate and maintain than the older approaches, resulting in greater productivity.

Minimized Learning Effort - Through careful syntax design, the analytic functions minimize the need to learn new keywords. The syntax leverages existing aggregate functions, such as SUM and AVG, so that these well-understood keywords can be used in extended ways.

However, using analytic functions in peoplecode could make it database type dependent. But as long as the database type is not going to be changed, these functions prove to be very handy in improving the performance.

4.Applications of Analytic Functions

Analytic functions can be used in the following areas where standard SQL fails to impress.

· · · · · ·

Ranking data within subsets of a data set. Aggregating data within subsets of a data set. Performing aggregations over moving windows. Displaying and comparing aggregates to individual entries within a single query. Comparing two or more rows within a given data set. Fetching from the translate table the values based on EFFDT.

9

5.Conclusion: The examples discussed above in this paper are just an introduction to the capabilities of analytic functions. There are many more features, which are to be explored and applied. The native approaches of PeopleSoft are not taken into consideration as they eventually end up executing a SQL across the database. The performance improvements, obtained in real time are much more and it requires a careful pruning across the database for slow running SQLs. Once this is done, and used subsequently in further coding, analytic functions will really prove to be very useful and efficient.

References: 1. Effective Oracle by Design- Thomas Kyte 2. www.oracle.com/technology/ products/oracle8i/pdf/8ifunctions.pdf 3. Peoplesoft Peoplebooks

10

Appendix 1 Here is an explanation for the statistics gathered during the simulation times for this paper Database Statistic Name Recursive Calls Description Number of recursive calls generated at both the user and system level. Oracle maintains tables used for internal processing. When Oracle needs to make a change to these tables, it internally generates an internal SQL statement, which in turn generates a recursive call. Number of times a CURRENT block was requested. Number of times a consistent read was requested for a block. Total number of data blocks read from disk. This number equals the value of "physical reads direct" plus all reads into buffer cache. Total amount of redo generated in bytes. Total number of bytes sent to the client from the foreground processes. Total number of bytes received from the client over Oracle Net. Total number of Oracle Net messages sent to and received from the client Number of sort operations that were performed completely in memory and did not require any disk writes. Number of sort operations that required at least one disk write. Number of rows processed during the operation.

db block gets consistent gets physical reads

redo size bytes sent via SQL*Net to Client bytes received via SQL*Net from client

SQL*Net roundtrips to/from client

sorts (memory)

sorts (disk) rows processed

For more information please refer to Oracle Performance Tuning Guide and Reference.

11

Information

Performance Improvement in Peoplesoft Using Oracle Analytic Functions

11 pages

Report File (DMCA)

Our content is added by our users. We aim to remove reported files within 1 working day. Please use this link to notify us:

Report this file as copyright or inappropriate

537496


You might also be interested in

BETA
IP-2010-12........A (Page 1)