ACM 7.1 Performance Guide
Issue Date: 20 September 2012
Information in this document is subject to change without notice and does not represent a commitment on the part of the vendor or its representatives.
Copyright © Alterian. All rights reserved.
Documentation feedback and comments:
ACM is a trademark of Alterian. Windows XP, SQL Server, ASP.NET, Visual Studio .NET and C# are trademarks of Microsoft Corporation. All other products named herein may be trademarks of their respective manufacturers and are hereby recognised. Trademarked names are used editorially, to the benefit of the trademark owner, with no intent to infringe on the trademark.
20 May 2010
Updated for version 7.1
Minor additions – Squid ignore-cc
This guide provides an overview of all aspects of ACM configuration affecting performance. Including configuration issues that fall outside the scope of Alterian support but affect the performance of a web site nonetheless.
The guide does not intend to have a solution to every performance problem that may arise but it does aim to provide a good description of the techniques required for ensuring that optimum performance is achievable for most sites under most circumstances.
This guide is intended for site managers, architects, system administrators, developers and project managers involved in the implementation and support of ACM environments.
The guide is intended to be as generic as possible; the principles described relate to all versions of the ACM product and its predecessor Morello. However where specific examples of configuration are described the examples will be from an ACM version 7.1 environment unless otherwise stated.
The guide does not intend to describe best practice for other technologies such as Oracle, SQL Server, Tomcat, IIS or Apache, other than to describe aspects of configuration relating to their for use with ACM. It is assumed that these technologies will have been implemented according to appropriate documented best practices.
The guide does not always prescribe exactly how to perform certain actions, it is assumed that the site administrators will have sufficient practical knowledge to implement suggested activities; however further guidance on specific tasks can be obtained from Alterian support or Professional Services if required.
Monospaced bold text is used to show code listings, file names, commands, etc.
./licencekey sample/encrypt[@db1] key0123456789
Italic monospaced text enclosed in angled brackets is used to specify variables for which you must supply a value.
./licencekey <username>/<password>[@<db1>] <licencekey>
Most ACM implementations are delivered on appropriate hardware because the sizing of this hardware to meet expected demand is an important part of the procurement process. Clearly you should size your hardware to deliver the performance that you require when the site is under its heaviest anticipated load. However, if this process is not performed correctly, or if the demands on a site change over time, it is possible that more hardware may be needed in order to maintain the expected levels of performance.
Before considering the purchase of new hardware (with its associated hosting, support and license costs) one should always make a review of each of the areas described in this document to determine if the best use is being made of existing resources.
Alterian Professional Services provide a full Health Check service which will determine any performance improvements that can be made as well as a review of the performance of any particular aspects of a site that appear to be poor. A recommendation will also be given on what extra hardware resources would be required in order to meet anticipated demands.
The following sections describe recommendations for providing appropriate hardware resources and a method of diagnosing a deficiency in any particular area.
The ACM, Database and web application software is generally placed on a separate disk partition from the operating system (/opt on Solaris and D:\ on Windows). There is very little disk I/O related to application software but you should ensure that the disk partition has plenty of free space. Log files can grow quickly when ACM runs in debug mode and the application will stop if it runs out of space.
The database should ideally be spread across a number of disk partitions in order to ensure that disk writes and reads do not become a bottleneck on performance. Oracle performs I/O of several forms simultaneously, including data segment writes, Undo segment writes and redo log writes. SQL Server behaves in a similar way. If all of these activities take place on a single disk then disk performance can cause applications to suffer. It is also standard practice to ensure that any element of your database can be recovered using objects stored on a separate hard disk. This leads to a general recommendation that at least 4 partitions of separate disks (or a RAID5 partition) be used to hold your OS, the application software, the database and the database recovery data (backups and archive/transaction logs. If this recommendation is followed then disk performance is unlikely ever to be an issue on an ACM database server.
You can tell if disk I/O is causing a performance problem by using the sar utility on Solaris or the Performance Monitor on Windows. Disk I/O should never require more than an average of 2% of CPU time or up to 10% in short bursts. If the average disk wait queue is high then this could indicate a problem.
A variable amount of system memory is required by ACM, this will grow according to the maximum size of the Java heap that is tuneable in the ACM properties file and also the configuration of the MAE caches that are described in section 3 of this guide. Oracle also requires separate segments for the difference caches described in section 2 of this document. Similarly your chosen J2EE container product will run a Java process with a maximum size of Java heap.
All of these memory settings are configurable, and so you should never find yourself in a situation where you run out of memory. However if you are using a 32-bit operating system then you may have to limit the amount of memory available to one or all of the above applications.
You can tell if more memory is required by ACM as you will either get ACM OutOfMemory errors in your CME or J2EE container log file, in which case a larger maximum value is required for the CME JVM in the properties files, or you may see warnings in the CAE log file saying that objects have been pushed out of their data cache and that the cache may need to be expanded.
You can tell if more memory is required for Oracle as one of the caches described in section 2 of this guide will be performing poorly.
You can tell if more memory is required by your J2EE container product as it will give OutOfMemory errors is it needs memory and there is none free in the java heap. You can increase the maximum size of the java heap using the JAVA_OPTS described in a later section of this guide.
After tuning each of these settings you can determine if more memory is required by looking at the amount of free memory on your server. On Solaris this is not simple as lots of ‘free’ or unallocated memory is used for a file cache. Files are dropped from this cache if applications need memory, but it does mean that the output from utilities such as ‘vmstat’ or ‘top’ can be misleading. The simplest way to see if there is a shortage of memory on Solaris is to view the amount of paging and swapping activity. You can do this using either the ’sar’ or ‘vmstat’ utilities. On Windows the Task Manager Performance tab is a reasonably reliable measure of free memory. If this indicates that you have little or no free memory you should see a corresponding increase in disk paging activity.
If you are short of memory after tuning your applications for performance as described above, then buy some more, its relatively cheap. However you should bear in mind the maximum memory limitations for the server and any individual program; these are determined by the architecture of the Operating system version in place.
CPU is used by the operating system kernel and by applications. If you have poor disk performance or a lack of memory then CPU time is wasted in waiting for I/O rather than processing application code. Some CPU time is also wasted in context switching (switching between outstanding tasks).
On a well tuned ACM server the J2EE container (or IIS) and the CAE are each likely to use about 40% of the total CPU being used, the database about 10% and the operating system about 10%. These figures are a guide only and can vary according to the complexity of the web application, the choice of J2EE container product and the caching policy for content on your site. Note that with the advent of the ACM Performance Layer the typical ratio of CPU usage between J2EE container, CME and database tends to be more like 65:25:5.
If you still have a problem with CPU availability after you have reviewed the performance issues in this guide then you may need to consider purchasing more CPU resource.
The Oracle database should be created with reference to the appropriate version of the ACM prerequisites file and should deliver good performance and require little in the way of routine maintenance. However a few areas do need attention, particularly the correct sizing of memory structures and the regular maintenance of table statistics and Oracle Text indexes.
The most significant tuneable factor affecting Oracle performance is the size of the memory segment allocated to the SGA when the database instance starts up. The SGA is used to hold two key database structures, the Database Buffer Cache and the Shared Pool The buffer cache is used to store data so that it may be retrieved from memory rather than from disk; the Shared pool (consisting of the library cache and the dictionary cache) is used to store SQL statements that have been validated and parsed, new SQL statements are checked against those in the caches to see if there is a version already available to use; this cuts down on time-consuming validating and parsing of repeated SQL.
If you are using Oracle 9iR2 then the SGA size is determined by the db_cache_size and shared_pool_size parameters in the instance init.ora file. If you are using Oracle 10gR2 or Oracle 11g then by default the cache size will be managed automatically within the scope of the memory made available to it by the sga_max_size parameter.
You can test the efficiency of the buffer cache with the following SQL:
And you can test the library cache with this SQL:
This SQL provides values for the efficiency of the caches for the period since the instance was last restarted. Leave the instance running for some time before testing as initial performance will always be poor as the cache is filling up. A value of 99% or lower generally represents sub-optimal cache efficiency.
Alterian recommend setting the parameters SGA_MAX_SIZE to at least 2000M and the parameter SGA_TARGET_SIZE to 1500M; this tells the database instance to use about 1500M of RAM for the SGA and allows the DBA to increase this target up to a maximum of 2000M without having to restart the database. The performance of the buffer cache and library cache should then be checked at regular intervals and these parameters altered accordingly.
The ACM code is written to be as efficient as possible. However it is possible that missing or broken objects in the ACM schema may cause the code to perform sub-optimally.
In particular it is important to ensure that table statistics are kept up to date using the Perl maintenance script updatedbstats.pl on a regular basis. Alternatively the Oracle package DBMS_STATS can be utilized to perform the same activity. The responsibility for choosing the appropriate method for gathering table statistics should lie with the DBA.
Also, on staging/editorial servers with a high level of editorial activity, the table statistics for the Distribution Server task and mirror tables ought to be maintained on an hourly basis as the content of these tables can change rapidly and the Oracle optimiser may make the wrong choices about how to access table data. Please refer to the FAQ named ‘Why does DistD run slowly sometimes’ - article ID (FAQCME0033) on the Alterian support extranet http://supportal.alterian.com for a description of this problem and its resolution.
Text indexes are used for searching ACM content. They are maintained in a timely state by a DBMS job that runs every minute on each index; however they are prone to fragmentation problems which can seriously affect search performance and database performance in general. These indexes are written in a serial manner rather than using a binary tree format like most Oracle indexes. As a result they begin to fragment as soon as they are updated. It is good practice to ensure that they are regularly optimized to reduce fragmentation. You can run a full optimization with the following SQL:
The rebuildtextindexes script will run this command for each of the Text indexes in the ACM user; it can be called from a windows scheduled job or a Unix crontab to perform this work. It should be altered appropriately to suit the license available.
It is recommended that this SQL be run on at least a weekly basis on an instance with a high level of editorial activity. However the ‘ONLINE’ option is only available if you have the Enterprise license for Oracle database. If you have the Standard edition, then you need to remove the work ‘ONLINE’ from the above script; it is recommended that you take a copy of the script and place it in /usr/local/ACM/scripts (or D:\Alterian\scripts); alter the script accordingly and call this one from the scheduled job or crontab. This way the functionality of this script will not be altered to the ‘ONLINE’ version every time you perform an ACM software update.
The SQL Server database should be created with reference to the appropriate ACM installation guide and should deliver good performance and require little in the way of routine maintenance. However a few areas do need attention, particularly the regular maintenance of table statistics and indexes.
SQL Server Management Studio allows you to create administration jobs and execute them using a scheduled plan, these are termed Maintenance Plans. The following approach is recommended for setting up appropriate maintenance plans for ACM databases:
1. Setup a SQL Server Agent job to clear the queryresults table – Add a new job to the SQL Server Agent, name it appropriately and add a single step. This step should perform one action, to clear the queryresults table in the database. A suitable command would be ‘truncate table queryresults’
2. Daily Maintenance Plan – Use the SQL Server Maintenance Plan Wizard to create a new Maintenance Plan, called ‘Daily Maintenance’
a. schedule it to run at midnight
b. choose ‘Execute a SQL Server Agent Job’ and ‘Update Database Statistics’
c. Choose the job you created in step 1.
d. Choose the ACM database to update the stats and run the job on
3. Weekly Maintenance Plan – Use the SQL Server Maintenance Plan Wizard to create a new Maintenance Plan, called ‘Weekly Maintenance’
a. Schedule to run on Sunday morning at about 2 a.m.
b. Choose to ‘Re-organise Indexes’ on the ACM database
The database should also be spread across a number of disk partitions in order to ensure that disk writes and reads do not become a bottleneck on performance. It is general SQL Server best practice to have two partitions on separate disks to store the database and logs files respectively, to allow SQL Server to access data and logs files independently. On large scale installations with many content editors, it may be advisable to also split the data files across multiple disks. As with Oracle installations, RAID should be used to allow disk redundancy.
Disk space should be monitored closely, and disk activity should be monitored using Performance Monitor.
The CAE maintains caches for most types of ACM object so as to avoid time consuming calls to the database for data that is often required. The default cache sizes are suitable for a small implementation only and some tuning is often required.
You may start to get warnings in the CAE log when the CAE forces items out of their caches and this is a prompt to consider resizing the appropriate cache. Please see the Application Engine Reference guide for a description of how these caches are managed.
Note that some form of cache tuning is generally required on a CAE delivery server. Prior to version 5.8 there was no messaging to tell the delivery server CAE when content items changed. The default max_age properties for the various caches is ‘0’ or infinite, this is suitable for a staging server, where the SyncD process expires content from the CAE caches when it changes, but on a delivery server a max_age value of 60 (or 1 minute) may be more appropriate for the item cache and any other caches whose content is subject to frequent updates.
From version 5.4 onwards there is a CAE property named mae.domain.itemfactory.enabled that allows as to switch off the item cache altogether by using a value of ‘FALSE’. This is useful if we have a web application in front of the MAE that caches metadata about ACM items as such an application may only ask for the item once in response to its updatedate changing, so it needs to be given the changed version, not an older cached one.
From version 5.8 onwards the live server SyncD functionality allows a local SyncD process to run for each live (or delivery) database. This receives change messages from the authoring environment by reviewing the contents of the SYNCDQUEUE table; as DistD (or SQL Server Replication) replicates changes to the delivery database the local SyncD is thus able to alert any local CAE processes to the change.
The ‘live SyncD’ functionality is enabled on the primary authoring server by configuring SyncD with the wrapper.app.parameter of ‘-t’ in $MS_HOME/etc/cmauth/syncd.conf. This tells SyncD to record any item changes into the new SYNCDQUEUE table. DistD pushes the contents of this table to the delivery database. On the primary delivery server for each delivery database a ‘live’ SyncD process is running. This is configured with the wrapper.app.parameter of ‘-vc’ in $MS_HOME/etc/cmdel/syncd.conf.
With ‘live SyncD’ configured the CAE data caches can be set to have an infinite expiry time in the same way as those on an authoring server.
And so from 5.8 onwards the tuning of the CAE data caches only requires appropriate values for the max_size properties of each cache and ensuring that the ‘live SyncD’ process is functioning correctly.
The CAE should have its java heap size (assigned memory) tuned like any other Java process.
On a Unix server prior to ACM version 5.8 the amount of memory assigned to the CAE can be altered using the msctl.MAE.java_flags property in the ACM properties file. The string JAVA_FLAGS=’-Xms256M;-Xmx256M’ needs adding to the start of this property.
On a pre-5.8 Windows server this is changed by altering the registry key /HKEY_LOCAL_MACHINE/SOFTWARE/Mediasurface/MAE/jvmproperties to have the following options at the start of the key. Do not add any spaces to the key value.
Since version 5.8 the ACM services on all servers are called from a java wrapper process and the service and its tuneable options are defined in a set of wrapper configuration files found under $MS_HOME/etc/<instance name>/, each service has its own configuration file.
Each service can have the minimum and maximum memory limits configured using the following properties:
On 32-bit operating systems there is a limit to the amount of memory that can be assigned to a java process. This amount varies depending upon the operating system, the application and the amount of contiguous memory available to it at the time. In practice, there is an upper limit of about 1500MB that can be assigned to the CAE. If you need to be able to assign more memory then you should really be using a 64-bit operating system.
The CAE maintains a pool of connections to the Oracle database in order to support a varioable amout of work and to reduce the amount of time spent creating and dropping sessions. The size of this pool will have a direct bearing on the amount of memory used and the amount of work that Oracle can do at once. By default the mae.db.pool.num_of_connections property is set to 5, this is sufficient for most instances. However if your application makes heavy use of calls to Oracle (which may be the case in an authoring environment or a delivery environment with a high proportion of dynamic content) then it may be advisable to increase this value.
It is possible to configure a CAE job that logs the status of each of the CAE’s database connections to the CAE log file every 5 minutes. This is useful for monitoring database performance problems should they arise. To achieve this, the following text can be added to the file mae-jobs-ms.xml.
The CAE binary buffer supports the retrieval and update of binary objects and a number of properties are used to define the memory allocated to this task and the way that the CAE manages concurrent requests.
The two CAE instance properties mae.persistence.item.binary_buffer.block_size and mae.persistence.item.binary_buffer.num_blocks determine the size of the memory segment allocated to handling binary data. If you are going to be transferring lots of binary objects between the ACM editorial clients and the server then it is good practice to increase the number of blocks. If you have a large number of editors and you anticipate large numbers of dynamic requests for binary objects such as images from your web sites, then you may wish to consider increasing the value of the property mae.persistence.item.binary_buffer.num_write_threads. However, the default values for the binary buffer properties (5120, 8000, 400, 20, 3000 and 5000) tend to be adequate for most clients.
Please refer to the Application Engine Reference Guide for details of all of the properties affecting the handling of binary objects.
In a well configured public facing web site environment the only requests that are made to ACM will be for dynamic content, and for content which has expired at the web server cache.
For extranets or web sites requiring authentication or HTTP encryption (HTTPS) it is generally not possible to allow caching of HTTP pages. For such sites it is doubly important to ensure best use of the caching features available within the web application described below such as OSCache and the ACM Performance Layer. Traditionally such sites might be expected to perform rather slower as all requests must be dynamically served but experience has shown that the appropriate use of the caching technologies described below has enabled dynamic web sites to be delivered almost as fast as cacheable sites.
In version 5.9.5 Alterian introduced the Performance Layer. This layer uses JBoss cache replication technology to replicate the CAE API response data from the CAE server to the CAE client. This has the effect of making much of the data required to render web pages available and up to date local to (in the same JVM as) the web application.
The key effect of this feature is to remove the biggest bottleneck from the dynamic performance of ACM content delivery, that being the performance of the CORBA messaging between CAE client and server. CORBA provides a communication architecture that allows ACM client and server to exist on separate server architectures and operating systems. However, since the advent of fast multi-core CPUs Alterian have found that it was not possible to utilise all of the CPU available on a modern server because the CORBA communication required to support increased workloads would lead to a request queue forming at the application server well before the CPU was fully utilised. It was possible to install multiple application server and CAE pairs on a single server but this is unduly complicated. The performance layer has removed this bottleneck and now most requests can be satisfied locally within the application server JVM.
The result is that we are now able to utilise all of the power of a multi-core CPU, we also produce a much more efficient delivery method as data in the replicated caches is only replaced if the cache becomes full or the content changes.
Note that there is no change to the use of the ACM API required by the use of the performance layer, it is entirely hidden from users and site developers. The only difference to note is that the performance layer necessitates a greater requirement for RAM on the application server.
When creating JSP page templates we have the ability to cache data, to cache page fragments or to create page headers that will determine how the whole page may be cached by caching/proxy servers or by individual browsers.
ACM currently only has direct control over the forward HTTP cacheability of binary objects. This is achieved separately for each content item type using the public and private cache settings that are defined for each type using the ACM Java Client.
All other content should have appropriate cache control headers created in the templates used to deliver that type of content in your web applications. Each type of content has a template which can include cache header statements based on the ACM getCacheTime and getPublicCachetime API functions for defining suitable cache controls for that type of content.
Any HTTP compliant cache server will be capable of reading these cache headers and caching the content appropriately. The first time an item of content is requested it will be fetched from ACM and the cache server will store a copy in its cache if directed to do so. Any further requests for this item within the expiry time for the item will be served from the cache, without contacting ACM.
It is therefore possible for web site developers to manage all aspects of the cacheability of the content that will be created on their site with no need for system administrators to be involved in this process other than ensuring that web servers are configured to implement caching according to HTTP cache control headers.
HTTP cache control headers tend to be implemented as part of an include file that is called at the beginning of the jsp template. This include makes use of the getCacheTime and get PublicCacheTime methods to retrieve the cache times for the appropriate type of content from the ACM repository. For example, the standard view of items of the Branch content type in an ACM site might be delivered via the following template:
This file starts with the following line:
And this include file contains cache control response headers that are derived like this:
The HTTP cache-control headers will set both the s-maxage, or public, and max-age, private (browser), cache times for the page to the times defined in ACM for this item type.
Note that cache control headers are only implemented as part of the HTTP/1.1 specification. Many caching solutions currently only support HTTP 1.0. Both specifications support the use of the Expires header and so this should be included as well. The above example also sets this header appropriately.
In order to implement an appropriate strategy for caching different types of content, the site owners should be consulted as to the appropriate cache times for each type of content on their site. For example, a Latest News page might require a cache time of say 5 minutes, whereas a Local Amenities page could possibly be cached for several hours. Binary items that will never change should be given long cache times. Clearly it is important to ensure that appropriate cache times are associated with each type of object when defining object types with the ACM Java Client.
All types of HTTP cache will now be able to serve an object if it is in the cache and fresh (still within its max age or Expiry time). If the object is no longer fresh the cache will contact ACM and validate whether the current version has been changed, this is done by sending an If-Modified-Since header to ACM. Only if it has changed will ACM regenerate the object.
Together, freshness and validation are the most important ways that a cache works with content. A fresh object will be available instantly from the cache, while a validated object will avoid sending the entire object over again if it hasn't changed.
In order to reduce the number of calls to ACM it is possible to create a structure within the web application that stores commonly requested ACM data. A good example of this is the Navigator cache that was delivered with the standard ACM (Java) Site Packs and available as part of an engagement with Alterian Professional Services. The data required to determine site navigation is stored in the application memory and therefore shared by all templates and available locally, without making relatively expensive API calls.
An alternative and more common method of caching within the web application is to use the third-party product OScache to cache page fragments. This product has its own tag library that allows you to mark sections of a page as being locally cacheable for variable lengths of time. This is especially useful for web sites that are not suitable for caching on public cache servers, such as extranets. You can find out more about this product at www.opensymphony.com/oscache/. There are also some useful notes on configuring OSCache available on the Alterian support extranet – run a search for OSCache at http://supportal.alterian.com/sitesearch.
ASP.NET provides the following three caching techniques:
· Cache API
· Output caching
· Partial page or fragment caching
These caching techniques are summarized in the following sections.
The Cache API can be used to cache application-wide data that is shared and accessed by multiple requests. The cache API is also a good place for data that you need to manipulate in some way before you present the data to the user.
You should avoid using the cache API in the following circumstances:
· The data you are caching is user-specific. Consider using session state instead.
· The data is updated in real time.
· Your application is already relatively static, and you do not want to update the cache very often. In this case, consider using output caching.
The output cache enables you to cache the contents of entire pages for a specific duration of time within IIS. It enables you to cache multiple variations of the page based on query strings, headers, and userAgent strings.
The page will be cached in IIS memory and will live there either until the cache time expires or IIS runs short of memory and expires the page in order to make room for something else.
Within an ACM-p site we can integrate the output caching settings early on within the web-app and template development and control the parameters and durations through some settings in the web.config.
For example, we create the following settings (in the web.config or .config file) which activate and specify the overall Output Caching options:
Specifics on the settings can be found here: http://msdn.microsoft.com/en-us/library/ms178606.aspx
Then we can create some profiles (in the web.config or .config file) to be used within specific page templates:
<add name="StaticPage" duration="14400" varyByParam="*" location="ServerAndClient" varyByHeader="*"/>
<add name="DynamicPage" duration="300" varyByParam="*" location="ServerAndClient" varyByHeader="*"/>
This means we can specify a particular template as being in the Static or Dynamic profile and a change of the duration is controlled in a single setting. For example, we could have a standard content page which changes rarely on a 4 hour cache time, and a news listing branch template as a15 minute cache duration. An example declaration within the template would be:
<%@ Page Language="C#" MasterPageFile="~/Master_Pages/Page.master" AutoEventWireup="True" Inherits="templates_Page_Standard" Codebehind="Standard.aspx.cs" %>
<%@ OutputCache CacheProfile="DynamicPage" %>
The ASP output cache is very efficient, given enough memory; however an alternative to output caching is to use an Apache or Squid HTTP proxy cache in front of IIS. This can be a good idea particularly if security considerations dictate that a separate web server is required in your DMZ to protect the application. This server can double as a cache and save resources on IIS that might be best used serving your application. This form of caching is achieved by adding cache control headers in a similar manner to that described above for caching in a Java web application but using the SetCacheability() and SetExpires() methods of the ASP System.Web.HttpCachePolicy Class.
Partial page or fragment caching is a subset of output caching. It includes an additional attribute that allows you to cache a variation based on the properties of the user control (.ascx file.)
Fragment caching is implemented by using user controls in conjunction with the @OutputCache directive.
Use fragment caching when caching the entire content of a page is not practical. If you have a mixture of static, dynamic, and user-specific content in your page, partition your page into separate logical regions by creating user controls. These user controls can then be cached, independent of the main page, to reduce processing time and to increase performance.
Just like with JSP template design one obvious candidate for the use of fragment caching is the caching of navigation menus as they contain data that is relatively static.
For more information on caching in ASP .Net applications please refer to the following Microsoft Support note: http://support.microsoft.com/kb/323290
Post-cache substitution provides an alternative to fragment caching in that instead of caching a small fragment of a dynamic page, post cache substitution marks a fragment of a page which has to be generated dynamically, leaving the rest of the page to be cacheable
The choice of whether to use post-cache substitution or fragment caching should depend on the proportion of the page to be generated dynamically. If it is small then use post cache substitution, if it is large use fragment caching.
For a comprehensive introduction to ASP.Net caching you should read the ASP.Net caching guide, available from Alterian support.
The performance of any HTTP caching product will be governed by a set of caching rules that determine what will and what will not be cached. Some products may have standard rules that may conflict with the use of ACM. One of the most common of these is a rule preventing the caching of URLs with question marks in. These are generally used to denote query strings and hence would not normally be appropriate for page caching, however in ACM we use a question mark to denote a view, and hence we would wish to be able to cache such URLs. The Squid caching server has such a rule and it must be removed as part of the configuration of the web server.
It is also possible that external services such as Akamai’s Dynamic Site Accelerator might be used. This product utilises Akamai’s global delivery and caching network to enable the caching of site objects on a server close to the end user – cutting out much of the network communication that slows down web sites when viewed over long distances. Such solutions often have their own rules for how to cache web content and these may override the values set in cache control headers. This can often be a good thing, but it is important to know exactly how such services are configured so as to understand the complete flow of content change from editor to browser.
Another service that can potentially disrupt the normal behaviour of caching rules and content change is the use of Web Proxy Accelerators such as Squid or Bluecoat. Large organisations often place such devices at their internet access point so as to cache objects locally and avoid the need for multiple local users to request a page from source within a given timeframe. Ideally this time frame would be dictated by a cache control header for the page but it is possible to over-ride this default and set a global cache time for all content (this might be done to reduce internet bandwidth requirements). This sometimes explains why some browsers may not see a change to a piece of content until long after it is visible on a PC elsewhere.
An important part of the caching strategy should be a statement of which pages won’t be cached. If the site is to deliver pages that will be different for each user, or require authentication, these should not be cacheable on public cache servers. As well as personalised and authenticated pages, this also includes any forms and/or search results pages.
For example, if you show the logged-in user name on the home page, this means that this cannot be cached and every hit on the home page will require the web server to re-build the page. Re-generating the home page for every hit is not desirable and so some form of web application component caching should be used as described the section above.
So the strategy documentation should make it clear to developers what can and what cannot be cached at an HTTP server or browser, and it should make it clear how to control this behaviour when developing page templates.
It is important to establish a documented caching strategy at the outset of any implementation. Your system documentation should detail how and where the various elements of the site will be cached. This document should be reviewed regularly to ensure that ongoing developments conform to the agreed strategy.
The goal of your caching strategy should be to ensure that as much content as possible is cached at the browser, the web server or the application server without compromising the required dynamic element of content delivery. Images and binary objects such as pdfs can usually be cached for long periods of time. Page navigation and other elements of a page can also be cached; even if it is for just 10 minutes this may have a dramatic effect.
A J2EE servlet container provides the JVM in which a web application runs. This is commonly referred to as an application server, however an application server generally provides a number of other complementary services such as a database connection pooling, Java Messaging Services and clustering. Only the J2EE contain is required to deliver web sites using ACM.
All J2EE containers will have a default memory heap size that the java runtime process runs within. This may well need tuning to cater for the memory requirements of your web application. This is easily done by adding or altering the java runtime options in whatever script is used to start the product. The –Xms options sets the starting size for the java heap and the –Xmx options sets its maximum size.
In Tomcat this can be done by adding these options within the Catalina.sh script on Unix or by using the Windows program d:\Tomcat\bin\tomcat5w.exe, which provides a GUI for the configuration of the Tomcat service.
In Weblogic this is done by adding the string '-Xms512m –Xmx512m' to the Startup argumentsoperty on the Weblogic Server Startup Options page.
In JBoss this can be done by editing the run.conf file, or on Windows if JBoss has been configured to run as a service, the JVM options can be configured in the Windows registry. The JVM options can be found under HKEY_LOCAL_MACHINE\CurrentControlSet\Services\JBoss\Parameters, and then taking the following steps
1) Increasing the value of the "JVM Option Count" by 2, if adding the Xms and Xmx values.
2) Adding a REG_SZ Parameter "JVM Option Number 1" and assign it the Value –Xms512m
3) Adding a REG_SZ Parameter "JVM Option Number 2" and assign it the Value -Xmx512m
The default configuration of the SUN Java Virtual Machine is sufficient for most situations; however experience has shown that minor performance improvements can be made by opting to use the Parallel garbage collector. This should be added as a a pair of wrapper.java.additional parameter in the cae.conf file for the ACM instance.
Remember that these parameters need to be presented in a sequential list with no numbers missing from the sequence.
The default configuration of JacORB is sufficient for most ACM clients, however if your web application often makes calls that return large amounts of data, and particularly if your web application is running on a separate server to the ACM, then the following changes should be made to tune the JacORB configuration.
We should set the fragment size for CORBA communication by adding the following parameter to the Tomcat java options in catalina.sh. It should also be added as a wrapper.java.additional parameter in the cae.conf file for the ACM instance.
In order to prevent CORBA timeouts while Tomcat is waiting for responses to calls that require lots of data we should add the following to the Tomcat java options in catalina.sh
The first (x) and last (y) of these timeouts have the effect of causing a thread to wait for x milliseconds if it has not finished reading from a read socket, if the read is not complete after this sleep then it will then sleep again for x+y milliseconds and try again, each subsequent wait will add a further y milliseconds to the previous wait.
On 32-bit operating systems there is a limit to the amount of memory that can be assigned to a java process. This amount varies depending upon the application and the amount of contiguous memory available to it at the time. In practice, there is an upper limit of about 1500MB that can be assigned to Tomcat. If you need to be able to assign more memory, then it is possible to do so on an Alterian application server by using Windows Server Enterprise Edition. Alternatively a small increase in the total amount of addressable memory can be made on Windows servers by using the /3GB switch in the boot.ini file; this switch allows up to 3GB of RAM to be used by applications (by default only 2GB is available to applications and 2GB is reserved for the operating system) with the effect that up to 2000MB can sometimes be allocated to a single JVM.
Note that by using the Windows Process Address Extension feature more than 4GB or RAM can be used on a 32-bit server but the limit of 1.5-2GB for any single JVM remains. Note also that the Weblogic ‘JRockit’ JVM is capable of addressing rather more memory as it is able to utilise multiple not-contiguous memory segments, whereas the SUN JVM is only able to utilise a single contiguous block of memory – this explains why the total memory available varies between servers depending upon the amount of contiguous memory available once the operating system and services have been started.
64-bit servers are able to address 16 terabytes of memory – enough for even the most demanding of web applications. For the reasons described here Alterian recommend the use of 64-bit operating systems for all new ACM implementations.CPU is used by the operating system kernel and by applications. If you have poor disk performance or a lack of memory then CPU time is wasted in waiting for I/O rather than processing application code. Some CPU time is also wasted in context switching (switching between outstanding tasks).
The ACM Smart Client is .Net application used by site administrators and content authors to manage ACM web site content. The client requires the Microsoft .Net Framework to be installed locally and requires free CPU and RAM to run efficiently.
The client maintains a number of separate interfaces with the Authoring server in order to work; it connects to the CAE server in order to retrieve content, to the SyncD in order to change content and see other people’s changes in real-time, and it connects to the J2EE application servlet container in order to both preview content changes on the web site and to allow authors to edit content in the context of the HTML page itself.
All of these interfaces and the richness of the functionality that the client provides mean that the client is highly dependent upon the latency and bandwidth of the network connection between the client PC and the authoring server. Experience has shown that the performance of the Smart Client tends to deteriorate if the client’s connection to the server is slower than about 40 milliseconds. Also, as network bandwidth becomes saturated the performance of the Web Client will suffer rather more than other applications.
For these reasons it is generally recommended that the Smart Client should only be used by authors and administrators connecting over networks with spare capacity and within relatively close proximity to the authoring server. This is often not a problem for smaller companies with data centres local to their business operations however for larger enterprises with distributed offices and hence distributed content managers this can be a serious problem. Luckily Alterian provides the Web Client for just this reason.
Network performance is the key factor affecting Smart Client performance. It is therefore good practice to establish what is normally understood to be acceptably good performance. A benchmark should be recorded that describes typical authoring activities and how long each of them should take; ideally from at least two physical locations, but the server itself (to rule out network factors) and from an authoring PC in a remote office. This will enable appropriate diagnostic work to be performed in the event of future performance problems. If the performance of a remote PC is poorer than expected and the local performance by the server is fine, then the network as the cause of the difference. However if performance is worse at both locations then the effect of the network can be discounted and support efforts can be targeted on the server.
The bulk of the HTTP requests between the browser client and the server consist of requests for static items and as these items do not change it is appropriate to ensure that they can be cached at the browser and also, ideally, at the server. This is enabled by configuring cache directives at the web server for these types of file, the following Apache virtual host describes a web site named mywebsite that proxies requests to the Web Client application on a server named mytcserver; the ExpiresByType directives are forcing Apache to add HTTP Expires headers to various types of static files; the expiry time given will be 86400 seconds, or 1 day from the request time.
From version 5.9.5 onwards the Web Client application can be configured to add appropriate headers itself in a very similar way as that described above. In the WEB-INF/web.xml file the following configuration defines a cache Filter with a suitable Cache-Control header and a set of URL mappings for the different types of cacheable static files delivered by the application.
If a Squid caching server is running on port 80 in front of Apache then it will cache these objects with headers added by Apache. If Apache is configured to cache HTTP objects and the web application is configured to add headers then Apache will cache them. This will, ensure that each object will be served from cache for all subsequent requests for the same object during the next day. Browsers will cache the objects in the same way whichever method of adding headers is used.
The result of this change is that the network traffic required to use the Web Client is significantly decreased to the point where perfectly acceptable performance can be gained over a network connection from the opposite side of the world, so long as the optimum possible network latency is available from the particular location.
Once network factors have been removed using the above technique it is generally the case that Web Client performance is reasonably similar on a remote browser to that observed close to or on the authoring server itself.
There are a number of functions in the web client that can affect performance if attention is not paid to the design of the content structure for a particular site. Each time we perform an action in the Site Planner or Content Explorer the whole screen needs to be refreshed by the web application in case any of the items on the screen have been modified. This is normally a fast process but if we were to create a folder with large numbers of child items in it then all of these items would need to be refreshed each time we perform any action; this can start to take a reasonably long time; and so Alterian recommend that no more than 200 items be stored in any one folder. Further to this, if a high proportion of editors are likely to have a relatively slow network connection, then this number should be lowered. And so if a client wishes to store a large number of items, such as Contacts, PDFs, or Reports in a Content Store or web site branch, then consideration should be given to creating a suitable storage structure under the main store. An example here would be to create a two-level alphabetical tree structure beneath the Content Store branch where items can be located according to the first two letters in their name. Assuming that editors know the names of the items that they are working on then this method would make it much easier to find content, as well as ensuring that Web Client performance is maintained.
Alterian recommend that a most web site performance issues will probably relates in some way to the issues discussed previously in this document. This is because the process that takes most time in rendering a web page is the dynamic element of redrawing that page using the database, ACM and Web application processes described above.
However there are also aspects of the configuration of the web server that can also impact upon the performance of a site, particularly when observed from a distance.
By default there is no compression of content within the Alterian and J2EE container configuration. Most modern browsers are capable of accepting compressed versions of HTTP objects and we can often reduce the network traffic by up to 70% by using this method, with the result that web pages are delivered more quickly to the browser, particularly if the browser is connecting over a slow or distant network connection.
It is possible to compress content within Tomcat but this can be a rather costly process in terms of CPU. For this reason that best place to add HTTP compression is on the web server. Many clients will have a separate Linux server or servers that sit in a DMZ that is close to the internet but separated from the rest of the delivery architecture by a firewall. These web servers are generally under-utilized in terms of the CPU power. This is because Apache is capable of delivering very high volumes of traffic from a relatively low powered server. And so it makes sense to utilize some of this spare CPU for compressing web content.
In order to add HTTP compression into an existing Apache configuration, firstly add the following Output Filter to the bottom of the Apache https.conf file:
Note that if you are adding compression on a Staging server then you should remove text/html and text/plain from this directive as they can conflict with the use of the In-Context servlet in the Web Client.
Secondly you should add the following directives, either below the line you have just added to httpd.conf, in which case all content matching this output filter will be compressed, or alternatively if you only wish to add compression for a particular web site then add the following lines to the virtual host configuration for that one web site.
Once you have restarted Apache then all content of the types matching the output filter will be compressed.
If your ASP .Net web application is being delivered by IIS then you can configure HTTP compression as follows:
· In IIS Manager, double-click the local computer, right-click the Web Sites folder, and then click Properties
· Click the Service tab, and in the HTTP compression region, select the Compress application files check box to enable compression for dynamic files. Note that if we are using a Squid cache in front of IIS they only static compression should be used.
· Select the Compress static files check box to enable compression for static files.
You can test the compression of objects being delivered to being delivered to our browser using the FireBug plug-in for FireFox. With the plug-in loaded, activate the Net Panel and examine the request and response headers for a particular page item. You will see a request header like this:
And a Response header like this:
The request header shows that browser telling the web server that it is capable of accepting compressed content in the two forms ‘gzip’ and ‘deflate’. The response header is provided by the server as it passes the object to the browser and confirms that it is of a compressed ‘gzip’ format.
Website Analyzer is a useful tool for measuring the effectiveness of HTTP compression on web pages can be downloaded from here:
HTTP compression is particularly effective in decreasing the size of HTML content and CSS files.
Using the above process Alterian clients have been able to significantly improve the performance of their web sites, particularly those sites with a global user base where network performance has a serious impact on the speed of delivery of web pages.
Alterian recommends that all client use some form of HTTP caching server in front of the web application delivering their web sites.
Apache Web Server can be configured to use the mod_cache module to cache all items that are presented with either Expires or Cache-Control response headers. Alternatively the popular Squid caching proxy server is often recommended as it is generally regarded as being a better performing and slightly more robust product than mod_cache.
As described above, all binary content delivered by the ACM Stream Servlet will contain such headers with cache times derived from the values associated with the content item type. Also HTTP pages, if they are cacheable in nature, should be given appropriate caching headers within the template used to generate each page (as described above for both JSP and ASP.Net templates).
Apache can be configured to run an HTTP cache with the following directives placed in its httpd.conf file:
These directives will tell Apache to keep a cache store under the directory /var/cache/apache. Note that the user running the Apache processes must have write access to the cache store. The CacheIgnoreHeaders directive ensures that Apache does not cache the Set-Cookie header; this can mess up client sessions by passing them someone else’s cookie.Apache will now cache all items according to the settings in either an ‘Expires’ header of a ‘Cache-Control: s-maxage’ header.
This is the basic configuration required for most sites, however if you wish to add further control over outgoing objects you can also get Apache to add Expires headers to any content that does not already have them (using the CacheDefaultExpire, CacheMaxExpire and CacheLastModifiedFactor directives).
Note that the above directives can also be used inside a virtual host instead of in if you wish to limit caching to certain hosts only – for example it would be inappropriate to add caching to the delivery of an authoring web site as editors always expect to see the latest version of any content when using the preview and In-Context edit functions in the ACM editorial clients.
If you choose to use the Squid caching server then it is generally recommended to run it in front of an Apache web server. Squid will cache anything that passes through it and so it sits in front of the web sites acting as caching appliance. Squid has no control over web site names, it just listens on an I.P address and proxies all requests than cannot be served from its cache on to a host name. In this configuration Squid will run on port 80 and so Apache will need to be modified to run on port 81.
The following directives in the squid.conf file will configure Squid to act as a proxy in front of Apache:
Note that the ignore-cc option to http_port tells Squid to ignore cache control headers such as max-age=0 that are sent by browsers to force a refresh of the page from the Origin server. This option is only introduced in Squid 3.
The Squid cache store and logging are defined as follows in squid.conf:
There are many further refinements possible to both Apache and Squid caching. Please consult the appropriate on-line documentation for further details or alternatively advice can be provided by Alterian Professional Services.
It is worth noting that modern load-balancing devices such as those available from Cisco and the BigIP devices from F5 often have the capability to perform HTTP compression and HTTP caching. If this facility is available then the load-balancer is likely to be the most efficient place to perform this work, particularly that of HTTP caching as there would only be one cache, and hence there would be no duplication of application effort caused by caching the same object at the same time on multiple cache servers.
In ACM version 7.0 a new workflow engine was introduced, this replaces the old Publishing Flows defined for each item type. The new engine uses the Enhydra Shark product that has been integrated into the core CAE Java service. A new set of tables have been added to the database repository to store the workflow definitions and statuses for each ACM object.
When upgrading to ACM 7.0 it is possible to automatically generate new workflows for each existing type of content, access to these workflows will then be granted to the appropriate groups. However most pre-ACM 7.0 instances use the same Publishing Flow for many item types and so the result is that the upgraded instance may include multiple identical workflows.
The performance of the new Workflow engine is affected by the number of workflows defined and the number of editorial groups and users that have been granted permission to use them. It is therefore good practice to keep these numbers to a minimum. We therefore recommend that rather than using the automated workflow generation tool (described in the Installation guide in the section on upgrades) we replace this step with a manual procedure of creating the workflows that will actually be required, applying them to the appropriate item types and granting access to them. These procedures are described in the ACM Workflow User Guide.
It is good practice to record the average amount of CPU used by each of the separate application processes on your servers at a time when performance is good. Straight after initial implementation is a good time to do this. One way to do this is to create a simple stress test that performs a repeatable series of requests on the web site. Ideally the test should be directed at the application server this will remove the element of HTTP caching that might be being performed by the web server. This ensures that you are measuring the performance of the applications, not the effectiveness of the cache.
A simple tool that allows you to execute such as stress test is the Microsoft’s Web Application Stress Test Tool. This tool has been removed from Microsoft’s download site but can be supplied by Alterian if requested. A rather more involved, but superior tool is Apache jMeter, which can introduce forms logins, wait states and think times; this allows you to model user behaviour and content contribution as well as site delivery.
Try to execute the tests from a point as close to the delivery servers as possible so as to remove the viable impact of network performance from the benchmark; remember we wish to record the performance of the servers, this is different from the performance of the web site as perceived by a remote browser.
Using the chosen tool create a table of results like the following one. Remember to try to run the tests at a time when the web site is likely to be relatively idle, so that the impact of production requests is as small as possible; clearly if you have a separate UAT environment then the test could be run there.
Tomcat average CPU
Oracle average CPU
Total CPU used
Average page speed
Pages delivered per second
This data provides you with a benchmark against which you can easily determine which application’s CPU usage may have changed at a time when performance appears to be poor and load has not increased. In order to view how this data may have changed over time it is important to schedule a regular benchmarking exercise on your production servers; this provides valuable insight into how performance may have changed over time and allows site managers to anticipate requirements for performance reviews or capacity increases.
This section provides a small number of performance analysis scenarios and examples of the most likely path to resolution.
Before analysing any particular area you should ensure that your current performance problem is not being created by an increase in load on your site. If the load on the external site has doubled since you last performed a benchmark (as described in the previous section of this guide) then the load on the ACM applications will increase by an amount determined by the cacheability of content on your site. If load has not increased from normal, then refer to the following sections according to the nature of your performance problem.
If site searches are slower than expected, and particularly if they appear to be getting slower over time, then it is possible that your Text indexes have become fragmented as described in section 2.4 of this guide. It is also possible that the cleanlive maintenance script is not being run.
If you are using SQL Server then check that the full text index is being updated correctly.
If the database processes appear to be using up too much of the available CPU on your server (compared to the benchmark figures that you recorded after reading the previous section of this guide), then it is possible that one of the caches referred to in the Oracle Performance section of this document is not performing efficiently.
It might also be that the updatedbstats maintenance script or SQL Server Daily Maintenance Plan is not being regularly run to optimize data access as described previously in this guide.
Further to this, you may need to determine what SQL is using the most CPU time or returning the most data in the database. This is likely to lead to a view on what process is at the heart of the problem. The Oracle STATSPACK package is simple to install and run in order to produce reports describing internal database performance. There are many resources available on the internet describing the installation and use of STATSPACK or the more recent Automatic Workload Repository (AWR)
If you are using SQL Server then SQL Profiler is the tool to use to look for poorly performing SQL.
If the Java process associated with your servlet container application appears to be busy (compared to the benchmark figures that you recorded after reading section 6.4 of this guide) but the MAE and Oracle processes appear to be behaving normally, then it may be that some change has been made to your web application to cause this. Discuss recent changes with your application developers and ensure that they are referring to Alterian best practices for their development work.
If the CAE process appears to be using more CPU than normal, but the servlet container and Oracle database processes are behaving normally, then it is possible that some configuration property has been changed in the ACM instance properties files or that the web application has some new template functionality that is using the ACM API in a different way.
If one area of your site performs poorly while the rest performs as expected, then it is very likely that there is poor coding in the templates delivering that aspect of the site. Any SQL that is called directly from the template may need to be reviewed. Alternatively, if there is some Alterian functionality that is only used in that area of the site, then you should raise the issue with Alterian support. They will either provide guidance on what may be causing the problem, or if the matter is more complex you may need to arrange for a Professional Services consultant to assist you in resolving the problem.
If the Distribution process appears to be taking longer than usual to pass content from a staging server to a live server, as compared to the rate recorded in your benchmark, then there are a number of possible causes.
Firstly it could be that the task queue is much larger than normal as a result of a large integration run. DistD should not run more slowly when there are more tasks to be processed, but if the DistD property distd.optimizerMode is not set to ‘ALL_ROWS’ then performance will worsen as the task queue gets bigger.
Alternatively it could be that the Oracle table statistics for the tables related to DistD are note up-to-date. Please see section 2.3 of this document for a description of this problem and its normal resolution.
Finally there is a known and very occasional issue with the collection of Oracle table statistics whereby the optimizer fails to recognise newly re-created table statistics; under this condition the performance of DistD suddenly deteriorates to processing only about 500 tasks every 5 minutes when the TASK table has more than a certain number (usually about 150,000) rows in it. Even after recreating the statistics on the TASK table performance does not improve. The way to resolve this issue is to manually drop and then recreate the statistics on the table rather than use the normal function which replaces the statistics; this can be performed in SQL*Plus as follows:
Normally performance improves back to normal (between 10,000 and 30,000 tasks per 5 minutes) once the currently executing batch of DistD tasks has been completed and the optimizer looks at the table statistics again.
Use SQL Server Replication Monitor to monitor the status of the replication to the delivery database(s). In general, SQL Server Replication is trouble-free, but network latency can sometimes cause the replication process to run slowly. Use the ‘Tracer Token’ functionality of SQL Server Replication Monitor to view the current network latency between the Authoring and Delivery databases.
Corporate and North American
European Headquarters Headquarters
The Spectrum Building 35 East Wacker Drive
Bond Street Suite 200
Bristol Chicago, Illinois
BS1 3LG 60601
T +44 (0) 117 970 3200 T +1 312 704 1700
F +44 (0) 117 970 3201 T +1 312 704 1701
Insight House 1010 Washing Boulevard Unit 3
Newbury Business Park 9th Floor 1 The Esplanade
London Road Stamford, CT Balmoral
Berkshire 06901 NSW 2088
RG14 2QA USA Australia
UK T +1 203 653 9090 T +61 2 9968 2449
T +44 (0) 1635 262000 F +1 203 653 9095 F +61 2 9969 1163
F +44 (0) 1635 262001
Ocean House Naarderweg 16 3rd Floor
4 Stinsford Road 1217 GL Hilversum H.M. Rochester
Poole Postbus 371 197 Double Road
Dorset 1200 AJ Hilversum Indiranagar
BH17 0RW The Netherlands Bangalore – 560 038
UK T +31 (0) 35 625 7890 T +0123 567890
T +44 (0) 1202 250000 F +31 (0) 35 625 7899 F +0123 567890
25152 Springfield Court
T +1 661 367 9970
F +1 661 367 9969
© Alterian 2012
 Both the updatedbstats script and the rebuildtextindexes script can be found in the $MS_HOME/bin directory.
 More details on this subject can be found on the Microsoft Technet site at http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/d52ff289-94d3-4085-bc4e-24eb4f312e0e.mspx?mfr=true