EDIT: Work in progress week of 6/20-/624. Expect frequent saves/updates during this time, as well as formatting changes.
Staging to Production is the process by which data from one WebSphere Portal environment is copied to a second WebSphere Portal environment. It is also known as a deployment process. This document will discuss many different questions that arise when devising a deployment process for your Portal environments.
This document is intended to be a living document and will be updated regularly. The format will be a Frequently Asked Questions with multiple sections and questions/answers per section. Questions should not be considered ordered and information in different sections will repeat/overlap. Some questions will be posted without answers - these will be answered in time. The writing style will be a bit more informal / free-form vs. adhering to strict technical writing guidelines, i.e. technical content, the content accuracy and readability will be prioritized.
Community feedback is welcome - suggestions for improvement, questions/answers to add to the document (will attribute work!), corrections to mistakes - all welcome. See end of this document for author contact information.
Index of Topics
Section 1: Staging to Production Overview
Section 2: Portal Architecture Considerations
Section 3: XMLAccess and ReleaseBuilder
Section 4: Syndication
Section 5: Portal Appilication Archive
Appendix A: Terminology
Appendix B: References
Appendix Y: Acknowledgements
Appendix Z: About the Author of this Document
Section 1: Staging to Production Overview
Forward note - the discussion in this section is a high-level discussion / not specific to the WebSphere Portal product. i.e. You could substitute the term "WebSphere Portal" with "My cool website" and the same principals would apply. Skip ahead to additional sections if you want to see Portal specific considerations.
Q1) What is staging to production?
A1) Staging to production from a high-level is taking data environment from one environment and copying it to a second environment. In non-Portal terminology, this is also referred to as a deployment process. The terminology used with Portal implies two such environments that are typically involved in a deployment process, a staging environment and a production environment. However, any two environments may be used to copy data.
Q2) What environments are involved in staging to production?
A2) Minimum of two environments, no maximum. Environments are typically referred to by their roles
||A development environment. May be a standalone workstation used by a single developer or a shared server used by multiple developers
|STAGE / INTEGRATION
||Work from multiple DEV enviroments is brought in here and tested. If it does not work, rejected and sent back to developer for additional work.
|QA / UAT
||Quality Assurance / User Acceptance Testing environment. Much more tightly controlled changes and verification testing is performed in this environment after passing through STAGE and before promoting to PROD REND.
||Authoring environment where new web content originates
||Production rendering envirnoment which serves data to end users
||Disaster Recovery environment - should mirror PROD REND 100% identical. If PROD REND suddenly fails, you failover to this environment.
In practice we have seen some environments combined to reduce costs - e.g. STAGE and QA. It is not recommended to do so as it may introduce additional risk for performing deployments. Determine what risk level vs. cost is acceptable when architechting your deployment process.
Q3) How does data flow between environments?
A3) Typically we see two flows established - one for web content, and a second for all other data.
Web Content Flow:
PROD AUTH originates WCM content. Typically a single piece of content is not considered high risk and can be syndicated between environment freely.
PROD AUTH to DEV
PROD AUTH to STAGE
PROD AUTH to QA
PROD AUTH to PROD REND
PROD AUTH to DR
In summary, PROD AUTH contains the master copy and all copies of web content should originate from it. No other environment should originate web content.
DEV originates Pages, Portlets, WCM design elements (authoring+presentation templates), themes and skins changes. In summary - anything which can functionally touch / update multiple locations on the Portal site and present risk should originate in DEV. Why not PROD AUTH for authoring and presentation templates? One mistake and you end up impacting all content authors work / delaying new deliverables. Mitigate that risk by putting higher-risk elements through a more rigorous set of tests in QA before pushing to either production environment.
DEV to STAGE
STAGE to QA
QA to PROD AUTH
QA to PROD REND
Q4) Is there a way to push updates without affecting a live production environment?
A4) Yes. You may have two PROD REND enviroments available, one which is live to end users (PROD-A), the second which is not live and has changes made to it (PROD-B). This is known as an active/passive configuration. During a deployment window the changes are made to the passive environment. Once the changes are made update and validation testing passes - update your DNS records to have the external hostname point to the passive environment (PROD-B) and your passive environment now becomes the active environment. Should something unexpected go wrong on the recently updated environment (PROD-B), you may revert back to the environment which had no changes made to it (PROD-A) via a quick DNS changes.
Active/passive offers low risk means of pushing changes between environments. The primary disadvantage is cost - approximately double the hardware and software costs for a second PROD environment - in addition to overhead to maintain a second environment. Further, the Portal servers are not the only servers which would need to be duplicated in an active/passive consideration. Consider duplication of deployment managers, web servers, load balancers, LDAP servers, database servers, etc. in addition to the Portal servers.
Q5) How should I update my Disaster Recovery enviornment (DR)?
A5) Ideally your PROD REND and DR environments should be identical to each other such that if the PROD REND environment fails, you can failover to a disaster recovery environment within seconds. This is more commonly known as an active/active configuration.
When pushing updated data - do you update DR simultenously to ensure the enviornments remain identical? OR, do you wait to update the DR environment? The answer to this question is also dependent on your acceptable business risk level. From experience we have seen changes pushed to PROD and DR simultaneously multiple times with proper validation testing performed in QA and everything works as expected. We have also seen cases where proper validation performed in QA and the same changes pushed to PROD and DR simultaneously end up breaking both PROD and DR simultaneously, resulting in costly outages.
We would recommend updating PROD first during a maintenance window, allow normal end users to access the site following the maintenance window, then schedule the same updates for DR thereafter. While this will create a slight timing gap in PROD/DR being 100% identical, it also mitigates any risk associated with the updates completely breaking both environments. Arguably, better to have a working environment with some new features/updates missing rather than no working environment available. Less costly to failover to a backup than troubleshoot an extended outage.
Q6) When should I update these various PROD environments?
A6) We'll break this out based on some hypothetical scenarios.
Scenario #1 - active/active
- Peak hours are during normal business hours, 0800-1700, Monday-Friday.
- PROD - Friday night maintenance window
- DR - Friday night maintenance window the following week
*Rationale: ~45 total hours of busy activity on the system following the change. Ensures change does not cause unwanted side effects over a long period of time.
Scenario #2 - active/active
- Global site that is accessed with similar usage patterns 24x7. Schedule as follows:
- PROD - Friday night maintenance window
- DR - Sunday night maintenance window
*Rationale: ~48 hours of busy activity on the system following change. Ensures change does not cause unwanted side effects over a long period of time.
Scenario #3 - active/passive
- PROD-B - Friday night maintenance window
- DR - Friday night maintenance window simultaneously with PROD-B
- PROD-A - no changes made
*Rationale: PROD-A is your fallback in the update fails on both PROD-B and DR.
Note: These are not hard and fast schedules for updating systems but ones we have observed implemented in practice. In most cases, the discussion is based on cost vs. risk assessment with the business. For example - will your business permit changes to a PROD system during business hours? Probably not in scenarios #1 and #2, possibly in scenario #3.
Q7) How frequently should I be updating data?
A7) As often as your business risk will allow it.
- For WCM content updates, these are typically considered smaller / less risky changes, and hourly updates are normal to see.
- For all other updates, schedule the updates weekly, monthly and quarterly, etc. as business risks will allow. Hourly is technically possible but often not acceptable from a risk perspective.
Section 2 - Portal Architecture Considerations
Q1) Does Portal support multiple installations in a single environment?
A1) Yes. The installer will detect existing running WAS/Portal instances and prompt you to install to a different directory. It will also recommended a different series of port numbers for the additional Portal installation to run on. The author of this article has successfully installed and runs simultaneously the following versions of all in a single environment - Portal 6.0, 6.1, 7.0, 8.0 and 8.5.
Q2) Does Portal support multiple profiles in a single environment?
A2) Yes - you may have a single set of WAS binaries (/AppServer) and Portal binaries (/PortalServer) with multiple Portal profiles. e.g.
Or possibly separated by line of business within a single environment:
This may help reduce software costs. However, one tradeoff is that application binaries are shared between environment. Meaning, if you update either WAS or Portal code levels, you must do so across all profiles simultaneously. We have found from experience this can work in a non-production environment, but not in a production environment. Why? Different lines of businesses have different timelines for their deliverables. Having to halt ALL deliverables for one line of business so a second line of business can perform a deployment is often unacceptable.
Example #1 - line of business #1 needs to add a new feature to allow single sign-on functionality via SAML login. This change needs to be configured at the DMGR cell level, meaning all profiles will need their Portal servers restarted once the change is made. While the other lines of business do not use SAML login - more importantly - they need to ensure the cell-level / global SAML login configuration change does NOT affect their line of business.
Example #2 - one line of business uses purely portlets. A second line of business uses purely web content. One of those lines of business receives an APAR iFix from IBM for a defect in the software. The APAR must be installed across all profiles simultaneously. One line of business is impacted by a change a second line of business needs and will in NO way benefit that line of business.
Q3) What is a virtual portal?
A3) Let's rephrase that to "What is a WebSphere Portal server?". Its a series of .ear files, .war files, and .jar files that run on top of WebSphere Application Server. Several configuration elements of WebSphere Application Server - such as JDBC datasources for databases, global security for LDAP, resource environment providers for global variables independent of the code, etc. are utilized such that it becomes a bit more complex than a single traditional WAS application.
Portal utilizes six separate database domains to store its database - separated by function. When you access the Portal server primary URL - /wps/portal - you are accessing the "base" portal of the system.
Q4) Should I have multiple installations, multiple profiles, multiple clusters, or multiple virtual portals? Or some combination of all of them?
A4) Loaded question - but a common one asked. There is no right/wrong answer one on this one - so the most common response starts with "it depends". We'll list the primary pro/con of each.
- Pro: Least risky. Separation of lines of business ensure changes to one LOB do not affect a second LOB.
- Con: Most costly - hardware, software and overhead all go up significantly.
- Pro: Single common environment / configuration for multiple lines of business. e.g. Only need to configure LDAP once.
- Con: Changes to WAS or Portal code must be applied to all profiles simultaneously, unlike multiple installations.
Multiple clusters in single profile:
- Pro: Simpler to maintain than multiple profiles configuration - only a single profile to update.
- Con: Cell level changes - such as security changes - must be applied to all LOB's simultaneously. Changes to WAS or Portal code must be applied to all clusters simultaneously,
Multiple virtual portals:
- Pro: Unique to WebSphere Portal product. Simplest of configurations to maintain+update. Can separate LOB per virtual portal.
- Con: Cell+Cluster level changes - such as security+database changes - must be applied to all LOB's simultaneously. Changes to WAS or Portal code must be applied to all virtual portals simultaneously,
What does IBM typically use internally for its environments? Multiple virtual portals. Each LOB has their own virtual portal and can work on their own deliverables independent of the other LOB. Coordination between LOB is needed for some elements that are not separated out by virtual portal - for example WAS/Portal code levels.
Q5) What about Portal farms?
A5) In this author's experience, 99% of Portal environments will want to choose clusters for their architecture. Portal farms are not bad by any means - however, they are intended to fulfill a specific need as documented in the Portal Infocenter. However, they also have limitations associated with them that clusters do not have. One notable limitation - managed pages in a farm is not supported, and generally speaking, customers utilizing IBM Web Content Management (WCM) would not want to choose a farm architecture. A one-hour presentation given by two Portal architects goes into extreme detail on pros/cons of Portal Farms - the 11-15 minute marks discuss when to choose a farm vs. not choose a farm to answer this question.
Section 3 - ReleaseBuilder overview
Q1) What is ReleaseBuilder?
A1) ReleaseBuilder is an executable .bat|.sh file located in wp_profile/PortalServer/bin. It is a critical tool used as a part of the overall staging to production process. In some documentation, you may see references to a ReleaseBuilder process. This is a synonymous terminology for staging to production / deployment process with WebSphere Portal.
Q2) What does ReleaseBuilder do?
A2) From a high level - the releaseBuilder tool allows you to compare the data on a system from two different points in time, capture the changes made on the system, and create an output file noting the changes made. The output file, commonly called a DIFF file, can thereafter be imported to another environment and have the EXACT same changes made on that second environment. This ensures the exact changes made on one environment can be replicated to a second environment. i.e. We need only copy over the delta of changes made, not a full copy of data. ReleaseBuilder is a standalone java process that does not require a running WAS or Portal server to execute.
Q3) How does ReleaseBuilder actually work?
A3) Example Scenario:
January 1st, 2016 Take an ExportRelease.xml XMLAccess export from STAGE. Name it 2016.01.01_stage_exportrelease.xml
xmlaccess.sh -user wpsadmin -password wpsadmin -url http://localhost:10039/wps/config -in /opt/IBM/WebSphere/PortalServer/doc/xml-samples/ExportRelease.xml -out /tmp/2016.01.01_stage_exportrelease.xml
February 1st, 2016 Take an ExportRelease.xml XMLAcess export from STAGE. Name it 2016.02.01_stage_exportrelease.xml
xmlaccess.sh -user wpsadmin -password wpsadmin -url http://localhost:10039/wps/config -in /opt/IBM/WebSphere/PortalServer/doc/xml-samples/ExportRelease.xml -out /tmp/2016.02.01_stage_exportrelease.xml
Now run releaseBuilder against the two files
releasebuilder.sh -inOld /tmp/2016.01.01_stage_exportrelease.xml -inNew /tmp/2016.02.01_stage_exportrelease.xml -out /tmp/2016.01.01_2015.02.01_STAGE_DIFF.xml
The 2016.01.01_2016.02.01_STAGE_DIFF.xml file contains a capture of all changes made to the STAGE enviornment release database over a one month period of time. It may now be imported to another enviornment, such as a QA enviornment, and the QA environment will automatically have the 1 month of changes made in the STAGE enviornment applied to it. STAGE and QA release databases will be in sync / identical thereafter.
Q4) Can ReleaseBuilder be used with two different environments?
A4) No this is not supported. Let me repeat - THIS IS NOT SUPPORTED!!! (Author's note - I would use blink tags here if I could to emphasize this point). If you create a DIFF file with the intention of importing it another environment, the DIFF file _MUST_ be created from XMLAccess exports from the SAME environment. If you create a DIFF file using releasebuilder exports two different environments and import that DIFF file, this is NOT supported, and could create unpredictable results. We have seen from field experiences loss of 25%+ or more of Portal pages if ReleaseBuilder is used in this manner. ALWAYS use Releasebuilder with two exports from the SAME environment.
Q5) Can ReleaseBuilder be used with two different environments if I don't perform an import?
A5) Yes - this is supported. You may generate a ReleaseBuilder DIFF between two different environments so long as the DIFF is NOT used in the import. The tool itself does not prevent this action and actually - it can be helpful to compare two different enviornments. IBM Support performs this action regularly when analyzing customer data to check for differences between environments.
Appendix A: Terminology
||staging to production
||system where data originates
||system where data is imported from SOURCE
||command line tool to export/import Portal release database data
||command line tool to compare two different XMLAccess exports of the same Portal server and produce a differential file showing changes made
|DIFF / Delta
||the differential file created by ReleaseBuilder
||Portal Application Archive
||a tool in the Portal administration console which can copy JCR data (WCM Libraries, Managed Pages, Managed Rules, etc.) between two different locations
||when syndication sends data from SOURCE to TARGET
||when syndication sends data from SOURCE to TARGET, or, TARGET to SOURCE.
||line of business
||The main portal site - typically /wps/portal
Separate area of Portal site with different look and feel, typically assigned per LOB. Uses either a URL context or a hostname, e.g.
wps/portal/hr OR hr.ibm.com/wps/portal
Appendix B: References
- Step-by-Step Cluster Guide for IBM WebSphere Portal v8.5
- Managing the ReleaseBuilder deployment process for IBM WebSphere Portal - written for Portal v7, content still applicable for v8.0 and v8.5
- Staging to Production Portal 8 without PAA - applicable for Portal v6.0 - v8.5
- Step-By-Step Guide to performing staging to production using Portal Application Archive in WebSphere Portal 8.5 - Portal v8.5 only
- How to generate a complete XMLAccess export of a Portal configuration
- XMLAccess Frequently Asked Questions
- Practical Advice On Deploying Portal as A Farm
Appendix Y: Acknowledgements
- The WebSphere Portal Information Development team for providing the Product Documentation.
- David Batres, Staff Software Engineer at IBM. For countless whiteboard discussions to hash out various technical topics.
Appendix Z: About the Author of this Document
Travis Cornwell is an Advisory Software Engineer at IBM working out of Research Triangle Park, North Carolina, USA. He began supporting the WebSphere Portal product in 2009 and is a subject matter expert in the areas of installation, configuration, administration, security, and performance. Travis has written numerous technical documents for WebSphere Portal and has been the primary technical reviewer/editor for many additional items.
If you have any feedback about the content of this document, Travis can be reached at: email@example.com.
If you encounter any failures following the steps in this guide or would like to discuss your deployment strategy in detail, you may open a PMR with WebSphere Portal Level 2 Support.