Skip to main content link. Accesskey S
  • Log In
  • Help
  • IBM Logo
  • IBM Digital Experience wiki
  • All Wikis
  • All Forums
  • ANNOUNCEMENT: WIKI CHANGE TO READ-ONLY. LEARN MORE...
  • Home
  • Product Documentation
  • Community Articles
  • Learning Center
  • IBM Redbooks
  • API Documentation
Search
Community Articles > IBM Web Content Manager > Increasing the Search Engine Optimization ranking for IBM Web Content Manager Web sites
  • New Article
  • Share Show Menu▼
  • Subscribe Show Menu▼

About the Original Author

Click to view profileIBM contributorLeslie Gallo
Contribution Summary:
  • Articles authored: 28
  • Articles edited: 12
  • Comments Posted: 0

Recent articles by this author

Collecting performance measurements of your IBM WebSphere - Java Virtual Machine

This article discusses an example of creating a collection of IBM WebSphere Portal performance measurements, using the Administrator Thin Client to running a jython script for collecting the desired data.

Integrating IBM WebSphere Portal 7 with Microsoft SharePoint 2010

This article explains how to integrate the portal frameworks between IBM WebSphere Portal, which is based on the J2EE programming model, and Microsoft SharePoint, which is based on the .NET programming model.

IBM WebSphere Portal 7 customization scenario: Part 1, Customizing a menu portlet

During an IBM WebSphere Portal implementation, customization is typically required in a few areas. This article explains how to build a customized WebSphere Portal menu in a JSR portlet, using WebSphere Portal 7 APISPIs.

Increasing the Search Engine Optimization ranking for IBM Web Content Manager Web sites

Learn how how you can remove both the traditional 301 (0 302) redirect from a Web site root to an IBM Web Content Manager URL and the common path part from the URL, such as wcpwcmconnectlibraryName.

Performance management tools for IBM WebSphere Portal

This document details the tooling that was used during a recent performance-related customer engagement. It describes the tools and how they were used to evaluate IBM WebSphere Portal 7 performance problem determination issues.
Community articleIncreasing the Search Engine Optimization ranking for IBM Web Content Manager Web sites
Added by IBM contributorLeslie Gallo on March 5, 2012 | Version 1
expanded Abstract
collapsed Abstract
Learn how how you can remove both the traditional 301 (0 302) redirect from a Web site root to an IBM Web Content Manager URL and the common path part from the URL, such as{{{ /wcp/wcm/connect/<libraryName>.}}}
ShowTable of Contents
HideTable of Contents
  • 1 Introduction
  • 2 Solution
    • 2.1 Set HomePage answer
    • 2.2 Remove the common path
    • 2.3 Install the Apache module
    • 2.4 Design the VirtualHost
  • 3 Resources
  • 4 About the author

Introduction


When you develop a Web site with IBM® Web Content Manager™ (WCM), you must first define a Library, then a Site Area for your Library, and so on, resulting in a long URL with a common path, for example:
http://www.mysite.com/wps/wcm/connect/<library>/<siteArea>/home
For improved Search Engine Optimization (SEO) you must call your HomePage with the following URL:
http://www.mySite.com
without an HTTP answer r301 or r302.

Then you replace your common path, like /wps/wcm/connect/<LibraryName>, with a short path, like /it/ for an Italian Web site. The typical solution is to apply some rewrite rule; however, this is an incomplete solution.

Because every href contained in the html page contains a long path, in this mode you obtain two URLs for every object---one short and one long. The result is that the search engine notices two URLs pointing to the same resource and divides the ranking by two, thwarting our efforts to increase SEO.

For example, if you call http://www.iulm.it/ and display the source page, you can find the long URL, as shown in figure 1.

Figure 1. Long URL example


And if you call http://www.iulm.it/wps/wcm/connect/iulmit/iulm-it/Home, you obtain the same page that you obtained with http://www.iulm.it/, if the search engine locates it along a route that points to the URL and verifies that the URL is the same short-halves ranking.

Solution


As an example of correcting this problem, we will look at http://www.sonus.com. We approach the solution in two phases; first we force the answer to the call only the homePage FQDN, then we take care of removing the common part of the URL.

Set HomePage answer


It’s quite simple to configure the IBM HTTP Server (IHS) with Redirect rules to hide context-path on a WCM home page. For example, on http://www.sonus.com/:

1. Activate mod_rewrite in the httpd.conf file:
a) Open the httpd.conf file, found in <installHTTPServer>/conf.
b) Find the term rewrite_module and remove the # in the beginning of the row; save it.
2. Define a specific virtualHost for your FQDN as follows:

a) Find the NameVirtualHost tag and activate, for example, NameVirtualHost <yoursIP>:80
b) Find the VirtualHost tag and activate it as shown in listing 1.

Listing 1. Activate VirtualHost tag

      <VirtualHost <yourIP>:80>
            DocumentRoot www/<mySite>
            ServerName <myFQDNServerName>
            ErrorLog logs/www/<mySite>/error.log
            CustomLog logs/www/<mySite>/access.log common
   </VirtualHost>
3. Insert the Rewrite rule by inserting the lines in listing 2 before the </VirtualHost> tag.

Listing 2. Rewrite rule code
	RewriteEngine On  #comment it to deactivate rewrite
	RewriteLog <path>\<FileLog>.log #comment it to deactivate log
	RewriteLogLevel 4 #comment it to deactivate log
	#--------------------- short URL  WCM
	RewriteCond %{HTTP_HOST} ^<fqdn> 
	RewriteCond %{REQUEST_URI} ^(/)?$ 
	RewriteRule ^(/)?$ /<path to the HomePage>/[PT,NC]   
	#-------------------------------------- End short Url WCM
4. Save it and restart the HTTP server.

Now your Web site answers http://<yourSiteFQDN> as http://www.sonus.com (see figure 2).

Figure 2. Sonus Web Site

Remove the common path


This step is more complex. The idea is to rewrite dynamically every
  • Link must be relative, such as /wps/wcm/connect/<libraryName>/<SiteArea>/...
  • Link to map your homepage must be /
  • Resource is in your WCM application
To apply this solution you must install an Apache module to allow a Proxy html, that is, to allow the rewriting of dynamic URLs. To install do this, you can recompile the source module, if necessary.

In the .zip file attached to this article, the modules are divided by folder, and you must copy every folder into the same folder under your Http Server installation path, when working with IHS 7 on a Microsoft® Windows® environment.

The rewrite process works via a Reverse Proxy functionality, an example of which is shown in figure 3.

Figure 3. Example rewrite process


As the figure shows:
  1. The user requests a page such as http://www.mysite.com/it/home.
  2. The VirtualHost receives the request and proxies it to an internal virtualhost like www1.mysite.com.
  3. The new virtualhost sends the request to the Application Server.
  4. The application server answers with a std page with a long URL link in
  5. The internal virtualHost re-proxies the answer, and a principal virtual host dynamically rewrites all the links in the page.
  6. Answer with a new rewrite page to the user.

Install the Apache module


To do this:
  1. Expand the attached .zip file into a temp directory.
  2. Copy the content of lib in /lib.
  3. Copy the content of modules in /modules.
  4. Copy the content of bin in /bin.
  5. Copy the content of conf in /conf.

Now add the following line in your httpd.conf file after your declaration of :
include conf\proxy-html.conf
Activate the proxy module in the load module section, find the proxy_module keyword, and activate the loadmodule line as follows:
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_connect_module modules/mod_proxy_connect.so
LoadModule proxy_ftp_module modules/mod_proxy_ftp.so
LoadModule proxy_http_module modules/mod_proxy_http.so
Edit your Proxy_Html.conf file to map to your library environment (see listing 3).

Listing 3. Edit Proxy_Html.conf file

#First, to load the module with its prerequisites.  Note: mod_xml2enc 
# is not always necessary, but without it mod_proxy_html is likely to 
# mangle pages in encodings other than ASCII or Unicode (utf-8). 

# For Unix-family systems: 
# LoadFile	/usr/lib/libxml2.so 
# LoadModule	proxy_html_module	modules/mod_proxy_html.so 
# LoadModule	xml2enc_module	modules/mod_xml2enc.so 

# For Windows 
# LoadFile	<HttpInstallPath>/bin/zlib.dll 
# LoadFile	<HttpInstallPath>/bin/iconv.dll 
# LoadFile	<HttpInstallPath>/bin/libxml2.dll 
# LoadModule	proxy_html_module	modules/mod_proxy_html.so 
# LoadModule	xml2enc_module modules/mod_xml2enc.so


The file is used with UTF-8 Charset; if you use another, you can modify the “Charset Section” in accordance with http://apache.webthing.com/mod_proxy_html/config.html.

Design the VirtualHost


In this example, the:
  • Public FQDN is www.mysite.com.
  • Private FQDN is www1.mysite.com, which is unreachable for Internet users.
  • Short root is /it (substituted /wps/wcm/connect/ for /it).
Listing 4 shows the code for the public VirtualHost.

Listing 4. Public VirtualHost

<VirtualHost <yourIP>:80>
    DocumentRoot www/mysite
    ServerName www.mysite.com
    ErrorLog logs/www/mysite/error.log
    CustomLog logs/www/mysite/access.log common
	#--------------------- short URL  WCM 
	 #  manage redirect hidden to / 

	RewriteEngine On 
	RewriteLog  logs/www/mysite/rewrite.log 	# write log,comment to disable  
	RewriteLogLevel 4 				# write log,comment to disable
	RewriteCond %{HTTP_HOST} ^www.mysite.com 
	RewriteCond %{REQUEST_URI} ^(/)?$ 
	RewriteRule ^(/)?$ /it/<pathToHomePage> [PT,NC]   

	#  <pathToHomePage> is your url part downstream of the library name in 
	# the complete URL
	#
	#  for example
	#
	# my std complete URL is : 
	# http://www.mysite.com/wps/wcm/connect/mysite/site/home/homepage
	#
	# my  pathToHomePage = “mysite/site/home/homepage”

	ProxyHTMLLogVerbose On # write debug log in error.log file switch on/off
	LogLevel Debug 

	ProxyPass /it/ http://www1.mysite.com/wps/wcm/connect/mysite/site/ 
	ProxyHTMLURLMap http://www1.mysite.com/wps/wcm/connect/mysite/site/ /it [c] 

	<Location /it/> 
	  ProxyHTMLEnable On 
	  ProxyPassReverse http://www1.mysite.com/wps/wcm/connect/mysite/site/ 
	  SetOutputFilter proxy-html 
	  ProxyHTMLURLMap /wps/wcm/connect/us/ /it/ 
	  ProxyHTMLURLMap /it /it 
	</Location> 
</VirtualHost>

Listing 4 shows the code for the private VirtualHost.

Listing 4. Private VirtualHost

<VirtualHost <yourIP>:80>
    DocumentRoot www/mysite
    ServerName www-1.mysite.com
    ErrorLog logs/www/mysite/error.log
    CustomLog logs/www/mysite/access.log common
</VirtualHost>

Restart the HTTP server, and enjoy your new shortened URL page. If your configuration is correct, when you restart the HTTP Server you can see the following lines in your error.log file, indicating that the workers are initialized:

[Tue Feb 14 14:35:53 2012] [debug] proxy_util.c(1808): proxy: grabbed scoreboard slot 0 in child 6384 for worker http://www1.site.com/wps/wcm/connect/it/
[Tue Feb 14 14:35:53 2012] [debug] proxy_util.c(1904): proxy: initialized worker 0 in child 6384 for (www1.site.com) min=0 max=600 smax=600

And when the Proxy-html is working correctly, you can see the following lines:

[Tue Feb 14 14:37:55 2012] [debug] mod_proxy_http.c(56): proxy: HTTP: canonicalising URLwww1.site.com/wps/wcm/connect/it/azienda/
[Tue Feb 14 14:37:55 2012] [debug] proxy_util.c(1494): [client 87.5.140.181] proxy: http: found worker http://www1.site.com/wps/wcm/connect/it/ for http://www1.site.com/wps/wcm/connect/it/azienda/
[Tue Feb 14 14:37:55 2012] [debug] mod_proxy.c(1000): Running scheme http handler (attempt 0)
[Tue Feb 14 14:37:55 2012] [debug] mod_proxy_http.c(1942): proxy: HTTP: serving URL http://www1.site.com/wps/wcm/connect/it/azienda/
[Tue Feb 14 14:37:55 2012] [debug] proxy_util.c(2001): proxy: HTTP: has acquired connection for (www1.site.com)
[Tue Feb 14 14:37:55 2012] [debug] proxy_util.c(2057): proxy: connecting http://www1.site.com/wps/wcm/connect/it/azienda/ to www1.site.com:80
[Tue Feb 14 14:37:55 2012] [debug] proxy_util.c(2155): proxy: connected /wps/wcm/connect/it/azienda/ to www1.site.com:80
[Tue Feb 14 14:37:55 2012] [debug] proxy_util.c(2310): proxy: HTTP: fam 2 socket created to connect to www1.site.com
[Tue Feb 14 14:37:55 2012] [debug] proxy_util.c(2416): proxy: HTTP: connection complete to 172.24.254.17:80 (www1.site.com)
[Tue Feb 14 14:37:55 2012] [debug] mod_proxy_http.c(1725): proxy: start body send
[Tue Feb 14 14:37:55 2012] [debug] mod_xml2enc.c(203): [client 87.5.140.181] Content-Type is text/html; charset=UTF-8
[Tue Feb 14 14:37:55 2012] [info] [client 87.5.140.181] Got charset UTF-8 from HTTP headers
[Tue Feb 14 14:37:55 2012] [info] [client 87.5.140.181] H: matched /wps/wcm/connect/it/, substituting /it/
[Tue Feb 14 14:37:55 2012] [info] [client 87.5.140.181] H: matched /wps/wcm/connect/it/, substituting /it/
[Tue Feb 14 14:37:55 2012] [info] [client 87.5.140.181] H: matched /wps/wcm/connect/it/, substituting /it/
[Tue Feb 14 14:37:55 2012] [info] [client 87.5.140.181] H: matched /wps/wcm/connect/it/, substituting /it/
[Tue Feb 14 14:37:55 2012] [info] [client 87.5.140.181] H: matched /it, substituting /it
[Tue Feb 14 14:37:55 2012] [info] [client 87.5.140.181] H: matched /it, substituting /it
[Tue Feb 14 14:37:55 2012] [info] [client 87.5.140.181] H: matched /it, substituting /it
[Tue Feb 14 14:37:55 2012] [info] [client 87.5.140.181] H: matched /it, substituting /it
[Tue Feb 14 14:37:55 2012] [debug] mod_proxy_http.c(1818): proxy: end body send
[Tue Feb 14 14:37:55 2012] [debug] proxy_util.c(2019): proxy: HTTP: has released connection for (www1.site.com)
[Tue Feb 14 14:38:00 2012] [debug] mod_proxy_http.c(56): proxy: HTTP: canonicalising URLwww1.site.com/wps/wcm/connect/it/azienda/
[Tue Feb 14 14:38:00 2012] [debug] proxy_util.c(1494): [client 87.5.140.181] proxy: http: found worker http://www1.site.com/wps/wcm/connect/it/ for http://www1.site.com/wps/wcm/connect/it/azienda/
[Tue Feb 14 14:38:00 2012] [debug] mod_proxy.c(1000): Running scheme http handler (attempt 0)
[Tue Feb 14 14:38:00 2012] [debug] mod_proxy_http.c(1942): proxy: HTTP: serving URL http://www1.site.com/wps/wcm/connect/it/azienda/
[Tue Feb 14 14:38:00 2012] [debug] proxy_util.c(2001): proxy: HTTP: has acquired connection for (www1.site.com)
[Tue Feb 14 14:38:00 2012] [debug] proxy_util.c(2057): proxy: connecting http://www1.site.com/wps/wcm/connect/it/azienda/ to www1.site.com:80
[Tue Feb 14 14:38:00 2012] [debug] proxy_util.c(2155): proxy: connected /wps/wcm/connect/it/azienda/ to www1.site.com:80
[Tue Feb 14 14:38:00 2012] [debug] mod_proxy_http.c(1822): proxy: header only
[Tue Feb 14 14:38:00 2012] [info] [client 87.5.140.181] No content-type; bailing out of proxy-html filter
[Tue Feb 14 14:38:00 2012] [info] [client 87.5.140.181] No content-type; bailing out of proxy-html filter
[Tue Feb 14 14:38:00 2012] [debug] proxy_util.c(2019): proxy: HTTP: has released connection for (www1.site.com)

NOTE: Before implementing this in a production environment, be sure to disable verbose logging.

To see a working example, visit http://www.sonus.com, where the short root is /content.

Resources

  • IBM Http Server Forum
  • Apache Http Server Project
  • Apache at WebÞing
  • Apache Tutor

About the author



Andrea Fontana currently works as a System Architect, defining, organizing, and configuring complex IBM product-based solutions. In particular, he works with WebSphere Portal and its collaborative environment including Domino 8.0.x, 8.5, IBM Connections 3.0.1, IBM Lotus Quickr 8.0.x, and IBM Sametime, setting up SSO Kerberos integration solutions and configuring systems with a r-proxy solution with SSL integration. His past experience includes roles as an Application Developer, Database Administrator, and Project Manager in a wide variety of business applications. He graduated from the ITIS Zuccante C., Mestre (Venice), specializing in Industrial Electronics. You can reach Andrea at a.fontana@net2action.com.
expanded Attachments (0)
collapsed Attachments (0)
expanded Versions (9)
collapsed Versions (9)
Version Comparison     
VersionDateChanged by              Summary of changes
9Sep 5, 2012, 9:47:47 AMAmanda J Bauman  IBM contributorMinor change
8Mar 26, 2012, 5:10:35 PMAndrea Fontana.  Minor change
7Mar 7, 2012, 10:53:14 AMAndrea Fontana.  Minor change
5Mar 5, 2012, 3:54:36 PMLeslie Gallo  IBM contributor
5Mar 5, 2012, 3:54:36 PMLeslie Gallo  IBM contributor
4Mar 5, 2012, 3:24:27 PMLeslie Gallo  IBM contributor
2Mar 5, 2012, 2:54:47 PMLeslie Gallo  IBM contributor
1Mar 5, 2012, 12:09:27 PMLeslie Gallo  IBM contributor
This version (1)Mar 5, 2012, 2:38:59 PMLeslie Gallo  IBM contributor
Copy and paste this wiki markup to link to this article from another article in this wiki.
Go ElsewhereStay ConnectedHelpAbout
  • IBM Collaboration Solutions wikis
  • IBM developerWorks
  • IBM Software support
  • Twitter LinkIBMSocialBizUX on Twitter
  • FacebookIBMSocialBizUX on Facebook
  • ForumsLotus product forums
  • BlogsIBM Social Business UX blog
  • Community LinkThe Social Lounge
  • Wiki Help
  • Forgot user name/password
  • About the wiki
  • About IBM
  • Privacy
  • Accessibility
  • IBM Terms of use
  • Wiki terms of use