Part 1 provides an introduction to the Lotus Sametime deployment at IBM, a description the former deployment of Sametime at IBM that relied on DB2 and how the old deployment was migrated, and pre-production testing of the new configuration.
< Previous | Contents | Next
One geographic location
One community and three vpuserinfo.nsf databases
Calculating a user's home server
Sametime is the official instant messaging application used by the IBM Corporation. With approximately 400,000 employees, IBM is the largest user of Sametime in the world. On a daily basis, millions of messages are exchanged by IBMers with their colleagues and with external customers. The deployment services over 480,000 worldwide users, with a peak concurrency rate of 225,000 users per day. All this is handled by three Domino clusters of three Sametime servers each that process approximately six million messages per day. The three clusters are located at one site in the United States.
Since it's inception, IBM had used a customized DB2-based version of the Sametime application, which had been marketed as an offering for Sametime. As IBM was the only customer to purchase and use this version of the product, Lotus Development decided to stop active development on DB2 Sametime 7.0 and focus on a single Domino-based Sametime code stream which was being used by external customers. IBM's CIO, Lotus, and ITDelivery decided to upgrade to a Domino-based server structure to resolve the internal instability issues, make use of new features within Sametime such as policies, and to provide a reference for external sales opportunities. In April of 2007, the CIO, Lotus, and IT Delivery attempted to migrate the DB2-based instant messaging environment to a Domino environment. The 2007 migration was unsuccessful for a combination of reasons, including:
- underpowered hardware and operating system
- very large Contacts List Database, which at 10 GB was too large to operate efficiently on Domino
- code issues found during testing
All these issues were addressed in the new 2008 infrastructure, ensuring a successful upgrade.
Beginning with the lessons learned from the project, the team conducted a thorough evaluation of the proposed migration, including architecture reviews and testing schedules. The remainder of 2007 and early 2008 was spent calibrating and revising both of these areas, as well as formulating a new plan for conducting another migration later in 2008.
The official project kicked off in 2Q 2008, and engaged participation from the CIO, Lotus (the Development, Test, and Support teams), and IT Delivery in the task plan. IT Delivery's input into the requirements stated that the configuration be available 24x7 for end user access, and assisted in the creation of an architecture to allow for redundancy should a hardware error occur on an individual machine. Additional considerations were brought forward regarding network connectivity and storage components, as well as an upgrade of the servers themselves. These factors were assimilated into the plan and the teams worked forward to accomplish the tasks associated with those requirments.
The planning and testing for the 2008 Sametime migration began in April of 2008 with a meeting involving the different Lotus development, support, and test groups involved in the architecture. The cross-functional team met and analyzed various data about what went wrong in 2007 in order to propose an architecture for IBM. The Lotus Engineering Test (LET) and WPLC Systems Verification Test (SVT) team worked in close conjunction with the Sametime development and Test team in order to come up with an architecture that would work efficiently in the IBM environment ,and would scale to meet the expected Sametime growth. Although the 2007 migration did not go as planned, it did give us some valuable statistics about user and server load the IBM Sametime environment that we did not have previously, since those statistics were not available in the IBM custom version. This information allowed us to better analyze and architect a new Sametime solution for IBM.
As the Sametime product (both client and server) has changed, so has the impact of the Sametime client on the server. After the 2007 migration attempt, Sametime Development and Sametime SVT went back and looked over the Sametime stress-tester tool that had been in use for several years. This led to the development of a new java-based internal Sametime stress-tester tool. With this new stress-tester tool, we were able to create load scripts to mimic the load seen during the 2007 migration. Fortunately, we still had a few of the original 2007 server that had been in the Sametime failover environment to use for testing. By using the new scripts against the original servers, we were able to mimic behavior similar to what we saw in production during the 2007 deployment. We spent a few months working with both Sametime and Domino development to find, test, analyze, and retest problems seen under load. This testing led to various fixes that were incorporated into the Sametime 8.0.2 code as well as hotfixes created for the 8.0.1 code.
Once the new production servers were available, we switched over to testing those servers. With the new servers, we tested for 40,000 to 60,000 simultaneous stress-tester users per server, using a copy of IBM's LDAP server and a copy of the Contacts List Database, vpuserinfo.nsf
. Even though the projected rate for the servers was around 250,000 simultaneous users, we tried to test between 360,000 and 450,000 in order to account for the projected growth of using Sametime awareness within various products, and to ensure that the in the event of an outage, the remaining servers could handle the increased load. The testing allowed us to assess the impact of various hotfixes and configuration changes on the environment.
< Previous | Contents | Next