Introduction to the Duwamish Online Sample Application
Transcription
Introduction to the Duwamish Online Sample Application
Introduction to the Duwamish Online Sample Application Pedro Silva and Michael D. Edwards Microsoft Developer Network July 2000 Summary: This article provides an overview of the history of the Duwamish sample application and discusses the process of turning it into a real, live e-commerce startup. (14 printed pages) View the Duwamish5.exe sample code in the MSDN Online Code Center. Download the Duwamish5.exe sample file (7,800 KB). Contents Introduction Overview www.DuwamishOnline.com Duwamish Online Goals Duwamish Online Application Upgrades Duwamish Online Deployment Application Architecture Layered Architecture Network Architecture The Internet Zone The Server Farm Zone The Server Hardware and Software Conclusion Introduction Ask anybody who has experienced the launch of a Web application what went wrong, and you'll get an earful. The problems are widespread, and by no means purely technical in nature. We should know—through our work on the MSDN® Duwamish Online project (documented at http://msdn.microsoft.com/voices/sampleapp.asp) we've been immersed in the task of designing, implementing, deploying, and operating a worldwide Internet e-commerce startup for the past two years. Now hold on—before you laugh yourself to tears at the thought of a real Internet startup taking two years to deploy, stop to consider how the schedule would be impacted by a primary objective of teaching the rest of the world how to reproduce our success! For the Duwamish team, this educational objective made identifying and solving all the problems associated with launching a Web application "only" half the work. The rest of our time was heavily invested in creating and updating extremely detailed lab notebooks and procedures, hundreds of pages of application software and network specifications, and a host of reports and analysis documents. In other words, successfully launching http://DuwamishOnline.com, but failing to teach you how to do so yourselves, was not an option. So, what is Duwamish Online, you ask? Read on, and we'll explain our project objectives, go over technical details of the Duwamish Online software and network architecture, and give you a taste of the deployment preparations required to launch your own Microsoft® Windows® DNA 2000 application, which we will be writing about all summer on MSDN Online. Overview Last August MSDN released Phase 4 of the Duwamish Books sample application, an ambitious project demonstrating a business and architectural migration to the Web. Phase 1 included a set of monolithic desktop applications for operating a single retail bookstore. Phase 2 migrated all the data access code into a shared COM component to support a growing business, now with multiple stores, and client-server architecture. As the fictional Duwamish bookstore business continued to expand into other cities and states, Phase 3 demonstrated migrating to a logical, three-tier architecture in order to support different business rules per store, in a business logic layer. Phase 3.5 integrated Microsoft Transaction Server (MTS) to manage the components in a physical three-tier architecture and to control transactions. And, finally, Phase 4 migrated the application workflow code into a shared COM component, and the presentation logic into Active Server Pages (ASP)—now Duwamish Books was a fully Web-based Windows DNA application. www.DuwamishOnline.com Well before we released Duwamish Books, Phase 4, we knew there would be a Phase 5. That's because through Phase 4 we had focused on demonstrating the software architecture of a Windows DNA application. That's only a third of the problem— another third is actually deploying a new Web application, and the final third is operating it. So, immediately upon releasing the Phase 4 milestone, we set out on a mission to deploy Duwamish live on the Internet. Duwamish Online Goals Our primary objective for Duwamish Online is to teach you how to successfully launch your own Web application. From A-Z we will describe every step in detail. (The key to Web application success is in the details.) Given that the Duwamish team had very little experience deploying and operating Web applications when we finished Phase 4, we decided to actually launch Duwamish ourselves, and operate it for a substantial period of time. Because we anticipated a summer 2000 launch, following the Windows 2000 release, we decided on a secondary objective of demonstrating how (and why) to upgrade the Duwamish application architecture to COM+ Services—essentially, how to migrate Windows DNA to Windows DNA 2000. Duwamish Online Application Upgrades A new Duwamish phase would be incomplete unless our architectural migration was stimulated by fundamental business change, and Duwamish Online is no exception. So, flush with IPO cash and the certainty that we must grow or die, Duwamish Online expanded its product catalog by acquiring a vendor of official logo casual and sports wear. Additional application upgrades were driven by our belief that more complete application presentation and workflow were required to provide generally applicable scalability and performance metrics. High availability and reliability requirements led to further architectural (both software and hardware) enhancements that were not so much driven by Duwamish Online business changes as by the reality of successfully operating a Web application (as opposed to "just" building it, as we did in Phase 4). Database layer Changes in the database layer were driven by two factors: • Our new business requirement to sell apparel and gear, in addition to books (and the assumption that future acquisitions would further expand the Duwamish Online product catalog). • The substantive expansion of the Duwamish application workflow. These changes led to modifications in the Duwamish database schema and a substantial increase in the number of tables, fields, and stored procedures. We took this opportunity to remove legacy database objects that were no longer relevant to the application. We also migrated the database to Microsoft SQL Server™ 2000 in order to take advantage of new features such as full-text search in clustered environment. Middle tier and presentation In the middle tier we have redesigned and implemented the order pipeline and added additional workflow and business logic to support our improved presentation features. In order to provide higher scalability, availability, and reliability, we introduced a COM+ Queued Component to handle interoperation with third-party partners. This allowed us to execute orders much quicker (providing significant scalability gains) and without depending on our partner's real-time server performance and uptime (thus making the availability and reliability of our order workflow something we could completely control on our own domain). If you downloaded and installed Phase 4, you'll recognize the huge strides we made in the presentation layer for Duwamish Online. Not only did we dramatically increase the complexity of each page, we added important additional features such as account history. We increased application complexity to provide more credible performance and scalability numbers. (For example, the Duwamish Online home page is a dynamic page that requires 10 times as much processing to deliver than the static Phase 4 home page.) But because Duwamish Online is live on the Internet, we also needed a more complete application to keep you engaged and interested. Third-party interoperation We implemented full interoperation with a credit card authorization vendor and a fulfillment vendor. This was important to do from both a complexity and completeness perspective. There were a number of interesting problems we had to solve here, from pull messages off the Queued Component server to auto-generating e-mail confirmations. Application setup and build Because we planned on installing Duwamish Online a multitude of times over its lifetime, we needed a very robust and maintainable setup application. So, we decided to throw out the Duwamish Books, Phase 4 setup (an unwieldy piece of code with origins going back to the Microsoft Visual Basic® Setup Kit—we started using this in Phase 3.5 to automate MTS component management) and start over. We ended up with the world's first completely automated, Windows 2000 logo-certified, Web application setup utility. (At least it's the first one that anybody is giving away for free!) The previous phases of Duwamish utilized an ad hoc build procedure (a fancy way of saying there was no formal build procedure). Because we wanted to enable our customers to reproduce our testing results, we needed a very reliable and easy-tomodify build utility. It was very important that our customers be able to build and test the same bits that we built and tested. So, we created a new application build facility, which we include with the Duwamish Online download. One of our best engineers spent several weeks on these two enhancements, and we are very proud of this work. Duwamish Online Deployment Half of the resources expended on Duwamish Online were devoted to determining and applying the 1,001 procedures necessary to deploy a Web application: From building and testing various network configurations, to making final staging and production server purchase decisions. From writing database backup utilities and practicing procedures to restore the database from backups, to testing database fail over. From purchasing a domain name, to "hardening" the production server farm against hacker attacks. From researching use scenarios and building load-test scripts, to isolating lock contention in scale-up testing. These were just a few of the A-Z details that took us almost a year to accomplish and will take us all summer to tell you about on MSDN Online. Application Architecture Layered Architecture Duwamish Online extends the similar n-tier design of earlier phases of the sample application, including the presentation layer, workflow layer, business logic layer, data access layer, and the data source. Although earlier phases implemented multiple presentation types, for the release of Duwamish Online we concentrated on HTML 3.2 and CSS 1.0 clients so we could support the largest browser audience possible. However, the ability to support multiple presentation types is still part of the design and can easily be extended to take advantage of new browser functionality. Therefore, all of the XML and XSL transformations must be done on the Web server. Figure 1. The application layers of the Duwamish Online sample Data is stored in a relational database, accessed and manipulated with components running under COM+ Services. It is converted into XML format between the middletier COM+ components and the presentation layer. The presentation layer formats the pages and transforms XML data into HTML 3.2, which is then returned by Internet Information Services (IIS). Table 1. Duwamish Online Layered Application Architecture Logical n-tier • Presentation tier—HTML 3.2 • Workflow tier—work spanning or incorporating multiple autonomous business transactions • Business logic tier—boundary for autonomous business transactions • Data access tier—handles disconnected data access Database tier • SQL Server 2000 database This logical factoring of an application into these layers allows you to write modular, reusable, and maintainable code more easily than it would be to write a monolithic Web application. Thinking about the application in these terms instills the discipline to design and implement features and entire applications with these things in mind. Also, this logical factoring of layers doesn't necessarily need to correspond to layers of COM+ components, although this is how the Duwamish application is divided. With new scripting functionality like Visual Basic Scripting Edition (VBScript) classes, code can be easily encapsulated in the classes and all of the layers can be written in script. Although this might help simplify your application development, you would not be able to take advantage of other features like COM+ security, Queued Components, and more. Although much in the middle-tier components has changed to accommodate new functionality, including a diverse item catalog and order history, the principles behind the workflow, business logic, and data access layers has remain similar to those in Phase 4. Therefore, let's focus on some of the new components—the queued workflow and fulfillment components—and see how they fit into the overall system. Queued workflow component For a Web site handling millions of transactions per day, with usage peaks of thousands per second, delaying costly operations can vastly improve response time. Queued operations free up IIS threads, so they can respond to more requests instead of waiting for costly synchronous operations to complete. In addition, queued operations can make the site more reliable. If the queued portions of the site go offline, or if there are a huge number of transactions that need to be processed, messages accumulate in the queue until the system is back online or there is a lull in traffic that allows the system to catch up. The COM+ Queued Components feature makes it easy to implement and configure objects to run on Microsoft's queuing technologies—in our case MSMQ. The difficult part is deciding which parts of your site don't require immediate user feedback and can be queued instead. Database operations are somewhat expensive, but credit card payment authorization is very expensive. As you know from swiping your credit card at the gas pump or department store, it can take on the order of a few seconds. When all of a Web server's worker threads (IIS uses a pool of 25 worker threads) are busy servicing payment authorization requests, your site's response time goes through the roof. Duwamish Online makes extensive use of COM+ Queued Component functionality for the order pipeline. When a customer clicks Buy, the presentation layer passes the XML-encoded order information to a local workflow component. The local component invokes a remote queued workflow component and executes a ProcessOrder method. All order processing is performed by the remotely hosted queued workflow component and is entirely out-of-band with IIS. Order processing includes inserting the sale and payment information into the database, authorizing the credit card purchase, preparing order data for fulfillment, email notification, and billing the credit card for the purchase after the order has been fulfilled. Fulfillment subsystem We decided to go with a third-party fulfillment company, Interact Inc. This company will be responsible for keeping the inventory on hand in their warehouse, boxing it, and shipping it to our customers. We immediately found that our database and message formats were incompatible with Interact's formats (ah, the joys of businessto-business integration). We could communicate with our fulfillment provider only by using File Transfer Protocol (FTP) to send messages to an address on their site. With the widespread adoption of XML messaging and server applications, such as Microsoft BizTalk™ Server, integration of these types of external services should soon become easier. The fulfillment system runs as several scheduled operations using the Microsoft Windows NT® Task Scheduler service. These scheduled events make calls into the fulfillment workflow component to send the purchase order to Interact and update order status and inventory from our provider. The fulfillment workflow is integrated with the other Duwamish components and uses the business logic and data access layers to perform database operations in the order tables, as well as in its own database used specifically for synchronization with Interact. Send Purchase Order Send Purchase Order uses the Duwamish Books business logic workflow layers to identify order records that are ready to be fulfilled. Once Send Purchase Order determines that an order is ready to be fulfilled, it transforms the order data from the Duwamish Online format into a format that is compatible with the external system. The data is then transmitted via FTP to the external system. Update Inventory and Update Order Status Update Inventory and Update Order Status transform inventory status and order status in files downloaded from the fulfillment server. Then, they update the order status and inventory information in the Duwamish Online system through the workflow and business logic layers. Using these components to make these changes preserves the business rules we have set up to govern changes to the inventory and its status. It also calls into the queued workflow component to do the final credit card billing. Network Architecture One of the greatest distinctions between Duwamish Online and the previous phases of Duwamish is the extensive work we did to design the network architecture that our application runs on. Oftentimes the focus of product development is on the software architecture and features. However, equally important in a Web application is the network architecture. Many well-designed applications can fail miserably on the Internet if they are not deployed and operated correctly. Although Duwamish Online is designed as a logic n-tier application, it is deployed on two physical tiers. After running a broad set of configuration tests on Duwamish, we discovered that the physical two-tier approach performed best because it minimized cross-machine communication—which is a huge performance killer. In this configuration, the Web servers run all of the ASP pages for the Web site and all of our COM+ components, and the second tier runs the database server. Load is balanced between the Web servers using Network Load Balancing (NLB). The tests we ran with components running on their own middle tier of the machines all had lower throughput and higher response time. Table 2. Duwamish Online Two-Tier Architecture Physical two-tier Web tier (NLB cluster) • VBScript in ASP • All HTML generated using XML/XSL transforms • Visual C++® ATL Cache component in ASP application space caches infrequently changing HTML/XML • Visual Basic COM+ workflow, business logic, data access components (COM+ library) Database tier (Windows Cluster Service) • SQL Server 2000 using stored procedures Our production network configuration can be divided into two major network zones: the Internet and the server farm. The Internet Zone The Internet Zone represents the network traffic external to our router and firewall. We are connected to the Internet through a 1.5 Mbps connection provided by our Internet Service Provider (ISP)—the Information Technology Group at Microsoft. We expect network traffic from the Internet by three types of sources: customers, service providers, and remote monitoring clients. Customers Customers are typical Internet users, and will access our site through a variety of Web browsers. They will be allowed access to the site only through HTTP at port 80. They will browse our catalog, select items to buy, and make purchases. Service providers We will also communicate, through the Internet, with the servers of our vendors that provide payment and fulfillment processing services. All communication with the service providers will be confined to the Queued Component (QC) server. Table 3 shows a list of the required communication methods on that server. Table 3. Required Communication Methods on QC Server Service provider Type of service Required network protocol CyberSource Payment processing Custom protocol—Port 80 Interact Order fulfillment FTP—Port 21 Remote monitoring clients The most important consideration is that a Web site be accessible to its customers. It is essential for us to have a way to check the site's availability from outside our actual server network. Some network problems—such as losing an ISP connection—cannot be monitored from within the private network. We will be setting up at least one client machine outside our firewall to remotely monitor the health of our service. This client will be pinging key services of our site at regular intervals, and will alert Operations personnel of failing services. Ideally, we would deploy multiple client machines through different ISP connections. However, for our initial deployment, we will simply set up one client machine through an external connection. The Server Farm Zone Our server farm's only external dedicated network connections are the direct Internet tap from our ISP and the secured dial-up connectivity to the Administration Server for operation and network management. Figure 2. Network diagram of Duwamish Online production farm Our production server farm consists of three network segments. These network segments are divided by a separate network interface card (NIC) in the machines for each segment. Front-end network This is the public network segment that is accessible from the Internet. All servers in this segment are connected to a 100-Mbps LAN switch. The front-end network consists of connections of the following servers and services: • Four Web servers, configured as one NLB cluster. • One Queued Component (QC) server, with SMTP server enabled. • One Primary Domain Controller (PDC)/Domain Name System (DNS) server. (We also have the QC server configured as a backup domain controller in case the primary server goes down.) • One 1.5-Mbps Internet connection from our ISP, with IP Filtering Firewall enabled at the router level. Back-end network The back-end network is the internal private network segment that allows secured communications between the front-end servers and the back-end database servers. This network is not directly accessible from the Internet, so no one or nothing can connect to our database servers except through the front-end servers. All servers in this segment are connected to a 100-Mbps LAN switch. The back-end network consists of connections of the following servers: • Four Web servers, configured as one NLB cluster. • One Queued Component (QC) server, with SMTP server enabled. • One Primary Domain Controller (PDC)/Domain Name System (DNS) server. (We also have the QC server configured as a backup domain controller in case the primary server goes down.) • Two database servers, configured with Active-to-Passive Server Clustering. (The two database servers are both connected to an external RAID5 storage system.) Management network The management network is another internal private network segment dedicated to the operation and management of the individual servers in the production farm. It consists of all the servers in the back-end network as well as an administration server. All servers are connected to a 100-Mbps LAN switch. The administration server will: • Provide Terminal Client access to all servers. • Monitor the health of all servers. • Work as a remote access server (RAS) for remote access to the farm. • Serve as a backup server. Server Hardware and Software Along with the network setup, it is vital to document the server machine hardware and software specifications so that everyone on the team knows what machines are in the farm and what software is installed on each server. At this time, the software list for our server farm is relatively short, because we're using the new Windows 2000 release. However, as service packs and software updates are released, it becomes even more important to keep track of what is on each server. Table 4 describes the hardware and software specifications for the Duwamish Online production server farm. Table 4. Duwamish Online Server Hardware and Software Specifications Server/device types # of devices Hardware spec Software spec Web server 4 Dell PowerEdge 2300 Dual-processors, 2 x 500 MHz 512 MB RAM, 9 GB HD 3 x 100 Mbps NIC Windows 2000 Advanced Server Network Load Balancing Microsoft Message Queuing Database server Queue Component server 2 1 Dell PowerEdge 2300 Dual-processors, 2 x 500 MHz 512 MB RAM, 9 GB HD 3 x 100 Mbps NIC Windows 2000 Advanced Server SQL Server 2000 Dell PowerEdge 2300 Dual-processors, 2 x 500 MHz 512 MB RAM, 9 GB HD 3 x 100 Mbps NIC Windows 2000 Advanced Server Microsoft Message Queuing Microsoft Cluster Services SMTP Active Directory™ Services PDC/DNS server Dell Precision 610 Single-processor, 550 MHz 256 MB RAM, 9 GB HD 3 x 100 Mbps NIC Windows 2000 Advanced Server 1 Dell Precision 610 Single-processor, 550 MHz 256 MB RAM, 9 GB HD 2 x 100 Mbps NIC 56K Modem 20G Backup Tape Drive Windows 2000 Advanced Server SiteScope/Microsoft Cluster Sentinel Remote 1 Monitoring Client Dell Precision 610 Single-processor, 550 MHz 256 MB RAM, 9 GB HD 1 x 100 Mbps NIC Windows 2000 Professional 100-Mbps LAN switch Allied Telesyn CentreCOM FS708 100 Mbps Ethernet Switch N/A Administration server 1 3 Active Directory™ Services Conclusion Duwamish Online has been an exciting adventure into some of the challenges and issues you would face as a developer or operations manager deploying a new site to the Internet. Although this latest release goes a long way to completing our Web store—with payment and fulfillment processing—there are still areas that were left undone, because it's not a real business. Along with operating the site, the Duwamish team will spend the summer releasing the sample code for the site, continue running further performance tests, and publish articles describing and explaining how to leverage the work we've done on Duwamish into your own applications. Visit our column on MSDN Online (http://msdn.microsoft.com/voices/sampleapp.asp) throughout the summer to download our source code and for more in-depth articles about Duwamish Online. Send feedback to MSDN. Look here for MSDN Online resources.