LEARN HOW TO REDUCE YOUR VERIFICATION RISK AT CDNLive!
Transcription
LEARN HOW TO REDUCE YOUR VERIFICATION RISK AT CDNLive!
INCISIVE VERIFICATION ARTICLE AUGUST 2005 LEARN HOW TO REDUCE YOUR VERIFICATION RISK AT CDNLive! JOHN WILLOUGHBY, CADENCE DESIGN SYSTEMS Electronic design is a process that is full of risks. Does my design work? Did I include all of the specified functionality? Am I ready to go to tapeout? Does my software work? Did I meet my performance goals? Did I miss any bugs that might result in product recalls? This is why we perform verification; verification is the “insurance policy” of the design world. Verification is all about eliminating the risks in your design process, ensuring that your product will meet specifications, ship without bugs, and ship on time. However, the complexities of verification are driving project teams to consider new technologies and methodologies, changing their current (working) verification environment and adding more risk. We are trying to reduce one kind of risk by adding another type of risk! This is a fundamental paradox facing every project team today. How can you avoid adding risks while trying to reduce risks? New languages, new methodologies, and new technologies each can introduce significant risks to your projects. However, if they are coupled properly and the right solution is chosen for the right application and engineering specialist or team of specialists, then project risks can be reduced significantly. Starting on September 12, a new event is coming to Silicon Valley that will help you learn how to reduce the risk of enhancing your verification environment and ultimately reducing the total risk to your project plans. CDNLive! Silicon Valley 2005 kicks off a new, global series of technical conferences for Cadence users. This premier technical conference is managed by a steering committee of Cadence users. At CDNLive!, you will have the opportunity to network with other design and verification engineers who are addressing and overcoming verification challenges such as verification process management, achieving functional coverage, and implementing assertionbased verification, mixed-language design environments, and hardware/software verification. You’ll also have the opportunity for face-to-face time with Cadence R&D engineers and other technologists. At CDNLive!, you’ll hear four luminary keynotes, including Dr. Bernard Meyerson (IBM) and Mark Edelstone (Morgan Stanley). You will be able to visit with 50 exhibitors at the Designer Expo. And you will see live demos of the latest Cadence technologies and meet their developers at Technology Demo Night. You will be able to choose from over 94 technical paper/panel presentations from your peers and Cadence technologists, including: • Methodology for integrated use of Cadence tools for simulation, coverage, and analysis (QLogic) • Verification for a GHz DSP — Experience designing an ERM-compliant environment (TI) • Formal analysis using Incisive Formal Verifier for PCI Express validation (ATI) • In-circuit emulation techniques for pre-silicon verification (Freescale) • Dynamic assertion-based verification using PSL and OVL (Philips) • Effective modeling techniques for formal verification of interrupt-controllers (TI) • Achieving vertical reuse in SoC verification (Cisco) • Bringing up the system-level e verification environment for a multimillion gate FPGA in seven days! (Cisco) • Application of dynamic random sequencing in augmenting constrained randomization (QLogic) • End-to-end data-flow monitoring with linked hierarchical transaction modeling for system verification (QLogic) • Benefits and methodology impact of adopting formal property checking in an industrial project (ST) • Verification methodology for OCP-based systems (YOGITECH SpA) • The use of a general control and communications module in a SystemC testbench (LSI) • Architectural analysis of an eDRAM arbitration scheme using transaction-level modeling in SystemC (Silicon and Software Systems) And finally, there will be a special series of hands-on tutorials to help you learn about the latest innovations and capabilities. These will be taught by Cadence technologists and will include three sessions focused specifically on verification: • Complete assertion-based verification (ABV) with the Incisive® platform • Deploying SystemVerilog with the Incisive platform • Advanced verification from plan to closure Register now for this exciting event to be held at the Westin in Santa Clara on Sept 12–15 by going to http://www.cadence.com/cdnlive/na 2 | Incisive Newsletter — August 2005 Incisive Newsletter — August 2005 | 2 INCISIVE VERIFICATION ARTICLE AUGUST 2005 HOW ARE YOU PLANNING TO VERIFY ALL THAT DFT? STYLIANOS DIAMANTIDIS, GLOBETECH SOLUTIONS ABSTRACT As gate counts continue to swell at a rapid pace, modern systems-on-chip (SoCs) increasingly are integrating more design-for-testability (DFT) capabilities. Test and diagnosis of complex integrated circuits (ICs) will soon become the next bottleneck, if, in fact, they have not already. With up to 30% of a project’s cycle already being spent debugging silicon, and typically 30–50% of total project costs being spent on test, DFT quickly is becoming the next wild card. As daunting a task as reining in all the variables related to DFT infrastructure can seem, an enormous opportunity awaits those ready to take up the challenge. AND A CHALLENGE IT IS... Today, DFT usually is nothing more than a collection of ad-hoc hardware put together by different people, using different tools, with neither a common strategy nor a vision of an end quality of result. The inability to deliver a reliable test infrastructure inevitably leads to missed market opportunity, increased manufacturing costs, or even a product that is not manufacturable. Instead, a carefully designed and verified DFT scheme, reflecting coherent test intent across the board, can be an excellent value differentiator throughout the lifetime of a product. This brief article discusses how to plan DFT verification against test intent, ensure compatibility with standards and functional correctness, and create a complete, methodical, and fully automated path from specification to closure. PLANNING FOR SYSTEM-WIDE DFT VERIFICATION The foundation for systematic DFT verification is a well-defined set of goals, supported by a methodology developed to provide integrationoriented test methods for chip-level DFT, to enable compatibility across different embedded cores, and to incorporate high levels of reuse. A DFT-verification plan must satisfy these three separate objectives: INTENT/SPECIFICATION Does the test infrastructure adhere to the test strategy and specification set forth by the design and test engineers? You must verify that the global test intent is designed and implemented properly. COMPLIANCE Does the test infrastructure comply with industry standards for interoperability and universal facilitation of access? This is crucial to ensure reuse of hardware and software. FUNCTIONALITY Are there functional-design issues with the DFT resources? Although such resources may appear to operate within the parameters of the first two points, there could be logic bugs in the implementation. Once the objectives are defined, the development of a complete system-level DFT-verification plan should follow these general steps: 1. Capture system-level test intent and translate it into an executable plan test plan based on heterogeneous core-level DFT schemes from different vendors. 2. Integrate heterogeneous core-level DFT plans into the system-level DFT plan Finally, this methodology also needs to apply to internal engineering teams delivering design IP for integration. Such teams have different skills and management styles, and can operate in different geographies or business units. They too, must understand the need to plan DFT verification and provide the necessary components to enable this methodology. 3. Provide a completely automated path from specification to closure 4. Provide quality-of-result metrics 5. Provide total-progress metrics 6. Integrate the DFT plan into the IC-level verification plan CAPTURING SYSTEM-LEVEL TEST INTENT In order to build a successful DFT-verification plan, you first must capture system-level test intent. During this step, test engineers need to work closely with verification engineers to ensure that the plan includes all the aspects of test that must be available in the final product. The global test-access mechanism (TAM) is the primary element at this point; however, other elements can come into play, such as top-level DFT features, integration to board-level testability, or hardware/software interfacing and information sharing. Management must also be involved so that they gain an understanding of the implications and trade-offs of building reliable DFT. This will ensure total visibility and resolve contention for resources further down the road. The preferable way to deliver the system-level test intent description is in executable form. The global verification plan must leave no room for doubt or misinterpretation. Furthermore, it needs to provide a solid basis for automation of subsequent steps down to design closure. PROVIDING A COMPLETELY AUTOMATED PATH FROM PLAN TO CLOSURE Having a) captured the high-level test intent in an executable plan and b) integrated separate core-level DFT schemes, verification and test engineers are now empowered to drive their processes more effectively. The result is a fully automated path from plan to closure for DFT verification, ensuring: • Completeness — The verification plan includes a section on all DFT features and their specifics • Intent — The verification scope has been defined early in the process by experts and with complete visibility • Uniformity — Disparate test strategies can now be driven by a single process During this stage, engineers should seek and incorporate various elements that will be used as building blocks to implement the verification strategy according to the plan. Such elements can include: • Verification IP, used to run verification tests on individual DFT features • Test-information models, used to exchange information with other tools and processes for enhanced automation INTEGRATING HETEROGENEOUS CORE-LEVEL DFT PLANS Bridging the gap between the ad-hoc world of spurious DFT resources and planned system-level DFT is not a trivial task. Individual intellectual property (IP) vendors’ strategies for testability can vary significantly in terms of quality, coverage, and/or support deliverables. In this phase of DFT-verification planning, it is important to work closely with vendors to align DFT strategies as closely as possible, and to enforce quality metrics. Optimally, vendors should work with their customers’ test engineers to design pluggable DFT schemes and plans. By capturing core-level test intent in an executable plan, and including it in the deliverables, IP vendors can provide a new added value to their customers. Such executable plans can then flow into the IC system-level test plan (see Figure 1). This is key to unlocking the paradox of driving a uniform SoC-level 4 | Incisive Newsletter — August 2005 TEST SPECIFICATION System DfT verification plan Standards verification plan(s) (e.g., JTAG) System DfT verification plan Embedded core verification plan(s) Standards verification plan(s) (e.g., IEEE 1500) DfT features verification plan(s) DfT features verification plan(s) Figure 1: Simulation acceleration Incisive Newsletter — August 2005 | 4 PROVIDING QUALITY-OF-RESULT METRICS But how can you conclude that the DFT-verification plan guarantees the necessary quality for reliable DFT? In order to address this inherent unpredictability, you must set expectations for quantifying results as part of your plan. Quality-of-result metrics are measurable targets that can be entered as verification-plan attributes early in the planning phase. Such targets must result from collaboration among verification, design, and test engineers in order to ensure that all the aspects of the task at hand are addressed. They can include functional coverage metrics, such as the number of different instructions loaded in any given JTAG TAP, seed patterns used in automatic test-patern generators (ATPGs), or isolation behavior of embedded-core scan cells. All these metrics should be associated neatly with a respective section of the executable test plan. It is also a good idea to assign priorities or weights to different test-plan sections based on these metrics. For example, what is the purpose of exhaustively testing a built-in self-test (BIST) controller connected to a JTAG TAP if the TAP is not thoroughly verified first? PROVIDING TOTAL-PROGRESS METRICS Quality-of-result metrics can be a guide to understanding and reporting progress, and can identify critical paths. Once the project is underway, it is difficult to track the progress of specific tasks and the implications of prioritization. Tracking quality-of-result progress provides a way of correlating real DFT-verification progress while simultaneously enabling total visibility across the different teams. This way, test engineers can know at all times the progress of the verification of DFT across the board and use this information to drive other processes, such as test-vector generation or early fault analysis. They can also use this information to raise management awareness of issues that may arise during the design process. INTEGRATING THE DFT PLAN INTO THE ICVERIFICATION PLAN Finally, a methodical DFT-verification plan must be integrated into the system-level IC-verification plan. This way, DFT quality-of-result metrics can be factored into chip-level metrics for total-quality management for closure. System-level planning should incorporate the processes and methodologies of effective DFT verification. This enhances the allocation of necessary resources, and ensurres that expert knowledge is available. 5 | Incisive Newsletter — August 2005 Furthermore, a DFT-verification plan can help bridge the cultural gap that today divides test engineers from the rest of the design-cycle, and can advocate cooperation between the two main bottlenecks of today’s SoC design: verification and test. BENEFITS OF DFT-VERIFICATION PLANNING There are a variety of motivating factors for planning and executing proper DFT verification. The investment made during the design cycle can be leveraged to reap a series of long-term benefits, including: REAL DESIGN-FOR-TEST Increased visibility into test intent across development teams results in better integration of the design and test engineering processes, skills, and cultures. Methodical plans to verify test infrastructures create a well-defined process for incorporating input from test engineers into the development cycle. Test engineers participate in creating the global test specification, helping to qualify vendors based on DFT-quality metrics, and/or prioritizing verification tasks against target results. This enhanced visibility also results in the reverse benefit of better communication and information feedback from manufacturing/test back to design in order to close the design-for-manufacturability (DFM) loop. BETTER, FASTER, CHEAPER TEST As semiconductor processes move deeper into nanometer scales, the cost of fabrication and test is exploding. Fabrication facility costs at 65nm are expected to hit $4 billion. If the current test-capitalper-transistor ratio persists (it has been flat for 20 years), in several years the general cost of test will exceed fabrication cost. Associated low yield also increases the number of test cycles required to determine the quality of silicon. DFT-verification planning aims to provide a reliable path from test intent to quality of results. Adding quality and efficiency to test planning leads to better testing strategies aimed at locating real silicon faults while minimizing costly over-testing and/or excessive vector sets. Testing on advanced automated test equipment (ATE) at 90nm can exceed $0.10/second per unit; for a batch of 1 million at 100% yield, that’s $100,000/second. Improving the DFT-planning process can help companies make efficient use of this expensive tester time, or even help them to switch to cheaper ATE resources by partitioning more test resources on-chip. Incisive Newsletter — August 2005 | 5 DIRECT EFFECTS ON TIME TO MARKET In a demanding consumer-driven electronics market, execution of a product strategy leaves no room for error. Re-spins simply are not an option when targeting 90nm (or below) process technologies. Lengthy silicon debugging, manufacturing-test time, low yield, and lack of diagnosability substantially impact the time-to-market window. Proper planning for DFT verification results in increased design- and test- schedule predictability and repeatability, better process automation, and enhanced efficiency with a direct, positive effect on time-to-market. Designing verification IP that can be invoked directly and automatically from the plan results in additional, significant time savings. Such IP can include complete environments capable of generating test vectors, checking DFT state, and measuring the extent of exercise of the test infrastructure. Such IP should be designed only once for standard components (e.g. JTAG), and then enriched with feature-specific libraries for customization. Time invested up front results in overall project-time savings by ensuring that DFT is designed and verified only once. Such savings are optimized from project to project due to complete and calculated reuse. BETTER VENDOR-QUALIFICATION METRICS With third-party IP playing such an integral role in today’s SoCs, DFT-verification planning can be used to increase levels of process integration and automation with strategic vendors. Furthermore, DFT-quality metrics can be incorporated in new vendor assessment by grading the vendor’s test strategy, DFT implementation, DFT reuse, and applicability targets. Qualification metrics offer advanced vendors incentives to provide complete, executable verification plans, IP, and test information models for enhanced integration into their customer’s test infrastructure. CONCLUSIONS Large and complex test infrastructures are a reality in today’s dense SoCs, which comprise a multitude of diverse DFT resources. If companies are to meet their manufacturing-cost and time-to-market demands, they will need to ensure that such test infrastructures are well verified for specification, compliance, and functionality. At the foundation of the solution lies a detailed executable plan that can be used to provide an automated path from specification to closure with predictable quality of result. How are you planning to verify all that DFT? 6 | Incisive Newsletter — August 2005 Incisive Newsletter — August 2005 | 6 INCISIVE VERIFICATION ARTICLE AUGUST 2005 GETTING HELP ON WARNING AND ERROR MESSAGES WITH THE INCISIVE UNIFIED SIMULATOR MIKE BELANGER, CADENCE DESIGN SYSTEMS The Cadence® Incisive® Unified Simulator, as well as the NC-Verilog® simulator and the NC-VHDL simulator, includes a utility called nchelp that you can use to get information on warning and error messages. This utility displays extended help for a specified warning or error. the following constructs and rerun ncvlog: The syntax is as follows: LIB_MAP ( <path> => <lib> [, ...] ) // hdl.var mapping % nchelp tool_name message_code where tool_name is the tool that generated the error, and message_code is the mnemonic for the warning or error. You can enter the message_code argument in lowercase or in uppercase. For example, suppose that you are compiling your Verilog® source files with ncvlog, and the following error message is generated: ncvlog: *F,NOWORK (test.v,2|5): no working library defined. To get extended help on this message, use the following command: % nchelp ncvlog NOWORK or: % nchelp ncvlog nowork This command displays information about the message, the cause of the error, and ways to fix the error. ncvlog/nowork = A working library must be associated with every design unit parsed by ncvlog. No working library was defined for the specified file and position. Specify a working library using one of ‘worklib <lib> // compiler directive –WORK <lib> // command-line option WORK <lib> // hdl.var variable Here is another example of using nchelp. In this case, nchelp is used to display extended help for the elaborator DLWTLK error. % nchelp ncelab DLWTLK ncelab/DLWTLK = The library specified in the message requires that the calling process wait for a lock before the data can be accessed. Another process may be accessing the library at the same time. The process will wait for up to an hour attempting to get the lock, and then will terminate with another error if the lock could never be achieved. Some possible causes include: – Another process may have locked the library. See the documentation of the -lockinfo option of the ncls utility for how to examine the lock. – A suspended process may have left a lock on the library. Use the unlock option of the ncpack utility. – Locking may currently be broken either on the network or between the two machines involved. Use the lockcheck option of the nchelp utility to check for these states. – If you are using the +ncuid option on ncverilog to enable parallel simulation, ensure that each invocation of ncverilog uses a distinct <ncuid_name>. – The library database may have been corrupted. In this case, delete and rebuild the library database. To get help on the nchelp utility, use the –help option. This option displays a list of the nchelp command-line options. % nchelp -help The following command-line options are especially useful: • –cdslib — Displays information about the contents of cds.lib files. This can help you identify errors and any incorrect settings contained within your cds.lib files. • –hdlvar — Displays information about the contents of hdl.var files. This can help you identify incorrect settings that may be contained within your hdl.var files. In addition to providing extended help for messages generated by the compiler, elaborator, and simulator, nchelp can provide help for messages generated by other tools, such as NCLaunch or HAL, and for messages generated by various NC utilities such as ncupdate, ncls, ncprotect, and so on. To display a list of all tools for which extended help is available, use the –tools option: % nchelp -tools For complete information on the nchelp utility, see the section “nchelp” in the chapter called “Utilities” in the NC-Verilog Simulator Help or the NC-VHDL Simulator Help Documents. 8 | Incisive Newsletter — August 2005 Incisive Newsletter — August 2005 | 8 INCISIVE VERIFICATION ARTICLE AUGUST 2005 EVALUATING PSL: A USER CASE STUDY BRAD SONKSEN, QLOGIC CORPORATION Having had some experience with assertion libraries in the past, I was interested to see what Cadence had to offer when it started touting its integration of Property Specification Language (PSL) hooks and other proprietary library components in NC-Verilog® and the Incisive® verification library. In order to maintain tool independence and to minimize cost, I was especially interested in seeing how NC-Verilog handled standard PSL and how easy it would be to invoke these features with my current Verilog® design and testbench. When I was told that I simply had to invoke the “+assert” plusarg on the command line, it was not hard to convince me to just try it out — this sounded like a low-risk proposition for both the design and for my time. During early tests of the software, I had some bumps as I either discovered or duplicated a couple of tool bugs. But these were resolved with bug fixes and did not require me to change my RTL or PSL code. I was then able to move on easily from using a single simulation test case to a running a full regression on my design. The full regression produced a few error messages triggered by the simulator to indicate that one of the assertion specifications had been violated while exercising a certain corner case in the design. This error pointed to a problem that definitely would have resulted in a silicon revision. By looking at the assertion error message, I was able to identify the time the error occurred in simulation and to observe it with a waveform viewer. After working out a few tool bugs with the Cadence team at the factory and a couple of applications engineers, I did, in fact, see that using PSL inside my Verilog code with NC-Verilog was basically as advertised, and no more complicated than turning on the “+assert” plusarg. And there were other unexpected benefits as well — I found a bug in a design that was thought to have been verified fully. The PSL code that found the error was very simple to write, but provided a very significant check: For my specific test case, I implemented 40 PSL assertions to cover two main types of verification: 1) interactions at the major interfaces to my 400Kgate block, and 2) complicated and hard-to-verify internal areas of the code. Many of these assertions were new, but many were also translated from another vendor’s proprietary library into the PSL language. // psl max_req_size: assert never // (Sop && !Write && (ByteCount > MaxRequestSize)) // @(posedge clk); The bug causing the error was related to a complicated transfer-size calculation that involved over 200 lines of code and was not correct in some corner cases. With many different requirements on the design, it was difficult initially to write the RTL code to cover all possible corner cases. Writing the assertion specifications in PSL and then running them through a dynamic regression to find errors provided an automated way for the simulator to flag the error. I did not have to write additional long tests to cover high-level functions; I just had to specify the properties of the small area of the logic of greatest concern to me and enable the assertions during these tests. This was much quicker than spending hours randomly poring over Verilog code looking for an issue that was not flagged as an error during simulation and was not known to exist. Furthermore, using PSL as a means to check the design through dynamic simulation enabled me to attack verification from a different angle. The bug that was found should have been detected already through checks that were required in our bus function model (BFM). But the BFM checks also were not being done properly. So the assertions not only found a bug in our design, but also in our dynamicsimulation environment. The failing test case was well-suited to assertionbased verification (ABV) techniques because it selectively targeted a concise, yet complicated, section of the code. The PSL code was not used simply as a tool to re-write the RTL code; in fact, it would have been very difficult to re-write this code in PSL. Instead, the PSL assertions targeted only a very specific, modular part of the whole calculation. The assertions were also much less complicated and time consuming to write than the original RTL code. Another thing that we looked at during this evaluation was the capability of Incisive to infer assertions related to synthesis pragmas (parallel_case/full_case) automatically during dynamic simulation. Initially, there were some errors reported from these automatically inferred assertions. After looking into them, we found that these errors were not real, but only related to event ordering within the asynchronous logic, and that by the time the sequential logic observed these values they all had met the criteria assumed by the synthesis pragmas. Based on this observation, I requested that Cadence provide a way to associate each of these automatic assertions to a clock. Cadence agreed, and in a subsequent release provided a way to declare a clock for synchronous evaluation of these pragmas. This cleared up the false errors reported by the pragma assertions. The end result of this evaluation is that we became more familiar with the capabilities of the Incisive verification tools with very little time investment, we improved our design, and we gained a better understanding of how to verify designs in the future. The payback for generating assertions and running them in dynamic simulation was much greater than the time investment it took to generate them. In the future, we will look more into using assertions for static verification as the tools mature and capacity improves. Also, during this evaluation we experimented some with the Cadence Incisive Assertion Library (IAL) components and we may expand our use of those in the future. 10 | Incisive Newsletter — August 2005 Incisive Newsletter — August 2005 | 10 INCISIVE VERIFICATION ARTICLE AUGUST 2005 KEYS TO SIMULATION ACCELERATION AND EMULATION SUCCESS JASON ANDREWS, CADENCE DESIGN SYSTEMS INTRODUCTION For better or for worse, the engineering community, the press, and the EDA vendors themselves have classified the world of simulation acceleration and emulation incorrectly into two camps: fieldprogrammable gate-arrays (FPGAs), and applicationspecific integrated circuits (ASICs). Advocates in each camp declare the same tired facts. FPGAs take forever to compile and have internal-timing problems. ASICs are power hungry and require longer development time. When it comes to choosing an emulation system, the underlying technology does contribute to the characteristics of the system, but far too much time is spent on low-level technology details and not enough time on how emulation gets the verification job done by providing high performance and high productivity. What engineers intend to say when they discuss “FPGA vs. ASIC” is “prototyping vs. simulation acceleration and emulation.” To add to the confusion, some semiconductor companies even call their internally developed FPGA prototype an “emulator.” This paper discusses the factors that are important when evaluating simulation acceleration and emulation and the different use-modes and applications for acceleration and emulation. With the acquisition of Verisity, Cadence now offers two of the most successful acceleration/emulation product lines in the market. From this unique position, Cadence can best serve customers by evaluating real verification needs and recommending products and technologies that enable improved verification through the use of simulation acceleration and emulation. DEFINITIONS AND CHARACTERISTICS OF SIMULATION ACCELERATION AND EMULATION Before the important aspects and use-modes of emulation are presented, some definitions are needed. Four distinct methods are used commonly for the execution of hardware design and verification: • Logic simulation • Simulation acceleration • Emulation • Prototyping Each hardware execution method has associated with it specific debugging techniques; each has its own set of benefits and limitations. The execution time for these methods range from the slowest (with the most thorough debugging) to the fastest (with less debugging). For the purpose of this paper the following definitions are used: Software simulation refers to an event-driven logic simulator that operates by propagating input changes through a design until a steady-state condition is reached. Software simulators run on workstations and use languages such as Verilog®, VHDL, SystemC, SystemVerilog, and e to describe the design and verification environment. All hardware and verification engineers use logic simulation to verify designs. Simulation acceleration refers to the process of mapping the synthesizable portion of the design into a hardware platform specifically designed to increase performance by evaluating the HDL constructs in parallel. The remaining portions of the simulation are not mapped into hardware, but run in a software simulator. The software simulator works in conjunction with the hardware platform to exchange simulation data. Removing most of the simulation events from the software simulator and evaluating them in parallel using the hardware increases performance. The final performance is determined by the percentage of the simulation that is left running in software, the number of I/O signals communicating between the workstation and the hardware engine, and the communication channel latency and bandwidth. A simple representation of simulation acceleration is shown in Figure 1. Emulation refers to the process of mapping an entire design into a hardware platform designed to further increase performance. There is no constant connection to the workstation during execution, and the hardware platform receives no input from the workstation. By eliminating the connection to the workstation, the hardware platform runs at its full speed and does not need to wait for any communication. A basic emulation example is shown in Figure 2. By definition, all aspects of the verification environment required to verify the design are placed into the hardware. Historically, this mode has restricted coding styles to the synthesizable subset of Verilog® and VHDL code, so this mode is also called embedded testbench or synthesizable testbench (STB). Even without a constant connection to the workstation, some hardware platforms allow on-demand workstation access for activities such as loading new memory data from a file, printing messages, or writing assertion-failure data to the workstation screen to indicate progress or problems. on-demand workstation access CPU Logic Verification environment Memory CPU Infrequent $display(); $readmemh(); Logic Memory Verification environment Workstation Hardware engine Figure 1: Simulation acceleration During simulation acceleration, the workstation executes most of the behavioral code and the hardware engine executes the synthesizable code. Within simulation acceleration there are two modes of operation, signal-based acceleration and transaction-based acceleration. Signal-based acceleration (SBA) exchanges individual signal values back and forth between the workstation and hardware platform. Signal synchronization is required on every clock cycle. SBA is required for verification environments that utilize behavioral-verification models to drive and sample the design interfaces. Transaction-based acceleration (TBA) exchanges only high-level transaction data between the workstation and the hardware platform at less-frequent intervals. TBA splits the verification environment into two parts: the low-level state machines that control the design interfaces on every clock, and the high-level generation and checking that occurs less frequently. TBA implements the low-level functionality in hardware and the high-level functionality on the workstation. TBA increases performance by requiring less-frequent synchronization and offers the option to buffer transactions to further increase performance. Workstation Hardware engine Figure 2: Emulation In-circuit emulation (ICE) refers to the use of external hardware coupled to a hardware platform for the purpose of providing a more realistic environment for the design being verified. This hardware commonly takes the form of circuit boards, sometimes called target boards or a target system, and test equipment cabled into the hardware platform. Emulation without the use of any target system is defined as targetless emulation. A representation of ICE is shown in Figure 3. ICE can be performed in two modes of operation. The mode in which the emulator provides the clocks to the target system is referred to as a static target. When running ICE with static targets, it is possible to stop and start the emulation system, usually for the purpose of debugging. The mode in which the target system provides the clocks Target boards on-demand workstation access CPU Infrequent $display(); $readmemh(); Logic Memory Verification environment Workstation Hardware engine Figure 3: In-circuit emulation 12 | Incisive Newsletter — August 2005 Incisive Newsletter — August 2005 | 12 to the emulator is referred to as a dynamic target. When using dynamic targets, there is no way to stop the emulator, because it must keep up with the clocks supplied by the target system. Dynamic targets require special considerations for proper operation and debugging. Hardware prototype refers to the construction of custom hardware or the use of reusable hardware (breadboard) to construct a hardware representation of the system. A prototype is a representation of the final system that can be constructed faster and made available sooner than the actual product. This speed is achieved by making tradeoffs in product requirements such as performance and packaging. A common path to a prototype is to save time by substituting programmable logic for ASICs. Since prototypes are usually built using FPGAs, they are often confused with and compared to emulation systems that also use FPGA technology. As we will see, there are very few similarities between how FPGAs are used in prototypes and in emulation. Prototypes use a conventional FPGA-synthesis flow to map a design onto the target FPGA technology. Partitioning and timing issues are left to the prototype engineer to solve and require careful planning from ASIC designers at the beginning of the design process. Now that the basic definitions of acceleration and emulation are clear, it’s important to note that while some products in the market perform only simulation acceleration and some perform only emulation, more and more, the trend is for products to do both. The systems that perform both simulation acceleration and emulation usually are better at one or the other mode of operation, regardless of what the marketing brochure says. In the rest of the paper, the general term “emulator” is sometimes used for systems that do both simulation acceleration and emulation, but that “specalize” in emulation. IMPORTANT CHARACTERISTICS OF SIMULATION ACCELERATION/EMULATION For engineers evaluating simulation acceleration and emulation, there are many important factors to consider. The main motivation for using either simulation acceleration and/or emulation is always the significant performance increases that are possible over that of a logic simulator alone; but performance is not the only criterion to consider. This section discusses a baseline set of features that are must-haves for all users. Without these features, a product is probably not useful. Automated compile process: Emulation offers a completely automated compile flow. The process is similar to compiling a design for logic simulation, but involves more steps to map the design onto the hardware architecture and prepare the database to be downloaded into the hardware. The user does not 13 | Incisive Newsletter — August 2005 need to learn about the details of the hardware, nor how to partition and route the design across boards and chips. There is no need to learn about timing issues inside the system. Simulation-like debug: Emulation provides many advanced debugging techniques to find and fix problems as quickly as possible. When needed, 100% visibility for all time is available, as well as many other modes of debugging, including both interactive during execution and using post-processing methods when execution is complete. Multi-user system: Project teams surely will want to verify entire chips or multi-chip systems using emulation. These tasks require large capacity. At other times, emulation is useful for smaller subsystems of the design, and capacity can be shared with other projects. To get the most value from emulation, high capacity should be available when needed, but also should be able to be split into smaller pieces and shared among engineers. Support for complex clocking: Another benefit of emulation is its ability to handle complex clocking in a design. Designs today use every trick possible to manage performance and power requirements. Most of these techniques do not map well onto FPGA architectures, which support primarily a single-clock-edge, synchronous design style. Easy memory mapping: Emulation provides the ability to map memories described in Verilog and VHDL automatically to the hardware with little or no user intervention. Intellectual property (IP) integration: Many designs contain hard- and/or soft-IP cores. Emulation provides the ability to integrate processor models so emulation can be used to execute embedded software. KEYS TO EMULATION SUCCESS Emulation products are no different than any other product: no one product can be the best at everything. In every design process there are constraints such as cost, power, performance, and system size. Engineers make tradeoffs to determine which factors are important, what to optimize, and what to compromise. Understanding the keys to emulation success in each mode of operation will lead to the best solution for the situation. SIGNAL-BASED ACCELERATION (SBA) The key to success for signal-based acceleration is easy migration from logic simulation. SBA users want to turn on performance with minimal effort. System characteristics that make an easy migration possible include: Incisive Newsletter — August 2005 | 13 • Easy migration path from simulation. This includes partitioning of the code that goes to the hardware vs. the code that goes to the workstation and an automatic connection between these two partitions • The closer the evaluation algorithms are to the logic simulator’s algorithms, the better • The ability to swap dynamically the execution image from the logic simulator into hardware and back again at any time • Initialization flexibility, such as the ability to use Verilog initial statements, procedureal language interface (PLI) code, and execute initialization sequences in logic simulation before starting the hardware • Dynamic-compare for debugging when results don’t match logic-simulation results TRANSACTION-BASED ACCELERATION (TBA) The key to success for transaction-based acceleration is the verification environment and the verification IP (VIP). TBA offers higher performance compared to SBA, but requires some additional planning up front. System characteristics that make TBA successful include: • Easy-to-use TBA infrastructure for VIP creation • High-bandwidth, low-latency channel between the emulator and workstation that enables the emulator to run as the master and to interrupt the workstation dynamically when needed • Library of TBA-ready VIP for popular protocols that runs congruently in the logic simulator and on the emulator EMBEDDED/SYNTHESIZABLE TESTBENCH (STB) Synthesizable testbench is all about performance. In addition to runtime speed, performance also comprises the overall turnaround time, including compile time, waveform generation, and debugging. For those designs that don’t have multiple-complex interfaces connected to the design, STB provides the highest level of performance possible. In addition, STB is used commonly for software development where performance is critical. The ability to interrupt the workstation infrequently in order to execute some behavioral functions, such as loading memory or printing messages, makes STB easier to use. IN-CIRCUIT EMULATION (ICE) The key to ICE is the availability of hardware interfaces and target boards. ICE is always tricky because the emulator is running much more slowly than the real system will run. • Collection of ICE-ready vertical solutions for popular protocols to minimize custom hardware development. 14 | Incisive Newsletter — August 2005 • Support for all types of ICE targets – Static targets that enable the provided clock to be slow or even stopped – Dynamic targets where the target board needs a constant or minimum clock frequency to operate correctly and stopping the clock would be fatal – Targets that drive a clock into the emulator • ICE-friendly debugging – Connections to software debugging environments – Large trace depths for debugging dynamic targets CHARACTERISTICS OF PALLADIUM AND XTREME The keys to emulation discussed above help determine the most important modes of operation and the solutions that best fit a set of verification projects. The next section describes the characteristics of the Cadence® Palladium® and Xtreme products, along with the strengths of each. PALLADIUM FAMILY Cadence Palladium systems are built using ASICs designed specifically for emulation. Each ASIC in a Palladium system implements hundreds of programmable Boolean processors for logic evaluation. Each ASIC is combined with memory on a ceramic multi-chip module (MCM) for modeling design memory and for debugging. The Palladium II custom processor is fabricated using IBM CU08, eight-layer copper-process technology. Each board in the system contains a number of interconnected MCMs to increase logic and memory capacity. Palladium uses the custom-processor array to implement a statically scheduled, cycle-based evaluation algorithm. Each processor is capable of implementing any four-input logic function using any results of previous calculations. During compilation, the design is flattened and broken into four-input logic functions. Each calculation is assigned to a specific processor during a specific time slot of the emulation cycle. The emulation cycle comprises the number of time slots, or steps, required for all the processors to evaluate the logic of the design. Typically, the number of steps required for a large design is somewhere between 125 and 320 steps. Performance can be improved through compiler enhancements that result in a shorter emulation cycle. The other benefit of the evaluation algorithm used by Palladidum is its ability to achieve higher performance by adding more processors. The availability of processors enables greater parallelization of the computations, and has a direct impact on performance. Because the scheduling of evaluations is fixed, performance doesn’t depend on design activity or design gate-count. Incisive Newsletter — August 2005 | 14 The technology used in Palladium gives it many desirable characteristics for emulation: Highest performance: Since custom processors are designed for high-performance emulation right from the start, using the most advanced semiconductor processes, the chips are clocked at hundreds of MHz, resulting in emulation speeds exceeding 1 MHz without any special advance design planning. Fastest compile: The processor-based architecture leads to compile speeds of 10–30 million gates-perhour on a single workstation. Best debugging: The custom hardware of Palladium is also designed for fast debugging. It supports multiple modes of debugging depending on how much information the user would like to see. It also has dedicated memory and high-bandwidth channels to access debugging data. Large memory: The custom hardware and the advanced technology enable very large on-chip memory with fast access time. This capability enables users to incorporate large on-chip and on-board memories with full visibility into the memory while maintaining high performance. Designers can use this memory for test vectors or storing embedded software code. Highest capacity: The ability to connect multiple units without changing the run-time frequency and bring-up time of the overall environment makes the Palladium family scalable, and provides an environment that supports 2–128 million gates with Palladium and 5–256 million gates with Palladium II. This is the highest emulation capacity in the market. Palladium offers a typical speed in the range of 600 kHz–1 MHz. (with many designs running over 1 MHz.). It is an ideal system for ICE and STB. A deterministic algorithm and precise control over I/O timing enable it to excel in ICE applications where constant speed is critical. Boundary scan (i.e., JTAG) is an example of an application that requires constant clock rates, the ability for the emulator to receive a clock signal from an external target board, and the need for hardware debugging when the target system cannot be stopped. Decades of ICE experience translate into the most available and reliable off-theshelf ICE solutions for major markets including networking, multimedia, and wireless. Palladium can do simulation acceleration (including SBA and TBA), but lacks some of the advanced features such as bi-directional simulation swapping and dynamiccompare with the logic simulator. Palladium also supports assertions in all modes of operation, including ICE with dynamic target. 15 | Incisive Newsletter — August 2005 Palladium supports up to 32 users, 61,440 in-circuit I/O signals, and is a true enterprise-class verification system with the highest capacity and highest performance. It can be used by multiple engineers and multiple projects in a company from anywhere in the world. XTREME FAMILY Cadence Xtreme systems are built using reconfigurable computing co-processors (RCCs) implemented in programmable logic using highdensity commercial FPGAs. Each FPGA implements hundreds of reconfigurable processors of different types based on the logic in the design. In addition to logic evaluation, each FPGA also contains internal memory that can be used to model design memory. Each board contains a number of FPGAs combined with additional static and dynamic memory to increase logic and memory capacity. Although Xtreme uses commercial FPGA devices, it does not operate like other FPGA-based systems. When considering how to use FPGAs, most engineers immediately think of synthesizing their design into a netlist and mapping gate-for-gate, wire-for-wire onto the array of FPGAs and trying to manage all the timing issues associated with internal, FPGA timing and routing between FPGAs. This description in no way resembles how RCC works. The best way to think of RCC technology is to think about how a logic simulator works. A simulator uses an event-based algorithm that keeps track of signal changes as events. It schedules new events by putting them into an event queue. At any particular simulation time, the algorithm updates new values (based on previous events) and schedule new events for some future time. As simulation time advances, only the values that are changing are updated and no extra work is done to compute values that don’t change. This event-based algorithm is the concept behind RCC. It uses patented, dynamic-event signaling to implement a simulation-like algorithm. The difference is that the events are all executing in hardware. Messages are sent between the co-processors as events occur and updates are required in other coprocessors. As with a simulator, new values are only computed when necessary, based on design activity. Dynamic-event signaling leads to a system that has none of the timing issues associated with FPGAs, and a product that runs and feels like a logic simulator. The technology used in Xtreme gives it many desirable characteristics for simulation acceleration and emulation: Low power: FPGAs consume very little power and the dynamic-event signaling algorithm used in Xtreme translates into good emulation performance with lower clock rates inside the system. Incisive Newsletter — August 2005 | 15 Small form factor: Low power combined with highdensity FPGAs translates into a 50-million-gate system (with 60–65% utilization) that fits in a desktop form factor, or a 6U rack mounted chassis. Xtreme is a portable system that is light, easy to transport, and is often placed in an engineer’s cubicle instead of a lab or computer room. Frequent capacity upgrades and cost reductions: Since emulation capacity and cost follow the mainstreamFPGA technology curve, it is possible to introduce very rapidly new systems that increase capacity and/or lower cost with minimal effort. Xtreme offers a typical speed in the range of 150–300 kHz and is an ideal system for simulation acceleration and TBA. It excels at targetless emulation and in any environment with a mix of SBA, TBA, and STB. Xtreme offers better performance than classic simulation acceleration systems that enforce the workstation as master. Xtreme is also an ideal system for accelerating e verification environments using SpeXtreme, a product that utilizes the behavioralprocessing features of Xtreme to increase overall verification performance. Xtreme can handle ICE, but lacks many of the advanced ICE features that are available in Palladium, such as dynamic-target support and robust ICE-debugging methods. The Xtreme family supports up to 12 users, 4,656 in-circuit I/O signals, and is a design-team-class verification system that can be used by multiple engineers working on a project. It offers excellent price vs. performance. Although there is some overlap in capabilities, the Xtreme and Palladium families of products excel at different modes of operation, as shown in Figure 4. Palladium family sweet spot CONCLUSION In this paper, we discussed that all products must make trade-offs in deciding which parameters should be optimized and which should be compromised. With the Xtreme and Palladium families, Cadence is in the ideal position to serve customers of all types. Some users choose one of the product lines to use throughout the phases of the project. Others utilize both technologies, each one for different phases of verification, or for different projects. Palladium offers the highest performance and highest capacity for emulation power-users with very large designs. Palladium is the most comprehensive solution for ICE and was designed with advanced ICE users in mind. Because of its simulation-like algorithm, Xtreme excels at simulation acceleration and TBA and provides higher performance than other acceleration solutions. It was designed with simulation acceleration in mind, with concepts like simulation swap and dynamic compare that make it run and feel like a logic simulator. Although the technologies behind Palladium and Xtreme are different, both are processor-based architectures that provide automated compile, short bring-up times, and scalable multiuser capacity not found in prototyping systems. Both systems provide more design-turns-per-day than competing systems, and each has a proven track record of success in the marketplace. Together, these product lines are being used by more than two-thirds of all simulation acceleration and emulation users. Engineers evaluating emulation need to understand the keys to success presented here for the different modes of operation and determine which solution is the best fit for their verification challenges. Xtreme family sweet spot Verification performance Embeded testbench In-circuit emulation System-level verification Transaction-based acceleration Signal-based acceleration Chip-level verification Block-level verification Figure 4: Emulation strengths 16 | Incisive Newsletter — August 2005 Incisive Newsletter — August 2005 | 16