I. Overview: Reset Domain Crossing Analysis as Part of Sign-Off
More reset domains and asynchronous resets in modern chips, along with faster designs, requires a proper reset domain crossing sign-off methodology to avoid chip-killing metastability, glitches, and other functional problems.
The growth in reset domains is driven by complex SOCs utilizing different controllers to manage more complex power management schemes, as well as security management schemes, I/O component, and functional configurations coming from different sources that can run at different clock frequencies, along with power-up resets, and software-based resets.
This paper discusses the four fundamentals of RDC needed to eliminate RDC bug escapes while delivering best PPA (performance, power, and area) designs in tight project windows, including key metrics to assess an effective methodology.
II. What is Reset Domain Crossing?
A reset domain crossing (RDC) occurs when a path’s transmitting flop has an asynchronous reset, and the receiving flop has an uncorrelated reset or no reset. The design errors that can occur from improperly implemented reset domain crossings and asynchronous resets are metastability, glitches, and loss of functional correlation.
The result is that design teams are including RDC analysis and sign-off as part of their strategy to shift left, to gain confidence before going into silicon.
The PDF version includes additional content.
Download for more on this subject.
III. Four Fundamentals to Eliminate RDC Bug Escapes
1. RDC Sign-Off has Critical Differences from CDC Sign-Off
Both RDC and CDC sign-off use static methods to identify metastability, glitches and function correlations design errors.
However, the design principles for protection against RDC issues are very different, due to the underlying distinction between RDC and CDC design problems. Thus, RDC domain-specific functional analysis is required.
Four important ways in which RDC analysis differs from CDC analysis include:
RDC errors can occur even in the same clock domain.
RDC analysis scope is global, while CDC interfaces are localized.
Analysis customized for RDC is required to identify all RDC issues with low noise.
RDC’s mean time between failures is higher than for CDC.
Because of these factors, reset domain crossing analysis must be highly customized to protect against RDC design issues.
1a. Reset domain crossing errors can occur within the same clock domain
Asynchronous reset assertion is still an asynchronous event, even if the reset crossing occurs in the same clock domain.
Design problems only occur between asynchronous clock domains.
RDC Design problems can occur between synchronous & asynchronous clock domains.
In a sequential design, if the source flop makes an asynchronous transition to a reset state, it can cause a transition inside the setup and hold window of the destination flop and cause a metastability issue.
1b. Reset Domain Crossing needs global analysis. CDC has local analysis
Identifying design issues requires a different scope of analysis for RDC versus the analysis needed for CDC.
RDC metastability analysis & debug is GLOBAL. The RDC analysis path can go from source in one reset domain through an intermediate set of flops and modules to destination flop in a different reset domain.
CDC metastability analysis & debug is LOCAL. The CDC analysis path goes from source flop in one clock domain to destination flop in a different clock domain.
Global analysis required to find all RDC issues
Local analysis (CDC interfaces) required to find all CDC issues
RDC domain-specific analyses — both structural and functional — are needed to identify all RDC design issues with low noise.
The example below shows how structural analysis alone can show a metastability error, while advanced RDC functional analysis of the design can recognize design protection such as a blocking signal.
Without this advanced RDC functional analysis, the number of false errors reported goes up tremendously. Engineers must review the false errors in the reported violations, so a high-noise violation report requires a lot more engineering time and effort to reach sign-off.
Additionally, if there are too many false positives reported, engineers can end up classifying actual errors as waivers – resulting in missed design bugs.
Structural analysis only False positive error
Analysis shows the assertion of RST_1 creates an untimed path from FF1 to FF2, which can cause metastability when RST_1 asserts and RST_2 to FF2 is de-asserted.
Running RDC-specific functional analysis of the combinational logic shows there is a blocking signal. The reset domain crossing path is proven safe, and no false error is reported.
1d. RDC mean time between failures (MTBF) is higher than CDC
One practical challenge faced by design teams is that chip failures due to RDC can be missed because the MTBF is much higher for RDC than for CDC.
For both RDC and CDC, the MTBF is correlated with the source flop change rate, the destination flop capture rate, and the setup-hold time window.
-The source flop change rate. Clocks toggle continuously; thus, if a CDC flop changes value once every 10 clock cycles, a clock domain crossing for a 1 GHz frequency clock can occur 100 million times per second. In contrast, resets only assert based on specific events.
-The destination flop capture rate, which is defined as the number of possible destination flop clock edges that can capture a transition from the source flop. The probability that a signal transition falls within the setup and hold time window of the capture flop is much higher for CDC as compared to RDC.
Because the mean time between failure is high for RDC issues, the frequency of RDC errors is highly unpredictable — they may occur once every hour, day, week, or month. Although they may not show up easily, they are still there.
These four differentiating aspects of RDC vs. CDC demonstrate why global static and functional RDC-specific analysis is necessary to ensure there are no reset domain crossing issues causing design failure.
Without this precise, RDC-specific analysis, reset domain crossing sign-off methods become unusable to designers due to too many false-positive design errors, creating the potential for missed errors.
The three primary types of design errors that result from incorrect implementation of reset domain crossings with asynchronous resets are described below.
Three Primary RDC Design Failures from Asynchronous Resets
Metastability when asynchronous resets are activated or deactivated. If the asynchronous reset propagates from one reset domain to another, it can cause metastability that leads to design failure.
Improper Functional Correlation
Reconverging synchronized resets can cause uncertainty and functional errors. When two events from one reset propagate through multiple synchronizers, they can create incorrect functional behavior where the flop being driven goes to an unexpected state.
Glitches for asynchronous resets. A design can have multiple sources toggling at different times resulting in a glitch at an asynchronous reset. This can cause an intermediate wrong value at the flop it drives and can cause a functional failure.
3. Tool Usability Can Make or Break an RDC Methodology
The overall goal is to invest in and establish a reset domain crossing static sign-off methodology that is widely adopted by the development team.
To achieve this goal, it is critical to incorporate RDC sign-off tool that is
Highly usable by the design team, minimizing the designer effort and their risk of error.
Fits easily with the company’s existing tool flow, supporting industry-standard formats.
Below are several of the key metrics to consider when evaluating a reset domain crossing tool.
Finds ALL Targeted Failures
To be considered a sign-off tool, by definition, the tool must find and report all targeted problems. It cannot miss any issues. Practically speaking, the first metric is tightly linked to the second metric of having low noise level.
High Precision, Low Noise Reports
Designers must review all errors; having a ‘noisy’ report (meaning a high number false errors to reported violations) creates an excess burden on designers. A low noise level is fundamental to the tool’s usability. The noise level difference between tools can easily be an order of magnitude; this can make a tool impractical to use and the design susceptible to missed errors.
Fast Runtime, High Capacity
Performance is another factor that varies dramatically between reset domain crossing tools — some tools run multiple times faster. Faster performance enables designers to quickly execute multiple analysis and debug iterations. Fast performance and high capacity are even more critical at the chip level.
Since all static sign-off tools have at least some noise, the tool should have violation filtering, e.g., for a particular reset scenario or for a block, to accelerate the identification of real errors. Efficient debug is further achieved by other features such as the tool displaying a schematic of the violation that is linked to the source code.
A multimode RDC sign-off tool requires only one set up for multiple reset scenarios, then runs them simultaneously, producing one consolidated report. This minimizes engineering effort and review time, which is a big factor in usability.
RDC Scenario Verification
An RDC tool should be able to perform setup checks and generate assertions for verifying constraints & scenarios in the industry-standard System Verilog Assertion format. This enables designers to run the assertions in their preferred, best-in-class choice of simulator and formal verification tool.
4. Using Asynchronous Resets Can Raise PPA (vs. Synchronous Resets)
4a. Asynchronous resets can consume less power & area.
Synchronous resetsonly work when the clock toggles; thus, they require additional logic (multiplexers, and gates…) outside the flop so that the reset signal overrides the data signal.
Asynchronous resets do not need additional logic on data path; they can connect directly to the flop. Asynchronous reset signals override clock signal and are independent of the clock toggling.
The result is that asynchronous resets can consume less power and area compared with synchronous resets.
4b. Asynchronous resets help design performance
Synchronous reset assertions need the clock to be running to drive the flop to its reset state. Therefore, the flop reset is delayed till the next clock edge after the clock starts.
Asynchronous reset assertions drive the flop to its reset state immediately. This accelerates reset sequence, decreasing initialization time.
IV. Conclusion: Eliminating Reset Domain Crossing Bugs while Maintaining PPA
Reset domain crossing sign-off is becoming part of the sign-off flow before tape out, due to complicated power management schemes, security management schemes, I/O connections, configurations, and faster designs.
It’s vital to deploy a methodology that eliminates RDC bug escapes due to metastability, glitches, and functional correlation losses — while still delivering the best possible PPA (performance, power, and area) designs in tight project windows.
About Real Intent Meridian RDC
Real Intent Meridian RDC reset domain crossing sign-off tool uses an engine specifically customized for global RDC static and functional analysis.
This approach and other unique technology enables the tool to address the key metrics discussed above, minimizing designer effort and eliminating missed errors.
1. Finds all targeted failures.
2. Violation report with 20x+ less noise (false errors) than competitive tools.
3. 4x faster runtime + high capacity.
4. Multimode / Multi-RDC scenarios — it requires only one set up, runs multiple reset scenarios in one run, and generates one consolidated, low-noise violation report.
5. Leverages Real Intent’s iDebug, with automatic filtering and pinpoint schematics.
6. Generates industry-standard SVA for RDC scenario verification.
7. Fits easily with existing tool flows, supporting industry-standard formats.
Dr. Prakash Narain shifted Real Intent’s technology focus to Static Sign-Off to better tackle verification inefficiencies back in 2010. Since then, Prakash has expanded this effort to now offer six tools focused specifically on different static sign-off domains, ranging from CDC to RDC to Linting — with more coming.
His career spans IBM, AMD and Sun where he had hands-on experience with all aspects of IC design, CAD tools, design and methodologies. He was the project leader for test and verification for Sun’s UltraSPARC III, and an architect of AMD’s Mercury Design System. He has architected and developed CAD tools for test and verification for IBM EDA.
Dr. Narain has a Ph.D. from the University of Illinois at Champaign-Urbana where his thesis focus was on algorithms for high level testing and verification.