ASIC Design & Verification
1.0 Simulate Everything
While seemingly obvious to ASIC folks, this is step #1. FPGAs can often be very productively tested in the lab (that's part of the point, right?). With ASICs, you get only one shot, or at least you pay the $100000 penalty per re-spin. A wise designer once said "That which is not simulated will not operate correctly".
"Everything" is a little ambitious, of course. Formal testing methodologies dictate that your design be specified in a document containing detailed, numbered paragraphs of each and every feature and function with numbers for any minimums and maximums. Formal test methodologies then require that a testing document specify a matrix cross-referencing every system feature/function against a very specific test that verifies that feature/function including those min/max parameters.
For FPGA folks, note that the heavy emphasis on simulation goes beyond just testing the design thouroughly. Chip vendors require the submission of test vectors which requires this kind of testbenching.
2.0 Regression, Regression and Regression
Regression testing means you can run a test suite on your design at any time in an easy manner require little additional setup and little analysis. The idea is that you can retest everything whenever something changes or periodically as a matter of course. This is sometimes refered to the "Always Working" model, where the design is kept in an always working state. "Always working" is made easier by using an RCS or other source code control system such that an official "release" directory contains the source code, and only source code that has passed the full regression test.
Is this just mindless testing repetition? Here are some reasons for doing this kind of repeated regression test. 1) A seemingly small tweak of the design will lead to a problem. Nobody on the surface believes the tweak justifies another round of tedious tests. 2) The design is really part of a larger system where another team's module has been fiddled with. They make a tweak and falsely believe it does not impact your block.. 3) Netlist changes! Quick - does the design still work? (see #10) 4) A new version of a tool was installed last night and will eventually cause a problem. Better to detect this sooner than later.
Regression Tests should be repeatable and traceable. A test should first be repeatable. If randomness is built into a test, then it may be important to specify an initial SEED for a test in order to replicate an error detected on a previous run. Traceability means that when an anomoly or error is reported in a test enough additional information is reported to allow you to zero-in on the problem. This means that you should display messages about major events in the test and include the simulation time (e.g. use the %t and $time feature!). This allows you to rerun a test, and selectively capture simulation data (e.g. waveforms or VCD) around the time the error occurs. In other words, if you have a test that runs for 9 hours and an error pops up 5 hours into the test; will you be able to diagnose that error?
A good goal is to pass the "Weekend Run Test". This test says that when you leave work on friday afternoon, are you able to kick off your test suite and know that you can easily review the results on Monday morning? Obviously, this requires an automated test suite but it also implies that the results are easy enough to check out at the end. Results that require a minimum 4 hours to analyse discourages the casual "weekend run". So, the goal is feel free to kick of the regression test suite at any time.
3.0 PASS or FAIL, Please!
Each test should report an unambiguous, succinct indication of pass or fail. This is also refered to as the Self-Checking Testbench. Viewing waveforms is great for diagnosing problems, but not the way to repetitevly assess whether your test and your design is working.
Establish a convention for reporting PASS/FAIL. What is best is a one line print statement containing the DATE/TIME the test completed, the name of the test and either the word PASS or FAIL. More information is fine as long as there is this bottom-line PASS/FAIL. This line should have some unique keyword like ENDTEST in it. Imagine a system where each test is in its own directory and a script runs a "test suite", or a key set of these tests. After the last test runs, the script could do a 'grep ENDTEST */results | mail email@example.com'. You receive an email listing a one line PASS/FAIL per test for the entire test suite that ran.
4.0 Keeping the design robust and flexible
This is almost a design issue and not so much a verification issue, however, the act of testing often reveals opportunities for beefing up the design robustness. A robust design partially means that it can be tweaked in order to handle unforseen circumstances. If a module ends up being problematic, it might be useful to disable it. Always include a DISABLE bit. Include alternative paths or even ways to bypass modules and features. If there is a shred of doubt about data formats or parameters, provide programmable settings. Option bits can selectively complement data, change bit ordering, little/big-endianness, or byte ordering. Programmable wait states might be key in fixing a fragile interface. All this may have speed and area costs, but risk reduction is important, too. Just remember, that one option bit you add may save your design in the lab.
5.0 Coverage and the "Time as a Tool" test
Are you testing all possible lengths, options, fields, rates and cases? Probably not, but are you covering the minimums and the maximums and at least one example of each major case? If you think you are(maybe you have a test that is supposed to test this), but aren't sure, how can you verify that you're verifying what you think you are verifying?! This is all about Coverage. There are tools that help with this. Code Coverage tools (like Synopsys Covermeter) will analyse your test and report on every line of your RTL and Test code as to whether the line was executed at some point during the tests. There is also Functional Coverage where you measure how much of your system functionality is exercised with the tests. Tools like Specman Elite from Verisity address this issue.
Consider adding psuedo-random stimulus creation. Randomly select parameters, options, etc. This is especially good if it is unrealistic to attempt to generate a test case for all possible combinations. One goal is to pass the "Time as a Tool" test. The question is; if you had another N hours to run a particular test, would you increase your test coverage? In other words, the longer the test runs the more nooks and crannies will be explored because you are randomly exploring more and more cases. Pseudo-random testing ties back into coverage. Beware of a false sense of security where you assume you really are covering all possible cases just because you have some randomness in your test. Verify your coverage!
A more homebrew scheme to address test coverage can be done in your regular HDL. Instrument your code so that whenever your code tests a particular case or sets some parameter value, display a message indicating this. This HDL can be activated with a '+define+SHOW_COVERAGE' option in the simulation command line. Some creative GREPing of the log file after the test suite finishes can actually report a % coverage.
6.0 Maximize Controllability
Control the design and the testbench. Can you inject data into almost any point in the design? Can you do this injecting of data only in the testbench, or could you do it in the real thing? Can you bypass any module? Can you disable modes? Can you emulate certain things at interfaces so that you can still simulate operation if certain external modules are absent (e.g. testing "stubs" or behavioral models).
7.0 Maximize Observability
Be able to observe data and operation anywhere in the design. Can you extract data at all major interfaces (without constantly having to use a waveform viewer!) in the design? Can you include test modes into the actual design allowing data streams or signals to be routed out to observable pins on the chip? Can you observe important events and signals easily? Can you observe those events and signals from C code running in your processor?
Consider adding HDL "monitors" to your testbench. For example, within your actual HDL code, you might include code like this:
You can easily enable this monitor code in a top-level testbench by using a defparam statement. Becareful that using monitors embedded in the RTL may not be availble during a gate-level simulation.
Counters can also greater increase observability. Consider adding in some counters within the actual design. Counters can count major system events, errors, etc. The counters are accessible to software running in a processor or even by HDL code. Counters cost area. One trick, is to use a single test counter in a particular module but then provide some control bits that select the event to be counted (another control bit can clear the counter).
8.0 C Code anyone?
Many ASICs these days inlcude or are controlled by a processor running C code. Some ASIC folks loathe C code, but consider the advantages to using C code in your testbenches in additional to your normal simulation HDL. Many people use Bus Functional Models to emulate the C code and the processor. BFMs emulate the reading and writing of memory and I/O addresses and can be made to exactly emulate the ultimate C code. However, using actual C code brings you closer to what the software engineer will actually be doing. Also, using C code in Design Verification tests can be reused on the lab bench.
C code can be brought into the HDL simulation using behavioral/RTL/gate-level processor models, cross compilers and linkers and the Verilog $readmemh PLI function for loading code images into memory models. There are also commercial tools catering to Co-simulation requirements such as Seamless from Mentor. Such tools allow you to simultaneously use, for example, a source-level debugger for the C code running on a processor model and the traditional HDL simulator. Simpler more homebrew methodolies exist such as using PLI routines. Obviously, there are significant tools issues involved in bringing together the HDL and C code but there are high payoffs.
9.0 The LAB..
Can all the hard work you expended creating these tests in the simulation world transition to the lab? Can the C code (if applicable) be easily ported to the real platform? Can data created in the simulation environment be introduced into the lab environment? Can live data be extracted and compared against the simulated data?
10.0 RTL, Gates, Timing.. Whatever..
ASIC folks are called upon to verify the design at various stages in the flow. Not only must the RTL code be tested, but often the Gate-level netlist must be tested. The netlist can be tested before and after scan is inserted, or with back-annotated timing etc. etc. Modern methodologies relying on Static Timing Analysis sometimes suggest that gate-level simulation is unneccessary. Some amount of gate-level simulation should be considered in order to catch subtle synthesis problems and modeling issues. Make sure the tests can be run on these netlists as well as the RTL. Sometimes, this means that certain HDL monitors may not work well with a gate-level netlist and embedded actual counters in the design is the better approach. Just be careful about how you construct the tests!