







## Image removed due to copyright restrictions. Adding quality inspectors ("verification engineers") and giving them better tools, was not the solution The Japanese auto industry showed the way "Zero defect" manufacturing February 22, 2005













## Example from a commercially available FIFO IP component

An error occurs if a push is attempted while the FIFO is full.

Thus, there is no conflict in a simultaneous push and pop when the FIFO is full. A simultaneous push and pop cannot occur when the FIFO is empty, since there is no pop data to prefetch. However, push data is captured in the FIFO.

A pop operation occurs when <code>pop\_req\_n</code> is asserted (LOW), as long as the FIFO is not empty. Asserting <code>pop\_req\_n</code> causes the internal read pointer to be incremented on the next rising edge of <code>clk</code>. Thus, the RAM read data must be captured on the <code>clk</code> following the assertion of <code>pop\_req\_n</code>.

data\_in data\_out
push\_req\_n full
pop\_req\_n empty

clk
rstn

These constraints are taken from several paragraphs of documentation, spread over many pages, interspersed with other text

February 22, 2005

L07-12







Courtesy of BlueSpec Inc. Used with permission.



Courtesy of BlueSpec Inc. Used with permission.



Courtesy of BlueSpec Inc. Used with permission.



Courtesy of BlueSpec Inc. Used with permission.

| Programming with rules: A simple example                            |             |          |
|---------------------------------------------------------------------|-------------|----------|
| Euclid's algorithm for computing the Greatest Common Divisor (GCD): |             |          |
| 15                                                                  | 6           |          |
| 9                                                                   | 6           | subtract |
| 3                                                                   | 6           | subtract |
| 6                                                                   | 3           | swap     |
| 3                                                                   | 3           | subtract |
| 0                                                                   | answer: (3) | subtract |
| February 22, 2005                                                   |             | L07-19   |



Courtesy of BlueSpec Inc. Used with permission.



Courtesy of BlueSpec Inc. Used with permission.

```
Generated Verilog RTL: GCD
     module mkGCD(CLK, RST_N,start__1, start__2, E_start_, ...)
       input CLK; ...
       output start__rdy; ...
       wire [31 : 0] x$get; ...
       assign result_ = x$get;
       assign _d5 = y$get == 32'd0;
       assign _d3 = x$get ^ 32'h80000000) <= (y$get ^ 32'h80000000);
       assign C___2 = _d3 && !_d5;
       assign x$set = E_start_ | P___1;
       assign x$set_1 = P__1 ? y$get : start_1;
       assign P___2 = _d3 && !_d5;
       assign y$set_1 =
           {32{P___2}} & y$get - x$get | {32{_dt1}} & x$get |
           {32{_dt2}} & start__2;
       RegUN #(32) i_x(.CLK(CLK), .RST_N(RST_N), .val(x$set_1), ...)
       RegN #(32) i_y(.CLK(CLK), .RST_N(RST_N), .init(32'd0),
     endmodule
February 22, 2005
                                                                 L07-22
```

Courtesy of BlueSpec Inc. Used with permission.







```
SW ("C") version of LPM
       Ipm (IPA ipa)
                               /* 3 memory lookups */
         int p;
         p = RAM [ipa[31:16]];
                              /* Level 1: 16 bits */
         if (isLeaf(p)) return p;
         p = RAM [p + ipa [15:8]]; /* Level 2: 8 bits */
         if (isLeaf(p)) return p;
         p = RAM [p + ipa [7:0]]; /* Level 3: 8 bits */
                            /* must be a leaf */
         return p;
    How to implement LPM in HW?
           Not obvious from C code!
February 22, 2005
                                                         L07-26
```





Courtesy of BlueSpec Inc. Used with permission.









## But, what about all the potential race conditions? Reading from the register file at the same time a separate instruction is writing back to the same location Which value to read? An instruction is being inserted into the ROB simultaneously to a dependent upstream instruction's result coming back from an ALU Put a tag or the value in the operand slot? An instruction is being inserted into the ROB simultaneously to A branch mis-prediction must kill the mis-predicted instructions and restore a "consistent state" across many modules



## Synthesizable model of IA64

CMU-Intel collaboration

- Develop an Itanium μarch model that is
  - concise and malleable
  - executable and synthesizable
- FPGA Prototyping
  - XC2V6000 FPGA interfaced to P6 memory bus
  - Executes binaries natively against a real PC environment (i.e., memory & I/O devices)
- An evaluation vehicle for:
  - Functionality and performance: a fast μarchitecture emulator to run real software
  - Implementation: a synthesizable description to assess feasibility, design complexity and implementation cost

Roland Wunderlich & James Hoe @ CMU Steve Hynal(SCL) & Shih-Lien Liu(MRL)

February 22, 2005

L07-35



Courtesy of BlueSpec Inc. Used with permission.