

# **Summer Training Presentation**

Student: Po-Jen Chen

Advisor: Professor Tsung-Te Liu

Date: 2019.09.12



#### **Outline**

- ◆ Frequency Analysis System (FAS) Block Diagram
  - Finite Impulse Response Filter (FIR)
  - Fast Fourier Transform (FFT)
  - Analysis
- **♦** Implementation Result
- Conclusion



## **FAS Block Diagram**





### Finite Impulse Response Filter (FIR)

- Using transpose property to reduce the number of adders
- Pipelining multiplication and addition to shorten critical path





## **Fast Fourier Transform (FFT)**

- ♦ Three complex multiplications are saved by using the right architecture.  $(10 \rightarrow 7)$
- ◆ By using serial-to-parallel (S2P) buffer, I have about 15 cycles to operate FFT. However, 4 multipliers are applied in my FFT module.
- With appropriate organization, only two multipliers are needed. (7 × 4 multiplications in 15 cycles)
- Pipelining multiplication and addition to shorten critical path





### **Analysis**

- ◆ Similarly, I use S2P buffer to collect the FFT outputs so that I have about 15 cycles to operate Analysis.
- In a single cycle, I compare two of the FFT outputs by using 4 multipliers.  $(|y(i)|^2 = a_1^2 + b_1^2 > a_2^2 + b_2^2 = |y(j)|^2)$ 
  - The more registers to keep data, the fewer multipliers are needed.
- ♦ Less accuracy is needed in this stage.
- Pipelining multiplication and addition to shorten critical path



# **Implementation Result**

|                                 | RTL | After<br>synthesis | After DFT | Post-layout |
|---------------------------------|-----|--------------------|-----------|-------------|
| Clock (ns/cycle)                | -   | 4.4                | 4.4       | 4.4         |
| Total simulation time (ns)      | -   | 4734.4             | 4734.4    | 4734.4      |
| Area (mm²)                      | -   | 0.332797           | 0.392863  | 0.654989    |
| A*T value (ns*mm <sup>2</sup> ) | -   | 1576               | 1860      | 3101        |
| Fault coverage                  | -   | -                  | 99.87%    | 99.87%      |



#### **Conclusion**

- Don't float any input/output port.
  - Clock-tree-synthesis can not be done successfully.
  - Remember to buffer all input/output ports
- The cost of multiplication is much bigger than that of others.
  - Try to decrease the number of multipliers using in all designs, especially large-bit multipliers
  - Appropriate organization can efficiently optimize the designs.
- P&R can be unpredictably time-wasting.
  - When any step in P&R can't go well, remember to look for the error or warning message hiding in the long log text.
  - But sometimes you would have know idea what the error message means.
  - Leave adequate time for back-end process