Hello Dear Readers,
Today in this series of posts I will provide some deep insight into low-power VLSI design flow and the different techniques to reduce different components of power consumption.
It’s no secret that power is emerging as the most critical issue in system-on-chip
(SoC) design today. Power management is becoming an increasingly urgent
problem for almost every category of design, as power density—measured in
watts per square millimeter—rises at an alarming rate.
From a chip-engineering perspective, effective energy management for an SoC
must be built into the design starting at the architecture stage; and low-power
techniques need to be employed at every stage of the design, from RTL to GDSII flow.
Fred Pollack of Intel first noted a rather alarming trend in his keynote at MICRO-32
in 1999. He made the now well-known observation that power density is increasing
at an alarming rate, approaching that of the hottest man-made objects on the
planet, and graphed power density as shown in Fig. 1.
|
Fig. 1: Power density with shrinking geometry |
The power density trend versus power design requirements for modern SoCs as shown below image. The widening gap represents the most critical challenge that
designers of wireless, consumer, portable, and other electronic products face
today.
Meanwhile, the design efforts in managing power are rising due to the necessity to
design for low power as well as for performance and costs. This has ramifications
for engineering productivity, as it impacts schedules and risk.
Power management is a must for all designs of 90nm and below. Aggressive management of leakage current can significantly impact design and implementation choices at smaller geometries. Indeed, for some designs and libraries, leakage
current exceeds switching currents, thus becoming the primary source of power
dissipation in CMOS, as shown in Fig. 2.
|
Fig. 2: Process technology vs. leakage and dynamic power |
Until recently, designers were primarily concerned with improving the performance
of their designs (throughput, latency, frequency), and reducing silicon area to
lower manufacturing costs. Now power is replacing performance as the key
competitive metric for SoC design.
These power challenges affect almost all SoC designs. With the explosive growth
of personal, wireless, and mobile communications, as well as home electronics,
comes the demand for high-speed computation and complex functionality for
competitive reasons. Today’s portable products are expected not only to be small,
cool, and lightweight but also to provide extremely long battery life. And even
wired communications systems must pay attention to heat, power density, and
low-power requirements. Among the products requiring low-power management
are the following:
- Consumer, wireless, and handheld devices: cell phones, personal digital
assistants (PDAs), MP3 players, global positioning system (GPS) receivers,
and digital cameras
- Home electronics: game consoles for DVD/VCR players, digital media
recorders, cable and satellite television set-top boxes, and network and telecom
devices
- Tethered electronics such as servers, routers, and other products bound by
packaging costs, cooling costs, and Energy Star requirements supporting the
Green movement to combat global warming
For most designs being developed today, the emphasis on active low-power management—as well as on performance, area, and other concerns—is increasing.
Power Management:
Power Dissipation in CMOS
Let’s take a quick look at the sources of power dissipation. Total power is a
function of switching activity, capacitance, voltage, and the transistor structure
itself.
|
Fig. 3: Power dissipation in CMOS |
Total power is the sum of dynamic and leakage power.
Dynamic power is the sum of two factors: switching power plus short-circuit power.
Switching power is dissipated when charging or discharging internal and net
capacitances. Short-circuit power is the power dissipated by an instantaneous
short-circuit connection between the supply voltage and the ground at the time the
gate switches state.
|
Fig. 4: Dynamic power in CMOS |
Dynamic power can be lowered by reducing switching activity and clock frequency,
which affects performance; and also by reducing capacitance and supply voltage.
Dynamic power can also be reduced by cell selection—faster slew cells consume
less dynamic power.
Leakage power is a function of the supply voltage Vdd, the switching threshold
voltage Vth, and the transistor size.
|
Fig. 5: Leakage power in CMOS |
Of the following leakage components, sub-threshold leakage is dominant. - Diode reverse bias current
- Sub-threshold current
- Gate-induced drain leakage
Gate oxide leakage While dynamic power is dissipated only when switching, leakage power due to leakage current is continuous, and must be dealt with using design techniques.
Techniques for Switching and Leakage Power Reduction:
The following table defines some common power management techniques for
reducing power:
Clock tree optimization and clock gating:
In normal operation, the clock signal continues to toggle at every clock cycle,
whether or not its registers are changing. Clock trees are a large source of
dynamic power because they switch at the maximum rate and typically have larger capacitive loads. If data is loaded into registers only infrequently, a significant
amount of power is wasted. By shutting off blocks that are not required to be
active, clock gating ensures power is not dissipated during idle time.
Clock gating can occur at the leaf level (at the register) or higher up in the clock
tree. When clock gating is done at the block level, the entire clock tree for the
block can be disabled. The resulting reduction in clock network switching becomes
extremely valuable in reducing dynamic power.
Operand Isolation:
Often, datapath computation elements are sampled only periodically. This
sampling is controlled by an enable signal. When the enable is inactive, the
datapath inputs can be forced to a constant value. The result is that the datapath
will not switch, saving dynamic power.
Multi-Vth:
Multi-Vth optimization utilizes gates with different thresholds to optimize for power,
timing, and area constraints. Most library vendors provide libraries that have cells
with different switching thresholds. A good synthesis tool for low-power
applications is able to mix available multi-threshold library cells to meet speed and
area constraints with the lowest power dissipation. This complex task optimizes for
multiple variables and so is automated in today’s synthesis tools.
MSV:
Multi-supply voltage techniques operate different blocks at different voltages.
Running at a lower voltage reduces power consumption, but at the expense of
speed. Designers use different supply voltages for different parts of the chip based
on their performance requirements. MSV implementation is key to reducing power
since lowering the voltage has a squared effect on active power consumption.
MSV techniques require level shifters on signals that go from one voltage level to
another. Without level shifters, signals that cross voltage levels will not be sampled
correctly.
DVS/DVFS/AVFS:
Dynamic voltage and frequency scaling (DVFS) techniques—along with
associated techniques such as dynamic voltage scaling (DVS) and adaptive
voltage and frequency scaling (AVFS)—are very effective in reducing power, since
lowering the voltage has a squared effect on active power consumption. DVFS
techniques provide ways to reduce power consumption of chips on the fly by
scaling down the voltage (and frequency) based on the targeted performance
requirements of the application. Since DVFS optimizes both the frequency and the voltage, it is one of the only techniques that is highly effective on both dynamic
and static power.
Dynamic voltage scaling is a subset of DVFS that dynamically scales down the
voltage (only) based on the performance requirements.
Adaptive voltage and frequency scaling is an extension of DVFS. In DVFS, the
voltage levels of the targeted power domains are scaled in fixed discrete voltage
steps. Frequency-based voltage tables typically determine the voltage levels. It is
an open-loop system with large margins built in, and therefore the power reduction
is not optimal. On the other hand, AVFS deploys closed-loop voltage scaling and
is compensated for variations in temperature, process, and IR drop using
dedicated circuitry (typically analog in nature) that constantly monitors
performance and provides active feedback. Although the control is more complex,
the payoff in terms of power reduction is higher.
Power Shutoff (PSO):
One of the most effective techniques, PSO—also called power gating—switches
off power to parts of the chip when these blocks are not in use. This technique is
increasingly being used in the industry and can eliminate up to 96 percent of the
leakage current.
Power gating is employed to shut off power in standby mode. A specific powerdown sequence is needed, which includes isolation on signals from the shut-down
domain. Erroneous power-up/down sequences are the root cause of errors that
can cause a chip re-spin. This needs to be correctly and exhaustively verified
along with functional RTL to ensure that the chip functions correctly with sections
turned off and that the system can recover after powering up these units.
Deploying power shutoff also requires isolation logic and possibly state retention of
key state elements or, in other words, state retentive power gating (SRPG). For
multi-supply voltage (MSV), level shifters are also needed.
Isolation logic is typically used at the output of a powered-down block to prevent
floating, unpowered signals (represented by unknown or X in simulation) from
propagating from powered-down blocks.
The outputs of blocks being powered down need to be isolated before power can
be switched off; and they need to remain isolated until after the block has been
fully powered up. Isolation cells are placed between two power domains and are
typically connected from domains powered off to domains that are still powered
up.
In some cases, isolation cells may need to be placed at the block inputs to prevent
connection to powered-down logic. If the driving domain can be OFF when the receiving domain is ON, the receiving domain needs to be protected by isolation.
The isolation cells may be located in the driving domain, with special isolation
cells, or they may be in the receiving domain. State Retention
In certain cases, the state of key control flops needs to be retained during power-off. To speed power-up recovery, state retention power gating (SRPG) flops can
be used. These retain their state while the power is off, provided that specific
control signaling requirements are met.
Cell libraries today include such special state retention cells. A key area of
verification is checking that these library-specific requirements have been satisfied
and the flop will actually retain its state.
|
Fig. 6: State retention power gating |
|
Fig. 7: Isolation gate and power-down switch
|
Power Cycle Sequence
For power-down, a specific sequence is generally followed: isolation, state
retention, and power shutoff. For the power-up cycle, the opposite sequence needs to be followed. The power-up cycle can also require a specific
reset sequence.
|
Fig. 8: Power-up/down sequence |
Given that there are multiple—possibly nested—power domains, coupled with
different power sequences, some of which may share common power control
signals and multiple levels of gated clocks, the need for verification support is
tremendous. The complexity and possible corner cases need to be thoroughly
analyzed; functional and power intent must be analyzed and thoroughly verified
together using advanced verification techniques.
Memory Splitting:
In many systems, the memory capacity is designed for peak usage. During normal
system activity, only a portion of that memory is actually used at any given time. In
many cases, it is possible to divide the memory into two or more sections, and
selectively power down unused sections of the memory.
With increasing SoC memory capacity, reducing the power consumed by memories is increasingly important.
Substrate bias (Reverse body bias):
Since leakage currents are a function of device Vth, substrate biasing—also known
as back biasing—can reduce leakage power. With this advanced technique, the
substrate or the appropriate well is biased to raise the transistor thresholds,
thereby reducing leakage. In PMOS, the body of transistor is biased to a voltage
higher than Vdd. In NMOS, the body of transistor is biased to a voltage lower than
Vss.
Since raising Vth also affects performance, an advanced technique allows the bias
to be applied dynamically, so during an active mode of operation the reverse bias
is small, while in standby the reverse bias is stronger.
Area and routing penalties are incurred. An extra pin in the standard cell library is
required and special library cells are necessary. Body-bias cells are placed
throughout the design to provide voltages for transistor bulk. To generate the bias
voltage, a substrate-bias generator is required, which also consumes some
dynamic power, partially offsetting the reduced leakage.
Substrate bias returns are diminishing at smaller processes in advanced
technologies. At 65nm and below, the body-bias effect decreases, reducing the
leakage control benefits. TSMC has published information pointing to a factor of 4x
reduction at 90nm, and only 2x moving to 65nm Consequently, substrate
biasing is predicted to be overshadowed by power gating.
In summary, there are a variety of power optimization techniques that attack
dynamic power, leakage, or both. Fig. 9 shows the effect of introducing several
power reduction techniques on a raw RTL design, on both active and static power.
|
Fig. 9: Power reduction techniques |
Connect with me
4.WhatsApp
Good coverage and I like to see day by day you are pro in the writing ✏️🤛🤛👊👊👊👊🙂🙂
ReplyDeleteSuperb post and helpful to understand easily low power.
ReplyDelete