Hello Dear Readers,
Today, I will explain what is HLS(High-Level Synthesis).
The hardware design process has evolved significantly over the years. When the circuits were small, hardware designers could more easily specify every transistor, how they were wired together, and their physical layout. Everything was done manually. As our ability to manufacture more transistors increased, hardware designers began to rely on automated design tools to help them in the process of creating the circuits. These tools gradually become more and more sophisticated and allowed hardware designers to work at higher levels of abstraction and thus become more efficient. Rather than specify the layout of every transistor, a hardware designer could instead specify digital circuits and have electronic design automation (EDA) tools automatically translate these more abstract specifications into a physical layout. The Mead and Conway approach of using a programming language (e.g., Verilog or VHDL) that compiles a design into physical chips took hold in the 1980s. Since that time, the hardware complexity has continued to increase at an exponential rate, which forced hardware designers to move to even more abstract hardware programming languages. register-transfer level (RTL) was one step in abstraction, enabling a designer to simply specify the registers and the operations performed on those registers, without considering how the registers and operations are eventually implemented.
EDA tools can translate RTL specifications into a digital circuit model and then subsequently into the detailed specification for a device that implements the digital circuit. This specification might be the files necessary to manufacture a custom device or might be the files necessary to program an off-the-shelf device, such as a field-programmable gate array (FPGA). Ultimately, the combination of these abstractions enables designers to build extraordinarily complex systems without getting lost in the details of how they are implemented. high-level synthesis (HLS) is yet another step in abstraction that enables a designer to focus on larger architectural questions rather than individual registers and cycle-to-cycle operations. Instead, a designer captures behavior in a program that does not include specific registers or cycles and an HLS tool creates the detailed RTL micro-architecture. One of the first tools to implement such a flow was based on behavioral Verilog and generated an RTL-level architecture also captured in Verilog. Many commercial tools now use C/C++ as the input language. For the most part, the language is unimportant, assuming that you have a tool that accepts the program you want to synthesize! Fundamentally, algorithmic HLS does several things automatically that an RTL designer does manually:
• HLS analyzes and exploits the concurrency in an algorithm.
• HLS inserts registers as necessary to limit critical paths and achieve the desired clock frequency.
• HLS generates control logic that directs the data path.
• HLS implements interfaces to connect to the rest of the system.
• HLS maps data onto storage elements to balance resource usage and bandwidth.
• HLS maps computation onto logic elements performing user-specified and automatic optimizations to achieve the most efficient implementation.
Generally, the goal of HLS is to make these decisions automatically based upon user-provided input specification and design constraints. However, HLS tools greatly differ in their ability to do this effectively. Fortunately, there exist many mature HLS tools (e.g., Xilinx Vivado HLS, LegUp, and Mentor Catapult HLS) that can make these decisions automatically for a wide range of applications. however, the general techniques are broadly applicable to most HLS tools though likely with some changes in input language syntax/semantics. In general, the designer is expected to supply the HLS tool a functional specification, describe the interface, provide a target computational device and give optimization directives. More specifically, Vivado HLS requires the following inputs:
• A function specified in C, C++, or SystemC
• A design testbench that calls the function and verifies its correctness by checking the results.
• A target FPGA device
• The desired clock period
• Directives guiding the implementation process
In general, HLS tools can not handle any arbitrary software code. Many concepts that are common in software programming are difficult to implement in hardware. Yet, a hardware description offers much more flexibility in terms of how to implement the computation.
It typically requires
additional information to be added by the designers (suggestions or #pragmas) that provide hints
to the tool about how to create the most efficient design. Thus, HLS tools simultaneously limit
and enhance the expressiveness of the input language. For example, it is common to not be able to
handle dynamic memory allocation. There is often limited support for standard libraries. System
calls are typically avoided in hardware to reduce complexity. The ability to perform recursion
is often limited. On the other hand, HLS tools can deal with a variety of different interfaces
(direct memory access, streaming, on-chip memories). And these tools can perform advanced optimizations (pipelining, memory partitioning, bitwidth optimization) to create an efficient hardware
implementation.
We make the following assumptions about the input function specification, which generally
adheres to the guidelines of the Vivado HLS tool:
• No dynamic memory allocation (no operators like malloc(), free(), new, and delete())
• Limited use of pointers-to-pointers (e.g., may not appear at the interface)
• System calls are not supported (e.g., abort(), exit(), printf(), etc. They can be used in the code, e.g., in the testbench, but they are ignored (removed) during synthesis.
• Limited use of other standard libraries (e.g., common math.h functions are supported, but uncommon ones are not)
• Limited use of function pointers and virtual functions in C++ classes (function calls must be compile-time determined by the compiler).
• No recursive function calls.
• The interface must be precisely defined.
The primary output of an HLS tool is an RTL hardware design that is capable of being synthesized through the rest of the hardware design flow. Additionally, the tool may output test benches to aid in the verification process. Finally, the tool will provide some estimates on resource usage and performance.
Vivado HLS generates the following outputs:
• Synthesizable Verilog and VHDL
• RTL simulations based on the design testbench • Static analysis of performance and resource usage
• Metadata at the boundaries of a design, making it easier to integrate into a system.
Once an RTL-level design is available, other tools are usually used in a standard RTL design flow. In the Xilinx Vivado Design Suite, logic synthesis is performed, translating the RTL-level design into a netlist of primitive FPGA logical elements. The netlist (consisting of logical elements and the connections between them) is then associated with specific resources in a target device, a process called place and route (PAR). The resulting configuration of the FPGA resources is captured in a bitstream, which can be loaded onto the FPGA to program its functionality. The bitstream contains a binary representation of the configuration of each FPGA resource, including logic elements, wire connections, and on-chip memories. A large Xilinx UltraScale FPGAs will have over 1 billion configuration bits and even the “smaller” devices have hundreds of millions of bits.
Connect with me
Superb try to explain more about it with example so it will be better.
ReplyDeleteGreat post bro but make it whole series by placing one or two examples which is highly needs.
ReplyDeleteso much easy explanation thanks for your effort towards VLSI community.
ReplyDeleteGood information which is required and easy explanation is you main power.
ReplyDelete