In 'Basic Principles' the VisualApplets idea of building FPGA applications using operators and links was explained. This chapter will explain the data flow model of FPGA data processing.
FPGA implementations differ from software programs run on a microprocessor. All implemented functionalities of an FPGA implementation exist and can be used at the same time (i.e., in parallel) while the functions of a microprocessor program are executed sequentially. This is a very important difference and one of the tremendous advantages of FPGA implementations. As everything is working in parallel, data is processed in a pipeline structure. As mentioned, modules (instantiated operators) are connected by links. Each module starts the processing of image or signal data as soon as data is available. The results are forwarded to the next module via link. Most modules do not buffer the input image data. They output the calculated results as soon as the information is available. Let's have a look on our project from the Getting Startedguide once again:
The project consists of three modules; namely, the camera, a buffer, and a DMA operator. The camera operator is an image source module. It receives the images sent from the camera. But instead of collecting a full frame and outputting the image after full acquisition, the module will forward the pixels to the next module as soon as they arrive. The buffer module ImageBuffer is a buffer which can store image data but will output available data as soon as the output is not blocked. The DMA module transfers the images to the host PC.
The result of this pipeline structure is that image pixels are transferred to the host PC while other pixels of the same image are still transferred from the camera. Thus, no images are stored inside the modules. The advantage of this non-buffered pipeline is that all modules can run in parallel and are efficiently used. Furthermore, the latency (the time a pixel needs until it is fully processed) is reduced to a minimum.
Always keep in mind that all modules can run in parallel and output their results as soon as they are available and the output is not blocked. Imagine the pipeline like a water pipeline with valves, branches, and small and wide tubes.
In difference to a microprocessor program the number of operations, i.e., the number of operators, will not influence the processing speed as everything is running in parallel. However, the more operators are used in an applet, the more hardware resources are required.
The bandwidth of a pipeline in an applet depends on the processing speed of the operators and on the connecting links. A link has several “link properties” (see 'Link Properties'). One of them is the parallelism which defines the bandwidth. The parallelism defines how many pixels are transferred in parallel between two operators in one design clock cycle. The higher the parallelism, the higher the bandwidth. Operators are automatically adapted to meet the required bandwidth of a link; if this is not possible, they redefine the bandwidth.
Let's assume a simple example. A parallelism of four will result in four pixels being transferred in parallel. This means the first four consecutive pixels of an image are transferred from the camera module to its successive module (in our case the buffer) in parallel. Next, the next four pixels are transferred from the camera module to the buffer module. Meanwhile, the first four pixels have been processed in the buffer module and are forwarded to the next module in the image processing chain.
As mentioned before, the parallelism defines the number of pixels transferred in parallel in one clock cycle of the design clock frequency. The frequency depends on the used hardware device. For example, the frame grabbers of the microEnable IV series use a design clock frequency of 62.5MHz.
|Calculating the Bandwidth|
The bandwidth b of a link is determined by the product of the parallelism p and the frequency f:
A parallelism of four will therefore result in a bandwidth of . If we assume a bit width of 8 bit per pixel, this will result in a bandwidth of 250MB/s. A list of the basic design clock frequencies for all hardware platforms can be found in Appendix. Device Resources.
Operators may change the parallelism between the input and the output link. Thus, the bandwidth is not constant throughout the design. That's because the required bandwidth might change. Suppose an operator reduces the image size. Thus, the required output bandwidth is reduced, too.
The bandwidth calculated with the formula above is a theoretic value. The actual bandwidth is slightly less than the theoretic value.
Some operators cannot process the full bandwidth given at the input link. You will find detailed information for all operators concerned in the Operator Reference.
Note the difference between the bandwidth and the latency. The latency is defined individually by each operator and mostly depends on the algorithmic implementation.
|Visualization on GUI|
For visualization of the according link properties, VisualApplets provides two GUI buttons in the toolbar of the program window:
Display Link Info displays the bit width and the parallelism for every link in the diagram:
Display Link Throughput displays the maximum pixel throughput in megapixels per second for every link in the diagram:
As previously explained, in VisualApplets, pixels are transferred through the pipeline one after another. If a link parallelism is greater than one, multiple pixels are transferred in parallel. The order of the transfer of pixels in images or lines is the following: In general, pixels of frames (two-dimensional images) are transferred from a camera starting with the first pixel at the upper left corner, and finishing with the last pixel at the bottom right corner. If VisualApplets operators require the pixel position for their processing, the same order, that is, left -> right and top -> down, is expected. The following figure illustrates this order:
However, some sources like cameras do not comply with this order. In these cases, VisualApplets operators and designs can be used to correct the pixel order. To do this, you should have some knowledge on the protocol of a pixel transfer. You get the according information in the section 'Image Protocols, Image Dimensions and Data Structure'.
The Image Protocol defines the image dimension and data structure of the transfered pixel and data. The Image Protocol is another link property besides the previously mentioned link property Parallelism.
There are three types of image protocols:
The 2D image protocol is used for the transfer of images, mostly used with area scan cameras. A link transports the information
- when a pixel is transferred (pixel valid signal)
- when a line is completed (end-of-line signal)
- when a frame is completed (end-of-frame signal)
Thus, a 2D image can have:
- an arbitrary number of pixels in a line
- an arbitrary image height
The pixel position itself within a line or row is not transfered. The position results out of the number of antecedent lines and antecendent pixels within the current line. It is not possible to have gaps between the pixels in a line. It is possible to have arbitrary line lengths. A line may have no pixels, i.e., empty lines are possible. As a minimum, a frame has to consist of one empty line.
The 1D image protocol is used to transfer lines, mostly used with line scan cameras. A link transports the information
- when a pixel is transfered (pixel valid signal)
- when a line is completed (end-of-line signal)
Thus, a 1D image can have:
- an arbitrary number of pixels in a line
- an unlimited image height
Again, the pixel position itself within a line is not transfered. The position results out of the number of antecedent lines and antecendent pixels within the current line. It is not possible to have gaps between the pixels in a line. It is possible to have arbitrary line lengths. A line may have no pixels, i.e., empty lines are possible.
In the 0D image protocol, no information on image dimensions is preserved. It is simply a stream of pixels. Still, time-gaps between pixels may exist, i.e., a pixel comes with a valid signal. The 0D protocol is mostly used for data transfers such as measurement results.
In the signal protocol, transfers are reduced to single bit data transfers which are always valid (valid at every clock cycle). Thus, the protocol does not include any control signals such as pixel valid or line/frame completed. The signal protocol is used for signal operators used in trigger and signal processing systems.
The image protocol is a link property. Each operator decides individually which link properties are accepted at its inputs and available at its outputs. See 'Link Properties' for more information about link parameterization.
In the previous sections, the pipeline structure of VisualApplets designs was explained. As mentioned before, data is transferred between the modules of a project via links. As soon as a module has processed an input pixel and is finished with calculating the output value, the result is output at the output link(s). However, the next module might not be able to process the data as it is still processing another pixel. In this case, the flow control of VisualApplets is applied and the pipeline is blocked. Thus, modules can block their inputs. In this case the antecedent module will not output its results and will propagate the blocking state backwards in the pipeline.
Let's have a look at the simple example shown in Figure 28, 'Simple VisualApplets Design'. Suppose a slow PC is used which cannot process the bandwidth generated by the camera. In this case, the DmaToPC module will not be able to transfer the data to the host PC. Thus, it will block its input from time to time. The blocking signal is propagated in the design up to the image buffer module. Now, the image buffer will not output any data while the blocking is active. As the ImageBuffer module is a buffer, all further incoming data will be buffered and the fill level of the buffer will increase. When the camera stops sending data, no new input data is transfered to the buffer and data will be output until it is empty.
Again, you can imagine the flow control like a pipeline of water. If a valve is closed, no more water can be transported. A buffer operator is like a reservoir which is filled with water if the drain cannot consume the input stream.