TagDetect¶
Introduction¶
The TagDetect modules are optional on-board modules of the rc_cube and require separate licenses to be purchased. The licenses are included in every rc_cube purchased after 01.07.2020.
The TagDetect modules run on board the rc_cube and allow the detection of 2D matrix codes and tags. Currently, there are TagDetect modules for QR codes and AprilTags. The modules, furthermore, compute the position and orientation of each tag in the 3D camera coordinate system, making it simple to manipulate a tag with a robot or to localize the camera with respect to a tag.
Note
These modules are not available in blaze
camera pipelines.
Note
These modules are pipeline specific. Changes to their settings or parameters only affect the respective camera pipeline and have no influence on other pipelines running on the rc_cube.
Tag detection is made up of three steps:
- Tag reading on the 2D image pair (see Tag reading).
- Estimation of the pose of each tag (see Pose estimation).
- Re-identification of previously seen tags (see Tag re-identification).
In the following, the two supported tag types are described, followed by a comparison.
QR code¶
QR codes are two-dimensional matrix codes that contain arbitrary user-defined data. There is wide support for decoding of QR codes on commodity hardware such as smartphones. Also, many online and offline tools are available for the generation of such codes.
The “pixels” of a QR code are called modules. Appearance and resolution of QR codes change with the amount of data they contain. While the special patterns in the three corners are always 7 modules wide, the number of modules between them increases the more data is stored. The lowest-resolution QR code is of size 21x21 modules and can contain up to 152 bits.
Even though many QR code generation tools support generation of specially designed QR codes (e.g., containing a logo, having round corners, or having dots as modules), a reliable detection of these tags by the rc_cube’s TagDetect module is not guaranteed. The same holds for QR codes which contain characters that are not part of regular ASCII.
AprilTag¶
AprilTags are similar to QR codes. However, they are specifically designed for robust identification at large distances. As for QR codes, we will call the tag pixels modules. Fig. 16 shows how AprilTags are structured. They have a mandatory white and black border, each one module wide. The tag families 16h5, 25h9, 36h10 and 36h11 are surrounded by this border and carry a variable amount of data modules in the center. For tag family 41h12, the black and white border is shifted towards the inside and the data modules are in the center and also at the border of the tags. Other than QR codes, AprilTags do not contain any user-defined information but are identified by a predefined family and ID. The tags in Fig. 16 for example are of family 16h5, 36h11 and 41h12 have id 0, 11 and 0, respectively. All supported families are shown in Table 16.
Family | Number of tag IDs | Recommended |
---|---|---|
16h5 | 30 | - |
25h9 | 35 | o |
36h10 | 2320 | o |
36h11 | 587 | + |
41h12 | 2115 | + |
For each family, the number before the “h” states the number of data modules contained in the tag: While a 16h5 tag contains 16 (4x4) data modules ((c) in Fig. 16), a 36h11 tag contains 36 (6x6) modules and a 41h12 tag contains 41 (3x3 inner + 4x8 outer) modules. The number behind the “h” refers to the Hamming distance between two tags of the same family. The higher, the more robust is the detection, but the fewer individual tag IDs are available for the same number of data modules (see Table 16).
The advantage of fewer modules (as for 16h5 compared to 36h11) is the lower resolution of the tag. Hence, each tag module is larger and the tag therefore can be detected from a larger distance. This, however, comes at a price: Firstly, fewer data modules lead to fewer individual tag IDs. Secondly, and more importantly, detection robustness is significantly reduced due to a higher false positive rate; i.e, tags are mixed up or nonexistent tags are detected in random image texture or noise. The 41h12 family has its border shifted towards the inside, which gives it more data modules at a lower number of total modules compared to the 36h11 family.
For these reasons we recommend using the 41h12 and 36h11 families and highly discourage the use of the 16h5 family. The latter family should only be used if a large detection distance really is necessary for an application. However, the maximum detection distance increases only by approximately 25% when using a 16h5 tag instead of a 36h11 tag.
Pre-generated AprilTags can be downloaded at the AprilTag project website (https://april.eecs.umich.edu/software/apriltag.html). There, each family consists of multiple PNGs containing single tags. Each pixel in the PNGs corresponds to one AprilTag module. When printing the tags of the families 36h11, 36h10, 25h9 and 16h5 special care must be taken to also include the white border around the tag that is contained in the PNG (see (a) in Fig. 16). Moreover, all tags should be scaled to the desired printing size without any interpolation, so that the sharp edges are preserved.
Comparison¶
Both QR codes and AprilTags have their up and down sides. While QR codes allow arbitrary user-defined data to be stored, AprilTags have a pre-defined and limited set of tags. On the other hand, AprilTags have a lower resolution and can therefore be detected at larger distances. Moreover, the continuous white to black border in AprilTags allow for more precise pose estimation.
Note
If user-defined data is not required, AprilTags should be preferred over QR codes.
Tag reading¶
The first step in the tag detection pipeline is reading the tags on the
2D image pair.
This step takes most of the processing time and its precision is crucial
for the precision of the resulting tag pose.
To control the speed of this step, the quality
parameter
can be set by the user.
It results in a downscaling of the image pair before reading the tags.
High
yields the largest maximum detection distance and highest
precision, but also the highest processing time.
Low
results in the smallest maximum detection distance and
lowest precision, but processing requires less than half of the time.
Medium
lies in between.
Please note that this quality parameter has no relation to the quality
parameter of
Stereo matching.
The maximum detection distance at quality High
can be
approximated by using the following formulae,
where is the focal length in pixels and is the size of a module in meters. can easily be calculated by the latter formula, where is the size of the tag in meters and is the width of the code in modules (for AprilTags without the white border). Fig. 17 visualizes these variables. denotes the number of image pixels per module required for detection. It is different for QR codes and AprilTags. Moreover, it varies with the tag’s angle to the camera and illumination. Approximate values for robust detection are:
- AprilTag: pixels/module
- QR code: pixels/module
The following tables give sample maximum distances for
different situations, assuming a focal length of
1075 pixels and the parameter quality
to be set to High
.
AprilTag family | Tag width | Maximum distance |
---|---|---|
36h11 (recommended) | 8 modules | 1.1 m |
16h5 | 6 modules | 1.4 m |
41h12 (recommended) | 5 modules | 1.7 m |
Tag width | Maximum distance |
---|---|
29 modules | 0.49 m |
21 modules | 0.70 m |
Pose estimation¶
For each detected tag, the pose of this tag in the camera coordinate frame is estimated. A requirement for pose estimation is that a tag is fully visible in the left and right camera image. The coordinate frame of the tag is aligned as shown below.
The z-axis is pointing “into” the tag. Please note that for AprilTags, although having the white border included in their definition, the coordinate system’s origin is placed exactly at the transition from the white to the black border. Since AprilTags do not have an obvious orientation, the origin is defined as the upper left corner in the orientation they are pre-generated in.
During pose estimation, the tag’s size is also estimated, while assuming the tag to be square. For QR codes, the size covers the full tag. For AprilTags, the size covers only the part inside the border defined by the transition from the black to the white border modules, hence ignoring the outermost white border for the tag families 16h5, 25h9, 36h10 and 36h11.
The user can also specify the approximate size () of tags. All tags not matching this size constraint are automatically filtered out. This information is further used to resolve ambiguities in pose estimation that may arise if multiple tags with the same ID are visible in the left and right image and these tags are aligned in parallel to the image rows.
Note
For best pose estimation results one should make sure to accurately print the tag and to attach it to a rigid and as planar as possible surface. Any distortion of the tag or bump in the surface will degrade the estimated pose.
Note
It is highly recommended to set the approximate size of a tag. Otherwise, if multiple tags with the same ID are visible in the left or right image, pose estimation may compute a wrong pose if these tags have the same orientation and are approximately aligned in parallel to the image rows. However, even if the approximate size is not given, the TagDetect modules try to detect such situations and filter out affected tags.
Below tables give approximate precisions of the estimated poses of AprilTags.
We distinguish between lateral precision (i.e., in x and y direction) and
precision in z direction.
It is assumed that quality
is set to High
, that the camera’s
viewing direction is parallel to the tag’s normal and that the images are well exposed
and do not suffer from motion blur.
The size of a tag does not have a significant effect on the lateral or z
precision; however, in general, larger tags improve precision.
With respect to precision of the orientation especially around the x and y
axes, larger tags clearly outperform smaller ones.
Distance | rc_visard 65 - lateral | rc_visard 65 - z | rc_visard 160 - lateral | rc_visard 160 - z |
---|---|---|---|---|
0.5 m | 0.05 mm | 0.5 mm | 0.05 mm | 0.3 mm |
1.0 m | 0.15 mm | 1.8 mm | 0.15 mm | 1.4 mm |
2.0 m | 1.5 mm | 14.5 mm | 0.5 mm | 3.7 mm |
Distance | 60 x 60 mm | 120 x 120 mm |
---|---|---|
0.5 m | 0.2° | – |
1.0 m | 0.8° | 0.3° |
2.0 m | 2.0° | 0.8° |
3.0 m | – | 1.8° |
Tag re-identification¶
Each tag has an ID; for AprilTags it is the family plus tag ID, for QR codes it is the contained data. However, these IDs are not unique, since the same tag may appear multiple times in a scene.
For distinction of these tags, the TagDetect modules also assign each detected tag a unique identifier. To help the user identifying an identical tag over multiple detections, tag detection tries to re-identify tags; if successful, a tag is assigned the same unique identifier again.
Tag re-identification compares the positions of the corners of the tags in the camera coordinate frame to find identical tags. Tags are assumed identical if they did not or only slightly move in that frame.
By setting the max_corner_distance
threshold, the user can specify how much a tag is allowed move in the static
coordinate frame between two detections to be considered identical.
This parameter defines the maximum distance between the corners of
two tags, which is shown in
Fig. 19.
The Euclidean distances of all four corresponding tag corners are computed in 3D.
If none of these distances exceeds the threshold, the tags are
considered identical.
After a number of tag detection runs, previously detected tag instances
will be discarded if they are not detected in the meantime.
This can be configured by the parameter forget_after_n_detections
.
Hand-eye calibration¶
In case the camera has been calibrated to a robot, the TagDetect module
can automatically provide poses in the robot coordinate frame.
For the TagDetect node’s Services, the frame of the
output poses can be controlled with the pose_frame
argument.
Two different pose_frame
values can be chosen:
- Camera frame (
camera
). All poses provided by the module are in the camera frame. - External frame (
external
). All poses provided by the module are in the external frame, configured by the user during the hand-eye calibration process. The module relies on the on-board Hand-eye calibration module to retrieve the sensor mounting (static or robot mounted) and the hand-eye transformation. If the sensor mounting is static, no further information is needed. If the sensor is robot-mounted, therobot_pose
is required to transform poses to and from theexternal
frame.
All pose_frame
values that are not camera
or external
are rejected.
Parameters¶
There are two separate modules available for tag detection,
one for detecting AprilTags and one for QR codes, named
rc_april_tag_detect
and rc_qr_code_detect
, respectively.
Apart from the module names they share the same interface definition.
In addition to the REST-API interface, the TagDetect modules provide pages on the Web GUI in the desired pipeline under and , on which they can be tried out and configured manually.
In the following, the parameters are listed based on the
example of rc_qr_code_detect
. They are the same for
rc_april_tag_detect
.
This module offers the following run-time parameters:
Name | Type | Min | Max | Default | Description |
---|---|---|---|---|---|
detect_inverted_tags |
bool | false | true | false | Detect tags with black and white exchanged |
forget_after_n_detections |
int32 | 1 | 1000 | 30 | Number of detection runs after which to forget about a previous tag during tag re-identification |
max_corner_distance |
float64 | 0.001 | 0.01 | 0.005 | Maximum distance of corresponding tag corners in meters during tag re-identification |
quality |
string | - | - | High | Quality of tag detection: [Low, Medium, High] |
use_cached_images |
bool | false | true | false | Use most recently received image pair instead of waiting for a new pair |
Via the REST-API, these parameters can be set as follows.
PUT http://<host>/api/v2/pipelines/<0,1,2,3>/nodes/<rc_qr_code_detect|rc_april_tag_detect>/parameters?<parameter-name>=<value>PUT http://<host>/api/v1/nodes/<rc_qr_code_detect|rc_april_tag_detect>/parameters?<parameter-name>=<value>
Status values¶
The TagDetect modules reports the following status values:
Name | Description |
---|---|
data_acquisition_time |
Time in seconds required to acquire image pair |
last_timestamp_processed |
The timestamp of the last processed image pair |
processing_time |
Processing time of the last detection in seconds |
state |
The current state of the node |
The reported state
can take one of the following values.
State name | Description |
---|---|
IDLE | The module is idle. |
RUNNING | The module is running and ready for tag detection. |
FATAL | A fatal error has occurred. |
Services¶
The TagDetect modules implement a state machine for starting and stopping.
The actual tag detection can be triggered via detect
.
The user can explore and call the rc_qr_code_detect
and rc_april_tag_detect
modules’ services,
e.g. for development and testing, using the
REST-API interface or
the rc_cube
Web GUI.
detect
¶
Triggers a tag detection.
Details
start
¶
Starts the module by transitioning from
IDLE
toRUNNING
.Details
stop
¶
Stops the module by transitioning to
IDLE
.Details
restart
¶
Restarts the module.
Details
trigger_dump
¶
Triggers dumping of the detection that corresponds to the given timestamp, or the latest detection, if no timestamp is given. The dumps are saved to the connected USB drive.
Details
reset_defaults
¶
Resets all parameters of the module to its default values, as listed in above table.
Details
Return codes¶
Each service response contains a return_code
,
which consists of a value
plus an optional message
.
A successful service returns with a return_code
value of 0
.
Negative return_code
values indicate that the service failed.
Positive return_code
values indicate that the service succeeded with additional information.
The smaller value is selected in case a service has multiple return_code
values,
but all messages are appended in the return_code
message.
The following table contains a list of common return codes:
Code Description 0 Success -1 An invalid argument was provided -4 A timeout occurred while waiting for the image pair -9 The license is not valid -11 Sensor not connected, not supported or not ready -101 Internal error during tag detection -102 There was a backwards jump of system time -103 Internal error during tag pose estimation -200 A fatal internal error occurred 200 Multiple warnings occurred; see list in message
201 The module was not in state RUNNING