Challenge 1 Perception

Part 1: Rosbag perception challenge

In this challenge the aim is to be able to recognise and estimate the position of as many objects in the environment as possible from a rosbag Links to an external site. which will contain images and information about the drone poses.

For practice and to ensure that you are familiar with the file format for the actual evaluation you will be given access to a rosbag to practice on. Along with this rosbag you will also be given the ground truth position of the objects and a script to score your system. The evaluation will be done on a rosbag disclosed close to the deadline to make it more realistic and reward overfitting less. You will get the ground truth after the deadline.

The simplifications here with respect to the final task are

you will be given information about the pose of the drone so that you do not need to perform localization and the error in object position will thus mainly come from your perception system
you can adjust your system parameters given the actual data, whereas in the final task you can only tweak before your run. We want you to make note of the difference in qualitative performance (you do not have exact ground truth until after the seminar) before your start tweaking and after so discuss this at the following progress seminar.

The comparison between different solutions will be based on two criteria similar to the ones used in the final task. The first one will be used as the first sorting criteria and the second only if the first is equal.

Number of correctly classified objects within one meter of the correct location (minus points for detections outside, if multiple, we will use your worst detection)
Average position error.

Practice Dataset:

We have put together a practice dataset in the same format as we will use for the perception challenge, so that you can test your solution and produce the correct result file format.

The practice dataset is available as a zip file: dd2419_perception_training.zip Download dd2419_perception_training.zip. It also contains scripts for sign pose visualization using TF, and the score evaluation we will use. If you're curious about the details, see the contained script. More info in the README file also contained in the zip file.

Please add your practice scores to this shared spreadsheet Links to an external site. so that you can benchmark your solution against the other groups (and them you) and so that we can monitor your progress.

Timeline:

First rosbag, ground truth object positions, and scoring script available: March 6th
Second rosbag, for evaluation, available: March 19th, 10:00 13:00
- Submission deadline: March 20th, 23:59 21st, 09:00
- Ground truth object positions available: March 21st, 00:01 09:01

Submission:

The submission should be a text file containing the sign type and its pose. The orientation will be ignored for scoring. We use a similar format to TUM, that is: each line contains a single pose, the format is sign tx ty tz qx qy qz qw with tabs for separation, where tx ty tz is the map position of the object’s center, and qx qy qz qw is the rotation quaternion of the object. sign is one of:

airport
circulation_warning
dangerous_curve_left
dangerous_curve_right
road_narrows_from_left
road_narrows_from_right
no_bicycle
no_heavy_truck
junction
no_parking
no_stopping_and_parking
residential
stop
follow_left
follow_right

These are the same names as in the rest of the course.

Submit results here

Part 2: Real-world perception challenge

The second part of the challenge has to do with integration into the system. A system meeting the requirement for this part of the challenge will be considered superior to a solution that does well on part 1 but which does not meet the requirement of part 2.

Requirements:

The drone moves autonomously (can be a preprogrammed path and it can be dead-reckoning only).
You can detect at least one object besides the stop sign in the environment from real camera images
- without any manual intervention after starting the system,
- the drone having traveled at least 4m and accumulated turns of at least 270 degrees, and
- with the detection system running from start to finish (i.e. not only after having travelled 4m.)

The requirements on moving are there to ensure that you have integrated the vision system and that you did not just pre-program the system for one specific path and object. We are looking for robustness and integration.

Submit results here. Deadline is on March 20, 23:59.