This project is part of the camera course in Udacity's sensor fusion nanodegree. The following program schematic outlines what is included in this project.
This project implements the missing parts in the schematic. To accomplish this, there are four major tasks to complete:
- First, develop a way to match 3D objects over time by using keypoint correspondences.
- Second, compute the TTC based on Lidar measurements.
- Then proceed to do the same using the camera, which requires to first associate keypoint matches to regions of interest and then to compute the TTC based on those matches.
- And lastly, conduct various tests with the framework. The goal is to identify the most suitable detector/descriptor combination for TTC estimation and also to search for problems that can lead to faulty measurements by the camera or Lidar sensor.
- Project Specification
- Project Report
- Performance Evaluation
- Building and Running the Project
- References
- Additional Project Files
# | Req | Criteria | Specification | Status |
---|---|---|---|---|
FP.0 | Report | Provide a Writeup / README that includes all the rubric points and how you addressed each one. You can submit your writeup as markdown or pdf. | The writeup / README should include a statement and supporting figures / images that explain how each rubric item was addressed, and specifically where in the code each step was handled. | DONE |
FP.1 | Match 3D Objects | Implement the method "matchBoundingBoxes", which takes as input both the previous and the current data frames and provides as output the ids of the matched regions of interest (i.e. the boxID property). Matches must be the ones with the highest number of keypoint correspondences. | Code is functional and returns the specified output, where each bounding box is assigned the match candidate with the highest number of occurrences. | DONE |
FP.2 | Compute Lidar-based TTC | Compute the time-to-collision in second for all matched 3D objects using only Lidar measurements from the matched bounding boxes between current and previous frame.]() | Code is functional and returns the specified output. Also, the code is able to deal with outlier Lidar points in a statistically robust way to avoid severe estimation errors. | DONE |
FP.3 | Associate Keypoint Correspondences with Bounding Boxes | Prepare the TTC computation based on camera measurements by associating keypoint correspondences to the bounding boxes which enclose them. All matches which satisfy this condition must be added to a vector in the respective bounding box. | Code performs as described and adds the keypoint correspondences to the "kptMatches" property of the respective bounding boxes. Also, outlier matches have been removed based on the euclidean distance between them in relation to all the matches in the bounding box. | DONE |
FP.4 | Compute Camera-based TTC | Compute the time-to-collision in second for all matched 3D objects using only keypoint correspondences from the matched bounding boxes between current and previous frame. | Code is functional and returns the specified output. Also, the code is able to deal with outlier correspondences in a statistically robust way to avoid severe estimation errors. | DONE |
FP.5 | Performance Evaluation 1 | Find examples where the TTC estimate of the Lidar sensor does not seem plausible. Describe your observations and provide a sound argumentation why you think this happened. | Several examples (2-3) have been identified and described in detail. The assertion that the TTC is off has been based on manually estimating the distance to the rear of the preceding vehicle from a top view perspective of the Lidar points. | DONE |
FP.6 | Performance Evaluation 2 | Run several detector / descriptor combinations and look at the differences in TTC estimation. Find out which methods perform best and also include several examples where camera-based TTC estimation is way off. As with Lidar, describe your observations again and also look into potential reasons. | All detector / descriptor combinations implemented in previous chapters have been compared with regard to the TTC estimate on a frame-by-frame basis. To facilitate comparison, a spreadsheet and graph should be used to represent the different TTCs. | DONE |
The goal of this task is to implement the method matchBoundingBoxes
, which takes as input both the previous and the current data frames and provides as output the ids of the matched regions of interest (i.e. the boxID property)“. Matches must be the ones with the highest number of keypoint correspondences.
The task is complete once the code is functional and returns the specified output, where each bounding box is assigned the match candidate with the highest number of occurrences.
The method matchBoundingBoxes
is called from FinalProject_Camera.cpp like this:
map<int, int> boundingBoxBestMatches;
DataFrame & previousFrame = *(dataBuffer.end() - 2);
DataFrame & currentFrame = *(dataBuffer.end() - 1);
// associate bounding boxes between current and previous frame using keypoint matches
matchBoundingBoxes(matches,
boundingBoxBestMatches,
previousFrame,
currentFrame);
This is the implementation of the method matchBoundingBoxes
in camFusion_Student.cpp:
void matchBoundingBoxes(std::vector<cv::DMatch> & matches,
std::map<int, int> & boundingBoxBestMatches,
DataFrame & previousFrame,
DataFrame & currentFrame)
{
std::multimap<int, int> boundingBoxMatches{}; // A pair of IDs to track bounding boxes
for(auto & match : matches)
{
cv::KeyPoint keypointsPreviousFrame = previousFrame.keypoints[match.queryIdx];
cv::KeyPoint keypointsCurrentFrame = currentFrame.keypoints[match.trainIdx];
unsigned int boxIdPreviousFrame;
unsigned int boxIdCurrentFrame;
for(auto & boundingBox : previousFrame.boundingBoxes)
{
if(boundingBox.roi.contains(keypointsPreviousFrame.pt))
{
boxIdPreviousFrame = boundingBox.boxID;
}
}
for(auto & boundingBox : currentFrame.boundingBoxes)
{
if(boundingBox.roi.contains(keypointsCurrentFrame.pt))
{
boxIdCurrentFrame = boundingBox.boxID;
}
}
boundingBoxMatches.insert({boxIdCurrentFrame, boxIdPreviousFrame});
}
vector<int> boundingBoxIdsCurrentFrame{};
for(auto & boundingBox : currentFrame.boundingBoxes)
{
boundingBoxIdsCurrentFrame.push_back(boundingBox.boxID);
}
for(int boxIdCurrentFrame : boundingBoxIdsCurrentFrame)
{
auto it = boundingBoxMatches.equal_range(boxIdCurrentFrame);
unordered_map<int, int> boundingBoxIdMatches;
for(auto itr = it.first; itr != it.second; ++itr)
{
boundingBoxIdMatches[itr->second]++;
}
// find the max frequency
unsigned int maxBoxID = 0;
int matchingBoxID = -1;
for(auto & boxIdMatch : boundingBoxIdMatches)
{
if(maxBoxID < boxIdMatch.second)
{
matchingBoxID = boxIdMatch.first;
maxBoxID = boxIdMatch.second;
}
}
boundingBoxBestMatches.insert({matchingBoxID, boxIdCurrentFrame});
}
}
The goal of this part of the project is to compute the time-to-collision for all matched 3D objects based on Lidar measurements alone.
The estimation is implemented in a way that makes it robust against outliers which might be way too close and thus lead to faulty estimates of the TTC. The TCC is returned to the main function at the end of computeTTCLidar
.
The task is complete once the code is functional and returns the specified output. Also, the code is able to deal with outlier Lidar points in a statistically robust way to avoid severe estimation errors.
void computeTTCLidar(std::vector<LidarPoint> & lidarPointsPreviousFrame,
std::vector<LidarPoint> & lidarPointsCurrentFrame,
double frameRate,
double & TTC)
{
std::cout << "Lidar Previous Frame: " << lidarPointsPreviousFrame.size() << " points" << "\t Lidar Current Frame: "
<< lidarPointsCurrentFrame.size() << " points" << std::endl;
// auxiliary variables
double dT = 0.1 / frameRate; // time between two measurements in seconds
double laneWidth = 4.0; // assumed width of the ego lane
// find closest distance to Lidar points within ego lane
double minXPrev = 1e9, minXCurr = 1e9;
for(auto it = lidarPointsPreviousFrame.begin(); it != lidarPointsPreviousFrame.end(); ++it)
{
if(abs(it->y) <= laneWidth / 2.0)
{ // 3D point within ego lane?
minXPrev = minXPrev > it->x ? it->x : minXPrev;
}
}
for(auto it = lidarPointsCurrentFrame.begin(); it != lidarPointsCurrentFrame.end(); ++it)
{
if(abs(it->y) <= laneWidth / 2.0)
{ // 3D point within ego lane?
minXCurr = minXCurr > it->x ? it->x : minXCurr;
}
}
std::cout << "Final minXPrev: " << minXPrev << "\t Final minXCurr: " << minXCurr << std::endl;
// compute TTC from both measurements
TTC = minXCurr * dT / (minXPrev - minXCurr);
}
Before a TTC estimate can be computed in the next part of the project, we need to find all keypoint matches that belong to each 3D object.
We can do this by simply checking whether the corresponding keypoints are within the region of interest in the camera image. All matches which satisfy this condition should be added to a vector. The problem you will find is that there will be outliers among the matches. To eliminate those, we compute a robust mean of all the euclidean distances between keypoint matches and then remove those that are too far away from the mean.
The task is complete once the code performs as described and adds the keypoint correspondences to the kptMatches
property of the respective bounding boxes. Also, outlier matches have been removed based on the euclidean distance between them in relation to all the matches in the bounding box.
// associate a given bounding box with the keypoints it contains
void clusterKptMatchesWithROI(BoundingBox & boundingBox,
std::vector<cv::KeyPoint> & kptsPrev,
std::vector<cv::KeyPoint> & kptsCurr,
std::vector<cv::DMatch> & kptMatches)
{
std::vector<double> euclideanDistance;
for(auto it = kptMatches.begin(); it != kptMatches.end(); it++)
{
int currKptIndex = (*it).trainIdx;
const auto & currKeyPoint = kptsCurr[currKptIndex];
if(boundingBox.roi.contains(currKeyPoint.pt))
{
int prevKptIndex = (*it).queryIdx;
const auto & prevKeyPoint = kptsPrev[prevKptIndex];
euclideanDistance.push_back(cv::norm(currKeyPoint.pt - prevKeyPoint.pt));
}
}
int pair_num = euclideanDistance.size();
double euclideanDistanceMean = std::accumulate(euclideanDistance.begin(), euclideanDistance.end(), 0.0) / pair_num;
for(auto it = kptMatches.begin(); it != kptMatches.end(); it++)
{
int currKptIndex = (*it).trainIdx;
const auto & currKeyPoint = kptsCurr[currKptIndex];
if(boundingBox.roi.contains(currKeyPoint.pt))
{
int prevKptIndex = (*it).queryIdx;
const auto & prevKeyPoint = kptsPrev[prevKptIndex];
double temp = cv::norm(currKeyPoint.pt - prevKeyPoint.pt);
double euclideanDistanceMean_Augment = euclideanDistanceMean * 1.3;
if(temp < euclideanDistanceMean_Augment)
{
boundingBox.keypoints.push_back(currKeyPoint);
boundingBox.kptMatches.push_back(*it);
}
}
}
}
Once keypoint matches have been added to the bounding boxes, the next step is to compute the TTC estimate.
Once we have our estimate of the TTC, we return it to the main
function at the end of computeTTCCamera
.
The task is complete once the code is functional and returns the specified output. Also, the code must be able to deal with outlier correspondences in a statistically robust way to avoid severe estimation errors.
// Compute time-to-collision (TTC) based on keypoint correspondences in successive images
void computeTTCCamera(std::vector<cv::KeyPoint> & kptsPrev,
std::vector<cv::KeyPoint> & kptsCurr,
std::vector<cv::DMatch> kptMatches,
double frameRate,
double & TTC,
cv::Mat *visImg)
{
// compute distance ratios between all matched keypoints
vector<double> distRatios; // stores the distance ratios for all keypoints between curr. and prev. frame
for(auto it1 = kptMatches.begin(); it1 != kptMatches.end() - 1; ++it1)
{ // outer keypoint loop
// get current keypoint and its matched partner in the prev. frame
cv::KeyPoint kpOuterCurr = kptsCurr.at(it1->trainIdx);
cv::KeyPoint kpOuterPrev = kptsPrev.at(it1->queryIdx);
for(auto it2 = kptMatches.begin() + 1; it2 != kptMatches.end(); ++it2)
{ // inner keypoint loop
double minDist = 100.0; // min. required distance
// get next keypoint and its matched partner in the prev. frame
cv::KeyPoint kpInnerCurr = kptsCurr.at(it2->trainIdx);
cv::KeyPoint kpInnerPrev = kptsPrev.at(it2->queryIdx);
// compute distances and distance ratios
double distCurr = cv::norm(kpOuterCurr.pt - kpInnerCurr.pt);
double distPrev = cv::norm(kpOuterPrev.pt - kpInnerPrev.pt);
if(distPrev > std::numeric_limits<double>::epsilon() && distCurr >= minDist)
{ // avoid division by zero
double distRatio = distCurr / distPrev;
distRatios.push_back(distRatio);
}
} // eof inner loop over all matched kpts
} // eof outer loop over all matched kpts
// only continue if list of distance ratios is not empty
if(distRatios.size() == 0)
{
TTC = NAN;
return;
}
// compute camera-based TTC from distance ratios
//double meanDistRatio = std::accumulate(distRatios.begin(), distRatios.end(), 0.0) / distRatios.size();
std::sort(distRatios.begin(), distRatios.end());
long medIndex = floor(distRatios.size() / 2.0);
double medDistRatio = distRatios.size() % 2 == 0 ? (distRatios[medIndex - 1] + distRatios[medIndex]) / 2.0
: distRatios[medIndex]; // compute median dist. ratio to remove outlier influence
double dT = 1 / frameRate;
TTC = -dT / (1 - medDistRatio);
}
This exercise is about conducting tests with the final project code, especially with regard to the Lidar part. Look for several examples where you have the impression that the Lidar-based TTC estimate is way off. Once you have found those, describe your observations and provide a sound argumentation why you think this happened.
The task is complete once several examples (2-3) have been identified and described in detail. The assertion that the TTC is off should be based on manually estimating the distance to the rear of the preceding vehicle from a top view perspective of the Lidar points.
One immediate factor that is evident from the camera images is the vehicles are facing a red light, which means the preceding vehicles are likely either already stopped at the red light, or slowing to a stop as they approach the red light. Hence, TTC should not be very large, and we should look out for large TTC estimates.
The first implementation of computeTTCLidar
did not sort the lidar points. This resulted in the majority of Lidar TTC estimates looking way off. These results can be seen in the following table.
Frame | Lidar points | TTC Lidar | TTC Camera |
---|---|---|---|
1 | 340 | 9.99495 | 15.3278 |
2 | 309 | 46.9361 | 12.7634 |
3 | 326 | 8.76554 | 13.5261 |
4 | 322 | 13.9876 | 13.81 |
5 | 344 | 28.9112 | 13.0276 |
6 | 352 | 15.5119 | 11.426 |
7 | 317 | 40.7209 | 12.8408 |
8 | 312 | 7.13085 | 13.9921 |
9 | 311 | 8.57043 | 11.274 |
10 | 286 | 8.09783 | 35.585 |
11 | 302 | -43.923 | 12.7762 |
12 | 302 | 5.97072 | 11.5836 |
13 | 316 | 48.8604 | 11.4876 |
14 | 312 | 8.62499 | 12.2294 |
15 | 289 | 6.14571 | 10.225 |
16 | 297 | 6.82135 | 10.6857 |
17 | 279 | 702.551 | 10.0052 |
18 | 303 | 9.26663 | 10.5152 |
The following table shows the Lidar TTC estimates side-by-side with the Lidar top view perspective. Each marker in the top view perspective image is 2m. Clearly, the car detected in the image in frame 2 is not 46.9361 seconds away. It is approximately 8 metres away. The image in frame 17 shows an extreme example, where the Lidar TTC is 702.551 seconds, while the image shows the preceding vehicle is actually closer to 7 metres away.
The Lidar sensor in the following scenario (image courtesy of Udacity) gives the distance to the closest 3D point in the path of driving. In the figure below, the closest point is indicated by a red line emanating from a Lidar sensor on top of the measuring vehicle.
Based on the constant-velocity model, the velocity v0
can be calculated from two successive Lidar measurements as follows (image courtesy of Udacity):
Once the relative velocity v0
is known, the time to collision can easily be computed by dividing the remaining distance between both vehicles by v0
.
The following table shows a set of manual calculations, based on an estimate of the distance to the rear of the preceding vehicle. It also shows manually estimating the distance to the rear of the preceding vehicle from a top view perspective of the Lidar points.
The calculations are in the results spreadsheet. The top view perspectives of the Lidar points for each frame are shown below, and in the image in the lidar_top_view folder. These are used to estimate the distance to the preceding vehicle in each frame.
A graph of v0
shows the velocity per frame:
The negative velocities in frames 12 and 17 could mean the vehicle is moving backwards, but that is unlikely given that it is just one frame each time. There could be deceleration happening in general from around frame 10, which makes sense if the vehicle is approaching a red light.
The TTC estimates improved after sorting the lidar points on closing distance (front-to-rear bumper). This is implemented in the computeTTCLidar
in camFusion_Student.cpp:
void computeTTCLidar(std::vector<LidarPoint> & lidarPointsPreviousFrame,
std::vector<LidarPoint> & lidarPointsCurrentFrame,
double frameRate,
double & TTC)
{
// Sort ascending on the x coordinate only
std::sort(lidarPointsPreviousFrame.begin(), lidarPointsPreviousFrame.end(), [](LidarPoint a, LidarPoint b) {
return a.x < b.x;
}
);
std::sort(lidarPointsCurrentFrame.begin(), lidarPointsCurrentFrame.end(), [](LidarPoint a, LidarPoint b) {
return a.x < b.x;
}
);
/**
Using the constant-velocity model (as opposed to a constant-acceleration model)
TTC = d1 * dt / (d0 - d1);
where:
d0: the previous frame's closing distance (front-to-rear bumper)
d1: the current frame's closing distance (front-to-rear bumper)
dt: (delta t) the time elapsed between images (1 / frameRate)
Note: this function does not take into account the distance from the lidar origin to the front bumper of our vehicle.
It also does not account for the curvature or protrusions from the rear bumper of the preceding vehicle.
*/
double d0 = lidarPointsPreviousFrame[lidarPointsPreviousFrame.size() / 2].x;
double d1 = lidarPointsCurrentFrame[lidarPointsCurrentFrame.size() / 2].x;
double dt = 1.0 / frameRate;
TTC = d1 * dt / (d0 - d1);
}
Sorting the lidar points on the x
dimension (distance to the preceding vehicle) before calculating the TTC produces a much better result, as can be seen in the next table.
Frame | Lidar points | TTC Lidar | TTC Camera |
---|---|---|---|
1 | 340 | 12.5156 | 15.3278 |
2 | 309 | 12.6142 | 12.7634 |
3 | 326 | 14.091 | 13.5261 |
4 | 322 | 16.6894 | 13.81 |
5 | 344 | 15.9082 | 13.0276 |
6 | 352 | 12.6787 | 11.426 |
7 | 317 | 11.9844 | 12.8408 |
8 | 312 | 13.1241 | 13.9921 |
9 | 311 | 13.0241 | 11.274 |
10 | 286 | 11.1746 | 35.585 |
11 | 302 | 12.8086 | 12.7762 |
12 | 302 | 8.95978 | 11.5836 |
13 | 316 | 9.96439 | 11.4876 |
14 | 312 | 9.59863 | 12.2294 |
15 | 289 | 8.57352 | 10.225 |
16 | 297 | 9.51617 | 10.6857 |
17 | 279 | 9.54658 | 10.0052 |
18 | 303 | 8.3988 | 10.5152 |
Using this data, and the formulae above, we can calculate v0
:
Now, when we graph v0
the graph shows that velocity is steadily increasing. This is consistent with the TTC results in the table above: the TTC is steadily decreasing. So, as velocity is increasing, TTC is decreasing, which means we are getting steadily closer to the preceding vehicle.
Again, these tables and graphs are in the FP5 results spreadsheet.
This performance evaluation is about running the different detector / descriptor combinations and looking at the differences in TTC estimation. The goal is to find out which methods perform best and also include several examples where camera-based TTC estimation is way off. As with Lidar, describe your observations again and also look into potential reasons.
The task is complete once all detector / descriptor combinations implemented in previous chapters have been compared with regard to the TTC estimate on a frame-by-frame basis. To facilitate the comparison, a spreadsheet and graphs are used to represent the different TTCs.
The spreadsheet with all results is in the FP6_performance_results.xlsx spreadsheet.
The following graph shows a selection of 9 detector/descriptor combinations for camera-based TTC estimation. The Lidar-based TTC estimate is shown in blue for comparison.
Even among these better-performing camera-based detector/descriptor combinations, several of them diverge significantly from the Lidar-based TTC around frames 4-5, and again around frames 9 and 12.
The following graphs show a selection of the best-performing detector/descriptor combinations for camera-based TTC estimation.
In general, the HARRIS and ORB keypoint detectors resulted in very poor camera-based TTC estimates.
The following 3 graphs show a selection of the worst-performing detector/descriptor combinations for camera-based TTC estimation, where the camera-based TTC is way off.
There are examples where the camera-based TTC is off only for certain frames, but not for others. The following graphs show a selection of these detector/descriptor combinations.
Although the camera-based TTC eventually aligns with the Lidar-based TTC in later frames, they are too inconsistent to be reliable.
According to Altendorfer, R. and Wilkmann (2017): (see references below):
"There is currently no satisfying way to compute an automotive collision probability over a time period: there is a heuristic proposal to pick the maximal collision probability over that period as the collision probability for that time period [Jansson, et al. 2008], and there are calculations relying on strong assumptions (e. g. constant velocity models) that directly compute the collision probability over a time period [Nordlund et al, 2008]."
As noted earlier, the TTC calculations in this project use the constant-velocity model as opposed to a constant-acceleration model. In a real road traffic scenario both the measuring vehicle and the preceding vehicle accelerate and decelerate non-uniformly, and this affects the accuracy of TTC calculations.
- cmake >= 3.1
- All OSes: click here for installation instructions
- make >= 4.1 (Linux, Mac), 3.81 (Windows)
- Linux: make is installed by default on most Linux distros
- Mac: install Xcode command line tools to get make
- Windows: Click here for installation instructions
- Git LFS
- Weight files are handled using LFS
- OpenCV >= 4.5.1
- This must be compiled from source using the
-D OPENCV_ENABLE_NONFREE=ON
cmake flag for testing the SIFT and SURF detectors. - The OpenCV 4.1.0 source code can be found here
- This must be compiled from source using the
- gcc/g++ >= 5.4
- Linux: gcc / g++ is installed by default on most Linux distros
- Mac: same deal as make - install Xcode command line tools
- Windows: recommend using MinGW
- Boost >= 1.75.0
- The Boost C++ Libraries can be found here
- Clone this repo.
- Make a build directory in the top level project directory:
mkdir build && cd build
- Compile:
cmake .. && make
- Run it:
./3D_object_tracking
.
Rather than re-run the program manually for each detector-descriptor pair, and then manually gather the results from the console, I run the program once for all detector-descriptor pairs. The project is structured around the model of an Experiment
. Each Experiment
runs one detector-descriptor pair, and gathers the results. The main()
function invokes a function called RunExperimentSet
, which runs a set of experiments - one for each valid detector-descriptor combination.
The main()
function defines three variables - the hyperparameters, the set of detectors to use, and the set of descriptors to use in the experiments.
int main()
{
Hyperparameters hyperparameters = Hyperparameters();
std::vector<string> detectors = {
"Shi_Tomasi",
"HARRIS",
"FAST",
"BRISK",
"ORB",
"AKAZE",
"SIFT"
};
std::vector<string> descriptors = {
"BRISK",
"BRIEF",
"ORB",
"FREAK",
"AKAZE",
"SIFT"
};
// Run experiments for all combinations of detectors and descriptors
std::vector<Experiment> experiments = RunExperimentSet(hyperparameters, detectors, descriptors);
...
return 0;
}
To run the project for a subset of the detectors and descriptors, simply comment out the ones you do not want to run, e.g.,:
std::vector<string> detectors = {
"Shi_Tomasi",
// "HARRIS",
// "FAST",
// "BRISK",
// "ORB",
// "AKAZE",
"SIFT"
};
std::vector<string> descriptors = {
"BRISK",
// "BRIEF",
// "ORB",
// "FREAK",
// "AKAZE",
"SIFT"
};
This scenario results in running a total of 4 experiments based on combinations of 2 detectors and 2 descriptors. The following tables illustrate what the results would look like for this scenario:
The Hyperparameters
are implemented as a struct
in dataStructures.h:
struct Hyperparameters
{
Hyperparameters()= default;
string keypointDetector = "Shi_Tomasi"; // Shi_Tomasi, HARRIS, FAST, BRISK, ORB, AKAZE, SIFT
string descriptor = "BRIEF"; // BRISK, BRIEF, ORB, FREAK, AKAZE, SIFT
string matcherType = "MAT_BF"; // MAT_BF, MAT_FLANN
string descriptorType = "DES_BINARY"; // DES_BINARY, DES_HOG
string selectorType = "SEL_KNN"; // SEL_NN, SEL_KNN
};
The RunExperimentSet()
function calls the RunExperiment()
function one time for each valid detector-descriptor combination. The RunExperiment()
function is where you will find much of the student assignment code.
/*
* This function encapsulates running an experiment with a given combination of detector, descriptor, matcher,
* descriptor type, and selector.
*/
void RunExperiment(Experiment &experiment)
{
...
...
}
An Experiment
uses the Hyperparameters
described earlier, and stores the results. It also has some configuration options that can be set via the Experiment
struct in dataStructures.h:
struct Experiment
{
Experiment()= default;
std::vector<ResultSet> resultSet;
Hyperparameters hyperparameters;
// Visualization and image saving options
bool displayImageWindows = false; // visualize matches between current and previous image?
bool isFocusOnPrecedingVehicleOnly = true; // only keep keypoints on the preceding vehicle?
bool saveKeypointDetectionImagesToFile = true; // save keypoint detection images to file
bool saveKeypointMatchImagesToFile = true; // save keypoint matching images to file
};
The configuration shown above means that the image windows are not displayed when running the program. This allows a set of experiments to run quickly without needing any human intervention. If you want to see the images during an experiment run, then set the displayImageWindows
variable to true
:
struct Experiment
{
...
bool displayImageWindows = true; // visualize matches between current and previous image?
...
};
- Rangesh, A. and Trivedi, M.M., 2019. No blind spots: Full-surround multi-object tracking for autonomous vehicles using cameras and lidars. IEEE Transactions on Intelligent Vehicles, 4(4), pp.588-599.
- Kilicarslan, M. and Zheng, J.Y., 2018. Predict vehicle collision by TTC from motion using a single video camera. IEEE Transactions on Intelligent Transportation Systems, 20(2), pp.522-533.
- Huang, J.K. and Grizzle, J.W., 2019. Improvements to Target-Based 3D LiDAR to Camera Calibration. arXiv preprint arXiv:1910.03126.
- Raphael, E., Kiefer, R., Reisman, P. and Hayon, G., 2011. Development of a camera-based forward collision alert system. SAE International Journal of Passenger Cars-Mechanical Systems, 4(2011-01-0579), pp.467-478.
- Altendorfer, R. and Wilkmann, C., 2017. A New Approach To Estimate The Collision Probability For Automotive Applications. arXiv preprint arXiv:1711.07060.
- Jansson, J. and Gustafsson, F., 2008. A framework and automotive application of collision avoidance decision making. Automatica, 44(9), pp.2347-2351.
- Nordlund, P.J. and Gustafsson, F., 2008. Probabilistic conflict detection for piecewise straight paths. Linköping University Electronic Press.
- Jiang, B., You, S. and Neumann, U., 2000, July. Camera tracking for augmented reality media. In 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No. 00TH8532) (Vol. 3, pp. 1637-1640). IEEE.
- IntoRobotics, April 1, 2013 (Updated November 25 2016) How to Detect and Track Object With OpenCV.
- Jeremy Behreandt, 2018. Cameras in Processing (2D and 3D). Medium.
- Jan Gaspar. Chapter 7. Boost.Circular Buffer. Boost C++ Libraries.
- StackOverflow. How to link C++ program with Boost using CMake.
This repository contains the following additional files and folders:
- 3DObjectTracking_Results.xlsx: A spreadsheet used to analyse the data and create the charts for FP5 performance analysis.
- FP6_performance_results.csv: A CSV file with all the raw data for FP6 performance analysis.
- FP6_performance_results.xlsx: A spreadsheet used to analyse the data and create the charts for FP6 performance analysis.
- output_perf_tables.md: A markdown file with a data tables and output images from all 35 experiments.
- output_perf_recommended.md: A markdown file showing a log trace of running the program with detailed logging turned on. Useful for debugging.
- TTC_Comparison_Charts: A folder containing a set of images exported from the analysis spreadsheets
- Combined Lidar and Camera TTC images: A set of images created by this program. There is an image for each detector/descriptor pair showing TTC estimates from camera and lidar for each frame.
- Lidar top view perspective images: A set of images created by this program. Each image shows the Lidar top view perspective, with distance markers, number of Lidar points, distance to preceding vehicle, and width of preceding vehicle.