Benchmark (surveying)

About

A benchmark in robotics and AI is a standardized dataset, task, or evaluation framework used to measure and compare the performance of algorithms, models, or systems under consistent conditions. Benchmarks typically consist of carefully collected data—such as camera images, LiDAR point clouds, depth sequences, or sensor recordings—paired with ground-truth annotations and defined metrics, enabling researchers to objectively assess progress. In robotics and AI, benchmarks drive development across core challenges including visual odometry, SLAM, object detection, pedestrian tracking, depth estimation, and autonomous driving perception. Landmark examples like the KITTI vision suite, TUM RGB-D dataset, and Argoverse provide real-world scenarios with precise ground truth, allowing fair head-to-head comparisons of competing approaches. Benchmarks matter because they create shared goalposts that accelerate progress, expose limitations of current methods, and prevent overfitting to narrow test conditions. Without rigorous benchmarks, claims of improvement are difficult to validate or reproduce. By establishing common evaluation protocols, benchmarks foster transparency, guide research priorities, and ultimately shorten the path from laboratory algorithms to reliable deployed robotic systems operating in real-world environments.

Top Researchers

Jean-Jacques Slotine

Institution: —

R.C. Eberhart

Institution: —

James Kennedy

Institution: —

Andreas Geiger

Institution: —

P Lenz

Institution: —

R. Urtasun

Institution: —

Vincent Vanhoucke

Institution: —

Oriol Vinyals

Institution: —

Raquel Urtasun

Institution: —

Sebastian Thrun

Institution: —

Top Institutes

Technical University of MunichDE15 papers Stanford UniversityUS15 papers Carnegie Mellon UniversityUS13 papers University of FreiburgDE12 papers Centre National de la Recherche ScientifiqueFR12 papers Google (United States)US10 papers ETH ZurichCH8 papers University of California, BerkeleyUS8 papers

Top Cited Papers

A new optimizer using particle swarm theory

R.C. Eberhart, James Kennedy

Citations: 14853 • 2002

Are we ready for autonomous driving? The KITTI vision benchmark suite

Andreas Geiger, P Lenz, R. Urtasun

Citations: 14348 • 2012

VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection

Yin Zhou, Oncel Tuzel

Citations: 4542 • 2018

A benchmark for the evaluation of RGB-D SLAM systems

Jrgen Sturm, Nikolas Engelhard, Felix Endres, Wolfram Burgard, Daniel Cremers

Citations: 3918 • 2012

A State-of-the-Art Survey on Deep Learning Theory and Architectures

Md Zahangir Alom, Tarek M. Taha, Chris Yakopcic, Stefan Westberg, Paheding Sidike, Mst Shamima Nasrin, Mahmudul Hasan, Brian C. Van Essen, Abdul Ahad S. Awwal, Vijayan K. Asari

Citations: 1593 • 2019

Argoverse: 3D Tracking and Forecasting With Rich Maps

Ming-Fang Chang, John Lambert, Patsorn Sangkloy, Jagjeet Singh, Sławomir Bąk, Andrew T. Hartnett, Wang De, Peter Carr, Simon Lucey, Deva Ramanan, James Hays

Citations: 1420 • 2019

Pedestrian detection: A benchmark

Piotr Dollár, Christian Wojek, Bernt Schiele, Pietro Perona

Citations: 1339 • 2009

Electromyography data for non-invasive naturally-controlled robotic hand prostheses

Manfredo Atzori, Arjan Gijsberts, Claudio Castellini, Barbara Caputo, Anne-Gabrielle Mittaz Hager, Simone Elsig, Giorgio Giatsidis, Franco Bassetto, Henning Müller

Citations: 931 • 2014

3-D Mapping With an RGB-D Camera

Felix Endres, Jürgen Hess, Jürgen Sturm, Daniel Cremers, Wolfram Burgard

Citations: 822 • 2013

Learning to Track: Online Multi-object Tracking by Decision Making

Xiang Yu, Alexandre Alahi, Silvio Savarese

Citations: 716 • 2015

Large Scale Multi-view Stereopsis Evaluation

Rasmus Ramsbøl Jensen, Anders Bjorholm Dahl, George Vogiatzis, Engil Tola, Henrik Aanæs

Citations: 676 • 2014

Simultaneous Localization and Mapping with Sparse Extended Information Filters

Sebastian Thrun, Yufeng Liu, Daphne Koller, Andrew Y. Ng, Zoubin Ghahramani, Hugh Durrant‐Whyte

Citations: 648 • 2004

Swarm of micro flying robots in the wild

Xin Zhou, Xiangyong Wen, Zhepei Wang, Yuman Gao, Haojia Li, Qianhao Wang, Tiankai Yang, Haojian Lu, Yanjun Cao, Chao Xu, Fei Gao

Citations: 539 • 2022

Computer vision and deep learning techniques for pedestrian detection and tracking: A survey

Antonio Brunetti, Domenico Buongiorno, Gianpaolo Francesco Trotta, Vitoantonio Bevilacqua

Citations: 533 • 2018

Need for Speed: A Benchmark for Higher Frame Rate Object Tracking

Hamed Kiani Galoogahi, Ashton Fagg, Chen Huang, Deva Ramanan, Simon Lucey

Citations: 524 • 2017

First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations

Guillermo Garcia-Hernando, Shanxin Yuan, Seungryul Baek, Tae‐Kyun Kim

Citations: 511 • 2018

Self-Supervised Sparse-to-Dense: Self-Supervised Depth Completion from LiDAR and Monocular Camera

Fangchang Ma, Guilherme V. Cavalheiro, Sertaç Karaman

Citations: 471 • 2019

Policy search for motor primitives in robotics

Jens Kober, Jan Peters

Citations: 437 • 2010

The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches

Md Zahangir Alom, Tarek M. Taha, Christopher Yakopcic, Stefan Westberg, Paheding Sidike, Mst Shamima Nasrin, Brian C. Van Essen, Abdul Ahad S. Awwal, Vijayan K. Asari

Citations: 436 • 2018

Recording and Playback of Camera Shake: Benchmarking Blind Deconvolution with a Real-World Database

Rolf Köhler, Michael Hirsch, Betty J. Mohler, Bernhard Schölkopf, Stefan Harmeling

Citations: 435 • 2012

Related Technologies

Generalization Artificial intelligence Consistency (knowledge bases)Computer science Machine learning Variety (cybernetics)Process (computing)Function (biology)Mathematics Unsupervised learning