In this episode I sat down with Isaac to discuss RF-DETR, a new state-of-the-art family of real-time object detection and segmentation models from Roboflow. We cover the motivation for building models that are not just accurate but also fast, cost-efficient, and deployable across diverse hardware and data regimes, and why moving beyond fixed architectures is key to achieving that. Isaac explains how RF-DETR combines strong foundation backbones like DINOv2 with efficient neural architecture search to unlock novel speed–accuracy trade-offs, including dropping decoder layers and queries after training. We also discuss the model’s strong transfer performance on domains far from COCO, the introduction of a memory-efficient instance segmentation head, and the team’s unusually rigorous benchmarking approach, before closing on the challenges of open-source research and upcoming improvements to inference and platform integration.
Bio: Isaac Robinson is a Machine Learning Research Engineer at Roboflow. He’s worked across the field of computer vision, from real-time stereo depth estimation on household robots to biomedical research at the NIH to founding a zero shot computer vision infrastructure startup. Isaac focusses on the intersection of low latency and high performance, with the goal of helping people unlock new capabilities through vision.











