Datasets for machine & deep learning

A short introduction to the remote-sensing-datasets repository

Discovering the right dataset for a new project can be a hugely time consuming and frustrating process. Information about datasets is scattered all over the internet, and often if can be hard to get started with a dataset without examples of how it can be loaded and used. I created the remote-sensing-datasets repository to address this pain point, and it has matured over several years to become a comprehensive source of information about remote sensing datasets. The datasets listed cover a wide range of challenges from object detection, to data fusion and change detection. These datasets include both well know benchmark datasets such as Xview and DOTA, to relatively unknown datasets such as the Oil and Gas Tank Dataset and the AIST Building Change Detection dataset.

Several factors distinguish this repository from other resources on the internet:

  1. This repository is actively maintained and regularly updated; no dead links!

  2. Many datasets are accompanied by Jupyter notebooks, codebases and papers that use that dataset, enabling you to hit the ground running 🏃

  3. Discussions can be started about datasets on the repository itself

Additionally, the repository contains substantial sections on:

  • ANNOTATION - how to annotate raw imagery & create an ML training dataset?

  • MODEL TRAINING - how and where to train ML models?

  • MODEL DEPLOYMENT - how to serve an ML model to end users?

  • SOFTWARE - for working with image datasets

  • MISCELLANEOUS - covers general utilities, graphing and visualisation, and more!

👉 access remote-sensing-datasets 👈