Feb 6 • 11M

Vision Transformers for Satellite Image Time Series with Michail Tarasiou

An exciting step towards high accuracy and automated crop mapping from space

Open in playerListen on);
Dive into the world of deep learning for satellite images with your host, Robin Cole. Robin meets with experts in the field to discuss their research, products, and careers in the space of satellite image deep learning. Stay up to date on the latest trends and advancements in the industry - whether you’re an expert in the field or just starting to learn about satellite image deep learning, this a podcast for you. Head to https://www.satellite-image-deep-learning.com/ to learn more about this fascinating domain
Episode details

In this episode, Robin catches up with Michail Tarasiou to discuss the new paper, ViTs for SITS: Vision Transformers for Satellite Image Time Series. This paper introduces the temporo-spatial vision transformer (TSViT) architecture. The TSViT incorporates novel design choices that make it suitable for time series tasks such as crop classification. In this work, TSViT crop classification and segmentation models are trained and evaluated on Sentinel 2 datasets and achieve state of the art (SOTA) results on these tasks by a significant margin. This is an exciting step towards high accuracy and low cost & automated crop mapping using remote sensing imagery.

Paper authors: Michail Tarasiou, Erik Chavez, Stefanos Zafeiriou