New discoveries #21
Canopy height prediction from Sentinel-2 images, SATIN & SAMRS datasets, and technical factors to consider when designing neural networks for semantic segmentation
Welcome to the 21st edition of the newsletter. I'm delighted to share that the newsletter now has 7.3k subscribers 🔥 Please note that this edition of the newsletter does not have a sponsor. If you're interested in gaining visibility for your business or service, sponsoring a future edition of the newsletter is an excellent way to achieve this. As a sponsor, you'll receive a shout-out in the opening statement and a dedicated section in the newsletter, reaching a wide audience in the community. For more information on how to sponsor the newsletter please email me 📧
This month the White House issued an Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. This comes at a time of intense discussion about the role of regulation in AI development. Key figures in the AI community fall into two camps; one group argue that AI poses significant risks to humanity and must be strongly regulated, whilst the other group argue that the risks are overstated, and that excessive regulation will stifle innovation, and ultimately benefit large corporations at the expense of startups and open source initiatives. My own views on regulation reflect those of Andrew Ng, but whatever your views I believe it's crucial for those of us actively working with AI to engage in these discussions, as we bring a nuanced perspective that is often absent in regulatory and political circles.
Canopy height prediction from Sentinel-2 images
The paper A high-resolution canopy height model of the Earth presents a probabilistic deep learning model which has been developed to retrieve canopy top height from Sentinel-2 images anywhere on Earth. The model is an ensemble of convolutional neural networks (CNNs), trained using canopy top height data derived from NASA's GEDI LIDAR.
The uncertainty of model predictions is quantified by modelling both the data uncertainty and the model uncertainty. This is particularly significant given that neural networks are often criticised for their 'black box' nature. By addressing model uncertainty head-on, this approach may pave the way for more accountable and interpretable solutions in ecological studies, potentially increasing the adoption and trust of such approaches.
SATIN: A Multi-Task Remote Sensing Meta-dataset
The Satellite ImageNet (SATIN) meta-dataset is a comprehensive collection of resources designed to train, evaluate, and analyse vision-language (VL) models for satellite and aerial imagery. VL models are designed to generate meaningful outputs based on images and text, enabling functionalities like image captioning, visual question-answering, and object detection with contextual annotations.
As satellite imagery databases often go under-utilised because of their daunting size and complexity, the advent of vision-language (VL) models offers a transformative approach for more intuitive and accessible data interaction, unlocking the untapped potential of these invaluable resources
SAMRS: Scaling-up Remote Sensing Segmentation Datasets with Segment Anything
The Segment Anything Model (SAM) received significant attention on social media due to its revolutionary ability in image segmentation across diverse visual domains. A key challenge in Remote Sensing (RS) is the laborious and costly process of pixel-level annotation, leaving a wealth of RS data untapped.
The paper SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model, showcases how SAM can be applied to existing object detection datasets to create SAMRS, a large-scale, efficiently-annotated segmentation dataset for RS applications. The potential for using SAM to enrich and broaden both current and upcoming datasets is extremely promising. My own experience has shown significant acceleration in dataset annotation, using the integration of SAM in the Roboflow platform.
Technical factors to consider when designing neural networks for semantic segmentation
The paper A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery presents a comprehensive discussion of the technical factors to consider when designing neural networks for semantic segmentation of satellite imagery. It focuses on Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and transformer models, discussing prominent design patterns for these networks and their implications for semantic segmentation.
Common pre-processing techniques for ensuring optimal data preparation are also covered. These include methods for image normalisation and chipping, as well as strategies for addressing data imbalance in training samples, and techniques for overcoming limited data, including augmentation techniques, transfer learning, and domain adaptation.
By encompassing both the technical aspects of neural network design and the data-related considerations, this review provides a comprehensive and up-to-date understanding of the factors involved in designing effective neural networks for semantic segmentation of satellite imagery.
📖 Paper
Poll
In the last poll, I explored the motivations for owning a GPU. The majority (34%) cited lack of access to a cloud GPU, while 24% mentioned cost control as their primary reason. Data security and cloud-related challenges received fewer votes. Despite the availability of free cloud GPU platforms like Google Colab and Kaggle, some still face barriers to access. This poll aims to delve into those specific obstacles:
Consulting
If you need expert guidance on any of the following topics, I’m available for video call consulting:
Applying machine learning techniques to satellite and aerial imagery, including dataset selection, model training, and deployment
Building data processing pipelines in the cloud
Appraisal of your product offering
Building your brand and community for technical products
Personal career development
As an experienced consultant, I offer customised advice and practical solutions to help you achieve your goals in these areas. To discuss this service please email me 📧
Testimonials
Gregor Beyerle, HAKOM Time Series GmbH: Working with Dr Cole was exactly what we needed: uncomplicated and insightful. It is wonderful to see that the EO space is finally big enough for independent consultants that don’t have to sell you on their companies’ products
Constantine Papadakis, Oak Wilt Mapper: It was an exceptional experience collaborating with Robin on my project at Oak Wilt Mapper. His expertise in machine learning, Python development, and space imaging technologies proved invaluable in fleshing out a technical solution that not only met but exceeded my expectations. Robin’s ability to answer my questions and provide clear and concise explanations of complex concepts was incredibly helpful. I would highly recommend him to anyone seeking a top-notch consultant
Sebastian Gutierrez, Data Science Weekly: We reached out to Dr. Cole with a set of questions we thought he could answer and provide guidance upon. He over-delivered on the value he provided not only during our live chats but also with material for us to read through (before and after our chat). He compared different approaches without ‘putting his thumb on the scale’ and ultimately I think we were able to chose the right solution for us all due to his advice. 10/10, would highly recommend.
Pushkar Kopparla, University of Bern: I reached out to Robin for advice on deep learning self-study. We had an excellent discussion, and I got clear and insightful pointers on how I could leverage my existing expertise to step into deep learning projects. Highly recommended!