New discoveries #5
SAR landslide detection, 3DUNetGSFormer Yosemite tree & RF100 datasets, Datasette
Welcome to the 5th edition of the newsletter. I am delighted to share that the newsletter has now passed 1530 subscribers 🥳 Please note that this edition of the newsletter does not have a sponsor. As a sponsor, you'll receive a shout-out in the opening statement and a dedicated section in the newsletter, reaching a wide audience in the community. If you're interested in gaining visibility for your business or service, sponsoring a future edition of the newsletter is an excellent way to achieve this. For more information on how to sponsor the newsletter, please email me 📧
SAR landslide detection
Synthetic Aperture Radar (SAR) is an active remote sensing technique that is unaffected by weather conditions. Training deep learning models typically requires large labeled datasets, which in the case of landslides are often not available for the specific region in which the event occurred. This paper demonstrates how deep learning algorithms for landslide segmentation on SAR products can benefit from pre-training on a simpler task and from data from different regions. A 2 stage approach is taken: (1) a classifier is trained identifying whether a SAR image contains any landslides or not, (2) a segmentation model is trained in a sparsely labeled scenario where half of the data do not contain landslides. The authors find that this process leads to a significantly lower false positive rate in areas without landslides and an improved estimate of the average number of landslide pixels in a chip. It is encouraging to see progress on this application as this technique could enable better disaster response. More generally the paper demonstrates an approach that can be applied to a wide variety of problems where large amounts of labelled data are not available. Authors: Vanessa Böhm, Wei Ji Leong, Ragini Bal Mahesh, Ioannis Prapas, Edoardo Nemni, Freddie Kalaitzis, Siddha Ganju, Raul Ramos-Pollan
3DUNetGSFormer: complex wetland mapping using GAN & Swin transformer
Wetlands are at risk due to climate change and human activities, and wetland mapping is critical for ecosystem monitoring. Modelling approaches are currently hampered by the limited availability of ground-truth data for large-scale wetland mapping. This paper proposes a 3D UNet Generative Adversarial Network Swin Transformer (3DUNetGSFormer) to adaptively synthesize wetland training data only for wetland classes with a limited amount of training data. Both real and synthesized training data are then imported to a deep learning architecture consisting of a CNN feature extractor and a Swin transformer classifier. This approach yielded meaningful improvements compared to only using the real wetland data. An ablation study demonstrated that the feature extractor plays a significant role in improving the accuracy of wetland mapping. However, the authors note the much higher cost of the proposed framework in terms of required time and hardware compared to that of other implemented algorithms, and suggest that future research should focus on the development of a more efficient model in terms of time and hardware resources. Overall this is a very interesting state-of-the-art approach, and I anticipate further refinements which will make the solution more appropriate for real world application. Authors:Ali Jamalia, Masoud Mahdianpari, Brian Brisco, Dehua Mao, Bahram Salehi, Fariba Mohammadimanesh
Yosemite tree dataset
The Yosemite Tree Dataset is a benchmark dataset for tree counting from aerial images. The study area is a 19200 x 38400 pixels image with 98,949 individual trees labeled. The study area is split into four regions of same size. Region B and D are used to train models to count trees. Region A and C are used as the test set.
RF100: A New Object Detection Benchmark
Roboflow is a platform for dataset annotation, model training & deployment. The team at Roboflow this week released a new object detection benchmark dataset called RF100. Currently most object detection papers/models are benchmarked on a single dataset such as the COCO dataset, but this does not always provide a good guide to performance on subdomains such as aerial or medical imaging. RF100 is compiled from 100 real world datasets that straddle a range of domains. The aim is that performance evaluation on this dataset will enable a more nuanced guide of how a model will perform in different domains. The Github page provides examples for fine tuning and evaluating Yolov5/v7/clip on the dataset. I think the benefits of this new benchmark dataset are clear to people who work in subdomains such as aerial imaging, and it will be interesting to see it this benchmark dataset becomes adopted by the wider computer vision community.
Datasette for geospatial analysis
Datasette is an open source tool for exploring and publishing data. Essentially if your data/metadata can be put into a SQLite database file, Datasette can quickly create a powerful UI for exploring the data. I have used this tool myself for a couple of projects, and was delighted to discover it now has a number of plugins and tools that can be used to work with geospatial data. I think this could be particularly useful for managing geospatial datasets. Check it out at the link below:
Weekly poll
Last week I asked ‘Is python your primary language?’, and 90% said yes, indicating this is the language to learn if you wish to collaborate with colleagues 🐍
Consulting with Robin
If you need expert guidance on any of the following topics, I’m available for hourly video call consulting:
Applying machine learning and deep learning techniques to satellite and aerial imagery, including dataset selection, model training, and deployment.
Understanding the physics of remote sensing imaging systems.
Building data processing pipelines in the cloud.
Building your brand and community for technical products.
Personal career development.
As an experienced consultant, I offer customised advice and practical solutions to help you achieve your goals in these areas. To discuss this service please email me 📧