Point annotation dataset of stranded whale and dolphin species identified in very high-resolution optical and SAR satellite imagery along offshore islands of New Zealand and Tasmania between 2018-2023
This dataset presents point annotations of stranded whale (Sperm whales, Physeter macrocephalus) and dolphin (Pilot whales, Globicephala melas edwardii) species identified in very high-resolution (VHR) optical and SAR satellite imagery, along offshore islands of New Zealand and Tasmania, between 2018-2023. Cetacean strandings offer significant conservation value for the assessment of ecosystems and serve as early warning of emerging concerns regarding animal, ocean, and human health. However stranding monitoring programmes are scarce or non-existent along minimally populated areas, coastlines with limited economic resources, geographically remote areas, complex coastlines and areas of geopolitical unrest. VHR satellite imagery offers the prospect of improving monitoring in these regions. While VHR satellite imagery is able to detect large baleen whale strandings, mass strandings are predominantly smaller-sized odontocetes (toothed whale and dolphin species). Detecting odontocetes is therefore crucial for VHR satellites to be useful for monitoring strandings globally. In addition, scaling up the use of VHR optical satellite imagery is limited by cloud cover, the primary environmental condition governing successful imagery collection. Synthetic Aperture Radar (SAR) satellites enable VHR imaging of Earth in cloudy regions and in darkness. This approach could facilitate strandings detection in cloudy regions and independent of daylight hours, which is critical for enabling timely emergency responses to unfolding stranding events. Here, we present data from four smaller odontocete mass strandings of long-finned pilot whale (LFPW), on Chatham, Pitt and Stewart Island, New Zealand, and one large odontocete (sperm whale) mass stranding on King Island, Tasmania, Australia between 2018-2023, to successfully detect and quantify large and small odontocete strandings in VHR optical and SAR satellite imagery.
This research has been supported by the Natural Environment Research Council (NERC) through a SENSE CDT studentship (grant no. NE/T00939X/1). The research was further supported by additional funding provided through, the British Antarctic Survey (BAS) Innovation Voucher, Sentinel Hub and their #30MapChallenge competition, BAS Ecosystems, and the support and cooperation of Airbus and Maxar Technologies Ltd, for their rapid response and efforts to enable successful collection of the imagery analysed here.
Simple
- Date (Creation)
- 2025-08-13
- Date (Revision)
- 2025-08-13
- Date (Publication)
- 2025-08-13
- Date (released)
- 2025-08-13
- Edition
- 1.0
- Unique resource identifier
- https://doi.org/10.5285/b26c3a3d-73c8-4500-9a23-696011c20a45
- Codespace
- doi
- Unique resource identifier
- GB/NERC/BAS/PDC/02083
- Codespace
- https://data.bas.ac.uk/
- Unique resource identifier
- NE/T00939X/1
- Codespace
- award
- Other citation details
- Please cite this item as: Clarke, P.J., Cubaynes, H.C., Bowler, E., Jackson, J.A., Attard, M.R.G., Stockin, K.A., & Carlyon, K. (2025). Point annotation dataset of stranded whale and dolphin species identified in very high-resolution optical and SAR satellite imagery along offshore islands of New Zealand and Tasmania between 2018-2023 (Version 1.0) [Data set]. NERC EDS UK Polar Data Centre. https://doi.org/10.5285/b26c3a3d-73c8-4500-9a23-696011c20a45
- Credit
- No credit.
- Status
- completed Completed
https://www.bas.ac.uk/team/business-teams/information-services/uk-polar-data-centre/
- Maintenance and update frequency
- asNeeded As needed
- Maintenance note
- completed Completed
- Theme
-
- cetacean stranding
- dolphin
- machine learning
- satellite remote sensing
- synthetic aperture radar
- training data
- very high-resolution satellite imagery
- whale
- Place
-
- Mason Bay, Stewart Island/Rakiura New Zealand
- King Island, Tasmania Australia
- Maunganui Beach, Chatham Island/Wharekauri New Zealand
- Long Beach, Petre Bay, Chatham Island/Wharekauri New Zealand
- Waihere Bay, Pitt Island/Rangiauria New Zealand
- GEMET - INSPIRE themes, version 1.0
- Access constraints
- otherRestrictions Other restrictions
- Use constraints
- license License
- Other constraints
- Open Government Licence v3.0
- Use constraints
- otherRestrictions Other restrictions
- Other constraints
- Data supplied under Open Government Licence v3.0
- Use constraints
- otherRestrictions Other restrictions
- Other constraints
- These data are under embargo until the publication of the associated manuscript.
- Unique resource identifier
- doi
- Codespace
- doi
- Association Type
- crossReference Cross reference
- Unique resource identifier
- doi
- Codespace
- doi
- Association Type
- crossReference Cross reference
- Unique resource identifier
- doi
- Codespace
- doi
- Association Type
- crossReference Cross reference
- Unique resource identifier
- url
- Codespace
- url
- Association Type
- crossReference Cross reference
- Unique resource identifier
- url
- Codespace
- url
- Association Type
- dependency dependency
- Unique resource identifier
- url
- Codespace
- url
- Association Type
- dependency dependency
- Unique resource identifier
- url
- Codespace
- url
- Association Type
- dependency dependency
- Unique resource identifier
- doi
- Codespace
- doi
- Association Type
- crossReference Cross reference
- Unique resource identifier
- doi
- Codespace
- doi
- Association Type
- crossReference Cross reference
- Spatial representation type
- textTable Text, table
- Metadata language
- engEnglish
- Character set
- utf8 UTF8
- Topic category
-
- Biota
- Begin date
- 2018-11-24
- End date
- 2023-10-03
- Supplemental Information
- It is recommended that careful attention be paid to the contents of any data, and that the author be contacted with any questions regarding appropriate use. If you find any errors or omissions, please report them to polardatacentre@bas.ac.uk.
Distributor
https://www.bas.ac.uk/team/business-teams/information-services/uk-polar-data-centre/
- Name
- text/csv
- Name
- application/vnd.shp
- Units of distribution
- bytes
- Transfer size
- 7654605
- OnLine resource
-
Get Data
(
WWW:LINK-1.0-http--link
)
Download data
- Units of distribution
- bytes
- Transfer size
- 7654605
- OnLine resource
-
Get Data
(
WWW:LINK-1.0-http--link
)
Download data
- Hierarchy level
- dataset Dataset
- Statement
-
Methodology:
Location
This data contains point annotations for five mass stranding events on remote islands in Tasmania and New Zealand, between 2018 and 2023. The events comprise: (1) a LFPW mass stranding of 162 individual animals comprising two pods approximately 2 km apart, reported on the 24 November 2018 in Mason Bay, Stewart Island/Rakiura, New Zealand, (2) a sperm whale mass stranding of 14 individuals, reported on the 19 September 2022 on King Island, Tasmania, (3) (a) a LFPW mass stranding of 243 animals, reported 7 October 2022 on Maunganui Beach, Chatham Island/Wharekauri, New Zealand (b) a LFPW mass stranding of 136 animals, reported 3 October 2023 on Long Beach, Petre Bay, Chatham Island/Wharekauri, New Zealand, and (4) a LFPW mass stranding of 245 animals, reported 10 October 2022 in Waihere Bay, Pitt Island/Rangiauria, New Zealand.
Satellite imagery and ground data
Stranding events were selected based on opportunistic availability of satellite imagery within image archives, as well as active tasking (ordering) of new imagery during real-time events. Optical satellite images were collected at their sensor specific native spatial resolution (0.5 or 0.3 m) and were artificially down-sampled (reduced spatial resolution), and/or up-sampled (enhanced spatial resolution) by the satellite image provider using proprietary algorithms, to achieve, where possible, imagery at three spatial resolutions, 0.5, 0.3 and 0.15 m. Due to licencing agreements, the satellite imagery are not shared here, however, the full metadata of the satellite imagery associated with the data, including information necessary to access and reproduce the image pre-processing steps, are provided in supplemental S1 and S2 of the associated manuscript (doi link).
Ground data are essential to validate satellite data in these early stages of method development. For the sperm whale event in Tasmania, ground data were provided by the Department of Natural Resources and Environment, Tasmania. For all LFPW events, ground counts were provided by the New Zealand Department of Conservation and the Non-Governmental Organisation, Project Jonah. Aerial imagery for the Stewart and Chatham Island events, were provided by the Department of Conservation and a local photographer, capturing part of the event extent.
Open-source pre-processing and image annotation workflow
To facilitate reproducible analysis and annotation, an open-source workflow was developed using QGIS 3.28. The standardised protocol was adapted from Cubaynes et al. (2023). A template for recording annotation metadata (e.g., confidence scores, satellite metadata, feature information, and environmental conditions) and an accompanying training document was co-developed (S4_attribute_training_document.pdf, included here). This approach ensures that future stranding datasets can be collected and formatted under consistent data standards. In addition, a replicable version of the QGIS workflow was developed using Python 3.10 (https://github.com/PennyJClarke/strandings_from_space). The pre-processing steps applied to the optical satellite imagery was pansharpening, using a QGIS plugin, Orfeo Toolbox (OTB, BundleToPerfectSensor). The SAR image was georeferenced to the matching optical image (aligning both images) by assigning ground control points in the SAR image to known geolocations in the optical image, using the QGIS 'Georeferencer' tool in QGIS 3.16, (see supplemental S6 and S7 for guidance).
Manual annotation
The dataset contains annotations for three observers, experienced in reviewing satellite imagery for wildlife from space. Each observer independently reviewed the satellite imagery using the annotation workflow. Features were recorded with a georeferenced point and assigned a confidence score, 'definite_(90-100%)', 'likely_(70-89%)' and 'possible_(50-69%)'. Additionally, observer 1 annotated all fea...(52)
Data collection:
Satellite imagery used to produce this dataset include the following sensors, optical: GeoEYE-1, Worldview-2 (WV2) and Worldview-3 (WV3), Pleiades, and SAR: TerraSAR-X.
The technical specification of the drones used to collect the drone imagery annotated here are unknown.
The ground data and the total counts of stranded cetaceans at each event were provided by:
- the Department of Conservation New Zealand
- Project Jonah, New Zealand, and
- the Marine Conservation Program, Department of Natural Resources and Environment Tasmania (NRE Tas), Tasmanian Government, Australia.
The annotation of all satellite data was made using open source QGIS 3.28 Windows 10 (see supplemental S2, S4, and S5 for guidance). As well as a user-interface centred pipeline in QGIS, a replicable version of the workflow is available using Python 3.10 (https://github.com/PennyJClarke/strandings_from_space).
All satellite imagery was pansharpened using QGIS plugin Orfeo Toolbox (OTB, BundleToPerfectSensor) (see supplemental S2 for guidance in the associated manuscript (doi link)). The SAR image was georeferenced to the matching optical image (aligning both images) using the QGIS 'Georeferencer' tool in QGIS 3.16 (see supplemental S6 and S7 for guidance in the associated manuscript (doi link)).
The annotation of all aerial data was made using VGG Image Annotator (VIA), as per protocols in Cubaynes et al. (2024)
To compare observers' annotations, hierarchical clustering was performed using a python script (v3.12) adapted from Attard et al. (2025) (available at: https://github.com/PennyJClarke/strandings_from_space ).
For aerial imagery with an unknown spatial extent, clustering was performed using pixel-based measurements generated by estimating real-world distances from average morphometrics calculated across several hundred male and female pilot whales sampled from New Zealand strandings (Emma Betty, 2022; Emma Betty, 2023), visible in the image, using ImageJ (version 1.54g; Java 1.8.0_345 [64-bit], see supplemental 8 for methodology in the associated manuscript [doi link]).
Data quality:
The satellite imagery reviewed by all observers was pre-processed and visualised with the same parameters. The three observers selected were experienced in reviewing satellite imagery for wildlife from space. Each observer followed the standardised protocols for annotation. Observers reviewed imagery independently and sequentially from low to high spatial resolution to avoid introducing biases when reviewing lower resolution satellite imagery. For SAR image counts, the matching optical and SAR imagery was split in half, and the opposite halves of the corresponding imagery were provided to each observer in two separate experiments (''''half_raster_1.shp' and 'half_raster_2.shp' are polygon shapefiles that provide the extent used to clip the matched SAR and optical images in half). This ensured the observer SAR counts were independent of visual aids from the optical imagery, while still allowing coastline alignment for reference. While measures were taken to ensure the highest level of accuracy, the annotation of stranded cetacean features in all imagery, particularly satellite imagery, may be impacted by:
-observer experience in the target feature (observers include a stranding from space expert, another a whale and walrus from space expert and finally a walrus and albatross from space expert)
-prevailing environmental conditions: cloud cover and diverse environmental backgrounds
-sensor specification: spatial resolution and nadir angle (angle of image collection, which can distort an image)
-target species: morphological characteristics, confounding features (features misidentified as cetaceans), and decomposition phase, and image quality: solar reflection/glare and image darkness
**Resolution**
The annotation data presented here are associated with a specific image and spatial resolution. GSD refers to the 'Ground Sampling Distance' (the distance on the ground represented per pixel, the higher the spatial resolution the more detail is visible) in metres. GSD is provided per satellite image annotated. Where more than one value exists, multiple spatial resolutions of the same image were assessed. First values indicate the native spatial resolution of the satellite sensor. Sensor specific native spatial resolution (0.5 or 0.3 m), is the spatial resolution that the sensor collects imagery. Images were artificially down-sampled (reduced spatial resolution), and/or up-sampled (enhanced spatial resolution) by the satellite image provider using proprietary algorithms, to achieve, where possible, imagery at three spatial resolutions, 0.5, 0.3 and 0.15 m.
1. Stewart Island:
- DS_PHR1B_201811282258003_FR1_PX_E167S47_1106_03071 - GSD (m): 0.5
- 1030010089B22D00 - GSD (m): 0.5 & 0.3
2. King Island:
- 105001002F0CA400 - GSD (m): 0.5 & 0.3
3.a Chatham Island
- 105001002F60AA00 - GSD (m): 0.5
- 10300100DC306300 - GSD (m): 0.5 & 0.3
- 104001007E5E7400 - GSD (m): 0.3, 0.5, & 0.15
3.b Chatham Island
- 104001008D07E000 - GSD (m): 0.3
- C542_N85_D_ST_spot_029_R_2023-10-05T17:04:56.332868Z - GSD (m): 0.28
4. Pitt Island
- 10300100DB012A00 - GSD (m): 0.5
For the aerial imagery annotated, the spatial extent and resolution was unknown. To perform clustering pixel-based measurements were generated by estimating real-world distances from average morphometrics calculated across several hundred male and female pilot whales sampled from New Zealand strandings ((Betty et al., 2022, Betty et al., 2023, mean length of adult males: 5.5 mean length of adult females: 4.32 m, as we were unable to distinguish sex within the aerial image, we used an average of the measurements for both sexes to equal 4.91 m, to take measurements from the tip of the rostrum to the notch of the fluke), visible in the image, using ImageJ (version 1.54g; Java 1.8.0_345 [64-bit], see supplemental S8 in associated manuscript (doi link) fo...(3)
- File identifier
- b26c3a3d-73c8-4500-9a23-696011c20a45 XML
- Metadata language
- engEnglish
- Character set
- utf8 UTF8
- Hierarchy level
- dataset Dataset
- Hierarchy level name
- dataset
- Date stamp
- 2025-08-13
- Metadata standard name
- ISO 19115 Geographic Information - Metadata
- Metadata standard version
- ISO 19115:2003(E)
https://www.bas.ac.uk/team/business-teams/information-services/uk-polar-data-centre/
Overviews
Spatial extent
Provided by
NERC Data Catalogue Service