The Weddell Sea Benthic Dataset: A computer vision-ready object detection dataset for in situ benthic biodiversity monitoring model development
We present the Weddell Sea Benthic Dataset (WSBD), a computer vision-ready collection of high-resolution seafloor imagery and corresponding annotations designed to support automated analysis of Antarctic benthic communities. The dataset comprises 100 top-down images captured during RV Polarstern Expedition PS118 (cruises 69-1 and 69-6) in 2019, using the Ocean Floor Observation and Bathymetry System (OFOBS) in the Weddell Sea, Antarctica. A subset of this imagery was manually annotated by ecologists at the British Antarctic Survey (BAS) to support ecological analyses, including benthic community composition and species interaction studies. These annotations were subsequently standardised into 25 morphotypes to serve as class labels for object detection tasks. Bounding box annotations are provided in COCO format, alongside the training, validation, and test splits used during model development at BAS. This dataset provides a benchmark for developing and evaluating machine learning models aimed at enhancing biodiversity monitoring in Antarctic benthic environments.
This work was funded by the UKRI Future Leaders Fellowship MR/W01002X/1 'The past, present and future of unique cold-water benthic (sea floor) ecosystems in the Southern Ocean' awarded to Rowan Whittle.
Simple
- Date (Creation)
- 2025-06-06
- Date (Revision)
- 2025-06-06
- Date (Publication)
- 2025-06-06
- Date (released)
- 2025-06-06
- Edition
- 1.0
- Unique resource identifier
- https://doi.org/10.5285/1ba97e4b-efb7-460b-9f2d-90437e33ce09
- Codespace
- doi
- Unique resource identifier
- GB/NERC/BAS/PDC/02069
- Codespace
- https://data.bas.ac.uk/
- Other citation details
- Please cite this item as: Trotter, C., Griffiths, H.J., Khan, T.M., Purser, A., & Whittle, R.J. (2025). The Weddell Sea Benthic Dataset: A computer vision-ready object detection dataset for in situ benthic biodiversity monitoring model development (Version 1.0) [Data set]. NERC EDS UK Polar Data Centre. https://doi.org/10.5285/1ba97e4b-efb7-460b-9f2d-90437e33ce09
- Credit
- No credit.
- Status
- completed Completed
https://www.bas.ac.uk/team/business-teams/information-services/uk-polar-data-centre/
- Maintenance and update frequency
- asNeeded As needed
- Maintenance note
- completed Completed
- Theme
-
- Benthos
- biodiversity monitoring
- computer vision
- deep learning
- marine ecology
- Place
-
- Weddell Sea, Powell Basin Southern Ocean
- GEMET - INSPIRE themes, version 1.0
- Access constraints
- otherRestrictions Other restrictions
- Other constraints
- no limitations to public access
- Access constraints
- otherRestrictions Other restrictions
- Other constraints
- no limitations
- Use constraints
- license License
- Other constraints
- Open Government Licence v3.0
- Use constraints
- otherRestrictions Other restrictions
- Other constraints
- Data are supplied under Open Government Licence v3.0
- Use constraints
- otherRestrictions Other restrictions
- Other constraints
- None
- Unique resource identifier
- doi
- Codespace
- doi
- Association Type
- crossReference Cross reference
- Unique resource identifier
- url
- Codespace
- url
- Association Type
- largerWorkCitation Larger work citation
- Unique resource identifier
- url
- Codespace
- url
- Association Type
- crossReference Cross reference
- Unique resource identifier
- doi
- Codespace
- doi
- Association Type
- crossReference Cross reference
- Unique resource identifier
- doi
- Codespace
- doi
- Association Type
- crossReference Cross reference
- Unique resource identifier
- doi
- Codespace
- doi
- Association Type
- crossReference Cross reference
- Unique resource identifier
- doi
- Codespace
- doi
- Association Type
- crossReference Cross reference
- Unique resource identifier
- url
- Codespace
- url
- Association Type
- dependency dependency
- Spatial representation type
- textTable Text, table
- Metadata language
- engEnglish
- Character set
- utf8 UTF8
- Topic category
-
- Biota
- Environment
- Oceans
- Begin date
- 2019-02-01
- End date
- 2019-04-30
- Supplemental Information
- It is recommended that careful attention be paid to the contents of any data, and that the author be contacted with any questions regarding appropriate use. If you find any errors or omissions, please report them to polardatacentre@bas.ac.uk.
Distributor
https://www.bas.ac.uk/team/business-teams/information-services/uk-polar-data-centre/
- Name
- application/json
- Name
- image/jpeg
- Units of distribution
- bytes
- Transfer size
- 139460608
- OnLine resource
-
Get Data
(
WWW:LINK-1.0-http--link
)
Download data
- Units of distribution
- bytes
- Transfer size
- 139460608
- OnLine resource
-
Get Data
(
WWW:LINK-1.0-http--link
)
Download data
- Hierarchy level
- dataset Dataset
- Statement
-
Methodology:
Data was collected as part of expedition PS118, cruises 69-1 and 6-9, of the RV Polarstern in 2019 [1] using the OFOBS [2] and manually labelled for use in benthic community analysis [3]. These labels were then condensed into 25 morphotypes, with annotations converted to COCO-formatted bounding boxes, for use in object detection model development. Data was split into training, validation, and test sets based on substrate, depth, seafloor inclination.
Imagery in this dataset is a subset of imagery collected during the expedition PS118 [1], available on PANGEA [2]. Dataset annotations are a subset of those present in [4] for use in benthic community analysis [3]. This dataset was used for the development of an object detection model capable of automated benthic organism detection [in-prep]. For model weights, see Related URLs.
Some original source images [4] were not comprehensively annotated, e.g. due to distortion. For use in object detection model training, the unlabelled regions were cropped, resulting in images of varying dimensions (average size = 3,364×4,545px).
Data collection:
Data was collected using the OFOBS, a top-down towed camera system. Original labelling for benthic community analysis was performed in Inkscape v1.1. Resulting SVG files were then converted using a custom script file to JPGs with corresponding COCO bounding box JSON using Python v3.12.8. Converted bounding boxes were then manually edited (resized etc) using LabelMe v5.8.1, then converted back to COCO format.
Data quality:
All bounding boxes were manually checked after conversion from SVG format. Class labels have been reviewed by ecologists at BAS for accuracy. Given the high densities of organisms in the dataset, the prevalence of small-bodied taxa, and the well documented issues of fatigue and subjectivity in manual annotation processes for benthic imagery, it is likely some valid organisms were omitted from the ground truth.
- File identifier
- 1ba97e4b-efb7-460b-9f2d-90437e33ce09 XML
- Metadata language
- engEnglish
- Character set
- utf8 UTF8
- Hierarchy level
- dataset Dataset
- Hierarchy level name
- dataset
- Date stamp
- 2025-06-06
- Metadata standard name
- ISO 19115 Geographic Information - Metadata
- Metadata standard version
- ISO 19115:2003(E)
https://www.bas.ac.uk/team/business-teams/information-services/uk-polar-data-centre/
Overviews
Spatial extent
Provided by
NERC Data Catalogue Service