Data collection
Research Aim 1.1 – Develop approaches to collect and generate data for analysis
This research area will collect, synthesise, and harmonise a vast range of environmental, human movement, and biological data.
Over the last 50 years significant amounts of Antarctic science data have accumulated. Data collection has exponentially increased recently due to improved sensor technology and remotely sensed images. Increased availability of data has moved Antarctic terrestrial science from a data-poor to a data-rich state.
To take advantage of this massive accession of data, we will collect many data types from a range of sources. Physical measurements will include: air, surface, and soil temperature, wind speed and direction, precipitation, snow height, solar radiation, relative humidity, depth to permafrost, soil samples, moisture, type and electrical conductivity, water temp and chemistry, specific conductivity, flow stage, and changes in glacier mass balance. From satellites we will collect: elevation, albedo, land surface temperature, snow cover, cloud cover, water vapour, and visible, multi-spectral and hyper-spectral imagery. We will source records of movements of scientists, tourists, and base staff and associated activities. Finally, biological data will be collected via the Global Biodiversity Information Facility, the Antarctic database ‘biodiversity.aq’, science programmes, and other curated databases.
Data Synthesis
Some information will require synthesis to provide useful data. We will use high performance computing (HPC) resources through the National e-Science Infrastructure (NeSI) to provide these syntheses. For example, we will develop a computational processing engine to simplify the synthesis from satellite imagery platforms. To create a land surface temperature layer for a single day across the entire Ross Sea Region (RSR), 16 individual satellite images must be stitched together. Using a year’s worth of daily-stitched layers, we can create mean, min, max, 95th percentile, and 5th percentile layers for temperature in the RSR. Done manually, and for other layers, this process would be time consuming.
Data Harmonisation
The final stage of this research area is the harmonisation of the data within a single spatial database. Combining existing data from multiple sources into a single resource presents multiple challenges. However, it will provide insights impossible to get from individual data sets. These may provide a fundamental change in how we understand the Antarctic terrestrial environment. Ideally, existing data will have been collected using standardised methods. However, most data are collected for specific projects with little concern for data re-use, and protocols vary widely. Harmonisation minimises systematic differences between data sources. Merging data from different sources without harmonisation of data in time yields poor models and analyses. Harmonisation of the data is therefore needed to generate exciting new science.