Highlights
Zooplankton Prac. 2: Classifying copepod communities at the IMOS National Reference Stations
PART 1: Multidimensional Data
What are the multivariate statistics?
In ecology, we are faced with multivariate-type data on a regular basis:
In each of these instances, there is at least one explanatory variable for each of several response variables (the things we want to model). Last week we relied on univariate statistics; in other words, statistics pertaining to only a single response variable. What happens when we are interested in many response variables simultaneously? The answer is simple, we move to a thing called multivariate statistics. Multivariate statistics is an area comprising techniques with big and impressivesounding names, but few are as scary as they might sound.
Although we are often taught to use statistical techniques to test hypotheses, we saw last week that model fitting is a bit of an art, and although it can involve statistical tests, it is sometimes more about identifying pattern than testing hypotheses. We’re going to continue the theme here, and use multivariate techniques to explore pattern rather than using them to test hypotheses (although they’re capable enough of doing so, if asked).
We use multivariate statistics in many, many contexts. But most commonly, these approaches are used to identify groupings (clusters) of sample objects (usually sample sites/stations) on the basis of (ecological) distance (dissimilarity), and then to figure out what predictor (often environmental) variables might be causing this grouping (clustering). Often, the first portion of this technique is exploratory, rather than inferential, and the second portion is correlative (it correlates patterns of predictor variables with patterns of sample-object clustering). There are many approaches, including Principal Components Analysis (PCA), Multidimensional Scaling (MDS), Canonical Correspondence Analysis (CCA) and Redundancy Analysis (RDA). We are going to focus here on MDS.
How it works
The method underlying MDS is relatively straightforward:
1. Start with a matrix of data consisting of samples (rows) and species (columns/variables)
2. Calculate all pairwise distances (dissimilarities) between samples with an appropriate distance measure (usually Bray- Curtis). The MDS ordination will be performed on this distance matrix
3. Decide on the number of dimensions for the MDS (usually 2-D; easy to plot). Related to suspected number of ecological gradients
4. Arrange initial configuration of objects (usually random)
5. Compare distances between objects (samples) on plots with the original dissimilarities between samples using Stress. Stress is a measure of badness of fit that measures how well the ordination distances match the resemblances. In R, Stress ranges between 0 and 1, and the smaller it is, the better the MDS represents your data
6. The samples are then moved on the ordination, one step at a time, by the method of steepest descent. The procedure then goes back to #5
7. Reach the final configuration. Further moving of objects will not lower the stress value
8. To ensure that you have not found a local rather than a global minimum, the procedure will start at #4 again with a random distribution of the samples in the ordination space. It will do this many times and save the best “fit” (lowest stress) configuration, which will be considered the “optimal” solution.
UNDERSTANDING:
What has the MDS algorithm and a lost hiker got in common? More than you might think! The approach within an MDS of trying to find the global minimum is akin to a hiker lost in the fog atop a mountain range, trying to find their way to the bottom of the valley. The hiker would simply walk downhill (always following the steepest descent) and carry on doing so until stepping in any direction means going uphill again. However, the fog prevents the hiker from seeing where s/he is going, so the hiker could stop in a local dip and think s/he is at the bottom of the valley below. This outcome could be minimized by the hiker repeating the walk many times, each time starting from a different location on the mountain, and taking the outcome where the hiker ends up at the lowest point. This is exactly what the MDS algorithm does: it restarts the routine many times with a random configuration and then tries to keep arranging the position of samples until it is most similar to the distances in your data (has the lowest stress) and this is the global minimum.
PART 2: Examining the Copepod Community
Copepods are sensitive indicators of water masses. As part of the Integrated Marine Observing System (IMOS), data on copepod species composition and concomitant environmental data have been collected at nine national reference stations around Australia. Samples are collected with a vertical zooplankton net dropped to 50 m depth. We are interested in investigating whether there are unique communities at the different national reference stations, and what might be driving these differences in community structure. Data from this series are in copepods.csv.
Here we perform an MDS on the copepod community and environmental data from the IMOS National Reference Stations around Australia to answer three questions:
1. Are there different communities at the different stations? (i.e., can we sample fewer stations?)
2. If there are different communities, what environmental variables might structure these communities?
3. What are the characteristic species in different communities?
Adding Environmental Data
Now let’s look at the relationship with environmental data. I have provided you with a file of satellite temperature and chlorphyll a (proxy for phytoplankton biomass).
There will be a few NaNs due to satellite coverage. We can live with that, as long as there aren’t too many.
What species are driving the patterns
Finally, we know that the copepod communities are different for the different stations, but which species are typical of the different communities? In the package vegan there is a function called simper() – similarity percentage, which identifies the most distinctive species between groups, but sometimes the results are non-intuitive. We find that the Indicators Species Analysis using the function indval() is more robust and interpretable. The function indval() calculates the indicator value of species as the product of the relative frequency (based on presence) and relative average abundance in clusters/groups (here stations).
HOMEWORK
For homework, please submit answers to the following questions derived from our last two pracs. Where neccessary include model output, figures or references to justify your answers. Limit: 3 pages (incl. figures) Total Marks: 15
1. In your analysis, did you show that the NRSs have distinctive zooplankton communities around Australia? (1 mark) Describe including a figure, how the various environmental co-variates you have studied will influence zooplankton community composition? (3 marks)
2. Outline the trend you found in herbivorous zooplankton with the CPR data (1 mark). Explain why you think we see this pattern? (2 marks)
3. Does copepod species richness (not abundance) increase or decrease towards the poles (1 mark). Using 2 examples from the terrestrial and/or aquatic literature, discuss some possible reasons for this (2 marks).
4. Zooplankton are wonderful indicators of environmental condition and change. Give 3 reasons why this is true (3 marks). Briefly discuss how the NRS zooplankton sampling program can be used to monitor climate change in Australia including What should we look for and where? (2 marks)
This Statistics Assignment has been solved by our Statistics Experts at My Uni Paper. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.
Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.
© Copyright 2026 My Uni Papers – Student Hustle Made Hassle Free. All rights reserved.