Highlights
Import the dataset in the file “southernAppsRiverHerbs.txt”. The dataset shows the number of plants found in separate plots in the southern Appalachians (Becky Brown’s PhD dissertation data).
Convert the geom2 column of the dataset into a factor.
Separate out columns 50 to 70 of the entire dataset into a separate dataset that just contains the number of 20 plant species found in each plot (community matrix). Are there lots of zeros in this dataset?
Now run a principal components analysis on this community matrix dataset. Generate a summary and a screeplot of the PCA. What proportion of the variation in the community matrix explained by the first few principal components? Does the proportion of variation explained plateau after a set number of principal components? How many principal components should be used if we aim to explain 95% of the variation in the community matrix?
Generate a biplot of the PCA you made above. Is there a good separation of the sites along the principal component axes? Do you suspect a horseshoe or arch effect? Why would that happen?
Based on the biplot, provide an example of a species are positively correlated with PCA1? Provide an example of a species that is negatively correlated with PCA1. Provide an example of a species that is positively correlated with PCA2, and one that is negatively correlated with PCA2! Also provide an example of a site that is the most different from all other sites in terms of PCA2.
Separate out columns 123 to 133 of the entire dataset into a separate environmental variable dataset that just contains the log-transformed values for concentration of various minerals in soil samples at each of the sites.
Now run a redundancy analysis on the community matrix that you created in Question 3, using the environmental variable to constrain the ordination. What environmental variables are strongly correlated with principal components one, or principal components two, in either a positive or a negative relationship? Do the configuration of the sites look very different from the PCA biplot in Question 5? Why could they be different? What proportion of the variation in the ordination is explained by the environmental dataset matrix, based on the summary of the redundancy analysis?
Now run a correspondence analysis on the community matrix dataset. Generate a summary and a screeplot of the CA. Is most of the variation in the community matrix explained by the first few principal components? Does the proportion of variation explained plateau after a set number of correspondence axes? How many axes should we use if we aim to explain 95% of the variation in the community matrix?
Generate a biplot of the CA you made above. Is there a good separation of the sites along the correspondence axes? Do you suspect a horseshoe or arch effect? Why would that happen? Does the biplot look exactly like the biplot you generated for the PCA in Question 5? What differences between the two methods would contribute to the difference?
Based on the biplot, provide an example of a species are positively correlated with CA1? Provide an example of a species that is negatively correlated with CA1. Provide an example of a species that is positively correlated with CA2, and one that is negatively correlated with CA2! Also provide an example of a site that is the most different from all other sites in terms of PCA2.
Now run a canonical correspondence analysis on the community matrix that you created in Question 3, using the environmental variable to constrain the ordination. What environmental variables are strongly correlated with CCA axis one, or CCA axis two, in either a positive or a negative relationship? Are these different than what you got with the RDA analysis in Question 8? Do the configuration of the sites look very different from the CA biplot in Question 10? Why could they be different? What proportion of the variation in the ordination is explained by the environmental dataset matrix?
Create a new variable by converting the community matrix dataset into a matrix using the as.matrix() function. Then run a MANOVA using the manova() command with this matrix as the response variable, and the geom2 variable of the original dataset as the predictor variable. This will test the null hypothesis that there is no difference between the centroids of the multivariate distribution of the species composition of sites between the 5 different groups defined by the variable geom2.
Running the summary() function on the MANOVA object you created above, have you been able to reject the above null hypothesis, using all four test statistics? (Pillai’s lambda, Wilk’s lambda, Hoteling-Lawley, Roy’s largest root)
Finally, run a non-metric multidimensional scaling on the community matrix dataset. What is the stress level of the default configuration? Increase the number of axes until you get an acceptable level of stress. What was the number of axes that you chose?
Generate a biplot of the NMDS. Is there a good separation of the sites along the NMDS axes? Does it look similar to what we got with the PCA and the CA methods? What differences between these three methods would contribute to the difference?
Overlay the groupings of the sites by the geom2 variable on the NMDS ordination, using the ordispider command. Is there a separation of the sites according to the geom2 variable?
Using the adonis() command, test the null hypothesis that there is no difference in the species community between the 5 groups defined by the geom2 variable. Have you been able to reject this null hypothesis? How is this different from the MANOVA we just ran in Step 13?
Using the anosim() command, test the null hypothesis that there is no difference in the species community between the 5 groups defined by the geom2 variable. Have you been able to reject this null hypothesis? How is this different from the ADONIS we just ran above, and the MANOVA we ran in Step 13?
Based on all of these, which of the methods (PCA, CA, MANOVA, NMDS) would you include in a report, and why.
This BIOL511 - Statistics Assignment has been solved by our Statistics Experts at My Uni Paper. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.
Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.
© Copyright 2026 My Uni Papers – Student Hustle Made Hassle Free. All rights reserved.