####################################################### # BIOSCI 109 Lab: Introduction to Tree-Thinking in R, # using phylogenies of tetrapods # # by: Nick Matzke, 2019 # School of Biological Sciences # University of Auckland ####################################################### ####################################################### # PART I. SET-UP # PART II. ULTRA-SHORT CRASH COURSE IN R # PART III. INTRODUCTION TO TREES IN R # PART IV. VERTEBRATE PHYLOGENY # PART V. MAPPING CHARACTER EVOLUTION ON A PHYLOGENY: # FEATHERED FLIGHT AND VIVIPARITY # PART VI. BONUS MATERIAL (OPTIONAL): THE PHYLOGENY AND # BIOGEOGRAPHY OF KIWIS, MOAS, AND RELATIVES ####################################################### ####################################################### # Summary: # # In this lab we will learn about "tree-thinking" -- # how to read evolutionary trees, and how to think # about evolution in terms of different depictions # of evolutionary trees. # # The lab will do this using "R", an open-source # statistical computing language that is now used # in science and statistics worldwide. NO PREVIOUS # KNOWLEDGE OF R IS REQUIRED -- all of the commands # used here can be copy-pasted from this script # into R. # # NOTE: Phylogenetics, and R, are two of the areas # for which New Zealand and the University of # Auckland are well-known in science. For example, # the original publication of R, Ihaka & # Gentleman (1996), now has over 10,000 citations # in the scientific literature. The authors, Ross # Ihaka and Robert Gentleman, are from the # Department of Statistics at the # University of Auckland. # # Assessment: # # All tree-thinking material (the specific # trees, the conceptual understanding of trees, etc.) # WILL BE FAIR GAME for assessments in this course, # and it will also be assumed knowledge for future # courses involving evolution. # # The "R" material WILL NOT be assessed -- however, # learning R can be a very useful skill for data # analysis, scientific graphics, and statistics. # Hopefully, by getting a taste of R here, you will # be encouraged to develop your R skills in future # courses and projects. (Note: knowing R can help # you get a job!) # ####################################################### ############################################################## # LICENSE: # # Free to use/redistribute under: # Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) # # This program is free software; you can redistribute it and/or # modify it under the terms of the above license, linked here: # # http://creativecommons.org/licenses/by-nc-sa/3.0/ # # Summary: # # You are free: # # * to Share -- to copy, distribute and transmit the work # * to Remix -- to adapt the work # # Under the following conditions: # # * Attribution -- You must attribute the work in the manner # specified by the author or licensor (but not in any way that # suggests that they endorse you or your use of the work). # * Noncommercial -- You may not use this work for commercial purposes. # # * Share Alike -- If you alter, transform, or build upon this work, # you may distribute the resulting work only under the same or # similar license to this one. # # http://creativecommons.org/licenses/by-nc-sa/3.0/ # ################################################################### ################################################################### # # Please link/cite if you use this, please email me if you have # thoughts/improvements/corrections: n.matzkeATauckland.ac.nz # # Citation: Matzke, Nicholas J. (2019). Introduction to # Tree-Thinking in R, using phylogenies of tetrapods and # paleognath birds. Lab for BIOSCI 109 at the University of # Auckland, New Zealand. # ################################################################### ################################################################### # PART I. SET-UP ################################################################### # # A. Installation of R # # Your lab computers should already have R (or RStudio) installed. # Find it and open it. # # If you are doing the lab on your own laptop or another machine, # you will have to install R first. Because R is free and open- # source, this is easy. # # Two options: # A1. Basic (terminal) R, or R.app for Macs: # Go to: https://cloud.r-project.org/ # # A2. RStudio: # https://www.rstudio.com/products/rstudio/download/ # # B. Installation of R packages # # R "packages" allow scientists to perform specialist analyses. # R packages are also open-source and shared freely worldwide. # Here, we will be using 2 packages designed for phylogenetics: # # 1. APE ("Analysis of Phylogenetics and Evolution") # 2. phangorn (yes, this is a nerd pun) # # These packages are already installed on the lab computers. # On other computers, you can install them by typing these # commands in R: # # (remove the "#", the comment character, to run the commands) # # install.packages("ape") # install.packages("phangorn") # ################################################################### ################################################################### # PART II. ULTRA-SHORT CRASH COURSE IN R ################################################################### # R has a "command-line interface", rather than a "point-and-click" # interface. Although "point-and-click" is easier at first, # command-line is better for science, because: # # (a) you can save all your commands in a text file, and repeat # the analysis easily by copying and pasting the saved commands; # this is much easier than trying to remember all the clicks you # did, in what order. # # (b) these saved commands constitute an "R script" that you can # share with other scientists, who can modify and improve the # script. This lab is in the form of an R script. # # (c) this is an easy form of beginning programming; eventually, # it leads to writing your own programs and R packages, thus # earning fame and fortune. # # We will not attempt a complete introduction to R here, just # the absolute basics. # # COMMENTS # # Any line that starts with a "#" is a comment. Comments, when # pasted into R, do nothing. # # Comments are where the script describes what is happening. # Well-commented code is valuable because other users, or # your future self, can more easily figure out what each # piece of code is doing. # # BASIC MATH # # Type these commands into R to see R do math: 1+1 30/3 5*3 # STORING AND RETRIEVING VARIABLES # # Variables can store numeric data # Typing the name of the variable returns what is stored: var1 = 1+1 var1 # Variables can also store character data var2 = "Hello, world!" var2 # LOGICAL TESTS # # The "==" operator tests if two things are equal, # and returns TRUE or FALSE 1 == 1 var1 == 1 var1 == 2 # R can also do greater than, less than, not equal, etc. 1 < 2 2 > 1 1 != 1 1 != 2 # VARIABLES STORING ARRAYS # A variable can store multiple items; this is a "vector" # or "array". Let's store a list of species names: var3 = c("human", "chimp", "gorilla", "orangutan") var3 # Items in an array can be accessed with square brackets: [ ] var3[1] var3[2] var3[3] var3[4] # FOR-LOOPS # A "for-loop" can be used to loop through an array, for # example to print or to search for a match: # ( copy-and-paste from "for" to "}" ) # Print each item in var3: for (i in 1:4) { print(var3[i]) } # PRINT ONLY MATCHES, WITH IF-ELSE STATEMENT # * "If-else" statements allow the code to # perform an action, *if* a particular # TRUE/FALSE statement is true. # # * Here, the variable "i" iterates through # 1, 2, 3, and 4. For each i, var3 is # checked to see if it equals "human". # If so, it prints "match found". If not, # it prints "match not found". # # * If-else statements allow more complex # analyses. # # ( copy-and-paste from "for" to the last "}" ) for (i in 1:4) { print(var3[i]) if (var3[i] == "human") { print("match found!") } else { print("match not found!") } } # FUNCTIONS # # Functions are pre-written code that takes an input # and produces an output. All functions end with # round brackets -- "(" and ")" -- which is where # the inputs go. # # We saw an example already, with the print() # function: print(var3) print(var3[1]) print(var3[3]) # Many R functions are statistical. For example, # random number generation: # Draw a random number from a normal distribution # with a mean of 1.0, standard deviation of 0.1 rnorm(n=1, mean=1.0, sd=0.1) # Do it several times rnorm(n=1, mean=1.0, sd=0.1) rnorm(n=1, mean=1.0, sd=0.1) rnorm(n=1, mean=1.0, sd=0.1) rnorm(n=1, mean=1.0, sd=0.1) rnorm(n=1, mean=1.0, sd=0.1) # Generate 100 such numbers and store in var4 var4 = rnorm(n=100, mean=1.0, sd=0.1) # R functions can also do statistical plots # Here is a histogram plot of var4 # Looking at var4 as a list of nubmers is difficult var4 # But, looking at a histogram is easy: hist(var4) # Repeat the histogram with 10,000 samples from # a normal distribution. Does this look # closer to the theoretical expectation for # a normal distribution (a "bell curve")? var5 = rnorm(n=10000, mean=1.0, sd=0.1) hist(var5) # LIBRARY-ING R PACKAGES, TO ACCESS MORE FUNCTIONS # # R packages store functions for specific scientific # purposes. For example, just like R can generate # random numbers, R can generate random phylogenetic # trees, with the "rtree()" function: # Paste into R: tr = rtree(n=10, rooted=TRUE) # Oops, that produced an error! The "rtree" (random tree) # function is part of the "ape" package. So we have to # library() the ape package first: library(ape) tr = rtree(n=10, rooted=TRUE) # No error that time, since we did "library(ape)". # Type "tr" to see what is in the variable "tr": tr # That's a tree with 10 tips! Let's see what it # looks like, with the plot() function: plot(tr) # Let's add a title title("This is a randomly-generated phylogenetic tree, with 10 tips.") # Let's make a bigger tree, with 40 tips: tr = rtree(n=40, rooted=TRUE) plot(tr) title("This is a randomly-generated phylogenetic tree, with 40 tips.") ########################################################## # CONCLUSION TO R CRASH COURSE # # That's enough basic R for now. The rest of the script # will use various R functions to load and display # phylogenies. Your job will be to copy-and-paste # the commands, look at the tree graphic, and # learn about the different ways phylogenetic trees # are displayed, interpreted -- and mis-interpreted! ########################################################## ########################################################## ########################################################## # HELPFUL R TRICKS # # 1. UP ARROW -- In RStudio and most other R platforms, # hitting the "up" arrow on your keyboard will bring # back your previous commands. This can be very # helpful for saving typing, going back to an earlier # step, etc. # # 2. Control-F or Command-F -- These keys are a shortcut # for "Find" in many programs. They let you search # a webpage, text file, PDf, or Word file for a # specific character string. This can be useful to # find a specific question in a large document. E.g. # to find Q32 in this long R script, type Control-F # and then type "Q32". (whether it is Control-F or # Command-F depends on the operating system. Typically # on Windows, it is Control-F). # # 3. dev.off() -- Many of the plots below will be done by: # a. Opening a PDF file for writing, with the "pdf" # command. # b. Sending graphics into the PDF with the "plot" # command # c. Stopping writing the PDF file with "dev.off()" # d. Opening the PDF for viewing with "system(cmdstr)" # # If you accidentally skip step (c), all future plotting # commands will be adding to the PDF opened in (a), # rather than into new PDFs. This can be very confusing, # and produce error messages about broken PDFs etc. # If it seems like this is happening: # * run "dev.off()" several times to close any # writing-open pdfs and any R graphics windows. # * Also, manually close (with clicking) any # graphics PDFs you are viewing, so that you # can see the new versions when they are made/opened. ########################################################## ########################################################## ####################################################### ####################################################### # INSTRUCTIONS: COPYING-AND-PASTING CODE CHUNKS ####################################################### ####################################################### # # For small commands, it is fine to paste in the # commands one, or a few, lines at a time. # # However, for larger plots, it is best to paste # in "chunks" of code. To make it clear where to # do this, we have labeled the "code chunks" like this: # ####################################################### # START CODE CHUNK #1 ####################################################### # # (some R code) # ####################################################### # END CODE CHUNK #1 ####################################################### # # When you see a code chunk, copy-and-paste the # whole chunk at once. # # There are 17 labeled code chunks in the lab exercise, # or 25 if the bonus material in part 6 is included. ####################################################### ####################################################### # PART III. INTRODUCTION TO TREES IN R ####################################################### # Let's make sure our R packages are library()-ed: library(ape) library(phangorn) # A. Phylogenies: Basic terminology ####################################################### # START CODE CHUNK #1 ####################################################### # Let's load a simple phylogeny great_ape_newick_string = "(((human:6,chimpanzee:6):1,gorilla:7):5,orangutan:12);" great_ape_phylogeny = read.tree(file="", text=great_ape_newick_string) great_ape_phylogeny # And plot the phylogeny to a PDF # Open the PDF for writing pdffn = "chunk01_Fig1_ape_phylo.pdf" pdf(file=pdffn, width=6, height=6) # Send the plot to the PDF plot(great_ape_phylogeny) axisPhylo() mtext(side=1, text="Millions of years ago", line=2) title("Chunk 1, Fig.1: Simple phylogeny of the great apes") # Close the writing of the PDF, open for viewing dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #1 ####################################################### # Terminology: # Phylogenies have several parts: # 1. Nodes. There are 3 kinds of nodes: # a. "Tip node", or "leaf" -- these are the "tips" of # the tree. Each tip represents a species. # b. "Internal node" -- these are the branching # points of the tree. Each represents a # speciation event. # c. "Root node". This is the internal node at the # "bottom" of the tree, i.e. the "roots" of # the tree. # # 2. Branches or edges. The branches are the lines # connecting the nodes. # a. As you can see, some branches have different # lengths. However (KEY POINT), whether or not # the displayed branch lengths are meaningful # depends on what kind of phylogeny is being # displayed. # b. In CLADOGRAMS, the branch lengths are # not meaningful. Cladograms just display # the hierarchical pattern of relationships # between species, also known as the # "tree topology". In cladograms, # the branches may differ in length, but # this is done only for graphical # convenience. # b. In PHYLOGRAMS, the branch lengths are # meaningful. The branch lengths can # represent: # - Amount of molecular change (e.g., the # amount of DNA change) # - Amount of morphological change (e.g., # the number of changes in discrete # characters) # - Amount of time elapsed. # # 3. Tip names, or "tip labels". Typically tip # names are the common or scientific names of # the species being studied, and the tree is a # tree of species, but tips can also be # gene names, specimen names, populations, # languages, or anything else that one # might want to study the phylogenetic # history of. # # 4. Phylogenetic graphics are scientific figures, # often used in the figures in textbooks, and # in scientific publications. As such they # often have additional information beyond # just the nodes, branches, and tip labels # of the phylogeny. These can include: # # a. Scale bar or timescale, showing the # units of branch length # b. Axis label # c. Descriptive title. # # Now, let's answer some basic questions, using # the information above, and the tree you have # plotted. # # Chuck 1, Q1: Is the phylogeny we are looking at a # phylogram or cladogram? # # Chuck 1, Q2: Which species are most-closely related # in this phylogeny? # # Chuck 1, Q3: Can a cladogram have a scale bar? # # Chuck 1, Q4: How many clades are shown in this tree? # # # R assigns each node a node number, in order # to keep track of the tree structure in the # computer. We can # display these node numbers on the plot. Run # this code and answer the following questions. ####################################################### # START CODE CHUNK #2 ####################################################### pdffn = "chunk02_Fig2_apes_node_numbers.pdf" pdf(file=pdffn, width=6, height=6) # Plot the phylogeny plot(great_ape_phylogeny) axisPhylo() mtext(side=1, text="Millions of years ago", line=2) title("Chunk 2, Fig. 2: Simple phylogeny of the great apes\n(with node numbers)") # Add the node numbers to the plot nodelabels() tiplabels() dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #2 ####################################################### # Chunk 2, Q5: Give the node numbers of the tip nodes # Chunk 2, Q6: Give the node numbers of the internal nodes # Chunk 2, Q7: Give the node number of the root node # # Chunk 2, Q8: Give the age of each of the 7 nodes # # Chunk 2, Q9: Which node represents the split between # humans and chimps? # Chunk 2, Q10: Which node represents the split between # humans and gorillas? # Chunk 2, Q11: Which node represents the split between # gorillas and chimps? # R also assigns numbers to the branches/edges. # Let's plot those. ####################################################### # START CODE CHUNK #3 ####################################################### # Plot the phylogeny pdffn = "chunk03_Fig3_apes_branch_numbers.pdf" pdf(file=pdffn, width=6, height=6) plot(great_ape_phylogeny) axisPhylo() mtext(side=1, text="Millions of years ago", line=2) title("Chunk 3, Fig. 3: Simple phylogeny of the great apes\n(with branch numbers)") # Add the branch numbers to the plot edgelabels() dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #3 ####################################################### # Chunk 3, Q12: Which branch is longest? How long is it? # (include the units in your answer) # Chunk 3, Q13: Which branch is shortest? How long is it? # (include the units in your answer) # Chunk 3, Q14: Why are there 6 branches, and not 9 or 12? # Displaying the same tree in another way # might help you answer (see code chunk 4): ####################################################### # START CODE CHUNK #4 ####################################################### pdffn = "chunk04_Fig4_apes_diagonal.pdf" pdf(file=pdffn, width=6, height=6) plot(great_ape_phylogeny, type="c") axisPhylo() mtext(side=1, text="Millions of years ago", line=2) title("Chunk 4, Fig. 4: Simple phylogeny of the great apes,\ndiagonal branch view") edgelabels() dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #4 ####################################################### # Cladograms versus Phylograms # Let's compare 2 versions of the great ape phylogeny: ####################################################### # START CODE CHUNK #5 ####################################################### great_ape_newick_string1 = "(((human:6,chimpanzee:6):1,gorilla:7):5,orangutan:12);" great_ape_newick_string2 = "(((human,chimpanzee),gorilla),orangutan);" # And, let's plot them both: great_ape_phylo1 = read.tree(file="", text=great_ape_newick_string1) great_ape_phylo2 = read.tree(file="", text=great_ape_newick_string2) # Plot 2 graphics on same page: dev.off() # (closes previous graphics windows) # Plot into a PDF pdffn = "chunk05_Fig5ab_compare_2_great_ape_phylogenies.pdf" pdf(file=pdffn, width=8.5, height=11) par(mfrow=c(2,1)) plot(great_ape_phylo1) title("Chunk 5, Fig. 5a: Tree from great_ape_newick_string1") plot(great_ape_phylo2) title("Chunk 5, Fig. 5b: Tree from great_ape_newick_string2") dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #5 ####################################################### # Chunk 5, Q15: Which tree is a cladogram? # # Phylogenetic trees are displayed in R, but they are # stored in a simple, textual format, called # parenthetical or "newick" format. In R, chunks # of text are called "strings" of characters, so # I called the variables "great_ape_newick_string1" # and "great_ape_newick_string2". # # Chunk 5, Q16: Look at the two newick_strings. What is the difference # between them? How does that connect to the difference # between a cladogram and a phylogram? # # Chunk 5, Q17: The "newick" format for phylogenies was invented # by some evolutionary biologists in the 1980s. # Google the origin of the name "newick" -- where # does the name come from? # ################################################################### # PART IV. VERTEBRATE PHYLOGENY ################################################################### # Now we will explore a more complex phylogeny: a backbone # tree of vertebrates, focusing on the tetrapods. Again, # in addition to the tree-thinking # skills, LEARN THE STRUCTURE OF THE VERTEBRATE PHYLOGENY. YOU # SHOULD BE ABLE TO DRAW IT FROM MEMORY (meaning, know the cladogram; # you should have a general idea of the divergence times, but # these are only known approximately and are still debated to # some extent). The vertebrate cladogram is basic # knowledge for all of your future work in biology and evolution. # # This phylogeny is simplified from Chiari et al. (2012). # ####################################################### # START CODE CHUNK #6 ####################################################### # Get the Newick string and load the tree. dev.off() # (closes previous graphics windows) pdffn = "chunk06_Fig6_compare_2_great_ape_phylogenies.pdf" pdf(file=pdffn, width=9, height=9) vert_newick_str = "(shark:471,(tuna:432,(lungfish:416,(frog:352.6457263,(((((kiwi:81.06707278,seagull:81.06707278):132.1992512,crocodile:213.266324):38.01256425,turtle:251.2788883):20.2733676,((wall_lizard:134.7070246,(snake:127.5268735,anole_lizard:127.5268735):7.180151025):85.293,Tuatara:220):51.5522313):42.65854432,(platypus:187.9246145,(opossum:147.3778793,human:147.3778793):40.5467352):126.2861857):38.43492607):63.35427375):16):41);" vert_phylo = read.tree(file="", text=vert_newick_str) plot(vert_phylo) axisPhylo() title("Chunk 6, Fig. 6: Backbone phylogeny of vertebrates") mtext(side=1, text="Ma (millions of years ago)", line=3) dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #6 ####################################################### # Answer these basic questions: # # Chunk 6, Q18: Is this a phylogram or cladogram? # # Chunk 6, Q19: In the term "tetrapod", what do "tetra" and "pod" mean? # # Chunk 6, Q20: Does everything in the tetrapod group fit this # description, or does the term describe # the inferred common ancestor of the group? # # Chunk 6, Q21: Approximately when did tetrapods begin to diversify? # # Chunk 6, Q22: I have put an example species of each major group at the tips. # Obviously, this phylogeny does not contain all of the species # in these groups. On a paper copy of Figure 6, write each # group name next to the tip with its representative species: # Names corresponding to a single tip in this tree: # Actinopterygii # Amphibia # Serpentes # Sphenodon # Testudines # Crocodilia # Paleognathae # Neognathae # Placentalia # Monotremata # Marsupalia # Chondrichthyes # Neoceratodus # # Name corresponding to 3 tips in the tree # Squamates # # # Chunk 6, Q23: For each of the groups above, look up the # approximate diversity (=number of species) # in the group, and write it next to the group # name. # # Chunk 6, Q24: Based on the phylogeny and further reading, # why is New Zealand's tuatara much # famous than most other reptiles? # # Chunk 6, Q25: Based on the phylogeny and further reading, # why are Australia's platypus and echidna # more famous than most other mammals? # # Chunk 6, Q26: Based on the phylogeny and further reading, # why is Australia's lungfish more famous than # most other fishes? # Polyphyletic groups versus monophyletic groups (=clades). # # Taxonomy is the science of classifying organisms. A taxon # (plural: taxa) is the name of a group of organisms. These # can be a species name, genus name, some other Linnaean # ranked taxon (order, class, etc.), or some unranked group # name. # # Linnaean taxonomy, which goes back to Linnaeus in the 1700s, # relies on a classification of groups-within-groups: species # are grouped into genera, genera into families, etc. These # groupings were typically done based on overall similarity of # shared characteristics. # # Phylogenetic systematics has evolved in recent decades, and # a key idea of phylogenetic systematics is that taxa should # be monophyletic groups, a.k.a. clades. A monophyletic taxon # includes a common ancestor and all of its descendants. If # a proposed taxon includes leaves out some of the # descendants of the common ancestor, it is paraphyletic. If # a proposed taxon leaves out the common ancestor, it is # polyphyletic. # # Phylogenetic systematists prefer monophyletic taxa, because # they are evolutionarily "real" -- they refer to coherent # groups on the Tree of Life (the phylogeny), which should # be discoverable by independent investigators using # independent datasets. Non-monophyletic groups are to some # extent arbitrary -- typically based on key characteristics # that humans find interesting, or overall appearance. # # During a phylogenetic analysis, researchers sometimes # discover Linnaean ranked taxa to be # monophyletic, and sometimes para- or polyphyletic. # Typically, monophyletic taxa are kept, and non- # monophyletic taxa are replaced with new names representing # proposed clades. Some scientists prefer to keep the # Linnaean ranked system (but modifying it to make all # groups monophyletic), some argue we should abandon # the Linnaean ranks (genus, family, etc.) and just # use unranked clade names. This debate is ongoing, so # you need to be aware that different scientific authors # take different positions. # # Here, we will practice our skills at identifying # monophyletic, polyphyletic, and paraphyletic groups. # # Chunk 6, Q27: Using brackets, label these traditional groups # (i.e., as understood in popular culture) on the # paper copy of the "Figure 6" tree: # # Names corresponding to one or more tips in the tree: # # mammals # reptiles # birds # amphibians # fishes # Chunk 6, Q28: Of the five groups, name the groups that are # are not monophyletic with respect to the other # four. # # Chunk 6, Q29: For the non-monophyletic groups, are they # polyphyletic or paraphyletic? # # Chunk 6, Q30: For each non-monophyletic group, which # other groups would have to be included # to make them monophyletic? # # Chunk 6, Q31: If a scientist says, "phylogenetically # speaking, birds are just a subgroup of # reptiles", what do they mean? # # Chunk 6, Q32: If a scientist says, "phylogenetically # speaking, tetrapods are just a subgroup # of fishes", what do they mean? # # Chunk 6, Q33: If a taxon "fishes" is defined as a # monophyletic group, are humans then # just an interesting type of fish, # phylogenetically speaking? # # Chunk 6, Q34: If a taxon "reptiles" is defined as a # monophyletic group, are humans then # just an interesting type of reptile, # phylogenetically speaking? # # Chunk 6, Q35: If a taxon "reptiles" is defined as a # monophyletic group, are birds then # just an interesting type of reptile, # phylogenetically speaking? # # Let's add some dinosaurs to this tree: # ####################################################### # START CODE CHUNK #7 ####################################################### vert_plusDinos_newick_str = "(shark:471,(tuna:432,(lungfish:416,(frog:352.6457263,(((((((((kiwi:71.06707278,seagull:71.06707278):60,Confuciusornis:5):20,Archaeopteryx:5):7.1992512,Velociraptor:80):30,Brontosaurus:50):25,crocodile:213.266324):38.01256425,turtle:251.2788883):20.2733676,((wall_lizard:134.7070246,(snake:127.5268735,anole_lizard:127.5268735):7.180151025):85.293,Tuatara:220):51.5522313):42.65854432,(platypus:187.9246145,(opossum:147.3778793,human:147.3778793):40.5467352):126.2861857):38.43492607):63.35427375):16):41);" vert_plusDinos_phylo = read.tree(file="", text=vert_plusDinos_newick_str) pdffn = "chunk07_Fig7_backbone_dinos_wFossils.pdf" pdf(file=pdffn, width=9, height=9) plot(vert_plusDinos_phylo) axisPhylo() title("Chunk 7, Fig. 7: Backbone phylogeny of vertebrates,\nsome dinosaurs added") mtext(side=1, text="Ma (millions of years ago)", line=3) dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #7 ####################################################### # Chunk 7, Q36: When someone says, "Actually, birds are # a type of dinosaur", what do they mean? # # Chunk 7, Q37: The smallest-known dinosaur is a species # living today. What is it? (use google) # # Chunk 7, Q38: What two groups of tetrapods living today can fly? # # Chunk 7, Q39: If "flying tetrapods" was proposed as a taxon, # would this be monophyletic, paraphyletic, or # polyphyletic? Why? # ####################################################### # Crown groups and stem groups # # As our ability to infer phylogenies has been improved, # it has become useful to distinguish between "crown" # and "stem" groups. These terms are most useful for # fossils, which often are related to, but fall outside # of, a living clade ("the crown"). # # # Definitions: # # To define a "crown group", pick 2 living taxa, and go back # to their common ancestral node. Anything descending from # this node is in the crown group. # # A "stem group" is any group that falls outside of this # crown. # # This distinction is useful because it avoids trying to solve # the problem of what defines "a bird", "a mammal", etc. # Because evolution is slow and gradual, the characters found # in these groups assembled gradually and not all a once. # # For example: all living birds descend from a common ancestor # that was warm-blooded, flew, had feathers, etc. However, we # find various fossils that have some, but not all, of the # characteristics of living birds. These are "transitional # fossils". Should we call them "birds", "bird-like # dinosaurs", "dinobirds", or what? We can pick a particular # characteristic, but this is arbitrary and sometimes produced # polyphyletic groups. # # The crown/stem distinction resolves this issue: fossils like # Archaeopteryx do not fall inside of crown birds. They are # stem groups, on the stem of crown birds. The crown group # they fall into is defined by the common ancestor of # crocodylians and living birds. # # # Chunk 7, Q40: Here, the two fossil birds are stem groups of # what extant crown clade? # (Hint: Google "crown group of birds". Yes, the # Wikipedia article on crown groups is accurate # as of May 7, 2019). # # Chunk 7, Q41a: What taxa in your tree represent the # smallest living crown clade that includes the # dinosaurs? # # Chunk 7, Q41b: What is the name of this group? # (You may have to google this!) (Hint: the # crown clade here is defined by the common # ancestor of 2 living groups: crococylians # and birds. So, google "crown group crocodiles # birds.") ####################################################### ####################################################### # Tree-thinking: avoiding notions of "progress" and # "higher" versus "lower" ####################################################### # # Going back to the ancient Greeks, it has been # common to think of life as a "Great Chain of # Being", or a ladder. It looks something like # this: # # gods # angels # humans # apes # mammals # reptiles # amphibians # fishes # other squishy things ("lower animals") # plants # rocks # # This way of thinking is so deeply embedded that # it often creeps into both popular and scientific # discussions of evolution and phylogeny. I even # did it by accident in our vertebrate phylogeny. # # Let's display the phylogeny in the way it is # often shown in textbooks: ####################################################### # START CODE CHUNK #8 ####################################################### pdffn = "chunk08_Fig8_subset_vert_cladogram.pdf" pdf(file=pdffn, width=9, height=9) plot(vert_phylo, type="cladogram", direction="upwards", use.edge.length=FALSE) title("Chunk 8, Fig. 8: Backbone cladogram of vertebrates") # Or, even more simplified: tips_to_drop = c("kiwi", "seagull", "turtle", "crocodile", "wall_lizard", "anole_lizard", "Tuatara") vert_phylo_subset = drop.tip(vert_phylo, tip=tips_to_drop) plot(vert_phylo_subset, type="cladogram", direction="upwards", use.edge.length=FALSE) title("Chunk 8, Fig.8: Backbone cladogram of vertebrates\n(subset)") dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #8 ####################################################### # # Here, because we read left-to-right, the display of # the cladogram tends to accidentally reinforce # the ladder-like, "Great-Chain-of-Being" misunderstanding # of evolution. # # To understand that this "linear view of evolution" is # a misconception, one has to understand that: # # IN VIEWING PHYLOGENIES, NODE-ROTATION IS ARBITRARY # # ...meaning, that any of the branches may be rotated # around the nodes, and EXACT SAME INFORMATION IS # ENCODED. A researcher might chose to arrange a # tree a certain way in order to emphasise a # particular point, or a particular clade of # interest, but these are basically subjective decisions. # ####################################################### # Demonstrating node rotations: ####################################################### # START CODE CHUNK #9 ####################################################### # Open PDF pdffn = "chunk09_Fig9_compare_2_vert_cladograms.pdf" pdf(file=pdffn, width=6, height=9) # Show 2 plots in one PDF par(mfrow=c(2,1)) plot(vert_phylo_subset, type="phylogram", direction="upwards", use.edge.length=FALSE) title("Chunk 9, Fig. 9a: Subset cladogram of vertebrates,\n(mammals at the right)") # (rotating some nodes) vert_phylo_subset2 = rotate(phy=vert_phylo_subset, node=c(12)) vert_phylo_subset2 = rotate(phy=vert_phylo_subset2, node=c(13)) vert_phylo_subset2 = rotate(phy=vert_phylo_subset2, node=c(14)) plot(vert_phylo_subset2, type="phylogram", direction="upwards", use.edge.length=FALSE) title("Chunk 9, Fig. 9b: Subset cladogram of vertebrates,\n(amphibians at the right)") # Stop writing PDF, open to screen dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #9 ####################################################### # Chunk 9, Q42: Does the mammal group "look" more primitive in # Figure 9a versus Figure 9b? Why might someone think this? # # Chunk 9, Q43: Is the mammal group ACTUALLY any more "primitive" # or "advanced" in Figure 9a versus Figure 9b of # the phylogeny? Why not? # # Arranging the full tree to put the most-recently diverging # groups at the right: ####################################################### # START CODE CHUNK #10 ####################################################### # Show two plots in one PDF # Open PDF pdffn = "chunk10_Fig10_compare_2_vert_cladograms.pdf" pdf(file=pdffn, width=6, height=9) # Show 2 plots par(mfrow=c(2,1)) plot(vert_phylo, type="phylogram", direction="upwards", use.edge.length=FALSE) title("Chunk 10, Fig. 10a: Backbone cladogram of vertebrates,\n(mammals at the right)") vert_phylo2 = ladderize(vert_phylo, right=FALSE) plot(vert_phylo2, type="phylogram", direction="upwards", use.edge.length=FALSE) title("Chunk 10, Fig. 10b: Backbone cladogram of vertebrates,\n(most-recent divergences at the right)") # Stop writing PDF, open to screen dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #10 ####################################################### # Chunk 10, Q44: What reptile species "appears" to be closest to # humans in Figure 10a? # # Chunk 10, Q45: What reptile species is ACTUALLY closest to humans, # evolutionarily speaking, according to a proper # reading of these phylogenies? (Note: trick question!) # # Chunk 10, Q46: Counting from the initial divergence of the tetrapods, # approximately how many million years of evolution # has each of these groups experienced? # # humans # opossum # platypus # snake # Tuatara # crocodile # frog # # Chunk 10, Q47: Which of these groups is "most evolved"? # # (Note: there may be some trick questions in here!) # ####################################################### # PART V. MAPPING CHARACTER EVOLUTION ON A PHYLOGENY: # FEATHERED FLIGHT AND VIVIPARITY ####################################################### # Here, we will practice mapping character evolution # on a phylogeny. This requires defining some terms: # # character: Any characteristic of a species that is # well-enough defined that it can be reliably # recognized in a species. Examples could includee: # feather colour, number of legs, presence of a tail, # etc. These would be "morphological" characters -- # characters about the physical morphology # (morphology = "study of shape") of an organism. # # Many other kinds of characters exist, e.g.: # - behavioural (parental care, mating behaviour) # - cytological (cell structure) # - karyotypic (chromosome number and structure) # - molecular (DNA and amino acid sequences) # # A "character map" is a reconstruction of the # evolutionary history of the character, on a # phylogeny. # # (As a statistical phylogeneticist, I prefer to # call these reconstructions "statistical estimates", # but "reconstruction" is an easier term to begin # with.) # # There are a number of methods to reconstruct # the history of a character on a phylogeny. A # method that is easy to understand intuitively # is "parsimony". A parsimonious reconstruction # is a history that explains the evolutionary # history of a character using the smallest # number of character-change events. # # Other ways to say this: "The most parsimonious # character history is the one that minimizes # the number of changes in the character." # # "The most parsimonious character history is # the one that minimizes the number of character # steps." # # If evolutionary changes in the character are # rare in evolutionary time, which is often # the case, then parsimony is a reasonable # way to make an educated guess at the # evolutionary history of a character. # # In future evolution classes, you will learn # more about how parsimony works as a computational # algorithm, as well as more sophisticated # alternatives (Maximum Likelihood and # Bayesian statistical approaches). # # For now, we will just do some intuitive # character reconstructions, and compare # them to the computer's reconstruction. # # Load some character data for: feathered flight # # Note: NEXUS format is a common computer # format for phylogenetic datasets ####################################################### # START CODE CHUNK #11 ####################################################### library(phangorn) # needed for NEXUS files # We are putting a big text string inside '', # then writing it to a temporary text file # (so that we don't have to mess with people # downloading / misplacing files) tetrapod_feathered_flight_nexus_format = '#NEXUS BEGIN DATA; DIMENSIONS NTAX=19 NCHAR=1; FORMAT DATATYPE = STANDARD GAP = - MISSING = ? SYMBOLS = "0 1"; MATRIX shark 0 tuna 0 lungfish 0 frog 0 kiwi 0 seagull 1 Confuciusornis 1 Archaeopteryx 1 Velociraptor 0 Brontosaurus 0 crocodile 0 turtle 0 wall_lizard 0 snake 0 anole_lizard 0 Tuatara 0 platypus 0 opossum 0 human 0 ; END; ' # Write to a temporary file datafn = "chunk11_tetrapod_feathered_flight.nexus" write(x=tetrapod_feathered_flight_nexus_format, file=datafn) # Load the feathered flight data into a variable # named "feathered flight" feathered_flight_ndf = read.nexus.data(file=datafn) state_names = c("no feathered flight", "feathered flight") data_levels = c(0, 1) feathered_flight = phyDat(feathered_flight_ndf, type = "USER", levels=data_levels) feathered_flight # Let's plot the character of "feathered flight" at the tips # of the tree state_colors = c("brown", "blue") nexus_to_tree_tiporder = match(x=names(unlist(feathered_flight_ndf)), table=vert_plusDinos_phylo$tip.label) colors_to_plot = state_colors[1+as.numeric(unlist(feathered_flight_ndf))[nexus_to_tree_tiporder]] pdffn = "chunk11_Fig11_distrib_feathered_flight.pdf" pdf(file=pdffn, width=9, height=9) plot(vert_plusDinos_phylo, label.offset=15) axisPhylo() mtext(side=1, text="Ma (millions of years ago)", line=3) tiplabels(text=NULL, tip=1:length(vert_plusDinos_phylo$tip.label), col=colors_to_plot, bg=colors_to_plot, pch=21, cex=3.5) legend(x="topleft", legend=state_names, fill=state_colors, cex=0.75) title("Chunk 11, Fig. 11: Distribution of the character of 'feathered flight'") dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) # Chunk 11, Q48: Looking at this phylogeny, and the distribution of the # "feathered flight" character, # # Chunk 11, Q48a: How many times do you think feathered flight evolved? On a # paper copy of Figure 11, mark the branch(es) where you think # feathered flight evolved. # # Chunk 11, Q48b: How many times do you think feathered flight was lost? On a # paper copy of Figure 11, mark the branch(es) where you think # feathered flight was lost. # # Let's run a parsimony analyses on this character parsimony_score = parsimony(tree=vert_plusDinos_phylo, data=feathered_flight, method="fitch", site="pscore") parsimony_score ####################################################### # END CODE CHUNK #11 ####################################################### # Chunk 11, Q49: The "parsimony score" is the number of changes reconstructed # in the character. Does this match your guess based on # intuition, above? # Let's do an MPR (Most Parsimonious Reconstruction) of this character # on the tree, then plot the map of the character on the tree. ####################################################### # START CODE CHUNK #12 ####################################################### # MPR analysis ancestral_states = ancestral.pars(tree=vert_plusDinos_phylo, data=feathered_flight, type="MPR", cost=NULL, return="prob") # Plot to a PDF pdffn = "chunk12_Fig12_feathered_flight_reconstruction_under_Fitch_parsimony.pdf" pdf(file=pdffn, width=7, height=7) plotAnc(tree=vert_plusDinos_phylo, data=ancestral_states, i=1, site.pattern=TRUE, col=state_colors, pos=NULL) axisPhylo() titletxt = paste0("Chunk 12, Fig. 12: Mapping of ancestral states under Fitch parsimony\n(equal costs).") title(titletxt) mtext(side=1, text="Ma (millions of years ago)", line=3) # Plot the legend legend(x="topleft", legend=state_names, fill=state_colors, cex=0.5) parsimony_txt = paste0("parsimony score\n(# of steps) = ", parsimony_score) text(x=-8, y=28, labels=parsimony_txt, pos=4) dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) # Chunk 12, Q50: Does the parsimony reconstruction match your reconstruction # based on intuition? Why or why not? # Chunk 12, Q51: Approximately when did feathered flight evolve? (Give an # approximate minimum and maximum date.) # Taxon sampling (which species are included in the phylogeny) # can strongly affect ancestral character reconstructions # # Let's do the same parsimony analysis, but leaving out # the fossil taxa. ####################################################### # START CODE CHUNK #13 ####################################################### # MPR analysis ancestral_states = ancestral.pars(tree=vert_phylo, data=feathered_flight, type="MPR", cost=NULL, return="prob") parsimony_score = parsimony(tree=vert_phylo, data=feathered_flight, method="fitch", site="pscore") parsimony_score # Plot to a PDF pdffn = "chunk13_Fig13_feathered_flight_reconstruction_without_fossils.pdf" pdf(file=pdffn, width=7, height=7) plotAnc(tree=vert_phylo, data=ancestral_states, i=1, site.pattern=TRUE, col=state_colors, pos=NULL) axisPhylo() titletxt = paste0("Chunk 13, Fig. 13: Mapping of ancestral states under Fitch parsimony\n(excluding fossils).") title(titletxt) mtext(side=1, text="Ma (millions of years ago)", line=3) # Plot the legend legend(x="topleft", legend=state_names, fill=state_colors, cex=0.5) parsimony_txt = paste0("parsimony score\n(# of steps) = ", parsimony_score) text(x=-8, y=28, labels=parsimony_txt, pos=4) dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #13 ####################################################### # Chunk 13, Q52: Approximately when did feathered flight evolve, according # to this reconstruction? ####################################################### # Map the appearance of feathers ####################################################### # Now, we will map the appearance of feathers # on the phylogeny. To do this, we have # to load the "feathers" character, by # loading the NEXUS text below. # # I have put "0" for all taxa, indicating # "no feathers". # # Chunk 14, Q53: For the fossil taxa that are coded as having # feathers, how do we know this? (use google) # ####################################################### # START CODE CHUNK #14 ####################################################### tetrapod_feathers_nexus_format = '#NEXUS BEGIN DATA; DIMENSIONS NTAX=19 NCHAR=1; FORMAT DATATYPE = STANDARD GAP = - MISSING = ? SYMBOLS = "0 1"; MATRIX shark 0 tuna 0 lungfish 0 frog 0 kiwi 1 seagull 1 Confuciusornis 1 Archaeopteryx 1 Velociraptor 1 Brontosaurus 0 crocodile 0 turtle 0 wall_lizard 0 snake 0 anole_lizard 0 Tuatara 0 platypus 0 opossum 0 human 0 ; END; ' # Write to a temporary file datafn = "chunk14_tetrapod_feathers.nexus" write(x=tetrapod_feathers_nexus_format, file=datafn) # Load the feathers character into a variable # named "feathers" feathers_ndf = read.nexus.data(file=datafn) state_names = c("no feathers", "feathers") data_levels = c(0, 1) feathers = phyDat(feathers_ndf, type = "USER", levels=data_levels) feathers # Let's run a parsimony analyses on this character parsimony_score = parsimony(tree=vert_plusDinos_phylo, data=feathers, method="fitch", site="pscore") parsimony_score ####################################################### # END CODE CHUNK #14 ####################################################### # Chunk 14, Q54: Does the parsimony score match your guess based on # intuition, above? NOTE: Look in the R window at the result when # you paste in "parsimony_score" (the end of code chunk 14). This # number is the number of changes reconstructed on the tree. # # Let's do an MPR (Most Parsimonious Reconstruction) of this character # on the tree, then plot the map of the character on the tree. ####################################################### # START CODE CHUNK #15 ####################################################### # MPR analysis ancestral_states = ancestral.pars(tree=vert_plusDinos_phylo, data=feathers, type="MPR", cost=NULL, return="prob") # Plot to a PDF pdffn = "chunk15_Fig15_feathers_reconstruction_under_Fitch_parsimony.pdf" pdf(file=pdffn, width=7, height=7) plotAnc(tree=vert_plusDinos_phylo, data=ancestral_states, i=1, site.pattern=TRUE, col=state_colors, pos=NULL) axisPhylo() titletxt = paste0("Chunk 15, Fig. 15: Mapping of feathers under Fitch parsimony\n(equal costs).") title(titletxt) mtext(side=1, text="Ma (millions of years ago)", line=3) # Plot the legend legend(x="topleft", legend=state_names, fill=state_colors, cex=0.5) parsimony_txt = paste0("parsimony score\n(# of steps) = ", parsimony_score) text(x=-8, y=28, labels=parsimony_txt, pos=4) dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #15 ####################################################### # Chunk 15, Q55: Approximately when did feathers evolve? (Give an # approximate minimum and maximum date.) # # Chunk 15, Q56: Did feathers evolve for the function of flight? # How do you know? If they didn't, what might # have the original function of feathers been? # ####################################################### # Another character: # Viviparity (live birth) # ...versus... # Oviparity (lays eggs externally) ####################################################### # The analysis we did with feathers and flight # can be done with any well-defined character. # # One interesting character is "viviparity": the # birthing of live young. Animals that lay # eggs external to the body exhibit "oviparity". # # The popular imagination tends to think "only # mammals have live birth". # # This is actually not true: various fishes, # amphibians, and lizards & snakes have # evolved viviparity (although, it is # true that oviparity is more common, and # represents the ancestral condition). # # But, is it true that live birth is a # defining characteristic of mammals? # ####################################################### # START CODE CHUNK #16 ####################################################### tetrapod_viviparity_nexus_format = '#NEXUS BEGIN DATA; DIMENSIONS NTAX=19 NCHAR=1; FORMAT DATATYPE = STANDARD GAP = - MISSING = ? SYMBOLS = "0 1"; MATRIX shark 0 tuna 0 lungfish 0 frog 0 kiwi 0 seagull 0 Confuciusornis 0 Archaeopteryx 0 Velociraptor 0 Brontosaurus 0 crocodile 0 turtle 0 wall_lizard 0 snake 0 anole_lizard 0 Tuatara 0 platypus 0 opossum 1 human 1 ; END; ' # Write to a temporary file datafn = "chunk16_tetrapod_viviparity.nexus" write(x=tetrapod_viviparity_nexus_format, file=datafn) # Load the feathers character into a variable # named "viviparity" viviparity_ndf = read.nexus.data(file=datafn) state_names = c("oviparity", "viviparity") data_levels = c(0, 1) viviparity = phyDat(viviparity_ndf, type = "USER", levels=data_levels) viviparity # Let's run a parsimony analyses on this character parsimony_score = parsimony(tree=vert_plusDinos_phylo, data=viviparity, method="fitch", site="pscore") parsimony_score ####################################################### # END CODE CHUNK #16 ####################################################### # Let's do an MPR (Most Parsimonious Reconstruction) of this character # on the tree, then plot the map of the character on the tree. ####################################################### # START CODE CHUNK #17 ####################################################### # MPR analysis ancestral_states = ancestral.pars(tree=vert_plusDinos_phylo, data=viviparity, type="MPR", cost=NULL, return="prob") # Plot to a PDF pdffn = "chunk17_Fig17_viviparity_reconstruction_under_Fitch_parsimony.pdf" pdf(file=pdffn, width=7, height=7) plotAnc(tree=vert_plusDinos_phylo, data=ancestral_states, i=1, site.pattern=TRUE, col=state_colors, pos=NULL) axisPhylo() titletxt = paste0("Chunk 17, Fig. 17: Mapping of viviparity under Fitch parsimony\n(equal costs).") title(titletxt) mtext(side=1, text="Ma (millions of years ago)", line=3) # Plot the legend legend(x="topleft", legend=state_names, fill=state_colors, cex=0.5) parsimony_txt = paste0("parsimony score\n(# of steps) = ", parsimony_score) text(x=-8, y=28, labels=parsimony_txt, pos=4) dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #17 ####################################################### # Chunk 17, Q57: Approximately when did viviparity evolve in mammals? # (Give an approximate minimum and maximum date.) # # Chunk 17, Q58: Did viviparity evolve at the same time as crown mammals? # How do you know? # # Chunk 17, Q59: Five living species of mammals lay eggs. Look them up, # and list them here. For each, give the common name, # followed by the scientific name (the Latin binomial) # WITH CORRECT SPELLING AND FORMAT FOR THE SCIENTIFIC NAMES. # # (a simple guide to scientific names is here: # http://entomology.ifas.ufl.edu/frank/kiss/kiss6.htm ) # # ...you should follow the formatting rules for scientific # names in all future academic work (assignments, exam # essays, etc.) # ####################################################### # PART VI. (BONUS MATERIAL) THE PHYLOGENY AND # BIOGEOGRAPHY OF KIWIS, MOAS, AND RELATIVES ####################################################### # # We know that New Zealand and other continental # blocks used to be connected in the ancient # supercontinent of Gondwanaland. The major # continental blocks split at approximately # these times: # # Gondwanaland breakup times (all approximate) # Sources: Gibbs 2016, Yonezawa et al. 2017 # # 130 mya - Africa splits from South America # (beginning of Atlantic Ocean) # 130 mya - India + Madagascar break off # from western Australia # 80 mya - Zealandia splits off from eastern Australia; # South America-Antarctica-Australia # remain connected # ~35 mya - Australia separates from Antarctica # ~30 mya - Drake passage opens, separating Antarctica & South America # # # Vicariance # # Speciation caused by geographic separation # is known as "allopatric" speciation, and # allopatric speciation caused by the # formation of a barrier due to external # factors (e.g., continental breakup) # is known as "vicariance". # # For many years, the geographic distribution # of the flightless ratites has been claimed # as a likely example of vicariance caused # by continental break-up. # # Paleognaths and Neognaths # # Living birds are divided into two clades: paleognaths # and neognaths. The paleognaths include many famous # flightless birds, including the kiwi, the extinct # moas, the Australian emu and cassowary, the # South American rhea, the ostrich, and the # extinct elephant birds of Madagascar. The # tinamous of South America, which can fly, are # also in the paleognath clade. # # The neognath clade contains all other living birds. # # For many years, the flightless paleognaths were # thought to form a clade, named the ratites, with # the flying tinamous as an outgroup. # # For example, a 1997 source (Dingus & Rowe 1997) gives # this cladogram of the ratites, based on an # analysis of morphological characters: ####################################################### # START CODE CHUNK #18 ####################################################### pdffn = "chunk18_Fig18_ratite_clado_1997.pdf" pdf(file=pdffn, width=6, height=6) ratite_newick_str = "(outgroup,(Tinamidae,(((Rhea,Struthio),(Dromaeus,Casuarius)),(Dinornis,Apteryx))));" ratite_phylo = read.tree(file="", text=ratite_newick_str) plot(ratite_phylo, use.edge.length=FALSE) title("Chunk 18, Figure 18: Cladogram of ratite birds\n(after Dingus & Rowe 1997)") dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #18 ####################################################### # Chunk 18, Q60: On a paper copy of Figure 18, look up the common names of # each tip, and the continent or island of origin, and write them # next to the scientific names. # # Chunk 18, Q61: In the 1997 cladogram, what is the closest relative # of the kiwi clade? # # Chunk 18, Q62: Based on this cladogram, how many times did # flightlessness evolve? Mark the change(s) # on a paper version of Figure 18. # # Map the flightlessness character on the cladogram # to check your work: ####################################################### # START CODE CHUNK #19 ####################################################### # Load the flightlessness data ratite_flight_nexus_format = '#NEXUS BEGIN DATA; DIMENSIONS NTAX=8 NCHAR=1; FORMAT DATATYPE = STANDARD GAP = - MISSING = ? SYMBOLS = "0 1"; MATRIX outgroup 0 Tinamidae 0 Dinornis 1 Apteryx 1 Casuarius 1 Dromaeus 1 Struthio 1 Rhea 1 ; END; ' # Write to a temporary file datafn = "chunk19_ratite_flight.nexus" write(x=ratite_flight_nexus_format, file=datafn) # Load the flight data ratite_flight_ndf = read.nexus.data(file=datafn) state_names = c("no flight", "flight") data_levels = c(0, 1) ratite_flight = phyDat(ratite_flight_ndf, type = "USER", levels=data_levels) ratite_flight # Let's run a parsimony analyses on this character parsimony_score = parsimony(tree=ratite_phylo, data=ratite_flight, method="fitch", site="pscore") parsimony_score ####################################################### # END CODE CHUNK #19 ####################################################### ####################################################### # START CODE CHUNK #20 ####################################################### # MPR (Most Parsimonious Reconstruction) of this character ancestral_states = ancestral.pars(tree=ratite_phylo, data=ratite_flight, type="MPR", cost=NULL, return="prob") names(ancestral_states) summary(ancestral_states) # Plot to a PDF pdffn = "chunk20_Fig20_ratite_flight_reconstruction_under_Fitch_parsimony.pdf" pdf(file=pdffn, width=7, height=7) plotAnc(tree=ratite_phylo, data=ancestral_states, i=1, site.pattern=TRUE, col=state_colors, pos=NULL) titletxt = paste0("Chunk 20, Fig. 20: Mapping of ratite flight under Fitch parsimony\n(equal costs).") title(titletxt) # Plot the legend legend(x="topleft", legend=state_names, fill=state_colors, cex=0.5) parsimony_txt = paste0("parsimony score\n(# of steps) = ", parsimony_score) text(x=-8, y=28, labels=parsimony_txt, pos=4) dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #20 ####################################################### # # A 2017 paper by Yonezawa et al. conducted a # comprehensive re-analysis of the phylogeny # of paleognaths, combining morphological # data with DNA from living species and # ancient DNA from extinct subfossils of # moas and elephant birds. This phylogeny # (following on several other studies) # suggested a different history of flight # in the paleognath clade, which in turn # impacts our ideas about its biogeographical # history. # # ####################################################### # Plot Yonezawa et al. 2017 paleognath phylogeny ####################################################### ####################################################### # START CODE CHUNK #21 ####################################################### pdffn = "chunk21_Fig21_paleognath_phylo_Yonezawa_etal_2017.pdf" pdf(file=pdffn, width=9, height=9) paleognaths_newick_str = "(outgroup:5,((Pseudocrypturus:52.16046818,(Paracathartes_howardae:38.45691818,Lithornis:31.21926818):10.4496):6.0879,((Struthio_camelus:83.16381818,Palaeotis:33.80436818):0.51765,(((((Aepynornis:37.21791818,Mullerornis:37.21791818):28.43295,((Apteryx_australis:6.150518182,(Apteryx_australis_mantelli:4.411718182,Apteryx_rowi:4.411718182):1.7388):7.0245,(Apteryx_owenii:4.503068182,Apteryx_haastii:4.503068182):8.67195):52.47585):4.62945,((Emuarius:4.697318182,Dromaius_novaehollandiae:29.26311818):4.46775,(Casuarius_casuarius:6.644018182,Casuarius_bennetti:6.644018182):27.08685):36.54945):2.90115,((Antarctica_fossil:10.22871818,((Eudromia_elegans:29.69886818,Nothoprocta_perdicaria:29.69886818):1.7976,(Crypturellus:23.76426818,Tinamus_major:23.76426818):7.7322):8.5239):16.1595,((Dinornis_giganteus:10.52796818,Megalapteryx_didinus:10.52796818):2.4024,(Anomalopteryx_didiformis:7.508168182,(Emeus_crassus:7.170068182,Pachyornis_australis:7.170068182):0.3381):5.4222):43.2495):17.0016):1.69785,(Diogenornis:9.764618182,(Rhea_americana:10.24236818,Pterocnemia_pennata:10.24236818):58.36635):6.2706):8.80215):26.2857):5);" paleognaths_phylo = read.tree(file="", text=paleognaths_newick_str) plot(paleognaths_phylo) axisPhylo() title("Chunk 21, Fig. 21: Phylogeny of paleognath birds\n(from Yonezawa et al. 2017)") mtext(side=1, text="Ma (millions of years ago)", line=3) dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #21 ####################################################### # Chunk 21, Q63: On a paper copy of Figure 21, write the continental/island # locations, and common names (or short descriptions) of the taxa # listed at the tips of the phylogeny. # # Chunk 21, Q64: In the 2017 phylogeny, what is the closest relative # of the kiwi clade? # # Chunk 21, Q65: Compare the 1997 and 2017 phylogenies. Flightless ratites are a # clade in the 1997 tree. Are they a clade in the 2017 tree? # If not, are ratites paraphyletic or polyphyeletic? ####################################################### # START CODE CHUNK #22 ####################################################### paleognath_flight_nexus_format = '#NEXUS BEGIN DATA; DIMENSIONS NTAX=30 NCHAR=1; FORMAT DATATYPE = STANDARD GAP = - MISSING = ? SYMBOLS = " 0 1"; MATRIX outgroup 1 Pseudocrypturus 1 Paracathartes_howardae 1 Lithornis 1 Struthio_camelus 0 Palaeotis 0 Aepynornis 0 Mullerornis 0 Apteryx_australis 0 Apteryx_australis_mantelli 0 Apteryx_rowi 0 Apteryx_owenii 0 Apteryx_haastii 0 Emuarius 0 Dromaius_novaehollandiae 0 Casuarius_casuarius 0 Casuarius_bennetti 0 Antarctica_fossil 0 Eudromia_elegans 1 Nothoprocta_perdicaria 1 Crypturellus 1 Tinamus_major 1 Dinornis_giganteus 0 Megalapteryx_didinus 0 Anomalopteryx_didiformis 0 Emeus_crassus 0 Pachyornis_australis 0 Diogenornis 0 Rhea_americana 0 Pterocnemia_pennata 0 ; END; ' # Write to a temporary file datafn = "chunk22_paleognath_flight.nexus" write(x=paleognath_flight_nexus_format, file=datafn) # Load the flightlessness vs. flight data into a variable # named "paleognath_flight" paleognath_flight_ndf = read.nexus.data(file=datafn) state_names = c("no flight", "flight") data_levels = c(0, 1) paleognath_flight = phyDat(paleognath_flight_ndf, type = "USER", levels=data_levels) paleognath_flight # Let's plot the character of flightlessness vs. flight at the tips # of the tree pdffn = "chunk22_Fig22_paleognath_phylo_with_flight_plotted.pdf" pdf(file=pdffn, width=9, height=9) state_colors = c("brown", "blue") nexus_to_tree_tiporder = match(x=names(unlist(paleognath_flight_ndf)), table=paleognaths_phylo$tip.label) colors_to_plot = state_colors[1+as.numeric(unlist(paleognath_flight_ndf))[nexus_to_tree_tiporder]] plot(paleognaths_phylo, label.offset=3) axisPhylo() mtext(side=1, text="Ma (millions of years ago)", line=3) tiplabels(text=NULL, tip=1:length(paleognaths_phylo$tip.label), col=colors_to_plot, bg=colors_to_plot, pch=21, cex=2) legend(x="topleft", legend=state_names, fill=state_colors, cex=0.75) title("Chunk 22, Fig. 22: Distribution of the character of 'flight' in paleognaths") dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #22 ####################################################### # Chunk 22, Q66: Intuitively, what does the Yonezawa et al. phylogeny # suggest about the evolution of flightlessness in # the paleognaths? # See if your intuition agrees with the parsimony reconstruction: # Parsimony score on the flight/flightlessness character parsimony_score = parsimony(tree=paleognaths_phylo, data=paleognath_flight, method="fitch", site="pscore") parsimony_score # Chunk 22, Q67: Does the parsimony score match your guess based on # intuition, above? # Let's do an MPR (Most Parsimonious Reconstruction) of this character # on the tree, then plot the map of the character on the tree. ####################################################### # START CODE CHUNK #23 ####################################################### # MPR analysis ancestral_states = ancestral.pars(tree=paleognaths_phylo, data=paleognath_flight, type="MPR", cost=NULL, return="prob") names(ancestral_states) summary(ancestral_states) # Plot to a PDF pdffn = "chunk23_Fig23_paleognath_flight_reconstruction_under_Fitch_parsimony.pdf" pdf(file=pdffn, width=11, height=11) plotAnc(tree=paleognaths_phylo, data=ancestral_states, i=1, site.pattern=TRUE, col=state_colors, pos=NULL) axisPhylo() titletxt = paste0("Chunk 23, Fig. 23: Mapping of paleognath flight under Fitch parsimony\n(equal costs)") title(titletxt) mtext(side=1, text="Ma (millions of years ago)", line=3) # Plot the legend legend(x="topleft", legend=state_names, fill=state_colors, cex=0.75) parsimony_txt = paste0("parsimony score\n(# of steps) = ", parsimony_score) text(x=-8, y=28.5, labels=parsimony_txt, pos=4, cex=0.75) dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #23 ####################################################### # # Chunk 23, Q68: Under this reconstruction, how many times was # flight lost? # # Chunk 23, Q69: Under this reconstruction, how many times was # flight gained? # # Chunk 23, Q70: Does this parsimony reconstruction of the evolution of # flight/flightlessness agree with your intuition? # # Chunk 23, Q71: Do you think it makes sense to think that it is # equally easy for a lineage to lose flight as to # gain flight? # # # Let's run another parsimony analysis, but one where # regaining flight "costs" 10 times as much as losing # flight. (Here the parsimony "cost" is the number of # steps counted per change.) ######################################################## # What if loss-of-flight is irreversible? ######################################################## ####################################################### # START CODE CHUNK #24 ####################################################### cost_of_losing_flight = 1 cost_of_regaining_flight = 10 # Set up a 2x2 cost matrix cost_matrix = matrix(data=c(0,cost_of_losing_flight,cost_of_regaining_flight,0), ncol=2, byrow=TRUE) colnames(cost_matrix) = c("flight","flightless") rownames(cost_matrix) = c("flight","flightless") # Look at the cost matrix cost_matrix # Now let's calculate a parsimony score, using "Sankoff" parsimony # (Before this, we have been using "Fitch" parsimony, which assumes # equal costs for all changes. Under Sankoff parsimony, the user # assigns a cost matrix.) parsimony_score = parsimony(tree=paleognaths_phylo, data=paleognath_flight, method="sankoff", site="pscore", cost=cost_matrix) parsimony_score ####################################################### # END CODE CHUNK #24 ####################################################### # Chunk 24, Q72: What is the parsimony score under this cost matrix? ####################################################### # START CODE CHUNK #25 ####################################################### # Reconstruct the ancestral states under this cost matrix ancestral_states = ancestral.pars(tree=paleognaths_phylo, data=paleognath_flight, type="MPR", cost=cost_matrix, return="prob") pdffn = "chunk25_Fig25_paleognath_ancestral_flight_under_Sankoff_parsimony.pdf" pdf(file=pdffn, width=11, height=11) plotAnc(tree=paleognaths_phylo, data=ancestral_states, i=1, site.pattern=TRUE, col=state_colors, pos=NULL) titletxt = paste0("Chunk 25, Fig. 25: Mapping of ancestral states under\nSankoff parsimony (unequal costs). Cost of regaining flight=", cost_of_regaining_flight) title(titletxt) axisPhylo() mtext(text="Millions of years ago (mega-annum, Ma)", side=1, line=2) # Plot the legend legend(x="topleft", legend=state_names, fill=state_colors) parsimony_txt = paste0("parsimony score\n(# of steps) = ", parsimony_score) text(x=-8, y=28, labels=parsimony_txt, pos=4) dev.off() cmdstr = paste0("open ", pdffn) system(cmdstr) ####################################################### # END CODE CHUNK #25 ####################################################### # Chunk 25, Q73: Under Sankoff parsimony with this cost matrix, how many times # was flight gained? # # Chunk 25, Q74: How many times was flight lost? # # Chunk 25, Q75: What does this reconstruction suggest about the vicariance # explanation for the distribution of the ratites? (You # don't have to have a "final" answer, these questions are # still being actively researched.) # ################################################################### # REFERENCES # # Chiari, Ylenia; Cahais, Vincent; Galtier, Nicolas; # Delsuc, Frederic (2012). "Phylogenomic analyses support the # position of turtles as the sister group of birds and # crocodiles (Archosauria)." BMC Biology, 10:65. # https://doi.org/10.1186/1741-7007-10-65 # # Dingus, Lowell; Rowe, Timothy (1998). The Mistaken # Extinction & CD-Rom: Dinosaur Evolution and the Origin # of Birds (Academic Version), 1st Edition. W. H. Freeman, # pp. 1-332 + CD-ROM. (ratite cladogram on CD-ROM) # # Gibbs, George C. (2016). Ghosts of Gondwana: The History of Life # in New Zealand. Fully revised edition. Potton & Burton, pp. 1-367. # # Ihaka, Ross; Gentleman, Robert (1996). "R: A Language # for Data Analysis and Graphics." Journal of # Computational and Graphical Statistics. Volume 5, # Issue 3, pages 299-314. # https://amstat.tandfonline.com/doi/abs/10.1080/10618600.1996.10474713 # # Yonezawa, Takahiro, et al. (2017). Phylogenomics and Morphology # of Extinct Paleognaths Reveal the Origin and Evolution of the # Ratites. Current Biology, 27, 68-77. # https://www.cell.com/current-biology/fulltext/S0960-9822(16)31214-3 ###################################################################