Application of Graph Theory in Phylogenetics: The Primate Approach

Application of Graph Theory in Phylogenetics: The Primate Approach
Image Courtesy: Shutterstock

The theory of evolution seemed puzzling, was counter intuitive and contradicted deeply held prejudices; but it has caused one of the major paradigm shifts in science. This seemingly simple theory whose only tool of operation is a simple mechanism called ‘natural selection’ has the ability to challenge all the ideas relating to the origin of life that different civilizations nurtured for thousand of years. The theory of evolution has ceased to become a study of the domain of ‘biology’ alone and different inter-disciplinary sciences have joined hands to create the perfect representation for the saga of life on this planet. This kind of cooperation is both unique and astonishing. Mathematics has also been a part of this endeavour and in this paper I explore how the concepts of ‘graph theory’ can be exploited in studying ‘phylogeny’, which is the heart of the evolution analysis.


Image Courtesy: Shutterstock

Image Courtesy: Shutterstock


We have inherited a unique planet that hosts an incredible bio-chemical feature called life. There are about 5 to 100 millions species of living organisms on earth today each with a different set of features. Yet they exhibit morphological, biochemical and behavioral similarities that point towards some form of genealogical relationship. The study of such evolutionary relationship among varied groups of living organism is known as phylogeny. This relationship among a wide range of species is pictorially represented in form of a graph called ‘phylogenetic tree’ which is tree is a branched diagram showing evolutionary interrelations between a group of species which have evolved from a common ancestral form. Two closely related species are placed closely and by studying the degree of separation we can gauge how these species will behave. In this paper, we study various properties of the phylogenetic tree through the lens of graph theory using a concept called ‘rooted tree’.


There are numerous biological, chemical and philosophical definitions of life based on the three prominent characteristic aspects of life: living beings grow, interchange energy and reproduce of the same kind. But the most ingenious definition of life from the point of view of physics was given by Erwin Schrodinger in 1944. He defined life as an organised structure that defies ‘the second law of thermodynamics’: the holy grail of classical physics. The celebrated second law of thermodynamics says that entropy of a closed system always increases.i In simple words, entropy is a measurable quantity which roughly tells about the disorderliness of the atoms in an object. Thus the first living object on earth, if such a thing really existed (?), had a chemical combination which could act against this disorderliness. It could keep its atoms in the desired spatial and physical state so that they could stay stable, which according to Richard Dawkins is the first rule of life.ii And this organism must have had a way to pass on this characteristic to its next lineage and so on. In each passage of genetic material from one generation to the other, there are some subtle variations called mutation which bring minute changes in the offspring. The nature selects the ‘favourable changes’. ‘Natural selection not only enhances the reproductive success of favourable variants but also diminishes the reproductive success of unfavorable ones.’iii After a passage of time(which is generally very long), these accumulated changes in the favourable variants bring out a new species.

The concept of natural selection which holds the key to the existence of organisms on earth seems too naïve to be true. But this idea of Charles Darwin is well supported by evidence and an has been an accepted since its first publication in the book titled ‘On the origin of species by means of natural selection, or the preservation of the favoured races in the struggle for life’ in 1859.

The most striking point of Darwin’s theory of evolution (Darwinism) is that a new species can originate only from a species which has been in existence already. If that is so, then all living organism are inter-linked and there must be a way to obtain the complete picture of all the life forms through these connections.

This is done by using a concept called ‘Tree’.

Tree’ is an idea that originated in Kirchhoff’s work of electrical connections. Applying these ideas to phylogeny, we create the ‘phylogenetic tree’, which follows the rules of a ‘tree’ as defined in graph theory. Hence we first look how graph theorist defines a ‘tree’.


A tree is a connected undirected graph with no simple circuits. A rooted tree is a tree in which one vertex has been designated as the root and every edge is directed away from it. A vertex u is called the parent of the vertex v if there exists a directed edge from u to v. When u is the parent of v, then we say that v is the child of u. Vertices with the same parents is called siblings. The ancestors of a vertex v are the vertices in the path from the root to the vertex v. The descendents of a vertex d are those vertices that have d as ancestors. A vertex of a tree is called a leaf if it has no children.iv

Now we list the three most important properties of the tree which will be exploited in this study:

  1. Tree is a connected graph. Hence there always exists a path in between any two pair of vertices. But what is unique about the paths in tree is that these paths are unique.

  2. Tree is a simple graph. It does not contain any multiple edges.

  3. Tree does not contain any cycle or self loop.


In the above figure, the vertex A is the parent of both B and C. Thus B and C are the children of A. B and C are siblings as they have a common parent but E and F are not siblings as their parents are different. The vertex D, B and A are the ancestors of H and I. Thus the descendent of B are D, E, H and I. In this figure E, F, G, H and I (denoted in circle) are leaves as they have no children. A is the root of the tree as all the other vertices are its descendent.


When the vertices of a rooted tree represent species and the edges represent their inter connections with other species then we call that tree to be a phylogenetic tree or evolutionary tree. Thus a phylogenetic tree is a branching diagram that shows the evolutionary relationship between related species which have a common ancestor. For example if each letter A, B, C, D, E, H, I and J in FIGURE 1 represents a distinct species, then that diagram will be an example of phylogenetic tree of the species.

This important tree catches the entire hullabaloo that compels the species to evolve and represents history spanning a million years with an edge no longer than five centimeters long! For example, if the paths from two different species originate from a single vertex then we conclude that they had the same ancestor. Similarly, a long path between two species will necessarily mean that they would have less resemblance.


In FIGURE 2, we present the phylogenetic tree of the different species of gorilla. Here all the vertices represent the species (like Eastern lowland gorilla, mountain gorilla etc). We see that all these different variants of gorilla originated from single species which is the ancestor of all other gorilla species. Likewise we see that Eastern gorilla is the parent of the Eastern lowland gorilla and thus the later is the children of the former’s. Also as both Eastern lowland gorilla and Mountain gorilla have a common parent, hence they are siblings. We notice that Eastern lowland gorilla, Mountain gorilla, Western lowland gorilla and Cross river gorilla are ‘leaves’ as no species evolved from them.


Image Courtesy: Shutterstock

Image Courtesy: Shutterstock

Now we come to the ‘meat’ of the paper. We will be exploring the concept from the perspective of graph theory in studying the phylogenetic tree and will try to arrive on conclusion simply based on mathematical deduction. As a case study, we have selected the phylogenetic tree of the primate. There are two main reasons for this:

  1. The complete phylogenetic tree of all the species on earth is still a distant dream as thousands of new species are being discovered every year.

  2. As we also belong to the primate order, it will be a huge privilege to study the branch of the tree of life that hosts us.


The above phylogenetic tree of the primate represented in FIGURE: 3 will be our focal point of reference for the rest of the discourse in this paper.


It is an exciting idea to learn that all living forms on earth originated from some earliest primitive form of life-like complex biochemical compounds that was formed some 3.5 billion years ago in the sea. Thus at some point human, a dog, crow, lizards, dinosaurs, fishes, worms and cockroaches shared the common ancestor. Suppose we want to find the common ancestor of modern human (that is us) and orangutans, which is one of the great apes. From the phylogenetic tree of the primate we find the ancestors of both the species.










Homo Sapiens










From the above chart, we see that at some point of time both Human and Orangutan had a common ancestor called Hominoidea. Thus at distant past, there lived a mammal which was neither human nor orangutan but it branched in two different directions to give rise to two completely contrasting species.

Now let us find the relative evolutionary distance of human and orangutan from their common ancestor. We say that two species are closely related if they are found in same branch of the phylogenic tree. We see that there are three transitional species in between human and Hominoidea where as there is only one transitional species between orangutan and Hominoidea. Hence we can safely conclude that the extinct species Hominoidea would resemble more to orangutan than a human being. Thus using the simple concept of ‘ancestor’ in a phylogenetic tree we can arrive at some startling conclusion.


When we look at our surroundings, we are astonished to see the variety of life forms that have flourished on earth. What sets us apart is the intelligence which we have inherited; relying on which we challenge nature at every step to claim superiority. It is very logic-defying to believe that such intelligence originated from a unicellular organism only through the process of natural selection. An obvious question that arises is: ‘Is the passage of life from the unicellular root to its prime achievement through zillions of transitional species is unique?’

Fortunately, the graph theory has a very clear and confident answer to this mysterious question and the answer is YES. This can be explained from the fact that the phylogenetic tree, being a tree by definition, does not contain multiple paths. For example, in the phylogenetic tree of the primates, the passage of life from the primates to the modern human is unique. To illustrate this point, we observe the path of transition of the earliest primates that first walked on this planet during Jurassic age which evolved into modern human in 65 million years:










Homo Sapiens






Earliest Primates

If we study the phylogenetic tree, we clearly see that this is the only way of coming down to Modern Human from the Earliest Primates. Thus, humans evolved on earth through a unique lineage and graph theory satisfactorily asserts that.


The phylogenetic tree can give an excellent argument against all those academicians who believe in pseudoscience like ‘intelligent design’. As we move up the ladder in the phylogenetic tree, we observe less and less complex life form. As the phylogenetic tree is rooted, there must be a simplest form of life from which all other life form originated. This nullifies the Intelligent Design’s hypothesis of the existence of a designer’s hand in the process of origin of life.


WE must note that no species in the phylogenetic tree has existed for ever. With passage of time, each one has given space to newer and advanced species. Thus all the species we list in the phylogenetic tree are transitional species: they are heading somewhere. Now where is the human heading for? Being a ‘leaf’ in the evolutionary tree, currently our species does not have any child. This means that there is no species on earth that has descended from us taking along all our genetic materials. Now what is the probability that our species go through ‘negative evolution’ and arrive at some ancestral point? In simple words, can human go through some weird metamorphosis and become an ape again? The graph theory says that it is not possible because the phylogenetic tree do not have any cycle. Hence no path exists from a descendent vertex to an ancestor vertex in an opposite direction. Thus there is no evolutionary track through which a human can go back to become an ape again.

Though it is true that we can not go backward in an evolutionary tree, there is always a way forward. We are evolving into a newer species every moment as evolution is a continuous process. Definitely humans will have to vacate this earth for some advanced species originating from itself in distant future, if the earth stays habitable up to that point. As Carl Sagan has put it, ‘We are the product of 4.5 billion year of fortuitous, slow biological evolution. There is no reason to think evolution has stopped. Humans are transitional animals; not the climax of evolution.’


The phylogenetic tree is the quick reference guide to the process of natural selection that has been is happening on earth since the past 3.5 billion years. The use of graph theory in phylogeny not only makes the study simpler but it also gives some astounding answers to some very important questions. This amalgamation of science from two different disciplines in studying the most vital feature of earth is a concrete example of a shared world we live in. Only through such comprehensive study, we can explore newer horizons in understandings the mechanism of the ‘Greatest Show on Earth2: LIFE’.



1 ‘The Ancestor’s Tale: A Pilgrimage to the Dawn of Life’ is the name of a book by Richard Dawkins.

2 ‘The Greatest Show on Earth: The Evidence for Evolution’ is the name of a book by Richard Dawkins

i E. Schrodinger, What is life: The physical aspect of the living cell

ii R. Dawkins: The selfish gene

iii Monroe W. Strickberger: Evolution, 2nd Edition

iv K H Rosen: Discreet mathematics and its application: Tata McGraw-Hill, 2012

4. and other internet sites.


This article is written by Dhiraj Kumar Deka and was originally published in the Asia Pacific Mathematics Newsletter and is published here under a special permission from World Scientific.