<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Archetype &#187; Cladistics</title>
	<atom:link href="http://roberto.kellerperez.com/category/cladistics/feed/" rel="self" type="application/rss+xml" />
	<link>http://roberto.kellerperez.com</link>
	<description>Ant reconstruction one homology at a time</description>
	<lastBuildDate>Thu, 03 Jun 2010 11:46:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Merriam-Webster on cladistics</title>
		<link>http://roberto.kellerperez.com/2010/06/merriam-webster-on-cladistics/</link>
		<comments>http://roberto.kellerperez.com/2010/06/merriam-webster-on-cladistics/#comments</comments>
		<pubDate>Thu, 03 Jun 2010 11:42:56 +0000</pubDate>
		<dc:creator>Roberto Keller</dc:creator>
				<category><![CDATA[Cladistics]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[Merriam-Webster]]></category>

		<guid isPermaLink="false">http://roberto.kellerperez.com/?p=2128</guid>
		<description><![CDATA[Google indexed this page today from the online version of Merriam-Webster dictionary: Main Entry: cla·dis·tics Pronunciation: \kl?-?dis-tiks, kla-\ Function: noun plural but singular in construction Date: 1965 : a system of biological taxonomy that defines taxa uniquely by shared characteristics not found in ancestral groups and uses inferred evolutionary relationships to arrange taxa in a [...]]]></description>
			<content:encoded><![CDATA[<p>Google indexed <a href="http://www.merriam-webster.com/dictionary/cladistics">this page</a> today from the online version of Merriam-Webster dictionary:</p>
<blockquote><p>Main Entry: <strong>cla·dis·tics</strong><br />
Pronunciation: \kl?-?dis-tiks, kla-\<br />
Function: <em>noun plural but singular in construction</em><br />
Date: 1965</p>
<p>: a system of biological taxonomy that defines taxa uniquely by shared characteristics not found in ancestral groups and uses inferred evolutionary relationships to arrange taxa in a branching hierarchy such that all members of a given taxon have the same ancestors</p>
<p>— <strong>cla·dist</strong> \?kla-dist, ?kl?-\ <em>noun</em><br />
— <strong>cla·dis·tic</strong> \kl?-?dis-tik, kla-\ <em>adjective</em><br />
— <strong>cla·dis·ti·cal·ly</strong> \-ti-k(?-)l?\ <em>adverb </em></p></blockquote>
<p>I don&#8217;t know when was this entry actually added to the dictionary, but it is nicely defined.</p>
<p><img class="aligncenter size-full wp-image-2132" title="hdr_mw_logo_area_160px" src="http://roberto.kellerperez.com/wp-content/uploads/2010/06/hdr_mw_logo_area_160px.jpg" alt="" width="160" height="198" /></p>
]]></content:encoded>
			<wfw:commentRss>http://roberto.kellerperez.com/2010/06/merriam-webster-on-cladistics/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>phyloseminar.org &#8211; February 24th, 1pm (PST)</title>
		<link>http://roberto.kellerperez.com/2010/02/phyloseminar-org-february-24th-1pm-pst/</link>
		<comments>http://roberto.kellerperez.com/2010/02/phyloseminar-org-february-24th-1pm-pst/#comments</comments>
		<pubDate>Tue, 23 Feb 2010 22:44:18 +0000</pubDate>
		<dc:creator>Roberto Keller</dc:creator>
				<category><![CDATA[Cladistics]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[Theory]]></category>
		<category><![CDATA[Noah Rosenberg]]></category>
		<category><![CDATA[phyloseminar]]></category>

		<guid isPermaLink="false">http://roberto.kellerperez.com/?p=2055</guid>
		<description><![CDATA[Do not forget to tune in to tomorrow&#8217;s phyloseminar where Noah Rosenberg will be speaking about consistency properties of species tree inference algorithms under the multispecies coalescent. February 24th at 1pm PST. You can watch him live from the comfort of your computer, but you may want to take some minutes before the seminar to [...]]]></description>
			<content:encoded><![CDATA[<div class="wp-caption aligncenter" style="width: 496px"><a href="http://phyloseminar.org/index.html"><img title="Rosenberg" src="http://phyloseminar.org/rosenberg.png" alt="" width="486" height="430" /></a><p class="wp-caption-text">.</p></div>
<p style="text-align: center;">
<p>Do not forget to tune in to tomorrow&#8217;s <a href="http://phyloseminar.org/index.html">phyloseminar</a> where Noah Rosenberg will be speaking about <em>consistency properties of species tree inference algorithms under the multispecies coalescent</em>. <strong>February 24th at 1pm PST.</strong></p>
<p>You can watch him live from the comfort of your computer, but you may want to take some minutes before the seminar to <a href="http://phyloseminar.org/connect.html">set up your computer</a> and microwave some <a href="http://en.wikipedia.org/wiki/Popcorn_bag">popcorn</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://roberto.kellerperez.com/2010/02/phyloseminar-org-february-24th-1pm-pst/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Pewter Leprechaun Awards Ceremony</title>
		<link>http://roberto.kellerperez.com/2009/12/pewter-leprechaun-awards/</link>
		<comments>http://roberto.kellerperez.com/2009/12/pewter-leprechaun-awards/#comments</comments>
		<pubDate>Tue, 15 Dec 2009 14:16:52 +0000</pubDate>
		<dc:creator>Roberto Keller</dc:creator>
				<category><![CDATA[Cladistics]]></category>
		<category><![CDATA[History of Science]]></category>
		<category><![CDATA[Humor]]></category>

		<guid isPermaLink="false">http://roberto.kellerperez.com/?p=1916</guid>
		<description><![CDATA[You may not fancy public humiliation of scientific papers (ah come on, who doesn&#8217;t?), but the Pewter Leprechaun Awards Ceremony is a fun read. If you want to know what is this all about look here. You really need to know well your history on systematics and biogeography to fully enjoy the piece, but if [...]]]></description>
			<content:encoded><![CDATA[<div id="attachment_1917" class="wp-caption alignright" style="width: 210px"><img class="size-medium wp-image-1917" title="Haeckel" src="http://roberto.kellerperez.com/wp-content/uploads/2009/12/Haeckel-264x300.jpg" alt="Haeckel" width="200" height="226" /><p class="wp-caption-text">E. Haeckel</p></div>
<p>You may not fancy public humiliation of scientific papers (ah come on, who doesn&#8217;t?), but the <a href="http://urhomology.blogspot.com/2009/12/paraphyly-watch-2009-pewter-leprechaun.html">Pewter Leprechaun Awards Ceremony</a> is a fun read. If you want to know what is this all about look <a href="http://urhomology.blogspot.com/2009/01/paraphyly-watch-2009.html">here</a>.</p>
<p>You really need to know well your history on systematics and biogeography to fully enjoy the piece, but if you don&#8217;t you will do well in putting Google to a good use and run some searches on those names. On a side note, I do think Brazeau&#8217;s paper didn&#8217;t deserved the nomination, specially among the other contestants.</p>
<p>I hope they <em>do</em> send a pewter leprechaun to the winner (and blog about it).</p>
]]></content:encoded>
			<wfw:commentRss>http://roberto.kellerperez.com/2009/12/pewter-leprechaun-awards/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Phylogenetics through videoconferencing</title>
		<link>http://roberto.kellerperez.com/2009/12/phylogenetics-through-videoconferencing/</link>
		<comments>http://roberto.kellerperez.com/2009/12/phylogenetics-through-videoconferencing/#comments</comments>
		<pubDate>Tue, 08 Dec 2009 17:11:43 +0000</pubDate>
		<dc:creator>Roberto Keller</dc:creator>
				<category><![CDATA[Cladistics]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[Theory]]></category>
		<category><![CDATA[Direct optimization]]></category>
		<category><![CDATA[Dynamic homology]]></category>
		<category><![CDATA[Frederick Matsen]]></category>
		<category><![CDATA[phyloseminar]]></category>
		<category><![CDATA[POY]]></category>
		<category><![CDATA[Ward Wheeler]]></category>

		<guid isPermaLink="false">http://roberto.kellerperez.com/?p=1882</guid>
		<description><![CDATA[Last night I attended a talk in Lisbon given by Ward Wheeler at the AMNH in New York City and moderated by Frederick Matsen from his home institution in Berkeley, California. The talk was the second on a series of talks in phylogenetics held via videoconferencing. The idea behind phyloseminar.org is to hold regular live [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1897" title="phyloseminar1" src="http://roberto.kellerperez.com/wp-content/uploads/2009/12/phyloseminar1.jpg" alt="phyloseminar1" width="267" height="63" />Last night I attended a talk in Lisbon given by <a href="http://research.amnh.org/scicomp/ward_wheeler.html">Ward Wheeler</a> at the AMNH in New York City and moderated by Frederick Matsen from his home institution in Berkeley, California. The talk was the second on a series of talks in phylogenetics held via videoconferencing.</p>
<p>The idea behind <a href="http://phyloseminar.org/">phyloseminar.org</a> is to hold regular <em>live</em> online seminars in phylogenetic methodology open to anyone around the globe. This is a challenge given the time zone differences of the possible participants, but it does makes the whole event fun: I watched it after dinner at 9:00pm; the presenter gave it at his 4:00pm; while the moderator was there after lunch at his 1:00pm. I saw at least one person among the audience that watched it from the future after breakfast in New Zealand the next day at 10:00am.<span id="more-1882"></span></p>
<p>We used the software <a href="http://evo.caltech.edu/evoGate/">EVO</a>, a free tool specifically designed for scientific communication (unlike the Internets that was designed for&#8230; nevermind). Prior to a seminar, you need to install and create an user account so you can then join the phyloseminar channel. It works really well. You see a window with the slideshow and a window with video stream for each participant (to keep things simpler, only the presenter and the moderator had video enabled last night).</p>
<p>For Wheeler&#8217;s talk we were twelve people, and looking at their user accounts (where you can set your location), there were people listening in California, Kansas, New York, Lisbon and New Zealand at least. The talk was 45 minutes long and went on for another 15 minutes of discussion. We could type questions using the chat tool of the software, which were then read by the moderator (again, rather than each person talking to keep things simpler).</p>
<div id="attachment_1899" class="wp-caption aligncenter" style="width: 510px"><img class="size-full wp-image-1899" title="phyloseminar2" src="http://roberto.kellerperez.com/wp-content/uploads/2009/12/phyloseminar2.jpg" alt="Looking right into Wheeler's desktop." width="500" height="530" /><p class="wp-caption-text">Looking right into Wheeler&#39;s desktop.</p></div>
<p>Wheeler&#8217;s talk, <em>Dynamic homology and phylogenetic systematics</em>, was about alignment, or rather methods to avoid having to perform an alignment for phylogenetic inference altogether, something he has been championing for many years now. The idea behind these methods, called <em>direct optimization</em> methods, is easy to understand: when you are comparing DNA sequences in order to reconstruct how species (or genes) are related to each other, you need to match them together to determine which positions along a sequence correspond to which positions in another one, a process called <a href="http://en.wikipedia.org/wiki/Sequence_alignment">sequence alignment</a>. Only then can you asses whether different species have the same or a different base composition in each position&#8211; the raw evidence for evolutionary relatedness. But it happens that, because those sequences are the result of a process of mostly branching evolution (where one species splits to gives rise to two descendant ones), the proper format for comparison between multiple sequences is not a matrix of rows and columns but a phylogenetic tree. The problem is that we don&#8217;t know the shape of this tree because that is what we seek to reconstruct in the first place.</p>
<p>The most common way to address this problem is to perform alignments using tree shapes that we know are a good approximations. Once we find a satisfactory match between our sequences, we proceed with the phylogenetic reconstruction proper, searching for the tree(s) that maximizes our optimality criterion (e.g., parsimony, likelihood). But one caveat of the procedure I just caricatured is that by running the analysis in two steps (alignment and tree search), you impose a restriction on the number of possible combinations you will evaluate. Direct optimization lifts this restriction by performing the sequence matching and tree evaluation in just one step, with the potential result that you may find more optimal solutions. In other words, direct optimization methods are able to perform more thorough exploration of the space of possible solutions.</p>
<p>Now, while the method is easy enough to describe in a post, its mathematical and computational implementation is not simple at all. The amount of operations needed to evaluate just a single tree shape increases exponentially in comparison with vanilla tree searches, and you will be better off performing these calculations in a computer cluster.</p>
<p>The aspect that caused more unease during the talk was Wheeler&#8217;s explanation of the difference between truth and optimality inherent in all these methods (direct optimization or not). Apparently, when you simulate sequence data in order to run it through different programs and evaluate how well each alignment methods does, they all invariably find solutions that are more optimal (more parsimonious, more likely or more probable) than the simulated one. That is, most of the time the optimal solution is different from the true one. The consequence of this is that, since in phylogenetic reconstruction we will never know for certain the true evolutionary history, we are forced to abandon the search for the true solution and will have to content with finding the optimal one.</p>
<p>If this talk was representative of the series, I said the seminars are not for the general audience: you needed a very good grasp of phylogenetic theory; alas, if you know Ward Wheeler you know that his brain runs as fast as his supercomputers. A good thing is that the seminars are being recorded and can be revisited anytime. You can watch the first one by <a href="http://phyloseminar.org/recorded.html">Marc A. Suchard here</a> and the one by Ward Wheeler there soon. The next seminar will address the problem of reconciling gene tress with species trees, and the next three seminars are decided by <a href="http://phyloseminar.org/vote.html">popular vote</a>.</p>
<p>The whole experience was a first for me, and it was real fun.</p>
]]></content:encoded>
			<wfw:commentRss>http://roberto.kellerperez.com/2009/12/phylogenetics-through-videoconferencing/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Patenting cladistics</title>
		<link>http://roberto.kellerperez.com/2009/08/patenting-cladistics/</link>
		<comments>http://roberto.kellerperez.com/2009/08/patenting-cladistics/#comments</comments>
		<pubDate>Thu, 13 Aug 2009 23:50:34 +0000</pubDate>
		<dc:creator>Roberto Keller</dc:creator>
				<category><![CDATA[Cladistics]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[silly]]></category>

		<guid isPermaLink="false">http://roberto.kellerperez.com/?p=1410</guid>
		<description><![CDATA[Every once in a while someone comes and tries to file a patent on some of the very basic algorithms we all use to infer phylogenetic trees. This time is a very &#8220;special&#8221; someone. The image to the left may give you a clue. Read more at Myrmecos blog.]]></description>
			<content:encoded><![CDATA[<p><a href="http://myrmecos.wordpress.com/2009/08/13/will-microsoft-own-phylogenetics/"><img class="alignleft" title="clippy" src="http://myrmecos.files.wordpress.com/2009/08/clippy1.jpg?w=238&amp;h=172" alt="" width="238" height="172" /></a>Every once in a while someone comes and tries to file a patent on some of the very basic algorithms we all use to infer phylogenetic trees.<br />
This time is a very &#8220;special&#8221; someone. The image to the left may give you a clue. Read more at <a href="http://myrmecos.wordpress.com/2009/08/13/will-microsoft-own-phylogenetics/">Myrmecos blog</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://roberto.kellerperez.com/2009/08/patenting-cladistics/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Chris Humphries, botanist and founding fellow of the Willi Hennig Society, died on July 31st, aged 62</title>
		<link>http://roberto.kellerperez.com/2009/08/chris-humphries-botanist-and-founding-fellow-of-the-willi-hennig-society-died-on-july-31st/</link>
		<comments>http://roberto.kellerperez.com/2009/08/chris-humphries-botanist-and-founding-fellow-of-the-willi-hennig-society-died-on-july-31st/#comments</comments>
		<pubDate>Wed, 05 Aug 2009 14:28:17 +0000</pubDate>
		<dc:creator>Roberto Keller</dc:creator>
				<category><![CDATA[Cladistics]]></category>
		<category><![CDATA[Personalities]]></category>

		<guid isPermaLink="false">http://roberto.kellerperez.com/?p=1376</guid>
		<description><![CDATA[It is our sad task to record the death of Professor Chris Humphries, merit researcher in the Botany Department until his retirement in 2007, on Friday 31st July. Chris was a leading figure in the cladistic revolution in systematics and biogeography. Without his tireless efforts, systematic botany &#8211; perhaps systematic biology &#8211; would be a [...]]]></description>
			<content:encoded><![CDATA[<blockquote>
<div id="attachment_1378" class="wp-caption alignleft" style="width: 245px"><img class="size-full wp-image-1378  " title="Chris-Humphries" src="http://roberto.kellerperez.com/wp-content/uploads/2009/08/Chris-Humphries.png" alt="© The Systematic Association" width="235" height="280" /><p class="wp-caption-text">Photograph courtesy of Malte C. Ebach (http://urhomology.blogspot.com/)</p></div>
<p>It is our sad task to record the death of Professor Chris Humphries, merit researcher in the Botany Department until his retirement in 2007, on Friday 31st July. Chris was a leading figure in the cladistic revolution in systematics and biogeography. Without his tireless efforts, systematic botany &#8211; perhaps systematic biology &#8211; would be a very different beast.</p>
<p>Chris joined the Botany Department in 1972 as an assistant curator, a nearly-finished PhD student, coming directly from Vernon Heywood&#8217;s Botany Department in Reading University. With the exception of three sabbaticals &#8211; two of them at the University of Melbourne (1979-80, 1986) and a six month stay as a fellow at the Wissenschaftskolleg zu Berlin (Institute for Advanced Study, Berlin) in 1994 &#8211; Chris spent his entire career in the Museum.<br />
<span id="more-1376"></span><br />
Chris&#8217;s early botanical research was on Asteraceae (daisies) and Macaronesia but during the 1970s and 1980s most of his intellectual effort went into developing, exploring and promoting cladistic systematics and cladistic biogeography. These efforts yielded two much acclaimed books: Cladistic Biogeography (1986) (with Lynne Parenti, of the Smithsonian; a revised 2nd edition appeared in 1999) for biogeography, and Cladistics: A practical course in systematics (1992) (with staff of the Natural History Museum; a revised 2nd edition appeared in 1998 as Cladistics: the theory and practice of parsimony analysis). Both books became standard works in their field.</p>
<p>Chris&#8217;s interest in art made him the perfect choice for organising and annotating the first complete full-colour edition of Banks&#8217; Florilegium, published between 1980 and 1990. The project marked the beginning of Chris&#8217;s love affair with Australia and her flora, the enigmatic southern beeches and the problems of explaining organism distribution in the Southern Hemisphere. The Florilegium consists of over 700 botanical line engravings made from Sydney Parkinson&#8217;s watercolours, recording the plants collected by Joseph Banks and Daniel Carl Solander on Captain James Cook&#8217;s first voyage around the world (1768-1771).</p>
<p>After 1990, Chris (with Dick Vane-Wright and Paul Williams, both of the Entomology Department) put biogeographical matters to more practical use, addressing what they called the &#8220;Agony of Choice&#8221; &#8211; the conservationists&#8217; dilemma &#8211; with their &#8216;WorldMap&#8217; approach to conservation biology, combining taxonomic, ecological and biogeographic information into one system. After a decade of collaboration with many different and diverse groups of researchers working on many different organisms, Chris returned to more fundamental matters in biogeographical investigation and to the distribution of plants on Macaronesia, the islands he began with as a student.</p>
<p>During his career, Chris received many honours; the Linnean Society&#8217;s Bicentenary Medal in 1980 and their Gold Medal in 2001; he was also an Honorary Fellow of the American Association for the Advancement of Science. He was President of the Systematics Association (2001-2003) as well as its Treasurer (1996-9), and President of the Willi Hennig Society (1989-1991), being elected a Fellow honoris causa in 1998. Chris was also Vice-President and Botanical Secretary of the Linnean Society (1994-1998).</p>
<p>In 2008, a three-day Meeting was held in his honour at the Linnean Society; a Festschrift will be published in early 2010.</p>
<p><strong>David M. Williams &amp; Charlie Jarvis</strong><br />
<em>Botany Department<br />
The Natural History Museum<br />
Cromwell Road<br />
London SW7 5BD<br />
UK</em></p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://roberto.kellerperez.com/2009/08/chris-humphries-botanist-and-founding-fellow-of-the-willi-hennig-society-died-on-july-31st/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Bigger is better: the largest phylogenetic tree reconstructed.</title>
		<link>http://roberto.kellerperez.com/2009/05/bigger-is-better-the-largest-phylogenetic-tree-reconstructed/</link>
		<comments>http://roberto.kellerperez.com/2009/05/bigger-is-better-the-largest-phylogenetic-tree-reconstructed/#comments</comments>
		<pubDate>Sun, 03 May 2009 11:45:18 +0000</pubDate>
		<dc:creator>Roberto Keller</dc:creator>
				<category><![CDATA[Cladistics]]></category>
		<category><![CDATA[Personalities]]></category>
		<category><![CDATA[Phylogeny]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[GenBank]]></category>
		<category><![CDATA[James Farris]]></category>
		<category><![CDATA[Pablo Goloboff]]></category>
		<category><![CDATA[TNT]]></category>
		<category><![CDATA[Willi Hennig Society]]></category>

		<guid isPermaLink="false">http://roberto.kellerperez.com/?p=853</guid>
		<description><![CDATA[GenBank, the standard database for genetic information maintained by National Center for Biotechnology Information, has been accumulating DNA sequences for some three decades now. Since its creation in the late 1980s, it has become the de facto repository for genetic information&#8211; genetic data must now be submitted to GenBank for a paper to be accepted [...]]]></description>
			<content:encoded><![CDATA[<p><span style="float: left; padding: 5px;"><a href="http://www.researchblogging.org"><img style="border:0;" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" alt="ResearchBlogging.org" /></a></span><a href="http://www.ncbi.nlm.nih.gov/Genbank/index.html">GenBank</a>, the standard database for genetic information maintained by <a href="http://www.ncbi.nlm.nih.gov/">National Center for Biotechnology Information</a>, has been accumulating DNA sequences for some three decades now. Since its creation in the late 1980s, it has become the <em>de facto</em> repository for genetic information&#8211; genetic data must now be submitted to GenBank for a paper to be accepted for publication. Most sequence data accumulated are the result of the sum of many &#8220;local&#8221; taxonomic studies that have targeted a particular group of organism for a relatively small, but well-known collection of genes. It contents now span over hundreds of genes across all of life&#8217;s domains. So, what would happen if you were to take all the sequence information contained in GenBank and analyze it phylogenetically all together in a single, one-step study? Well, that is what Pablo A. Goloboff and coworkers just did, the results of which were published in last week&#8217;s online early edition of <a href="http://www3.interscience.wiley.com/journal/118512781/home">Cladistics</a>, the international journal of the <a href="http://www.cladistics.org/">Willi Hennig Society</a>.</p>
<p><span id="more-853"></span></p>
<p>The phylogenetic analysis comprises an astonishing 73,060 terminal eukaryotic taxa, 9535 molecular characters and, for good measure, they threw in 604 morphological characters. It is therefore the largest phylogenetic analysis published to date and almost six times larger than the former world record. Such feat presented many technical challenges. The logistics required the automatizing of every step in the analysis, via computer scripts, to retrieve and sort thousands of GenBank entries, to align the sequences to construct the data matrix, to perform the actual searches for the optimal solutions, and to interpretation of the mammoth-size phylogenetic trees. The crux of the analysis, the search for the optimal phylogenetic trees, was done with the powerful parsimony phylogenetic program <a href="http://www.zmuc.dk/public/phylogeny/TNT/">TNT</a> running in parallel in three multi-processor computers for 2.5 months.</p>
<div id="attachment_879" class="wp-caption aligncenter" style="width: 407px"><img class="size-full wp-image-879" title="nf1" src="http://roberto.kellerperez.com/wp-content/uploads/2009/05/nf1.gif" alt="nf1" width="397" height="666" /><p class="wp-caption-text">Fig. 1. Pruned strict consensus tree for the combined data set (seven trees, 1879 taxa excluded). The bar shows the span of 5000 species.</p></div>
<p>The resulting phylogeny recovers most traditional taxonomic groups. This is interesting for various reasons. First, as noted about, our understanding of the tree of life is the results of many taxonomically localized efforts that have been informally pasted together<sup class='footnote'><a href='#fn-853-1' id='fnref-853-1'>1</a></sup>. This is the first time a phylogeny has been reconstructed from scratch, letting the data speak unconstrained for itself without assuming that certain evolutionary relationships most be true <em>a priori</em>. Second, it shows that there is enough historical information contained in the data so that the optimal solution is not a complete mess or largely unresolved answer&#8211; consider that there are 9 X 10<sup>345,593</sup> possible tree combinations for the number of terminals included. Third, that we do have the current capacity, both in terms of software and hardware, to carry out such a large analysis. And last, but related to the previous two points, that parsimony methods for phylogenetic reconstruction are up for the task. The latter point is worth noting because early simulations, based on just a few taxa (a grand total of four actually) scared systematists into thinking that parsimony methods may result in erroneous reconstructions. Later studies using real data and a much larger collection of species has shown that this is not the case, and this 73,060 taxa analysis serves as the largest of these test cases.</p>
<p>The authors are no strangers when it comes to computer implementation of phylogenetic methods. <a href="http://www.nrm.se/en/menu/researchandcollections/departments/molecularsystematics/staff/jamesstevenfarris.1179_en.html">James S. Farris</a> is a pioneer in the field who developed the algorithmic foundations and produced the some of the first phylogenetic programs in the late 1960s, when the character information for each taxon to be analyzed was contained in a <a href="http://en.wikipedia.org/wiki/Punch_card">punch card</a> and random addition sequence for the phylogenetic tree construction meant that the set of cards was shuffled by hand before feeding them into the terminal connected to the mainframe. Likewise, Pablo A. Goloboff has been responsible for many of the rapid search techniques developed during the 1990s up to the present, that seek to cover the searchable tree-space in a fast and efficient way.</p>
<p>It seems that, for phylogenetics, the only limit that remains is the availability of data.</p>
<p><strong>References and notes</strong></p>
<p><span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=Cladistics&amp;rft_id=info%3Adoi%2F10.1111%2Fj.1096-0031.2009.00255.x&amp;rfr_id=info%3Asid%2Fresearchblogging.org&amp;rft.atitle=Phylogenetic+analysis+of+73+060+taxa+corroborates+major+eukaryotic+groups&amp;rft.issn=07483007&amp;rft.date=2009&amp;rft.volume=&amp;rft.issue=&amp;rft.spage=0&amp;rft.epage=0&amp;rft.artnum=http%3A%2F%2Fblackwell-synergy.com%2Fdoi%2Fabs%2F10.1111%2Fj.1096-0031.2009.00255.x&amp;rft.au=Goloboff%2C+P.&amp;rft.au=Catalano%2C+S.&amp;rft.au=Marcos+Mirande%2C+J.&amp;rft.au=Szumik%2C+C.&amp;rft.au=Salvador+Arias%2C+J.&amp;rft.au=K%C3%A4llersj%C3%B6%2C+M.&amp;rft.au=Farris%2C+J.&amp;rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CComputer+Science%2CTaxonomy%2C+Phylogeny%2C+Cladistics">Goloboff, P., Catalano, S., Marcos Mirande, J., Szumik, C., Salvador Arias, J., Källersjö, M., &amp; Farris, J. (2009). Phylogenetic analysis of 73 060 taxa corroborates major eukaryotic groups <span style="font-style: italic;">Cladistics</span> DOI: <a rev="review" href="http://dx.doi.org/10.1111/j.1096-0031.2009.00255.x">10.1111/j.1096-0031.2009.00255.x</a></span>
<div class='footnotes'>
<div class='footnotedivider'></div>
<ol>
<li id='fn-853-1'>Only more recently we have the development of &#8220;supertree&#8221; methods, that seek to construct a large phylogeny based on the consensus of multiple small, partially overlapping, trees following more precise set of rules. <span class='footnotereverse'><a href='#fnref-853-1'>&#8617;</a></span></li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://roberto.kellerperez.com/2009/05/bigger-is-better-the-largest-phylogenetic-tree-reconstructed/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Cladistics wars 2.0</title>
		<link>http://roberto.kellerperez.com/2009/04/cladistics-wars-20/</link>
		<comments>http://roberto.kellerperez.com/2009/04/cladistics-wars-20/#comments</comments>
		<pubDate>Wed, 22 Apr 2009 18:36:44 +0000</pubDate>
		<dc:creator>Roberto Keller</dc:creator>
				<category><![CDATA[Cladistics]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[Metablogging]]></category>
		<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://roberto.kellerperez.com/?p=830</guid>
		<description><![CDATA[There is a skirmish going on at Dechronization blog right now1. This is a coauthored blog about phylogenetics. I like used to like this blog (its was right there on my blogroll &#8212;-&#62;2). There are surprisingly very few blogs about phylogenetic methods these days, despite the wide use that phylogenies currently have  in evolutionary biology [...]]]></description>
			<content:encoded><![CDATA[<p>There is <a href="http://treethinkers.blogspot.com/2009/04/cladistics-workshop-announced.html">a skirmish going on at Dechronization blog</a> right now<sup class='footnote'><a href='#fn-830-1' id='fnref-830-1'>1</a></sup>. This is a coauthored blog about phylogenetics. I <span style="text-decoration: line-through;">like</span> used to like this blog (<span style="text-decoration: line-through;">its</span> was right there on my blogroll &#8212;-&gt;<sup class='footnote'><a href='#fn-830-2' id='fnref-830-2'>2</a></sup>). There are surprisingly very few blogs about phylogenetic methods these days, despite the wide use that phylogenies currently have  in evolutionary biology and beyond (e.g., linguistics). I will complain that, for nine authors, they post little, sometimes not a single post during a month.</p>
<p><span id="more-830"></span></p>
<p>The hot post in question is a mocking of an announcement about a (to be honest, very successful) workshop in phylogenetic methods cosponsored by the Willi Hennig Society and <a href="http://www.cladistics.org/workshops.html">so far held in different continents</a>:</p>
<blockquote><p>The Ohio State University and the Willi Hennig Society have just announced this summer&#8217;s <span style="text-decoration: line-through;">Workshop in Phylogenetics</span> Indoctrination in Cladistics Workshop. Some twenty students will receive fellowships to attend this workshop from the Willi Hennig Society. With these fellowships, students will be able to receive four days of instruction on the proper use of outdated methodologies for only $600.</p></blockquote>
<p>Leaving the mocking part of the post aside, the bottom end seems to be that say poster finds objectionable the fact that model-based methods, especially Bayesian methods, will be taught by Christopher Randle who, as pointed out, has been critical of some of the aspects of how Bayesian statistics is been implemented in phylogenetic reconstruction (in particular, the possibility of establishing equal priors).</p>
<blockquote><p>Instruction on model-based methods will be provided by <a href="http://www.shsu.edu/%7Ebio_www/randle.html">Dr. Christopher Randle</a>, whose only publications on Bayesian methods are critiques (<a href="http://www.informaworld.com/smpp/1913810350-85685367/content%7Econtent=a748911911%7Edb=all">1</a>, <a href="http://www.ingentaconnect.com/content/iapt/tax/2005/00000054/00000001/art00003">2</a>) and whose recent publications rely either exclusively on parsimony (<a href="http://www.amjbot.org/cgi/content/abstract/93/11/1699">3</a>) or give preference to parsimony over maximum likelihood when the two methods are largely congruent (<a href="http://www.ingentaconnect.com/content/iapt/tax/2008/00000057/00000001/art00010">4</a>). I&#8217;m sure Dr. Randle is an excellent scientist, but his presence as the sole instructor of model-based methods suggests that this workshop is going to be about as balanced as Fox News.</p></blockquote>
<p>Here are my thoughts on the issue. First I agree in that there may be no one better to teach a method or its software implementation that the person who developed it. It would be wonderful if one could learn during a workshop, say, <a href="http://evolution.genetics.washington.edu/phylip.html">Phylip</a> from Joe Felsenstein and <a href="http://mrbayes.csit.fsu.edu/">MrBayes</a> from John Huelsenbeck. Since this type of opportunities doesn&#8217;t happen very often, I content that any systematist well familiarized with such methods is a good substitute, in the same way that I don&#8217;t need to be James Watson to teach you the structure of DNA (it&#8217;s the one with the uracil, right?). Now, I don&#8217;t know Christopher Randle personally, but I gather that as someone who has published papers criticizing Bayesian implementation in peer-reviewed journals and who&#8217;s papers have elicited published responses, he is more than qualified to teach such methods. If his papers were really far off, if he didn&#8217;t understand the theory behind the Bayesian implementations or didn&#8217;t know how to use MrBayes, they would have rather had incited the worst response there is to a scientific paper: being ignored.</p>
<p>As it happen, at the Ohio workshop students will get to learn parsimony based phylogenetic methods from leading authors and software programmers in the field, like <a href="http://www.zmuc.dk/public/Phylogeny/TNT/">TNT</a> from Kevin Nixon and Direct Optimization (<a href="http://research.amnh.org/scicomp/projects/poy.php">POY</a>) from Ward Wheeler.</p>
<p>But regardless of your opinion about the aptness of parsimony methods in phylogenetics consider this. Parsimony methods are, literally, an elegant and simple algebraic approach to phylogenetic reconstruction. If you understand the basics of parsimony (optimization, tree search, etc), you will be able to learn model-based approaches in a breeze. You can almost reduce all phylogenetic methods to Sankoff matrices. So, even if you are only interested in model-based methods, you should know your parsimony well.</p>
<p>The Dechronization crew may be surprised at the sudden popularity of that particular post, but timing explains much. Just the previous day a different author posted there <a href="http://treethinkers.blogspot.com/2009/04/dechronization-interviews-jack-sullivan.html">an interview with Jack Sullivan</a>, editor-in-chief of the journal Systematic Biology, promising more interviews to come. This post was picked up by <a href="http://noticiasenfilogeneticaorg.blogspot.com/2009/04/dechronization-interviews-jack-sullivan.html">Noticias sobre Filogenética</a>, a popular Latin-American blog and forum about phylogenetics widely read in the region, which recommended the blog and encouraged its readers to check out the interesting interview. Next day, blam, an acid critique intended to be humorous but that, unfortunately, reads more like a rant.</p>
<p>As a regular reader of Dechronization I was struck at the sudden change in tone. This is a problem of coauthored blogs. You get to read each posts as a headline of a newspaper before you know who wrote the story, if you know at all. The blog feed reader that I use doesn&#8217;t even show the author of the post. I read Dechronization blog, not the items post by Susan Perkins only, for example. Something for Dechronization to think about.</p>
<p>Meanwhile the comments section of that post keeps filling up quickly.</p>
<p><strong>Short version of this long post:</strong> <a href="http://treethinkers.blogspot.com/2009/04/cladistics-workshop-announced.html">cripple fight!</a></p>
<p style="text-align: center;">&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</p>
<div class='footnotes'>
<div class='footnotedivider'></div>
<ol>
<li id='fn-830-1'>1. <em>Update<strong> </strong>12:00pm GMT, April 22nd, 2009.</em> <a href="http://treethinkers.blogspot.com/2009/04/cladistics-post-deleted.html">Original post on Dechronization deleted</a>.</p>
<p>1.2. <em>Update, May 2nd, 2009</em>. It seems that the original poster did not agree with the removal of his posts and reposted the <a href="http://dechronization.blogspot.com/2009/04/cladistics-workshop-announced.html">Dechronization announcement of the Cladistics Workshop here.</a> <span class='footnotereverse'><a href='#fnref-830-1'>&#8617;</a></span></li>
<li id='fn-830-2'><em>Update April 23nd, 2009.</em> I took the link out of my blogroll to show a dear friend that I care more about him than a silly blog. <span class='footnotereverse'><a href='#fnref-830-2'>&#8617;</a></span></li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://roberto.kellerperez.com/2009/04/cladistics-wars-20/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The evolution of the web</title>
		<link>http://roberto.kellerperez.com/2009/04/the-evolution-of-the-web/</link>
		<comments>http://roberto.kellerperez.com/2009/04/the-evolution-of-the-web/#comments</comments>
		<pubDate>Fri, 03 Apr 2009 13:43:01 +0000</pubDate>
		<dc:creator>Roberto Keller</dc:creator>
				<category><![CDATA[Cladistics]]></category>
		<category><![CDATA[Phylogeny]]></category>
		<category><![CDATA[Direct optimization]]></category>
		<category><![CDATA[Implied weights]]></category>

		<guid isPermaLink="false">http://roberto.kellerperez.com/?p=696</guid>
		<description><![CDATA[Spider web, that is. This is an excellent example of the way systematic papers should be. In the latest issue of the Proceedings of the National Academy of Sciences (USA), Blackledge and coworkers assembled a comprehensive data set for cladistic analysis of orb web spiders that includes six different molecular loci, 143 morphological characters and [...]]]></description>
			<content:encoded><![CDATA[<p><span style="float: left; padding: 5px;"><a href="http://www.researchblogging.org"><img style="border:0;" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" alt="ResearchBlogging.org" /></a></span>Spider web, that is.</p>
<p>This is an excellent example of the way systematic papers should be. In the latest issue of the <a href="http://www.pnas.org/">Proceedings of the National Academy of Sciences</a> (USA), <a href="http://www.pnas.org/content/106/13/5229.abstract">Blackledge and coworkers</a> assembled a comprehensive data set for cladistic analysis of orb web spiders that includes six different molecular loci, 143 morphological characters and behavior in the form of characters derived from web architecture.</p>
<p><span id="more-696"></span></p>
<p style="text-align: center;">
<div id="attachment_698" class="wp-caption aligncenter" style="width: 537px"><img class="size-full wp-image-698" title="Blackledge et al. 2009 - figure 2" src="http://roberto.kellerperez.com/wp-content/uploads/2009/04/blackledge_etal2009fig2.jpg" alt="Hypothesis of web architecture evolution as optimized in preferred phylogeny." width="527" height="671" /><p class="wp-caption-text">Hypothesis of web architecture evolution as optimized in preferred phylogeny (from Blackledge et al. 2009: fig. 2).</p></div>
<p>The resulting picture supports a single origin of aerial orb webs from irregular webs constructed in the ground. There is a subsequent evolution to more economical, irregular aerial web architectures from the more costly, regular orb types at least three times independently. And there seems to be an instance of evolution towards the simplified aerial web spun by bolas spiders.</p>
<p>However, apart from the nice evolutionary story, the real treat in this paper is hidden in the small text and supplementary information. All the data compiled would have been useless if analysed with poor methods. Instead the authors performed a series of sophisticated phylogenetic techniques that, besides vanilla Bayesian and parsimony analyzes, included implied weighting for the morphology partition and direct optimization for the molecular data. Morphology and molecular data were analyzed separately and in combination for an impressive total of 64 different types of phylogenetic analyzes.</p>
<p>Implied weights and direct optimization analyzes are worth remarking here because they are not well known and still rarely used in <span style="text-decoration: line-through;">flashy</span> high ranked papers. While in a regular parsimony analysis all characters are given equal weights regardless of how well or how poorly they fit a given tree, in implied weights analysis characters are downweighted as a function of the amount of homoplasy (extra steps) that is required to explain their distribution on any given tree topology during the tree-search phase. It is an <em>a posteriori</em> type of character weighting. One of the rationales behind this methods is that one extra step in a character that already performs very poorly (is very homoplasious), should not be counted equally as one extra step in a character with almost perfect fit. The method was developed in the early 1990&#8242;s but is has had a resurrection of late for morphological data, curiously because with the rise of molecular analyzes many authors have noticed that the results of analyzes of morphology under implied weights mirror more closely the molecular phylogenies.</p>
<p>For the molecular data, the use of direct optimization techniques is simply a way to push the limits on finding the optimal correspondences among the positions of DNA sequences under comparison. <a href="http://en.wikipedia.org/wiki/Multiple_sequence_alignment">Multiple sequence alignment</a> methods tend to find just one of the many possible optimal solution (if not suboptimal) from which a regular phylogenetic analyzes (parsimony, maximum likelihood or Bayesian) is then performed during a second phase. In contrast, direct optimization side-steps alignment altogether by searching for the optimal correspondences among sequences during tree search in a simultaneous single step process, thus performing a much more aggressive evaluation of the multiple possible alternatives. It is computational intensive, but this is hardly an excuse for good science as exemplified in this paper.</p>
<p>The result of applying these methods is a well supported phylogeny that allow the authors to make a rigorous reconstruction of the evolution of spider silk, bringing together information from silk chemistry, spider morphology and behavioral ecology.</p>
<p>Don&#8217;t forget to take a look at the supplementary information. There are lots of nice pictures showing all the types of web architectures. Oh, and its open access, so no subscription required.</p>
<p><strong>Reference</strong></p>
<p><span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=Proceedings+of+the+National+Academy+of+Sciences&amp;rft_id=info%3Adoi%2F10.1073%2Fpnas.0901377106&amp;rfr_id=info%3Asid%2Fresearchblogging.org&amp;rft.atitle=Reconstructing+web+evolution+and+spider+diversification+in+the+molecular+era&amp;rft.issn=0027-8424&amp;rft.date=2009&amp;rft.volume=106&amp;rft.issue=13&amp;rft.spage=5229&amp;rft.epage=5234&amp;rft.artnum=http%3A%2F%2Fwww.pnas.org%2Fcgi%2Fdoi%2F10.1073%2Fpnas.0901377106&amp;rft.au=Blackledge%2C+T.&amp;rft.au=Scharff%2C+N.&amp;rft.au=Coddington%2C+J.&amp;rft.au=Szuts%2C+T.&amp;rft.au=Wenzel%2C+J.&amp;rft.au=Hayashi%2C+C.&amp;rft.au=Agnarsson%2C+I.&amp;rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CTaxonomy%2C+Zoology%2C+Phylogeny%2C+Evolutionary+Biology">Blackledge, T., Scharff, N., Coddington, J., Szuts, T., Wenzel, J., Hayashi, C., &amp; Agnarsson, I. (2009). Reconstructing web evolution and spider diversification in the molecular era <span style="font-style: italic;">Proceedings of the National Academy of Sciences, 106</span> (13), 5229-5234 DOI: <a rev="review" href="http://dx.doi.org/10.1073/pnas.0901377106">10.1073/pnas.0901377106</a></span></p>
]]></content:encoded>
			<wfw:commentRss>http://roberto.kellerperez.com/2009/04/the-evolution-of-the-web/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
