View on GitHub

William S. DeWitt III


University of Washington
Department of Genome Sciences
Foege Building S-250, Box 355065
3720 15th Ave NE, Seattle WA 98195-5065


GCtree: Using genotype abundance to improve phylogentic inference.

I was amazingly fortunate to get to work my first project in phylogenetics under Erick Mastsen (Fred Hutch) and Vladimir Minin (UW at the time, now UC Irvine). The aim was to incorporate genotype abundance information in phylogenetic inference for experimental systems in which unfolding evolutionary processes are sampled at single-cell resolution. For such data sets, we have counts of the number of individuals that bear each genotype—a situation that standard phylogenetics algorithms can’t handle (so this abundance information is generally ignored). Things began with chalkboard scribbles like this:

We used a whimsical mix of standard phylogenetics approaches and stochastic process likelihoods to capture the intuition that genotypes with higher observed abundance will tend to have more descendant genotypes. After validating with extensive simulations, I set out for The Rockefeller in NYC to collaborate with Luka Mesin and Gabriel Victora on empirical tests of our methods in the setting of B cell receptor affinity maturation. They have some amazing experimental windows into affinity maturation, just check out this video about Gabriel’s recent MacArthur “Genius” award:

what a genius

Our validation efforts seem to show that we are productively integrating the abundance information to infer better trees. Although we were motivated by B cell affinty maturation, we’re hoping GCtree will be useful in other settings, such as single-cell tumor phylogenetics, and cell lineage tracing using genome editing (e.g. the GESTALT magic coming out of the nearby Shendure lab).

Here’s the repo for the GCtree software, and the arXiv preprint of our manuscript. We’d be very happy to hear from others interested in inferring phylogenetic trees for data with genotype abundance information, and I think we’ve done a decent job of making the code general and usable.

And check out Erick’s nice blog post about the project!


* equal contribution


WS DeWitt, L Mesin, GD Victora, VN Minin, FA Matsen. Using genotype abundance to improve phylogenetic inference. arXiv:1708.08944 [q-bio.PE] (2017) (under review)

WS DeWitt, KK Quan, D Wilburn, A Sherwood, M Vignali, SC De Rosa, CL Day, TJ Scriba, HS Robins, W Swanson, RO Emerson, P Bradley, C Seshadri. An MHC-independent T-cell repertoire is enriched during active tuberculosis. bioRxiv 123174 (2017) (under review)

RO Emerson*, WS DeWitt*, M Vignali, J Gravley, JK Hu, EJ Osborne, C Desmarais, M Klinger, CS Carlson, JA Hansen, M Rieder, HS Robins. Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nature Genetics 49, 659–665 (2017)


WS DeWitt*, P Lindau*, TM Snyder*, AM Sherwood, M Vignali, CS Carlson, PD Greenberg, N Duerkopp, RO Emerson, HS Robins. A Public Database of Memory and Naive B-Cell Receptor Sequences. PLoS ONE 11(8) (2016)


WS DeWitt, P Lindau, TM Snyder, M Vignali, RO Emerson, HS Robins. Replicate Immunosequencing as a Robust Probe of B Cell Repertoire Diversity. arXiv:1410.0350 [q-bio.QM] (2014)

RO Emerson, A Sherwood, WS DeWitt, B Howie, M Rieder, HS Robins. A next-gen pipeline for generation, error correction and annotation of high-throughput immunosequencing data. J Immunol 192:69.10 (2014)

WS DeWitt*, RO Emerson*, P Lindau, M Vignali, TM Snyder, C Desmarais, C Sanders, H Utsugi, EH Warren, J McElrath, KW Makar, A Wald, HS Robins. Dynamics of the Cytotoxic T Cell Response to a Model of Acute Viral Infection. J Virol 89:4517–4526 (2014)


X Xu, Y Shen, WS DeWitt, D Pandya, FZ Bischoff, KD Crew, DL Hershman, MA Maurer, R Parsons, K Kalinsky. Mutational analysis of circulating tumor cells in breast cancer patients by targeted clonal sequencing. J Clin Oncol 30, (suppl; abstr 10516) (2012)

WS DeWitt, K. Chu. Imaging Protein Statistical Substate Occupancy in a Spectrum-Function Phase Space. Phys. Rev. Lett. 105, 098101 (2010)

J. Wu, J. Pepe, WS DeWitt. Nonlinear Behaviors of Contrast Agents Relevant to Diagnostic and Therapeutic Applications. Ultrasound Med Biol. Apr;29(4):555-62. (2003)