Young Investigator Zachary Goecker on forensic hair analysis using proteomic genotyping
Zachary talks about HID and medical applications of sequencing the hair shaft proteome
June 19, 2020
The new Investigator blog shines a personal spotlight on individual scientists at the exciting early stages of a career working in the field of human identification and forensics. The passion and commitment revealed in their stories are an inspiration to all in our community.
This chapter in the series introduces Zachary Goecker, a Ph.D. candidate at the University of California, Davis. Zachary is working in the specialized area of proteomic genotyping, with reference to the hair shaft proteome. This has forensic applications for individual identification and ancestral classification and is useful in circumstances where DNA may be degraded.
Tell us about your background and how you became interested in forensic science?
My mother, father and sister all work (or worked) in criminal justice. Growing up, I was enamored with police officers chasing bad guys, the thrill of detectives investigating a crime, and the action-hero type story lines of criminal justice. However, I began to realize second-hand that working with justice-involved persons can be draining, stressful and sometimes quite dangerous depending on the setting. It was apparent that criminal justice was not the way to go for me.
In high school, biology was by far the most entertaining subject and so I began to develop a deeper interest in topics like genetic diseases, cloning and DNA sequencing. I also developed close friendships with others in my science classes. This encouraged me to be more science-literate and ultimately led me to my undergraduate degree choice. As an undergraduate, I studied forensic biology. During this time, I also developed a passion for organic chemistry and biochemistry but decided to stick with forensic science and obtained my master’s in forensic chemistry.
After this experience, I was sure I wanted to study pharmacology! There was nothing that was going to stop me! So, I set out for a Ph.D. During my first year, I saw an advertisement for a project in forensic proteomics and I just couldn’t resist. While my passions are biochemistry and genetics, I still have a soft spot for criminal justice.
Can you provide a summary of the project you are working on?
The project I am working on answers genetic questions using protein-based methods. In short, we are interested in analyzing nonsynonymous SNPs, which are SNPs that are translated into protein. These nonsynonymous SNPs may be inherited genetically (non-somatic) and therefore have population genetic information associated with them such as genotype frequency. Peptides associated with these SNPs we call genetically variant peptides (GVPs). The frequency of occurrence in populations can help to determine classic forensic statistics such as random match probability and biogeographic likelihood ratio. The process of imputing SNPs from protein sequences is called proteomic genotyping. This type of genotyping is especially useful in circumstances where DNA may be degraded, such as with environmentally challenged, aged, or naturally degraded tissue.
The scope of my project deals with the human hair shaft. More specifically, we sequence the proteome of the hair shaft for the purposes of individual identification, ancestral classification and disease characterization. The hair shaft does have nuclear DNA, but it is degraded into small fragments by a process known as cornification. Given that previous methods of forensic hair analysis occurred under the microscope, are subjective and deemed to have limited probative value, proteomics could increase the value of hair in a courtroom setting.
For individual identification and ancestral classification, we are interested in calculating random match probabilities and likelihood ratios from GVPs detected in the hair proteome. For disease characterization, we are interested in identifying the proteomic explanation for a disease known as trichothiodystrophy. Persons with trichothiodystrophy are photosensitive and can be classified under the acronym PIBIDS (for Photosensitivity, Ichthyosis, Brittle hair, Infertility, Decreased intelligence, and Short stature).
Please describe your typical day in the lab.
In my experience, proteomics is 20% lab work and 80% data analysis. The lab work is primarily proteomic sample processing in the form of biochemical and physical experimentation. While DNA methods are rigorous in terms of sterilizing and cleaning to prevent contamination, proteomics is even more so. After all, dust is composed of keratin and the smallest trace could easily produce false positives in our analysis. DNA or RNA free reagents may still have trace peptides in them which can be detected using the sensitive instruments we use. As a result, there is a lot of time spent passing reagents through solid-phase extraction filters to ensure proteomic-grade purity.
In terms of data analysis, the field of proteomics has a multitude of software. This creates a complicated roadmap for what analysis to perform based on what questions you are asking. For this project, we are mostly looking at single amino acid polymorphisms and calculating random match probability from genotype frequencies and also looking at protein abundance levels using softwares such as X! Tandem, PEAKs, and Scaffold. Another aspect of proteomics is working with analytical instrumentation, such as with LC-MS/MS.
Method optimization alone could take up a lot of your time in this type of work, but, luckily, we collaborate with a proteomics core which is very familiar with the methodology needed to run our samples.
The potential of this research is what is most interesting to me. The more we work on optimizing the processing chemistry or data acquisition, the higher our random match probabilities reach. Mitochondrial analysis from human hair is limited by database size to about 1 in 10,000. Alu element typing has recently gained attention for obtaining RMPs in the billions. With our current data acquisition strategies, proteomic genotyping is projected to reach RMPs of about 1 in 100 trillion.
These results are definitely surprising to us, since in the early development of this technique RMPs reached up to about 1 in 10,000. If a combination of mtDNA, Alu element typing, and proteomic genotyping is used on a single hair shaft, the total estimate of random match probability may rival that of STR analysis. Another surprising result is how little amount of hair we need to get useful results. We use only 2 cm of hair for routine analysis, while others have proven that 2 mm may even be sufficient. This is mostly due to the sensitivity of instrumentation that is used.
What are the benefits of your project?
The main advantage of this project is the inherent objectivity of the analysis. Proteomics can be used for human identification and ancestral classification, all without the use of subjective visual comparisons. This is a very much needed turn in forensic hair analysis. At this time, only mtDNA and Alu element typing have allowed for an objective analysis.
Another benefit of proteomic genotyping is how well protein persists through environmental challenges. DNA tends to break down when exposed to environmental stresses such as high temperature and UV irradiation. This leads to a loss of genetic information in both nuclear and mitochondrial DNA.
Although short amplicon methods are gaining traction in forensic genetics, these methods are only useful above a certain threshold of degradation. Protein is more chemically stable and resistant to these challenges, providing an alternative for degraded samples. This may also be of use in an archaeological context where natural degradation has occurred. Structural proteins in hair and bone may still carry genetic information in the form of GVPs.
The main challenges of this project involve sample collection, contamination and data interpretation. Sample collection can be very easy if you are studying a non-human system. However, sample collection can be much trickier when you are required to obtain it from a living person. Recruitment of individuals can be particularly hard when it comes to more invasive sampling methods such as body fluid collection or a haircut to collect hair shafts.
In terms of contamination, we are always worried about keratin contamination from dust particulates. The instrumentation we use is capable of detecting peptides at the attomole level. Considering the most common organic contaminant is keratin, this is especially an issue. As mentioned earlier, we take rigorous steps to prevent contamination in our hair digests.
Finally, while the proteomic process is objective, data interpretation can vary from person to person. The threshold on what to call a true positive GVP is yet to be set, although some groups are currently working on guidelines to remedy this.
Which QIAGEN products do you use and what do you like about the products?
Currently, we are using the Gentra Puregene Buccal Cell Kit to extract DNA from mouthwash. Proteomic genotyping can only be compared with DNA genotypes, not genotypes produced from proteomic profiles. This is because the sensitivity of proteomic genotyping is not 100 %, meaning there are gaps in the data. Therefore, there is a need to achieve fast, reliable, and convenient methods of genotyping to parallel to our proteomic method. The Gentra Puregene kit was the best candidate for quick and reliable DNA extraction from mouthwash. It is especially helpful that we can collect DNA from individuals with a non-invasive mouthwash method rather than having to stick a swab inside of their mouth.
In the future, we may also be exploring methods of targeted genotyping with QIAGEN using the QIAseq Targeted DNA Custom Panel to replace whole exome sequencing as the validation method. This would ensure an affordable method of validation where all of our SNPs of interest are genotyped, and not just a subset.
Outside of forensic science, what are your hobbies?
On clear nights, I enjoy stargazing and looking for deep sky objects through my telescope. While I am no astronomer, I find that looking through a telescope at distant galaxies gives me extraordinary meaning and a humble perspective on life. My favorite objects are the Pinwheel galaxy, Jupiter and the Orion nebula.
In my free time I also enjoy cooking. However, I am more of a sous-chef under my significant other. While meal-prep is a part of most working life, it makes a potentially monotonous process more enjoyable to experiment with a new recipe or food you have never tried. My go-to meals include beef stir-fry, lemon garlic tilapia and jambalaya, although I have recently been trying out some Mediterranean dishes.