We have used high-density peptide arrays displaying short peptides to interrogate the specificity of the antibody repertoires of Chagas Disease patients across the Americas. The detailed description of our work can be seen in our original paper.

Introduction

Chagas disease

Chagas Disease (American trypanosomiasis) is a lifelong infection caused by the protozoan parasite Trypanosoma cruzi. Despite being discovered ~100 years ago, the condition remains a major social and public health problem in the Americas and is regarded as a neglected tropical disease by the World Health Organization.

Trypanosoma cruzi

Trypanosoma cruzi is a unicellular eukaryote that infects and replicates within cells. There are 6-7 evolutionary lineages of the parasite across the Americas, circulating in different ecoepidemiological cycles (domestic, peridomestic, sylvatic). All cause human infections, with a diversity of clinical manifestations.

High Density Peptide Arrays

Peptide arrays display short peptides at addressable positions. A very high-density of probes is achieved by in situ peptide synthesis driven by digital light processors. Such arrays contain hundreds of thousands to millions of peptides.

Antibody Repertoires

The term “antibody repertoire” refers to the entire set of antibodies produced by an individual as part of the adaptive immune response to infections or vaccines.

Assays

We assayed high density arrays displaying immobilized short peptides (candidate antigens and epitopes) with the same methodology used in indirect immunoassays. Arrays were first incubated with a primary antibody (human serum sample or pool of samples), then washed to remove unbound immunoglobulins. Arrays were next incubated with a secondary antibody, fluorescently labeled. This secondary antibody binds to human immunoglobulin G (total IgG). After washing to remove unbound antibodies, fluorescence was read in a scanner.

Step 1: Whole Proteome Discovery Screening

The discovery screening was aimed at displaying the complete proteome of T. cruzi using short peptides; and detecting which of these peptides were bound by antibodies by subjects with Chagas disease. Two complete proteomes were displayed at this stage (2.84 million peptides), derived from T. cruzi strains CL-Brener and Sylvio X10, representative of lineages TcI (Sylvio X10) and TcVI (CL-Brener, hybrid). To maximize coverage for discovery, each peptide array was incubated with a pool of 5-6 serum samples.

Visual Summary

Visual Summary of the Discovery Screening. Schematic representation of the steps followed to analyze two T. cruzi proteomes (CL-Brener and Sylvio X10) using pooled serum samples across the Americas (one pool from infected individuals and one from healthy subjects from Argentina, Brazil, Bolivia, Colombia, Mexico and the United States).

Experiment Design

Proteomes, Proteins & Microarrays

We used high-density peptide arrays to perform high-resolution antigen discovery. We designed an array that includes protein sequences encoded in the genomes of two T. cruzi strains: the genome reference from the CL Brener strain (19,668 proteins, Discrete Typing Unit (DTU) TcVI, hybrid), and the Sylvio X10 strain (10,832 proteins, DTU TcI, non-hybrid). A total of 30,500 proteins were displayed in our microarrays.

To create the microarray slides, each protein was split in peptides of 16 amino acids. The overlap between consecutive (neighboring peptides) was of 12 amino acid residues (e.g. the offset between one peptide and the next was 4 residues, see Visual Summary). A total of 2,441,908 unique peptides were used in this design to display all proteins in these two proteomes. We called this design CHAGASTOPE-v1.

Serum Samples

To discover new antigens and epitopes, we assayed CHAGASTOPE-v1 whole-proteome microarray slides with pooled serum samples. These included samples from Chagas positive donors (infected with T. cruzi), and matched negative sample pools from healthy subjects. Each pair of positive + matched negative pools were representative of 6 different geographical regions across the Americas: Argentina, Bolivia, Brazil, Colombia, Mexico and the United States. Also, two additional pools were profiled: a pool from Leishmania-positive individuals and a matched pool of Leishmania-negative (also Chagas-negative) samples from the same geographic area. All 14 samples were assayed in duplicate.

Sample pools were labeled as: AR (Argentina), BO (Bolivia), BR (Brazil), CO (Colombia), MX (Mexico), US (United States) and LE (the pool from Leishmania-positive individuals). Each pool was also tagged as either Positive (for Chagas/Leishmania-positive individuals) or Negative for healthy individuals.

Data Processing

Data processing of Chagastope arrays produces the following types of numerical values for peptides:

Raw Signal data were those obtained from scanning each microarray slide. From the experiments we obtained 2 raw signal values per peptide per sample (because they were assayed in duplicate). When browsing “All Peptide Data” these are shown under the “Replica” column.

Normalized Signal data was obtained by performing quantile normalization over two sets of microarray slides. All arrays assayed with Chagas-positive subjects were normalized together (positive set), and all arrays assayed with negative samples were normalized together in a separate set (negative set). All normalized values can be compared across replicas and experiments.

Smoothed Signals were obtained after placing all peptides from the same protein in order, and smoothing the normalized signal data using a rolling median followed by averaging the values obtained for each replica. This was done to remove outliers, and uses neighboring peptides in a protein sequence as pseudo-replicates (a given peptide shares 75% of its sequence with the previous/next peptide in the protein sequence). This resulted in a signal value for each peptide in a specific position inside each protein.The standard deviation (SD) in this case summarizes the dispersion between smoothed values between the two replicates.

Antigenicity threshold

We established an arbitrary but very conservative antigenicity threshold to classify peptides as antigenic. We calculated the mode and the standard deviation for all peptides in the arrays using Normalized Signals. The antigenicity threshold was defined as the statistical mode + 4 standard deviations (this equals a normalized signal value of 10,784.80 arbitrary fluorescence units). The antigenicity (signal) threshold of dynamic plots can be adjusted manually for visualization purposes using the options in the “Plot options” tab.

Step 2: Detailed Epitope Analysis

The second experimental stage used a different array design and was aimed at performing a detailed analysis of the antigens and epitopes identified in the previous stage. At this stage all assays were performed with individual serum samples (not pooled) . Hence, we obtained important information on the seroprevalence of antigenic regions and epitopes. Neighboring peptides were also maximally overlapped (sliding offset of 1 amino acid residue).

Experiment Design

Antigenic Regions, Antigenic Peaks & Microarrays

At this stage, all assays were performed using sectorized high-density peptide arrays. These arrays can assay up to 12 samples (primary antibodies) in parallel in the same slide. All 12 sectors in each slide contained the same set of peptides representative of 9,547 antigenic regions found in the discovery screening (241,772 unique peptides). We called this design CHAGASTOPE-v2.

Serum Samples

Array sectors were assayed with individual serum samples from the same regions as in the discovery screening: 71 individual samples from Chagas-positive individuals were assayed (in duplicate) on 142 array sectors: 12 each from Argentina, Bolivia, Brazil, Mexico and the United States and 11 from Colombia.

At the Chagastope web resource, plots and tables refer to these samples as, e.g. AR_P1 where the prefix AR means that that sample is from Argentina, P means that it was one of the serums used in the AR pool of Chagas-positive individuals from Argentina and the trailing number is used to differentiate samples. Samples that were not part of their corresponding pools are marked with an E (as in AR_E1). Prefixes are: AR (Argentina), BO (Bolivia), BR (Brazil), CO (Colombia), MX (Mexico) and US (United States). These serums were all tagged as Positive serum for Chagas.

Data Processing

Data was processed and analyzed in a similar manner as in the discovery screening.

Antigenicity threshold

A separate arbitrary threshold was calculated and defined for these experiments. After calculating the statistical mode and the standard deviation for all Normalized Signals in these experiments, we defined an antigenicity threshold equal to the statistical mode plus 2.4 standard deviations, which resulted in 5,814.81 arbitrary fluorescence units.