-
PDF
- Split View
-
Views
-
Cite
Cite
Marit K H van der Wiel, Ngoc Giang Le, Nanine de Groot, Natasja G de Groot, Ronald E Bontrop, Jesse Bruijnesteijn, Exploring the genetic mechanisms driving KIR diversification, The Journal of Immunology, Volume 214, Issue 4, April 2025, Pages 762–779, https://doi.org/10.1093/jimmun/vkae047
- Share Icon Share
Abstract
Killer cell immunoglobulin-like receptors (KIRs) are key modulators of natural killer cell activity, displaying either activating or inhibitory potential upon recognition of major histocompatibility complex (MHC) class I molecules. The genomic organization of KIR genes is complex, involving copy number variation and allelic polymorphism, which is probably due to their coevolution with highly polymorphic MHC ligands. The KIR diversity is reflected by more than 70 similar region configurations encountered in humans, generated through meiotic recombination events. Rhesus macaques happen to display even more diversity, and over 100 distinct configurations were identified in a relatively small cohort of animals. More than half of these region configurations feature hybrid KIR genes, suggesting a more pronounced mode of diversification in macaques. The molecular mechanism facilitating meiotic rearrangements in the KIR region is poorly understood. Examination of 21 rhesus macaque and 14 human KIR region configurations revealed the presence of long terminal repeats and PRDM9 binding motifs associated with recombination hotspots. The variable DNA recognition patterns of PRDM9 could potentially contribute to the differing recombination activities documented for the KIR region in humans and macaques. The diversification process of the KIR repertoire in natural killer cells is fundamentally distinct from the mechanisms generating T and B cell receptor diversity or MHC polymorphisms. This sophisticated recombination machinery preserves the functional integrity by the frequent generation of in-frame KIR genes. A diverse KIR repertoire contributes to the protection of individuals and populations against pathogen evasion and subversion.
Introduction
Natural killer (NK) cells are lymphocytes that contribute to the early response during infection or tumor formation. In contrast to T and B lymphocytes involved in the adaptive immune response, the innate counterparts do not feature rearranged antigen-presenting receptors but monitor their environment via an array of germline encoded molecules.1–3 Important modulators of NK cell activity are the killer cell immunoglobulin-like receptors (KIRs), which exhibit activating or inhibitory potential upon recognition of polymorphic epitopes on major histocompatibility complex (MHC) class I molecules.4 Coevolution of these 2 immune families, driven by a constantly changing landscape of pathogens, propels their genetic conservation or diversification.5,6 In total, 17 KIRs are documented in humans, which display diversity in receptor structure, signaling potential, ligand binding specificity and affinity, expression levels, and cellular localization.7,8 This variation is extended by polymorphisms, as evidenced by a repertoire of 1,600 KIR allotypes.9 Only a limited set of KIRs are encoded within an individual’s genome, whereas the whole scale of allelic variation becomes visible at the population level. This strategy of the immune system likely minimizes the risk of an entire population being highly susceptible to a specific pathogen and reduces the potential for pathogen evasion. Even more, NK cells express 1 to 3 of these available KIRs in a stochastic manner, a process maintained by the epigenetic status of promoters, generating NK cell heterogeneity at an individual level.10–13
At a genomic level, the plasticity of the KIR system is further augmented by extensive gene copy number variation. Over 70 different gene configurations (unique sets of genes present on 1 chromosome) are defined for the human KIR gene cluster on chromosome 19q13, the majority of which consist of a centromeric and a telomeric region.14–16 The 7 most frequent KIR gene configurations are different combinations of 4 centromeric and 2 telomeric regions. The less common gene cluster organizations originate from chromosomal rearrangements that propel structural diversity by the introduction and deletion of genes, by recombining complete centromeric and telomeric regions, and by the reshuffling of gene segments. This latter molecular process may generate novel hybrid entities that might encode KIRs with distinct functional properties.15,17,18 These recombination events make the KIR cluster one of the most variable regions in the human genome, comparable to the extensive genetic diversity recorded for their MHC ligands.
The rapid evolution of this immunogenetic system is also substantiated by the existence of diverse KIR region configurations and gene repertoires in other primate species.19–22 For example, unparalleled KIR gene diversity is documented for rhesus macaques, a species that is relevant as model for infectious diseases in biomedical research.18,19,23,24 Over 118 KIR haplotypes (gene configurations with allelic polymorphism taken into account as well), representing 102 region configurations, were recorded in a relatively small outbred colony of 300 rhesus macaques, indicating a highly unique KIR repertoire per individual.24 The large number of distinct gene configurations most likely originates from frequent recombination events, a phenomenon that appears to be more pronounced in the KIR region of rhesus macaques as compared with its human counterpart. This difference might be explained by a stronger diversifying selection acting on the macaque KIR region, potentially driven by several factors, such as the imperative to maintain immune recognition in the face of evolving pathogens, reproductive success, and coevolution with the dynamic nature of MHC class I genes, which, in contrast to humans, display substantial levels of copy number variation in macaques as well.25
The differential diversifying activity in the human and macaque KIR regions likely involve molecular aspects that promote recombination, but these putative features are poorly understood. Across the genome most recombination events are confined to narrow hotspots that are unevenly distributed on chromosomes.26 A higher hotspot density is in general observed at the terminal subtelomeric regions,27 where indeed on chromosome 19 the KIR gene cluster is situated in primate species. In addition to chromosomal location, the sequence similarity among KIR genes, the presence of retroviral elements and the dense packing of genes in the KIR cluster might also contribute to the seemingly high recombination activity.26 Other molecular features that could modulate recombination events involve the local GC content and the distribution of DNA-binding motifs for PR domain-containing protein 9 (PRDM9).28,29 This latter polymorphic zinc finger protein associates with meiotic recombination sites in mammals, contributing to the initiation of double-stranded breaks and thereby facilitating the exchange of genetic material. Allelic variation in PRDM9 concentrates in the DNA-binding domains, resulting in distinct DNA recognition specificities among individuals and species, ultimately shaping differential recombination landscapes.30
In this study, we explored the recombination intervals within the human and macaque KIR regions through pairwise sequence alignment of hybrid genes and recombined haplotypes. By intersecting these recombination hotspots with genomic features that potentially drive rearrangements, we mapped the factors that contribute to the differential diversification observed in the rhesus macaque and human KIR regions.
Materials and methods
DNA extraction, sequencing, and haplotype assembly
We isolated high-molecular-weight and ultra-high-molecular-weight DNA from peripheral blood mononuclear cell samples (±7 × 106 cells) of 13 rhesus macaques of Indian origin that are housed at the Biomedical Primate Research Centre (Rijswijk, the Netherlands) using the Nanobind UL library prep kit (Circulomics). A computational enrichment approach was applied, referred to as adaptive sampling, in concert with sequencing on R9.4 flow cells and an Oxford Nanopore Technologies GridION device. Basecalling was performed with Guppy V3.4.1 software, and the generated reads were filtered on length (>3 kb) and quality score (q-score > 7) before they were mapped to a reference database containing KIR exon sequences from the IPD-KIR database (www.ebi.ac.uk/ipd/kir/) using minimap2. Mapped reads that were generated after adaptive sampling enrichment had an average length 55 kb and were used to construct complete KIR haplotypes by de novo assembly and by alignment to genomic reference sequences using minimap2, reaching an average coverage of 40×. All KIR alleles were annotated based on exon reference sequences or were designated as novel entities based on phylogenetic trees.
Gene density
The gene density of the human and rhesus macaque KIR, LILR, and MHC-DRB regions were calculated for different haplotypes. The regions were defined from the first coding exon on the haplotype to the last coding exon. The length of each region was determined in combination with the number of functional genes and pseudogenes present. For comparison, the gene/length ratios were extrapolated to the number of genes per Mb sequence. Gene densities that were calculated for the different haplotypes were used as independent samples in a t test using RStudio v2024.12.0.31 The overall gene densities of the human and rhesus macaque genomes were determined using the total length and annotated genes documented for the reference genomes, GRCH38 and Mmul_10.32,33 The calculated overall gene density for the rhesus macaque genome might be underestimated due to less comprehensive annotation of the reference sequences compared with the human reference.
Gene similarity
The sequences of KIR genes and their corresponding intergenic stretches were extracted from the KIR haplotypes and aligned using MAFFT.34 Neighbor-joining trees were generated and plotted for the KIR genes and intergenic regions utilizing Geneious Prime software (version 2023.2.1), applying the Jukes-Cantor and the Kimura substitution models. Both models demonstrated similar clustering, substantiated by high bootstrap values. In addition, a python application, SimPlot++, was used to determine the global and local sequence similarity, applying the Jukes-Cantor distance model, a window length of 200 bp, and steps of 20 bp.35 Similarity networks were plotted using the same tool, with the global similarity threshold set to 95% and the local similarity threshold to 100%, analyzing the complete sequence range. The sequence similarity tables were extracted and used to calculate the number of 100% identical intervals between pairs of KIR genes.
DNA-binding predictions for the PRDM9 zinc finger arrays in macaques
Sequence variation of exon 10, encoding the PRDM9 zinc fingers, was investigated for 55 rhesus macaques of Indian origin by amplicon sequencing. Forty of these individuals represented 14 sets of relatives, with at least 1 parent with 1 child, allowing segregation analysis to confirm novel sequences, whereas 15 others were unrelated individuals. Genomic DNA samples of these animals were available from the in-house biobank. Exon 10 was amplified using sequence specific primers and an optimized polymerase chain reaction protocol (Table S1). The amplicons were size selected using gel electrophoresis and extracted from gel using a GeneJet Gel extraction kit (Invitrogen). Tagged amplicons were pooled and purified twice using AMPure XP beads (Beckman Coulter) at a 1:1 bead-to-DNA volume ratio. SMRTbell libraries were generated according to the PacBio Amplicon Template Preparation protocol for circular consensus sequences and sequenced on a PacBio Sequel II platform with P6-C4 sequencing chemistry. An average of 2,000 CCS reads were obtained per individual, which were mapped to a PRDM9 reference sequence extracted from Mmul_10. Novel PRDM9 exon 10 sequences were confirmed by at least 2 independent polymerase chain reactions or by allele segregation in relatives. Eleven sequences encoding the zinc finger array were confirmed, all representing novel PRDM9 alleles (Fig. S1). The new alleles are designated as PRDM9*New, followed by a sequential numbering (e.g. PRDM9*New1). The nucleotide sequences were translated to amino acids for further analysis.
An online Cys2His2 zinc finger predictor was used to identify the zinc fingers encoded by the different PRDM9 alleles (HMMER bit scores > 17.7).36,37 The residues 1, 3, and 6 are reported to facilitate DNA-binding interactions for each zinc finger. The polynomial support vector machine algorithm was applied to predict the DNA-binding motifs for each zinc finger array, for which position weight matrixes were generated. The position weight matrixes were converted to the MEME-motif format, which were subsequently used to generate DNA sequence logos.38 To compare the different motifs, we first aligned them and subsequently stacked the motifs in a phylogenetic tree using the motifStack package in RStudio v2024.12.0 (R Foundation for Statistical Computing).39
Determination of recombination intervals
To determine the site of recombination, sequences of hybrid KIR genes and gene tandems were aligned with their putative donating gene sequences using MAFFT.34 Subsequently, the alignments were used for recombination analysis utilizing RDP4 (Recombination Detection Program version 4) with a sliding window of 30 bp.40 The start and end coordinates of the recombination intervals were determined with a 99% confidence interval.
Intersecting repeat elements and PRDM9 motifs with the recombination intervals
We used RepeatMasker (v4.1.4) utilizing the rmBLAST search engine and default settings to screen the generated consensus sequences of each KIR haplotype for repeats.41 The repeats were categorized into 3 classes: long interspersed elements (LINEs), such as the L1MA4 lineage; short interspersed elements (SINEs), like the AluSx subfamily; and long terminal repeats (LTRs), including MLT1D sequences. In addition, low-complexity stretches and simple repeats were also determined using the same tool. The repeat elements identified by RepeatMasker were listed into a BED file, containing information on the KIR haplotype ID, the coordinates of the repeat, and the type of repeat.
The occurrences of the DNA-binding motifs corresponding to the different PRDM9 allotypes were assessed for all assembled KIR haplotypes by utilizing the FIMO tool.42 The default settings were used, and the scan output was converted to a BED file, containing the KIR haplotype ID, the coordinates of the DNA-binding motif, and the ID of the corresponding PRDM9 allotype. The matches with the highest confidence levels (P value <10 × 10−5 for rhesus macaque motifs and P-value <10 × 10−6 for human motifs) were split into a second BED file.
The coordinates of the repeat elements and the PRDM9 binding motifs were intersected with the coordinates of the recombination intervals on the KIR haplotypes using BEDTools.43 The “closest” flag was used to identify the features overlapping with the recombination interval, or, when no feature was overlapping, the nearest feature was determined.
Scanning haplotypes for recombination-associated features
Based on the intersections with recombination intervals, 3 LTR elements and 2 broad groups of PRDM9 binding motifs were determined to represent recombination-associated features in rhesus macaques. The 3 LTR elements, 1 within intron 3 and 2 within intron 6, were aligned and plotted in a phylogenetic tree, applying the Jukes-Cantor and Kimura substitution models. Both models demonstrated similar clustering, substantiated by high bootstrap values. The sequence alignment and phylogenetic tree were combined into an illustration using the ggtree package in R.44 The 2 PRDM9 motifs groups represent predicted binding specificities of 6 allotypes, including the 2 predominant variants in our characterized rhesus macaque cohort. In addition to these features, sequence motifs were determined that were present in all or in a selection of the recombination intervals, referred to as interval motifs. This motif discovery was performed by utilizing the MEME tool, either forcing the tool to discover motifs shared by all intervals (OOPS flag) or by a selection of the intervals (ZOOPS flag).45 The interval motifs with high confidence (P value < 1 × 10−20) were selected, aligned, and plotted into a phylogenetic tree as described previously.
The distribution of the recombination-associated features was assessed for all assembled KIR haplotypes. The LTR sequences identified in rhesus macaques were aligned to the haplotypes using minimap2, with the “asm5” flag to only map at a threshold of 95% sequence similarity. The coordinates of the LTRs were extracted from the BAM files and saved as a BED file. In addition, the rhesus macaque and human KIR haplotypes were also scanned for the 2 groups of PRDM9 motifs and the 7 human PRDM9 motifs reported by Altemose et al.,46 respectively, using the FIMO tool as described above. The rhesus macaque KIR haplotypes were scanned for the 2 sets of interval motifs as well. To uncover the distribution, the number of occurrences was determined per feature and per KIR gene. This distribution was visualized with a custom script that utilizes the different generated BED files and the circlize package in R.47
Results
Characterization of rhesus macaque KIR haplotypes using long read sequencing
The genomic organization of various human KIR haplotypes is well documented,14,16,48–50 whereas the characterization of the rhesus macaque equivalent remains relatively limited.23 Initially, we have deduced over 100 distinct rhesus macaque KIR haplotypes utilizing full-length transcript sequencing in concert with segregation analysis.18,19,24 This approach revealed the presence of variable numbers and combinations of transcribed KIR genes per deduced region configuration, in addition to allelic polymorphism. We selected 13 of these deduced rhesus macaque haplotypes that display strong indications of recombination events for further analysis. As most macaques are heterozygous for their KIR haplotypes, 3 additional KIR regions with putative standard organizations complemented the selection. The detailed genetic organization of these 16 distinct genomic KIR regions was resolved using long-read sequencing on an ONT platform in combination with adaptive sampling (Fig. 1). In addition, we included 5 KIR haplotypes that we characterized utilizing a Cas9-mediated enrichment protocol.23

Overview of the rhesus macaque KIR haplotype panel. The 16 rhesus macaque KIR haplotypes are illustrated together with the 5 previously characterized haplotypes (indicated with an asterisk).23 Haplotype H15 was characterized in both studies. (Top) A schematic outline is depicted of a putative standard region organization. In this standard organization, KIR3DL20 (yellow) and KIR2DP (gray) are framework genes present in the centromeric region. A third gene, KIR1D, is present on approximately half of the haplotypes. The telomeric region features the presence of KIR2DL04 (green) and a diverse set of KIR3D genes, with inhibitory (blue) or activating (red) potential. Five KIR haplotypes follow the putative standard organization. The other haplotypes are classified based on 3 markers of meiotic recombination: the presence of hybrid KIR genes or multiple copies of particular KIR genes, or deviating region organizations. On 10 haplotypes, 1 or 2 hybrid KIR genes (orange) were identified. Two or more copies of a particular gene (bold outlined boxes) were found on 7 KIR haplotypes. The standard organization, with a clear separation of centromeric and telomeric haplotype segments, deviated on 3 haplotypes. Several haplotypes display multiple indications for recombination (e.g. haplotype H107 contains hybrid KIR genes and multiple copies of genes).
As compared with humans, the rhesus macaque KIR haplotypes display a condensed centromeric and a variably expanded telomeric region, which are separated by a noncoding stretch of DNA. Two framework genes, KIR3DL20 and the pseudogene KIR2DP, are present on nearly all centromeric regions, except for configurations where these loci have been involved in recombination events (H14, H16, H19). The only macaque KIR gene that shares an apparent ortholog with humans is KIR2DL04 and is encountered on most configurations at the start of the telomeric segment. The shortest haplotype contains 5 KIR genes (H4-C) and span 150 kb, whereas the largest region comprises 17 genes (H107) and measures 243 kb. Except for the conserved KIR2DP gene, no other pseudogenes are identified, which indicates a well-orchestrated and highly efficient recombination machinery that diversifies the KIR gene cluster. This contrasts, for example, with the macaque MHC class II region that has many pseudogenes and truncated remnants.51
To elucidate the molecular mechanisms driving diversification, we set out to identify potential recombination hotspots within the panel of rhesus macaque KIR haplotypes and 14 distinct human KIR region configurations, representing group A and B haplotypes that were previously published by our team and other groups (Fig. 2).23,48,49

Overview of the selected human KIR haplotypes. In total, 14 distinct human KIR region configurations were included in our comparative study. For each configuration, the corresponding accession number is provided, together with the scientific haplotype designation, indicating the diverse centromeric and telomeric segments. The group A and group B haplotype classification for haplotypes encoding a more inhibitory or activating potential, respectively, are indicated in the designation of the centromeric or telomeric haplotype segments. Each region contains KIR3DL3 (yellow), whereas most haplotypes also contain the other framework genes, KIR3DP1 (gray), KIR2DL4 (green), and KIR3DL2 (blue). The genes encoding inhibitory and activating receptors are illustrated with blue and red boxes, respectively. Most haplotypes contain 2 pseudogenes (gray), whereas copies of KIR2DL5 (green) are only identified on group B haplotype segments. Hybrid genes (orange) are identified on 5 region configurations. The large noncoding stretch separating the centromeric and telomeric haplotype segments is depicted as an interrupting break. One haplotype, cA01-tB04, contains 2 telomeric haplotype segments, and is the only configuration that contains multiple copies of particular KIR genes.
Primate KIR haplotypes display markers of recombination events
Meiotic recombination events that have occurred within the KIR region are characterized by 3 distinct haplotype features: (1) the presence of hybrid KIR genes, in which a donor and acceptor gene were fused in frame; (2) the presence of multiple copies of a specific KIR gene; and (3) the emergence of novel haplotype organizations (Fig. 1). The first haplotype feature marks meiotic rearrangements within genes, whereas the other features may also originate from recombination events in intergenic sequences. In our panel of rhesus macaque KIR haplotypes, we identified 11 hybrid KIR genes, originating from distinct fusion events, distributed over 10 haplotypes (Fig. 1 and Table 1). Three haplotypes, H107, H121, and H130, contain 2 hybrid KIR genes. The generation of these hybrid entities involved 12 different donor genes, including pseudogene Mamu-KIR2DP. Most frequently involved in intragenic recombination events are KIR3DL05 and KIR3DL08, suggesting the presence of strong motifs that promote recombination within their sequences. On 7 haplotypes the presence of multiple copies of 1 or more similar KIR genes indicates the introduction of genes by recombination events. For example, haplotype H9 has 3 copies of KIR3DL07 and 2 copies of KIR3DL05 and KIR3DS02, respectively. We hypothesize that this region configuration resulted from different successive chromosomal rearrangements. Copies of KIR3DL07 and 3DS02 are frequently involved in the expansion of region configurations, again suggesting the presence of strong recombination motifs. Three macaque KIR haplotypes display a peculiar region organization (H14, H16, H19), as their centromeric and telomeric regions are fused, lacking the large noncoding stretch that generally separates KIR2DP and KIR2DL04 (Fig. 1).
Recombination intervals identified for the hybrid KIR genes in rhesus macaques.
Haplotype . | Hybrid gene . | Donor 1 . | Exons . | Donor 2 . | Exons . | Recombination interval 99% CI . | Interval length . | Interval location . | Repeat . | Distance repeat . | PRDM9 motif . | Distance motif . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
H4-C | 3DL20–1D | 3DL20 | 1–9 | 1D | 9 | 15,548–15,754 bp | 206 bp | Intron 8/Exon 9 | LTR | −797 bp | New10 | 0 bp |
H4-D, H130 | 3DLW43 | 3DL02 | 1–3 | 3DL08 | 4–9 | 3,396–3,468 bp | 72 bp | Intron 3 | LTR | 0 bp | New2 | −95 bp |
H14 | 3DL20–2DL04 | 3DL20 | 1–7 | 2DL04 | 8 and 9 | 12,855–12,947 bp | 92 bp | Intron 7 | LTR | −275 bp | New1 | −81 bp |
H16 | 3DLW36 | 3DL08 | 1–3 | 3DL05 | 4–9 | 3,963–4,101 bp | 138 bp | Intron 3 | LTR | 0 bp | New7 | −296 bp |
H19 | 2DP/3DL07 | 2DP | — | 3DL07 | 5–9 | 1,834–1,879 bp | 46 bp | — | SINE | −1,182 bp | New2 | +91 bp |
H27 | 3DLW44 | 3DL01 | 1–4 | 3DL08 | 5–9 | 4,754–4,989 bp | 235 bp | Intron 4 | LTR | −774 bp | New1 | +22 bp |
H87 | 3DLW47 | 3DL05 | 1–3 | 3DL07 | 4–9 | 5,625–6,068 bp | 443 bp | Intron 3/Exon 4 | LTR | −369 bp | New10 | 0 bp |
H107 | 3DLW45 | 3DL05 | 1–3 | 3DL10 | 4–9 | 4,480–4,761 bp | 281 bp | Intron 3 | LTR | −312 bp | New10 | +183 bp |
H121 | 3DLW45/3DSW08 | 3DLW45 | 1–4 | 3DSW08 | 4–9 | 5,071–5,267 bp | 196 bp | Exon 4 | LTR | −405 bp | New6 | +166 bp |
H121 | 3DLW51 | 3DL08 | 1–4 | 3DL07 | 4–9 | 4,964–5,081 bp | 117 bp | Exon 4 | LTR | −597 bp | New1 | 0 bp |
H130 | 3DS05/2DL04 | 3DS05 | 1–6 | 2DL04 | 7–9 | 11,437–11,586 bp | 149 bp | Intron 6 | LTR | 0 bp | New6 | +204 bp |
Haplotype . | Hybrid gene . | Donor 1 . | Exons . | Donor 2 . | Exons . | Recombination interval 99% CI . | Interval length . | Interval location . | Repeat . | Distance repeat . | PRDM9 motif . | Distance motif . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
H4-C | 3DL20–1D | 3DL20 | 1–9 | 1D | 9 | 15,548–15,754 bp | 206 bp | Intron 8/Exon 9 | LTR | −797 bp | New10 | 0 bp |
H4-D, H130 | 3DLW43 | 3DL02 | 1–3 | 3DL08 | 4–9 | 3,396–3,468 bp | 72 bp | Intron 3 | LTR | 0 bp | New2 | −95 bp |
H14 | 3DL20–2DL04 | 3DL20 | 1–7 | 2DL04 | 8 and 9 | 12,855–12,947 bp | 92 bp | Intron 7 | LTR | −275 bp | New1 | −81 bp |
H16 | 3DLW36 | 3DL08 | 1–3 | 3DL05 | 4–9 | 3,963–4,101 bp | 138 bp | Intron 3 | LTR | 0 bp | New7 | −296 bp |
H19 | 2DP/3DL07 | 2DP | — | 3DL07 | 5–9 | 1,834–1,879 bp | 46 bp | — | SINE | −1,182 bp | New2 | +91 bp |
H27 | 3DLW44 | 3DL01 | 1–4 | 3DL08 | 5–9 | 4,754–4,989 bp | 235 bp | Intron 4 | LTR | −774 bp | New1 | +22 bp |
H87 | 3DLW47 | 3DL05 | 1–3 | 3DL07 | 4–9 | 5,625–6,068 bp | 443 bp | Intron 3/Exon 4 | LTR | −369 bp | New10 | 0 bp |
H107 | 3DLW45 | 3DL05 | 1–3 | 3DL10 | 4–9 | 4,480–4,761 bp | 281 bp | Intron 3 | LTR | −312 bp | New10 | +183 bp |
H121 | 3DLW45/3DSW08 | 3DLW45 | 1–4 | 3DSW08 | 4–9 | 5,071–5,267 bp | 196 bp | Exon 4 | LTR | −405 bp | New6 | +166 bp |
H121 | 3DLW51 | 3DL08 | 1–4 | 3DL07 | 4–9 | 4,964–5,081 bp | 117 bp | Exon 4 | LTR | −597 bp | New1 | 0 bp |
H130 | 3DS05/2DL04 | 3DS05 | 1–6 | 2DL04 | 7–9 | 11,437–11,586 bp | 149 bp | Intron 6 | LTR | 0 bp | New6 | +204 bp |
A list is provided for all hybrid KIR genes identified in our rhesus macaque cohort, indicating the donating genes, the involved exons, the coordinates of the identified recombination interval in the pairwise alignment, determined at a 99% CI, and the length and position of the intervals. In addition, the class of the intersecting repeat element is indicating, or the position of the closest element when no repeat was intersecting. Furthermore, the PRDM9 allotypes whose DNA-binding motifs intersect with a recombination interval are listed, or, when no overlapping motif was identified, the distance to the nearest motif is provided. The position of the closest repeat elements and PRDM9 motifs are denoted by a positive or negative value, indicating the sequence's location upstream or downstream of the recombination interval, respectively.
Abbreviation: CI, confidence interval.
Recombination intervals identified for the hybrid KIR genes in rhesus macaques.
Haplotype . | Hybrid gene . | Donor 1 . | Exons . | Donor 2 . | Exons . | Recombination interval 99% CI . | Interval length . | Interval location . | Repeat . | Distance repeat . | PRDM9 motif . | Distance motif . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
H4-C | 3DL20–1D | 3DL20 | 1–9 | 1D | 9 | 15,548–15,754 bp | 206 bp | Intron 8/Exon 9 | LTR | −797 bp | New10 | 0 bp |
H4-D, H130 | 3DLW43 | 3DL02 | 1–3 | 3DL08 | 4–9 | 3,396–3,468 bp | 72 bp | Intron 3 | LTR | 0 bp | New2 | −95 bp |
H14 | 3DL20–2DL04 | 3DL20 | 1–7 | 2DL04 | 8 and 9 | 12,855–12,947 bp | 92 bp | Intron 7 | LTR | −275 bp | New1 | −81 bp |
H16 | 3DLW36 | 3DL08 | 1–3 | 3DL05 | 4–9 | 3,963–4,101 bp | 138 bp | Intron 3 | LTR | 0 bp | New7 | −296 bp |
H19 | 2DP/3DL07 | 2DP | — | 3DL07 | 5–9 | 1,834–1,879 bp | 46 bp | — | SINE | −1,182 bp | New2 | +91 bp |
H27 | 3DLW44 | 3DL01 | 1–4 | 3DL08 | 5–9 | 4,754–4,989 bp | 235 bp | Intron 4 | LTR | −774 bp | New1 | +22 bp |
H87 | 3DLW47 | 3DL05 | 1–3 | 3DL07 | 4–9 | 5,625–6,068 bp | 443 bp | Intron 3/Exon 4 | LTR | −369 bp | New10 | 0 bp |
H107 | 3DLW45 | 3DL05 | 1–3 | 3DL10 | 4–9 | 4,480–4,761 bp | 281 bp | Intron 3 | LTR | −312 bp | New10 | +183 bp |
H121 | 3DLW45/3DSW08 | 3DLW45 | 1–4 | 3DSW08 | 4–9 | 5,071–5,267 bp | 196 bp | Exon 4 | LTR | −405 bp | New6 | +166 bp |
H121 | 3DLW51 | 3DL08 | 1–4 | 3DL07 | 4–9 | 4,964–5,081 bp | 117 bp | Exon 4 | LTR | −597 bp | New1 | 0 bp |
H130 | 3DS05/2DL04 | 3DS05 | 1–6 | 2DL04 | 7–9 | 11,437–11,586 bp | 149 bp | Intron 6 | LTR | 0 bp | New6 | +204 bp |
Haplotype . | Hybrid gene . | Donor 1 . | Exons . | Donor 2 . | Exons . | Recombination interval 99% CI . | Interval length . | Interval location . | Repeat . | Distance repeat . | PRDM9 motif . | Distance motif . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
H4-C | 3DL20–1D | 3DL20 | 1–9 | 1D | 9 | 15,548–15,754 bp | 206 bp | Intron 8/Exon 9 | LTR | −797 bp | New10 | 0 bp |
H4-D, H130 | 3DLW43 | 3DL02 | 1–3 | 3DL08 | 4–9 | 3,396–3,468 bp | 72 bp | Intron 3 | LTR | 0 bp | New2 | −95 bp |
H14 | 3DL20–2DL04 | 3DL20 | 1–7 | 2DL04 | 8 and 9 | 12,855–12,947 bp | 92 bp | Intron 7 | LTR | −275 bp | New1 | −81 bp |
H16 | 3DLW36 | 3DL08 | 1–3 | 3DL05 | 4–9 | 3,963–4,101 bp | 138 bp | Intron 3 | LTR | 0 bp | New7 | −296 bp |
H19 | 2DP/3DL07 | 2DP | — | 3DL07 | 5–9 | 1,834–1,879 bp | 46 bp | — | SINE | −1,182 bp | New2 | +91 bp |
H27 | 3DLW44 | 3DL01 | 1–4 | 3DL08 | 5–9 | 4,754–4,989 bp | 235 bp | Intron 4 | LTR | −774 bp | New1 | +22 bp |
H87 | 3DLW47 | 3DL05 | 1–3 | 3DL07 | 4–9 | 5,625–6,068 bp | 443 bp | Intron 3/Exon 4 | LTR | −369 bp | New10 | 0 bp |
H107 | 3DLW45 | 3DL05 | 1–3 | 3DL10 | 4–9 | 4,480–4,761 bp | 281 bp | Intron 3 | LTR | −312 bp | New10 | +183 bp |
H121 | 3DLW45/3DSW08 | 3DLW45 | 1–4 | 3DSW08 | 4–9 | 5,071–5,267 bp | 196 bp | Exon 4 | LTR | −405 bp | New6 | +166 bp |
H121 | 3DLW51 | 3DL08 | 1–4 | 3DL07 | 4–9 | 4,964–5,081 bp | 117 bp | Exon 4 | LTR | −597 bp | New1 | 0 bp |
H130 | 3DS05/2DL04 | 3DS05 | 1–6 | 2DL04 | 7–9 | 11,437–11,586 bp | 149 bp | Intron 6 | LTR | 0 bp | New6 | +204 bp |
A list is provided for all hybrid KIR genes identified in our rhesus macaque cohort, indicating the donating genes, the involved exons, the coordinates of the identified recombination interval in the pairwise alignment, determined at a 99% CI, and the length and position of the intervals. In addition, the class of the intersecting repeat element is indicating, or the position of the closest element when no repeat was intersecting. Furthermore, the PRDM9 allotypes whose DNA-binding motifs intersect with a recombination interval are listed, or, when no overlapping motif was identified, the distance to the nearest motif is provided. The position of the closest repeat elements and PRDM9 motifs are denoted by a positive or negative value, indicating the sequence's location upstream or downstream of the recombination interval, respectively.
Abbreviation: CI, confidence interval.
The 3 markers of recombination events were also recorded on the selected human KIR haplotypes (Fig. 2). Four distinct hybrid KIR genes were present on 5 region configurations (Table 2), each of them displaying a reduced gene content (Fig. 2). Three of those hybrid KIR entities emerged from reshuffling the segments encoding the binding and signaling domains, while the hybrid KIR2DS1 gene experienced rearrangement only in the exons encoding the leader peptide, originating from KIR2DL1. Several human region configurations containing multiple copies of particular KIR genes have been deduced using standard typing methods in combination with segregation studies,15 but only 1 of these has been completely characterized at the genomic DNA level (cA01-tB04) (Fig. 2). This expanded region configuration is characterized by the introduction of an entire telomeric segment (tB01). The other selected human KIR region configurations represent standard organizations (groups A and B) or display minor alterations with regard to gene content.
Recombination intervals identified for the hybrid KIR genes on the selected human haplotypes.
Haplotype . | Acc. number . | Hybrid gene . | Donor 1 . | Exons . | Donor 2 . | Exons . | Recombination interval 99% CI . | Interval length . | Interval location . | Repeat . | Distance repeat . | PRDM9 motif . | Distance motif . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
cA01-tA02 | MN167530 | 3DL2 | 3DL1 | 1–5 | 3DL2 | 6–9 | 8,734–8,849 bp | 115 bp | Intron 5 | LINE | 0 bp | Human #6 | −115 bp |
cA03-tB02 | KP420443 | 2DS1 | 2DL1 | 1 and 2 | 2DS1 | 3–9 | 1,318–3,067 bp | 1749 bp | Intron 2 | LTR, SINE | 0 bp | Human #1, 2, 3, 4, 5, 6, 7 | 0 bp |
cA04 | KU645198 | 2DL1 | 2DL1 | 1–6 | 3DL2 | 7–9 | 11,207–11,438 | 231 bp | Intron 6 | SINE | +156 bp | Human #4, 6 | 0 bp |
cB05-tA01 | KP420437 | 2DS2*005 | 2DS2 | 1–6 | 2DS3 | 7–9 | 12,557–13,169 bp | 612 bp | Intron 6 | LTR | 0 bp | Human #1 | −31 bp |
cB05-tB01 | NA24385 |
Haplotype . | Acc. number . | Hybrid gene . | Donor 1 . | Exons . | Donor 2 . | Exons . | Recombination interval 99% CI . | Interval length . | Interval location . | Repeat . | Distance repeat . | PRDM9 motif . | Distance motif . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
cA01-tA02 | MN167530 | 3DL2 | 3DL1 | 1–5 | 3DL2 | 6–9 | 8,734–8,849 bp | 115 bp | Intron 5 | LINE | 0 bp | Human #6 | −115 bp |
cA03-tB02 | KP420443 | 2DS1 | 2DL1 | 1 and 2 | 2DS1 | 3–9 | 1,318–3,067 bp | 1749 bp | Intron 2 | LTR, SINE | 0 bp | Human #1, 2, 3, 4, 5, 6, 7 | 0 bp |
cA04 | KU645198 | 2DL1 | 2DL1 | 1–6 | 3DL2 | 7–9 | 11,207–11,438 | 231 bp | Intron 6 | SINE | +156 bp | Human #4, 6 | 0 bp |
cB05-tA01 | KP420437 | 2DS2*005 | 2DS2 | 1–6 | 2DS3 | 7–9 | 12,557–13,169 bp | 612 bp | Intron 6 | LTR | 0 bp | Human #1 | −31 bp |
cB05-tB01 | NA24385 |
A list is provided for all hybrid KIR genes identified on the selected human haplotypes, indicating the accession number of the involved haplotype, the donating genes, the involved exons, the coordinates of the determined recombination interval at a 99% confidence interval (CI), and the length and position of the intervals. In addition, the class of the intersecting repeat element is indicating, or the position of the closest element when no repeat was intersecting. In addition, the PRDM9 allotypes whose DNA-binding motifs intersect with a recombination interval are listed, or, when no overlapping motif was identified, the distance to the nearest motif is provided. Multiple repeat elements and/or PRDM9 motifs were found to intersect with 2 recombination intervals. The position of the closest repeat elements and PRDM9 motifs are denoted by a positive or negative value, indicating the sequence's location upstream or downstream of the recombination interval, respectively.
Recombination intervals identified for the hybrid KIR genes on the selected human haplotypes.
Haplotype . | Acc. number . | Hybrid gene . | Donor 1 . | Exons . | Donor 2 . | Exons . | Recombination interval 99% CI . | Interval length . | Interval location . | Repeat . | Distance repeat . | PRDM9 motif . | Distance motif . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
cA01-tA02 | MN167530 | 3DL2 | 3DL1 | 1–5 | 3DL2 | 6–9 | 8,734–8,849 bp | 115 bp | Intron 5 | LINE | 0 bp | Human #6 | −115 bp |
cA03-tB02 | KP420443 | 2DS1 | 2DL1 | 1 and 2 | 2DS1 | 3–9 | 1,318–3,067 bp | 1749 bp | Intron 2 | LTR, SINE | 0 bp | Human #1, 2, 3, 4, 5, 6, 7 | 0 bp |
cA04 | KU645198 | 2DL1 | 2DL1 | 1–6 | 3DL2 | 7–9 | 11,207–11,438 | 231 bp | Intron 6 | SINE | +156 bp | Human #4, 6 | 0 bp |
cB05-tA01 | KP420437 | 2DS2*005 | 2DS2 | 1–6 | 2DS3 | 7–9 | 12,557–13,169 bp | 612 bp | Intron 6 | LTR | 0 bp | Human #1 | −31 bp |
cB05-tB01 | NA24385 |
Haplotype . | Acc. number . | Hybrid gene . | Donor 1 . | Exons . | Donor 2 . | Exons . | Recombination interval 99% CI . | Interval length . | Interval location . | Repeat . | Distance repeat . | PRDM9 motif . | Distance motif . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
cA01-tA02 | MN167530 | 3DL2 | 3DL1 | 1–5 | 3DL2 | 6–9 | 8,734–8,849 bp | 115 bp | Intron 5 | LINE | 0 bp | Human #6 | −115 bp |
cA03-tB02 | KP420443 | 2DS1 | 2DL1 | 1 and 2 | 2DS1 | 3–9 | 1,318–3,067 bp | 1749 bp | Intron 2 | LTR, SINE | 0 bp | Human #1, 2, 3, 4, 5, 6, 7 | 0 bp |
cA04 | KU645198 | 2DL1 | 2DL1 | 1–6 | 3DL2 | 7–9 | 11,207–11,438 | 231 bp | Intron 6 | SINE | +156 bp | Human #4, 6 | 0 bp |
cB05-tA01 | KP420437 | 2DS2*005 | 2DS2 | 1–6 | 2DS3 | 7–9 | 12,557–13,169 bp | 612 bp | Intron 6 | LTR | 0 bp | Human #1 | −31 bp |
cB05-tB01 | NA24385 |
A list is provided for all hybrid KIR genes identified on the selected human haplotypes, indicating the accession number of the involved haplotype, the donating genes, the involved exons, the coordinates of the determined recombination interval at a 99% confidence interval (CI), and the length and position of the intervals. In addition, the class of the intersecting repeat element is indicating, or the position of the closest element when no repeat was intersecting. In addition, the PRDM9 allotypes whose DNA-binding motifs intersect with a recombination interval are listed, or, when no overlapping motif was identified, the distance to the nearest motif is provided. Multiple repeat elements and/or PRDM9 motifs were found to intersect with 2 recombination intervals. The position of the closest repeat elements and PRDM9 motifs are denoted by a positive or negative value, indicating the sequence's location upstream or downstream of the recombination interval, respectively.
High KIR gene density and sequence similarity may impact recombination rates
In most organisms, an elevated recombination rate is positively correlated with gene density.52–54 The architecture of the KIR region is characterized by a head-to-tail organization in which densely packed genes are separated only by relatively short intergenic stretches. The studied rhesus macaque KIR haplotypes display a significantly higher gene density than their human counterparts (Fig. 3). This difference is mostly due to longer sequences encountered for introns 5 and 6 in the human KIR genes, which extends the haplotype length. In comparison with other segmentally duplicated immune regions, such as the genes encoding the LILRs (leukocyte immunoglobulin-like receptors), located adjacent to the KIR cluster, and the MHC-DRB genes, the physical proximity of genes is 2-fold higher in the KIR regions of both humans and macaques (Fig. 3). The impact of the significantly denser arrangement of rhesus macaque KIR genes, in contrast to their human equivalents, on potentially increasing the recombination activity remains an enigma.

A comparison of the gene density across the KIR, LILR, and MHC-DRB regions of humans and rhesus macaques. The gene density (genes/Mb) is plotted for the KIR, LILR, and MHC-DRB regions in humans (red) and rhesus macaques (blue). The level of significance between the 2 species is indicated with asterisks (*P < 0.05, ****P < 0.00005). The overall gene densities of the rhesus macaque and human reference genomes are indicated with dashed lines.
Sequence similarity and recombination rates are also interconnected factors that modulate the genomic landscape.55 The KIR genes and their associated intergenic sequences cluster independently in phylogenetic trees (Fig. S2), reflecting a complex evolutionary history as evidenced by distinctive structural diversity (indels) and point mutations. Beyond these variations, numerous short and long intervals demonstrate a high degree of sequence similarity across KIR genes in rhesus macaques and humans, encompassing exons, introns, and intergenic regions (Fig. S3 and Tables S2 and S3). These highly similar stretches may facilitate unequal alignment of homologous chromosomes during meiosis, thereby enabling the rearrangement of genetic material. In humans, for example, the intergenic sequences associated with KIR2DL5B and 3DP1 are highly similar (Fig. S4), which may facilitate recombination events that reduce the gene content of the centromeric cB02 region (Fig. 2). Similarly, in rhesus macaques, KIR3DS02 is often involved in rearrangements and shares at least 1 identical stretch of 200 bp with 13 different KIR genes. In contrast, KIR3DL20 and KIR2DL04 are stable framework genes on most macaque haplotypes and share identical sequences with only 1 other KIR gene, KIR3DS06 and KIR3DS01, respectively. However, these latter genes are not involved in the formation of the hybrid entities that are documented for the framework genes. Additionally, more identical stretches are shared among sets of human KIR genes than among rhesus macaques genes (Table S3), but there are no indications that they often trigger recombination events in the contemporary time. Therefore, it remains uncertain whether the high sequence similarity between KIR genes contributes to local recombination activity.
The architectural features involving gene density and sequence similarity make the KIR regions in primates susceptible to meiotic recombination. However, these features alone do not account for the precisely orchestrated process, which only generates in-frame hybrid KIR genes and intact expanded and contracted haplotype organizations. Instead, it is likely that motifs at or near recombination sites play a more dominant role in guiding the rearrangement of KIR region configurations.
Determination of the recombination sites
Recombination sites, defined as short sequence intervals in which the DNA strand breaks during meiosis, can be identified within the KIR region through the pairwise alignment of rearranged segments and their potential precursor sequences. We mapped the recombination interval for the 11 rhesus macaque hybrid KIR genes in our panel (Table 1). For example, a recombination interval of 72 bp is located in intron 3 of KIR3DLW43, a hybrid entity that consists of segments originating from KIR3DL02 and KIR3DL08 (Fig. 4A). Other intervals range from approximately 35 to 450 bp in length and occur both in introns and exons. The length of recombination intervals increases with the level of sequence similarity between the 2 donating genes, as discriminative SNPs are required to pinpoint putative meiotic breakpoints. Recombination intervals were also determined for the 4 human hybrid KIR genes present in our panel (Table 2).

Pairwise identity plots for the identification of recombination intervals. (A) The pairwise sequence identity of KIR3DL08 and KIR3DL02, KIR3DL08 and KIR3DLW43, and KIR3DL02 and KIR3DLW43 are depicted by yellow, blue, and red lines, respectively. (B) A similar plot is generated for the alignments of 3 gene tandems, involving KIR3DL05–KIR3DL07, KIR3DL01–KIR3DS02, and KIR3DL05–KIR3DS02. The identified recombination intervals, containing the putative double-stranded break, are indicated at 95% and 99% confidence intervals by gray highlighting. The nucleotide position is illustrated on the x-axis, with boxes indicating the position of the different exons in the sequence alignments. The striped black bar at the top represent regions where the aligned sequences deviate from one another, which helps to determine the recombination interval. The plots are generated using RDP4.40
In addition to the generation of hybrid genes, recombination events are also responsible for the expansion and contraction of KIR region configurations by the introduction and deletion of genes. The breakpoints of these rearrangements are likely situated near or within intergenic sequences. To identify such recombination intervals, we first assessed the occurrence of pairs of neighboring KIR genes. Notably, most KIR genes exhibit strong linkage with their adjacent gene (Table S4). For example, KIR3DL08 is adjacent to KIR3DL01 in all of the haplotypes containing this gene in our macaque panel. Uncommon KIR gene tandems may mark meiotic recombination events, 10 of such were present in our cohort (Table 3). For instance, KIR3DL07 is often linked to the 3′ side of KIR3DL05, but in one case this position was occupied by KIR3DS02 (H69), which may indicate a recombination event. The associated breakpoint probably locates in a recombination interval of 382 bp that was determined for the KIR3DL05–KIR3DS02 tandem in intron 6 of the former gene (Fig. 4B). Most recombination intervals in the other potentially rearranged gene tandems also start within intron 6 of the former gene and may extend into introns 7 and 8. The location of this putative hotspot is supported by pairwise sequence alignments of 3 distinct gene tandems involving KIR2DL04 (Fig. S5). The recombination events mapping to intron 6 may have generated previously unrecognized hybrid KIR genes, in which the exons encoding the signaling domains are swapped. This intragenic process is further substantiated in humans by the tight linkage of intergenic sequences and their associated genes (Fig. S4), suggesting a restriction on the shuffling of KIR promoters. Two of the human hybrid KIR genes, KIR2DL1 (KU645198) and KIR2DS2 (KP420437), also share recombination intervals within intron 6, but this hotspot seems to be less active.
Recombination intervals identified by the alignment of KIR gene tandems in rhesus macaques.
Gene couple . | Donor 1 . | Donor 2 . | Recombination interval 99% CI . | Interval length . | Location . | Repeat . | Distance repeat . | PRDM9 motif . | Distance motif . |
---|---|---|---|---|---|---|---|---|---|
3DL01–3DS02 | 3DL01–3DS06 | 3DL05–3DS02 | 9,815–10,925 bp | 1,109 bp | Intron 6–Intron 8 | LTR | 0 bp | New7 | 0 bp |
3DS02–3DS04 | 3DS02–3DS06 | 3DS02–3DS04 | 12,053–13,666 bp | 1,614 bp | Intron 7–Intergenic region | LINE | 0 bp | New3/New7 | 0 bp |
3DS02–3DL05 | 3DS02–3DS06 | 3DS04–3DL05 | 12,215–12,831 bp | 617 bp | Intron 6 | LTR | 0 bp | New4/New9 | 0 bp |
3DL05–3DL07 | 3DL05–3DS02 | 3DL01–3DS02 | 9,681–10,062 bp | 382 bp | Intron 6 | LTR | 0 bp | New7 | 0 bp |
3DL02–3DL07 | 3DL02–3DS01 | 3DL05–3DL07 | 11,544–12,014 bp | 471 bp | Exon 9–intergenic region | LINE | 0 bp | New10 | +107 bp |
3DL07–3DL07 | 3DL07–3DSW09 | 3DL05–3DL07 | 14,564–15,054 bp | 490 bp | Intron 6–Intron 7 | LTR | −64 bp | New7 | −194 bp |
3DLW45–3DL07 | 3DLW45–3DL01 | 3DL02–3DL07 | 9,560–9,834 bp | 275 bp | Intron 5–Intron 6 | SINE | 0 bp | New7 | +542 bp |
3DS01–3DL05 | 3DS01–3DL07 | 3DS02–3DL05 | 14,543–15,430 bp | 888 bp | Intron 6–Intron 7 | LTR | 0 bp | New2/New9 | 0 bp |
3DS04–3DL07 | 3DS04–3DL05 | 3DS01–3DL07 | 15,131–15,794 bp | 663 bp | Intron 6–Intron 7 | LTR | 0 bp | New2/New9 | −330 bp |
3DS05–3DLW47 | 3DS05–3DL10 | 3DS02–3DL05 | 13,093–13,831 bp | 730 bp | Intron 6–Intron 7 | LTR | 0 bp | New9 | −283 bp |
Gene couple . | Donor 1 . | Donor 2 . | Recombination interval 99% CI . | Interval length . | Location . | Repeat . | Distance repeat . | PRDM9 motif . | Distance motif . |
---|---|---|---|---|---|---|---|---|---|
3DL01–3DS02 | 3DL01–3DS06 | 3DL05–3DS02 | 9,815–10,925 bp | 1,109 bp | Intron 6–Intron 8 | LTR | 0 bp | New7 | 0 bp |
3DS02–3DS04 | 3DS02–3DS06 | 3DS02–3DS04 | 12,053–13,666 bp | 1,614 bp | Intron 7–Intergenic region | LINE | 0 bp | New3/New7 | 0 bp |
3DS02–3DL05 | 3DS02–3DS06 | 3DS04–3DL05 | 12,215–12,831 bp | 617 bp | Intron 6 | LTR | 0 bp | New4/New9 | 0 bp |
3DL05–3DL07 | 3DL05–3DS02 | 3DL01–3DS02 | 9,681–10,062 bp | 382 bp | Intron 6 | LTR | 0 bp | New7 | 0 bp |
3DL02–3DL07 | 3DL02–3DS01 | 3DL05–3DL07 | 11,544–12,014 bp | 471 bp | Exon 9–intergenic region | LINE | 0 bp | New10 | +107 bp |
3DL07–3DL07 | 3DL07–3DSW09 | 3DL05–3DL07 | 14,564–15,054 bp | 490 bp | Intron 6–Intron 7 | LTR | −64 bp | New7 | −194 bp |
3DLW45–3DL07 | 3DLW45–3DL01 | 3DL02–3DL07 | 9,560–9,834 bp | 275 bp | Intron 5–Intron 6 | SINE | 0 bp | New7 | +542 bp |
3DS01–3DL05 | 3DS01–3DL07 | 3DS02–3DL05 | 14,543–15,430 bp | 888 bp | Intron 6–Intron 7 | LTR | 0 bp | New2/New9 | 0 bp |
3DS04–3DL07 | 3DS04–3DL05 | 3DS01–3DL07 | 15,131–15,794 bp | 663 bp | Intron 6–Intron 7 | LTR | 0 bp | New2/New9 | −330 bp |
3DS05–3DLW47 | 3DS05–3DL10 | 3DS02–3DL05 | 13,093–13,831 bp | 730 bp | Intron 6–Intron 7 | LTR | 0 bp | New9 | −283 bp |
Several recombination intervals were revealed by the pairwise alignment of KIR gene tandems. The 3 involved gene tandems are listed, together with the coordinates of the determined recombination interval at a 99% CI, and the length and location of the interval. In addition, the class of the intersecting repeat element is indicating, or the position of the closest repeat element when no repeat was intersecting. Furthermore, the PRDM9 allotypes whose DNA-binding motifs intersect with a recombination interval are listed, or, when no overlapping motif was identified, the distance to the nearest motif is provided. The position of the closest repeat elements and PRDM9 motifs are denoted by a positive or negative value, indicating the sequence's location upstream or downstream of the recombination interval, respectively.
Abbreviation: CI, confidence interval.
Recombination intervals identified by the alignment of KIR gene tandems in rhesus macaques.
Gene couple . | Donor 1 . | Donor 2 . | Recombination interval 99% CI . | Interval length . | Location . | Repeat . | Distance repeat . | PRDM9 motif . | Distance motif . |
---|---|---|---|---|---|---|---|---|---|
3DL01–3DS02 | 3DL01–3DS06 | 3DL05–3DS02 | 9,815–10,925 bp | 1,109 bp | Intron 6–Intron 8 | LTR | 0 bp | New7 | 0 bp |
3DS02–3DS04 | 3DS02–3DS06 | 3DS02–3DS04 | 12,053–13,666 bp | 1,614 bp | Intron 7–Intergenic region | LINE | 0 bp | New3/New7 | 0 bp |
3DS02–3DL05 | 3DS02–3DS06 | 3DS04–3DL05 | 12,215–12,831 bp | 617 bp | Intron 6 | LTR | 0 bp | New4/New9 | 0 bp |
3DL05–3DL07 | 3DL05–3DS02 | 3DL01–3DS02 | 9,681–10,062 bp | 382 bp | Intron 6 | LTR | 0 bp | New7 | 0 bp |
3DL02–3DL07 | 3DL02–3DS01 | 3DL05–3DL07 | 11,544–12,014 bp | 471 bp | Exon 9–intergenic region | LINE | 0 bp | New10 | +107 bp |
3DL07–3DL07 | 3DL07–3DSW09 | 3DL05–3DL07 | 14,564–15,054 bp | 490 bp | Intron 6–Intron 7 | LTR | −64 bp | New7 | −194 bp |
3DLW45–3DL07 | 3DLW45–3DL01 | 3DL02–3DL07 | 9,560–9,834 bp | 275 bp | Intron 5–Intron 6 | SINE | 0 bp | New7 | +542 bp |
3DS01–3DL05 | 3DS01–3DL07 | 3DS02–3DL05 | 14,543–15,430 bp | 888 bp | Intron 6–Intron 7 | LTR | 0 bp | New2/New9 | 0 bp |
3DS04–3DL07 | 3DS04–3DL05 | 3DS01–3DL07 | 15,131–15,794 bp | 663 bp | Intron 6–Intron 7 | LTR | 0 bp | New2/New9 | −330 bp |
3DS05–3DLW47 | 3DS05–3DL10 | 3DS02–3DL05 | 13,093–13,831 bp | 730 bp | Intron 6–Intron 7 | LTR | 0 bp | New9 | −283 bp |
Gene couple . | Donor 1 . | Donor 2 . | Recombination interval 99% CI . | Interval length . | Location . | Repeat . | Distance repeat . | PRDM9 motif . | Distance motif . |
---|---|---|---|---|---|---|---|---|---|
3DL01–3DS02 | 3DL01–3DS06 | 3DL05–3DS02 | 9,815–10,925 bp | 1,109 bp | Intron 6–Intron 8 | LTR | 0 bp | New7 | 0 bp |
3DS02–3DS04 | 3DS02–3DS06 | 3DS02–3DS04 | 12,053–13,666 bp | 1,614 bp | Intron 7–Intergenic region | LINE | 0 bp | New3/New7 | 0 bp |
3DS02–3DL05 | 3DS02–3DS06 | 3DS04–3DL05 | 12,215–12,831 bp | 617 bp | Intron 6 | LTR | 0 bp | New4/New9 | 0 bp |
3DL05–3DL07 | 3DL05–3DS02 | 3DL01–3DS02 | 9,681–10,062 bp | 382 bp | Intron 6 | LTR | 0 bp | New7 | 0 bp |
3DL02–3DL07 | 3DL02–3DS01 | 3DL05–3DL07 | 11,544–12,014 bp | 471 bp | Exon 9–intergenic region | LINE | 0 bp | New10 | +107 bp |
3DL07–3DL07 | 3DL07–3DSW09 | 3DL05–3DL07 | 14,564–15,054 bp | 490 bp | Intron 6–Intron 7 | LTR | −64 bp | New7 | −194 bp |
3DLW45–3DL07 | 3DLW45–3DL01 | 3DL02–3DL07 | 9,560–9,834 bp | 275 bp | Intron 5–Intron 6 | SINE | 0 bp | New7 | +542 bp |
3DS01–3DL05 | 3DS01–3DL07 | 3DS02–3DL05 | 14,543–15,430 bp | 888 bp | Intron 6–Intron 7 | LTR | 0 bp | New2/New9 | 0 bp |
3DS04–3DL07 | 3DS04–3DL05 | 3DS01–3DL07 | 15,131–15,794 bp | 663 bp | Intron 6–Intron 7 | LTR | 0 bp | New2/New9 | −330 bp |
3DS05–3DLW47 | 3DS05–3DL10 | 3DS02–3DL05 | 13,093–13,831 bp | 730 bp | Intron 6–Intron 7 | LTR | 0 bp | New9 | −283 bp |
Several recombination intervals were revealed by the pairwise alignment of KIR gene tandems. The 3 involved gene tandems are listed, together with the coordinates of the determined recombination interval at a 99% CI, and the length and location of the interval. In addition, the class of the intersecting repeat element is indicating, or the position of the closest repeat element when no repeat was intersecting. Furthermore, the PRDM9 allotypes whose DNA-binding motifs intersect with a recombination interval are listed, or, when no overlapping motif was identified, the distance to the nearest motif is provided. The position of the closest repeat elements and PRDM9 motifs are denoted by a positive or negative value, indicating the sequence's location upstream or downstream of the recombination interval, respectively.
Abbreviation: CI, confidence interval.
Although precise meiotic breakpoints could not be determined, the identified recombination intervals indicate that all rearrangements in the macaque KIR region generate hybrid gene entities. Aligning sequence features that may promote recombination with the defined intervals might pinpoint the elements that contribute to the dynamic nature of the macaque and human KIR regions.
DNA-binding motifs of rhesus macaque and human PRDM9 allotypes
An important modulator of meiotic recombination events, PRDM9, exhibits allelic variation in its zinc finger array, which may affect its DNA recognition motif.56 So far, only a single rhesus macaque PRDM9 sequence is published, and the extent of allelic variability remains to be explored. We investigated the genetic diversity of the PRDM9 zinc finger array in 55 Indian rhesus macaques and identified 11 novel alleles, 2 of which were predominantly present in our cohort (Fig. 5). The nucleotide variation is mostly confined to the stretches encoding the different zinc fingers, and in particular concentrates around the 3 DNA-contacting residues (Fig. S1). The discovered alleles encode for 8 distinct zinc finger arrays, and their corresponding DNA-binding motifs were predicted computationally (Fig. 6). This revealed 4 broad motif groups predicted to be binding targets for the different PRDM9 allotypes detected in rhesus macaques. The 2 most frequent alleles in our cohort, New1 and New2, have distinct predicted binding specificities.

The zinc finger arrays of different PRDM9 alleles. The zinc fingers arrays corresponding to different rhesus macaque (top) and human (bottom) PRDM9 alleles are illustrated. Each zinc finger is depicted as an oval, in which the 3 DNA contacting residues (1, 3, and 6) are provided. Different zinc fingers are indicated by distinct colors. The 2 rhesus macaque zinc fingers (SFQ, DSY) with a bold lining are also found in human PRDM9 zinc finger arrays. The frequency of each allele in the characterized rhesus macaque cohort is provided, demonstrating 2 dominant allotypes. Three alleles, PRDM9*New1, PRDM9*New8, and PRDM9*New11, encode for identical zinc finger arrays.

Phylogenetic tree of the predicted PRDM9 binding motifs in rhesus macaques. Eight distinct zinc finger arrays are encoded by the PRDM9 alleles identified in the panel of 55 rhesus macaques. Their amino acids sequences were used to determine the different zinc finger arrays, from which the binding motifs were predicted using a software tool.37 The motifs are depicted as sequence logos, in which the height of the letters positively correlates with their confidence level. The different motifs were aligned and plotted into a phylogenetic tree together with 1 human PRDM9 motif for comparison. Four broad motif groups could be identified, numbered in a sequential order. The occurrences of a motif from a specific group overlapping with or near a recombination interval is indicated within brackets, identifying PRDM9 motif groups 1 and 2 to be predominantly present at the recombination hotspots in the KIR region of rhesus macaques.
In humans, over 40 PRDM9 alleles are recorded.57,58 The DNA recognition motifs of the 3 most prevalent alleles, designated as PRDM9-A, PRDM9-B, and PRDM9-C, are extensively examined.46,57 The A and B alleles broadly recognize the same 13-mer sequence and are predominant in non-African populations. Within African cohorts, the PRDM9-A allele manifests at a frequency of 50%, complemented by 12% prevalence of PRDM9-C and a wide diversity of alleles with lower frequencies.59 Seven motifs with a close internal match to the 13-mer sequence are recognized by the PRDM9-B allele.46 These binding motifs are likely shared with PRDM9-A, which only differs by 1 DNA-contacting residue in its zinc finger array.
The PRDM9 binding domains in rhesus macaques may contain 9 to 11 zinc fingers, whereas the 3 prevalent human alleles exhibit 13 or 14 zinc fingers (Fig. 5). In addition, except for the proximal zinc finger that is not involved in DNA-binding, only 1 zinc finger (DSY) is conserved in the arrays of rhesus macaques and humans, sharing identical DNA-contacting residues. These observations support the distinct binding motifs that are determined (or predicted) for their PRDM9 alleles (Fig. 6). Consequently, different recombination landscapes are expected for rhesus macaques and humans, which is in line with a comparative study highlighting a variable distribution of recombination hotspots.60 This suggests that allelic variability of PRDM9 may impact the differential diversification of KIR haplotypes at a species level.
Repeat elements and PRDM9 motifs associate with recombination intervals
A total of 21 recombination intervals were identified in the panel of rhesus macaque KIR haplotypes, whereas 4 intervals were determined in the selected human equivalents. For each interval, we assessed the proximity to, or overlap with, repeat elements and PRDM9 motifs.
Twelve macaque intervals intersected with repeat elements, including all recombination sites within intron 6, except for 1 (Tables 1 and 3). Predominantly, the repeat elements represent LTRs, complemented by 2 LINE and 1 SINE repeats. Two distinct LTR sequences starting in intron 6 comprise a relatively short stretch of approximately 240 bp and a longer one extending up to 910 bp, respectively (Fig. S6). Both LTRs in intron 6 are different in comparison with the intersecting LTRs present in intron 3. An example of this latter LTR maps within the interval determined for the hybrid KIR3DLW43 gene (Fig. 7). If not overlapping, the nearest repeat elements are situated at an average distance of 551 bp from the recombination intervals, with proximities ranging from 64 to 1,182 bp. Notably, all of these proximal elements belong to the LTR family, with a single exception that belongs to the SINE family (Table 1). Three of the 4 human recombination intervals intersected with a repeat element from the SINE, LINE, or LTR class (Table 2). The largest interval aligned with 2 repeat elements, a SINE and an LTR. The LTR found in intron 6 of the hybrid KIR2DS2*005 gene exhibits a sequence similarity of up to 88% with its counterparts in rhesus macaques, indicating a retroviral element that integrated long before speciation. These type of ancient LTR integrations are more often associated with the diversification of immune regions, as has been demonstrated for the MHC-DRB genes in rhesus macaques.51

Distribution of repeat elements and PRDM9 motifs in KIR3DLW43. A schematic overview in which the sequence of KIR3DLW43 is circularized, with the exons (dark gray) and recombination interval (red) indicated at scale in the outer ring. The distribution of the different repeat elements, including LTRs, LINEs, and SINEs, is indicated in the first inner circle, whereas the positions of the different PRDM9 binding motifs (blue colors) are provided in the second inner circle. The recombination interval intersects with an LTR element, whereas a DNA-binding motif corresponding to PRDM9*New2 is found 95 bp upstream (Table 1).
Eight recombination intervals in rhesus macaques intersect with a predicted PRDM9 motif, 5 of which also align with a repeat element (Tables 1 and 3). Most intersecting binding motifs correspond to PRDM9*New7, PRDM9*New10, and PRDM9*New9 (Fig. 6). When no motif overlapped with a recombination interval, we identified the nearest motif at an average distance of 200 bp, with proximities ranging from 22 to 542 bp. For example, a motif corresponding to PRDM9*New2 was identified 95 bp upstream of the recombination interval of KIR3DLW43 (Table 1 and Fig. 7). Hence, PRDM9 typically does not bind directly at the exact site where double-stranded breaks form during meiosis. Instead, it commonly binds to DNA sequences nearby, usually within a few hundred base pairs from the double-stranded break site, as was previously documented in humans.60,61 Consequently, also the proximal motifs that do not directly overlap with a recombination interval might contribute to a recombination event. Sets of similar PRDM9 binding motifs, categorized into motif groups 1 and 2, were primarily found near or overlapping with the 21 recombination intervals (Tables 1 and 3 and Fig. 6). In contrast, the motifs associated with the New3 and New4 allotypes were only identified once near an interval. The predominant motif groups both associate with 1 of the 2 most frequently identified PRDM9 alleles (New1 and New2) present in our cohort of macaques. The DNA-binding motifs of human PRDM9 intersected with 2 recombination intervals, whereas for the other 2 putative rearrangements a motif was identified at close proximity (Table 2). Given the slight differences among the human PRDM9 binding specificities, multiple motifs intersected with 2 of the intervals.
When neither a repeat element nor a PRDM9 recognition motif overlapped with a recombination interval in rhesus macaques, which was the case for 6 intervals, the nearest feature was identified at a minimal distance ranging from 22 to 183 bp. This mostly involved a proximal PRDM9 binding motif.
Predicting putative recombination sites on KIR haplotypes
The characterization of complete KIR haplotypes facilitated the determination of recombination intervals and the identification of diverse molecular features associated with them. By mapping these features on the genomically characterized KIR haplotypes, we could assess the presence of other potential breakpoints that may initiate recombination during meiosis. In addition, we determined 2 sequence motifs that recurred in all 21 recombination intervals in rhesus macaques, as well as 2 motifs that were recurring in at least 6 of the intervals (Fig. S7). The low conservation across all recombination intervals suggests that diversification of the KIR region is not driven by a single molecular feature. Nevertheless, these recurring motifs might serve as indicators for the presence of potential recombination hotspots on the KIR haplotypes.
Mapping LTR sequences, PRDM9 binding motifs, and recurring interval motifs onto the KIR haplotypes revealed variable distributions of recombination-associated features at the gene level, exemplified by rhesus macaque KIR haplotype H5 (Fig. 8 and Table 4). Overall, the LTR element intersecting with recombination intervals in intron 3 was found in 9 distinct KIR genes (Table 4). Other genes may have an LTR element at this position, but these are distinct from the sequence that was identified to intersect with recombination intervals (<95% sequence similarity). In our macaque panel, 3 hybrid KIR genes originate from a rearrangement with a double-stranded break in intron 3 (Table 1), and their generation may have been facilitated by the presence of these LTR elements. Other recombination-promoting LTR elements are situated in intron 6 and are present in the majority of KIR genes, appearing as either short or long variants. These highly similar repeats possibly contribute to the continuous expansion and contraction of KIR haplotypes by promoting the reshuffling of genes through recombination. This observation is substantiated by the highly conserved organization of the centromeric haplotype, containing KIR3DL20 and KIR1D, which both lack the recombination-associated LTR element in intron 6. Furthermore, KIR3DL08, also lacking the LTR in intron 6, consistently appears adjacent to the 5′ site of KIR3DL01 (Table S4). This suggests that any gene lacking this repeat element may not participate in rearrangements. Approximately two-thirds of the macaque KIR genes possesses at least 1 PRDM9 motif with high confidence, mostly involving recognition patterns of the PRDM9*New2, PRDM9*New6, and PRDM9*New9 (group 2) allotypes. Most recognition motifs are located in introns 4 and 6 and thereby may contribute to the formation of hybrid KIR genes and the expansion and contraction of haplotypes. This is supported by the observation that most PRDM9 motifs are located in KIR3DS02, which is indeed frequently involved in the expansion of haplotypes. Predicting the potential involvement of PRDM9 motifs identified with lower confidence in initiating recombination events is difficult. However, given that some of these motifs were detected near recombination intervals makes it is plausible that they may play an active role as well. The motifs that were generated from the 21 recombination intervals in rhesus macaques, or a selection thereof, were identified in most KIR genes. These motifs mostly mapped to exon 4, the end of intron 5, and within intron 6, predicting potential recombination hotspots that are supported by several detected recombination events.

Distribution of recombination-associated features on rhesus macaque KIR haplotype H5. The outer ring represents the sequence of KIR haplotype H5, with the introns (light yellow) and exons (dark yellow) indicated at scale. In the first inner circle, the distribution of the recombination-associated LTRs is illustrated, with positions in introns 3 and 6 of different KIR genes. The second inner circle displays the positions where PRDM9 motifs are identified, distinguishing motifs categorized into groups 1 and 2 (Fig. 6). The 2 most inner circles display the placement of motifs that were generated from the recombination intervals (Fig. S7), categorizing motifs that recur in all intervals or in a selection of them.
The gene distribution of the different recombination-associated features in rhesus macaques.
LTR intron 3 . | LTR intron 6 . | PRDM9 . | Interval motif . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Short . | Long . | Group 1 . | Group 2 . | Total . | All . | Selection . | ||||||
3DL20 | – | – | – | 0–1 | 11–14 | 1 | 7, 8 | 0–2 | 1 | 7, 8 | 0–2 | 6, 7 |
1D | – | – | – | 1 | 14–16 | 0–1 | 8–10 | 0–2 | 1 | 8, 9 | 0–1 | 6–9 |
2DL04 | – | – | 1 | – | 6–8 | 1 | 3, 5–7 | 1 | 0–1 | 5–8 | 0–1 | 3, 4 |
3DL01 | – | 1 | – | 0–1 | 7–9, 11 | – | 2, 5–7 | 0–1 | 0–2 | 10–14 | 1–2 | 3, 4 |
3DL02 | – | 1 | – | – | 5 | – | 4 | – | 1 | 12, 13 | 1–2 | 4, 5 |
3DLW03 | – | – | – | 1 | 6 | – | 1 | 1 | 1 | 11 | 1 | 5 |
3DL05 | 1 | 1 | – | – | 8 | – | 2, 3 | – | 0–2 | 13–15 | 0–2 | 3, 4 |
3DL07 | – | 1 | – | 0–1 | 3, 7–11 | 0–1 | 2–5 | 0–2 | 1–2 | 13–15 | 0–2 | 3, 5, 6 |
3DL08 | 1 | – | – | – | 9–11 | – | 7, 9, 10 | – | 0–1 | 14–16 | – | 4, 5 |
3DL10 | – | 1 | – | – | 3, 4, 7, 8 | – | 0, 2 | – | 0–1 | 1, 13, 15 | – | 0, 4 |
3DLW36 | 1 | 1 | – | – | 10 | – | 5 | – | 2 | 14 | 1 | 3 |
3DLW44 | – | – | – | – | 10 | – | 9 | – | – | 13 | – | 3 |
3DLW45 | 1 | 1 | – | – | 6 | – | 2, 4 | – | 1 | 1, 2 | 0–2 | 3 |
3DS01 | 1 | – | 1 | 1 | 8, 9 | 1 | 4 | 2 | 1–2 | 10, 11 | 2 | 6 |
3DS02 | – | – | 1 | 2 | 6, 7 | 1 | 5, 6 | 3 | 0–2 | 10–14 | 0–2 | 5–8 |
3DS04 | – | 1 | 1 | – | 10 | 1 | 5 | 1 | 0–1 | 13 | 2 | 7 |
3DS05 | – | – | 1 | 1 | 8 | 1 | 3 | 2 | 1–2 | 8–10 | 1–2 | 6, 7 |
3DS06 | – | – | 1 | 0–1 | 6, 8, 9 | 1–2 | 4–6 | 1–3 | 0–2 | 9, 10, 12 | 1–2 | 5, 6 |
3DSW08 | – | 1 | 1 | – | 10 | 1 | 8 | 1 | – | 10 | 1 | 7 |
3DSW09 | – | – | 1 | 1 | 11 | 1 | 6 | 2 | – | 11, 12 | 1 | 5, 6 |
3DLW45/3DSW08 | 1 | 1 | 1 | – | 9 | 1 | 10 | 1 | – | 12 | 1 | 5 |
3DLW47 | 1 | 1 | – | – | 8 | 1 | 7 | 1 | 2 | 12 | 2 | 4 |
3DLW43 | 1 | – | – | – | 8 | – | 7, 8 | – | – | 14 | – | 5 |
3DSW39 | – | 1 | 1 | – | 9 | 1 | 6 | 1 | 1 | 12 | 2 | 3 |
3DLW51 | 1 | 1 | – | – | 11 | – | 6 | – | 2 | 15 | 1 | 4 |
2DP/3DL07 | – | 1 | – | 1 | 3 | – | 3 | 1 | 2 | 13 | 2 | 3 |
LTR intron 3 . | LTR intron 6 . | PRDM9 . | Interval motif . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Short . | Long . | Group 1 . | Group 2 . | Total . | All . | Selection . | ||||||
3DL20 | – | – | – | 0–1 | 11–14 | 1 | 7, 8 | 0–2 | 1 | 7, 8 | 0–2 | 6, 7 |
1D | – | – | – | 1 | 14–16 | 0–1 | 8–10 | 0–2 | 1 | 8, 9 | 0–1 | 6–9 |
2DL04 | – | – | 1 | – | 6–8 | 1 | 3, 5–7 | 1 | 0–1 | 5–8 | 0–1 | 3, 4 |
3DL01 | – | 1 | – | 0–1 | 7–9, 11 | – | 2, 5–7 | 0–1 | 0–2 | 10–14 | 1–2 | 3, 4 |
3DL02 | – | 1 | – | – | 5 | – | 4 | – | 1 | 12, 13 | 1–2 | 4, 5 |
3DLW03 | – | – | – | 1 | 6 | – | 1 | 1 | 1 | 11 | 1 | 5 |
3DL05 | 1 | 1 | – | – | 8 | – | 2, 3 | – | 0–2 | 13–15 | 0–2 | 3, 4 |
3DL07 | – | 1 | – | 0–1 | 3, 7–11 | 0–1 | 2–5 | 0–2 | 1–2 | 13–15 | 0–2 | 3, 5, 6 |
3DL08 | 1 | – | – | – | 9–11 | – | 7, 9, 10 | – | 0–1 | 14–16 | – | 4, 5 |
3DL10 | – | 1 | – | – | 3, 4, 7, 8 | – | 0, 2 | – | 0–1 | 1, 13, 15 | – | 0, 4 |
3DLW36 | 1 | 1 | – | – | 10 | – | 5 | – | 2 | 14 | 1 | 3 |
3DLW44 | – | – | – | – | 10 | – | 9 | – | – | 13 | – | 3 |
3DLW45 | 1 | 1 | – | – | 6 | – | 2, 4 | – | 1 | 1, 2 | 0–2 | 3 |
3DS01 | 1 | – | 1 | 1 | 8, 9 | 1 | 4 | 2 | 1–2 | 10, 11 | 2 | 6 |
3DS02 | – | – | 1 | 2 | 6, 7 | 1 | 5, 6 | 3 | 0–2 | 10–14 | 0–2 | 5–8 |
3DS04 | – | 1 | 1 | – | 10 | 1 | 5 | 1 | 0–1 | 13 | 2 | 7 |
3DS05 | – | – | 1 | 1 | 8 | 1 | 3 | 2 | 1–2 | 8–10 | 1–2 | 6, 7 |
3DS06 | – | – | 1 | 0–1 | 6, 8, 9 | 1–2 | 4–6 | 1–3 | 0–2 | 9, 10, 12 | 1–2 | 5, 6 |
3DSW08 | – | 1 | 1 | – | 10 | 1 | 8 | 1 | – | 10 | 1 | 7 |
3DSW09 | – | – | 1 | 1 | 11 | 1 | 6 | 2 | – | 11, 12 | 1 | 5, 6 |
3DLW45/3DSW08 | 1 | 1 | 1 | – | 9 | 1 | 10 | 1 | – | 12 | 1 | 5 |
3DLW47 | 1 | 1 | – | – | 8 | 1 | 7 | 1 | 2 | 12 | 2 | 4 |
3DLW43 | 1 | – | – | – | 8 | – | 7, 8 | – | – | 14 | – | 5 |
3DSW39 | – | 1 | 1 | – | 9 | 1 | 6 | 1 | 1 | 12 | 2 | 3 |
3DLW51 | 1 | 1 | – | – | 11 | – | 6 | – | 2 | 15 | 1 | 4 |
2DP/3DL07 | – | 1 | – | 1 | 3 | – | 3 | 1 | 2 | 13 | 2 | 3 |
The number of recombination-associated features that are identified within the different KIR genes is provided. The LTR sequences in introns 3 and 6 display at least 95% sequence similarity to the LTRs that were identified at the intervals. The PRDM9 motifs are divided into 2 broad motif groups, in which group 1 represents the binding motifs corresponding with the PRDM9*New1, PRDM9*New7, and PRDM9*New10 allotypes, whereas group 2 includes the motifs associated with PRDM9*New2, PRDM9*New6, and PRDM9*New9. The most confident motif matches are given in black (P value < 10 × 10−6), whereas lower confidence matches are in gray. The matches of the interval-derived motifs are also categorized by high (black; P < 10 × 10−25) and lower (gray) confidence scores.
The gene distribution of the different recombination-associated features in rhesus macaques.
LTR intron 3 . | LTR intron 6 . | PRDM9 . | Interval motif . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Short . | Long . | Group 1 . | Group 2 . | Total . | All . | Selection . | ||||||
3DL20 | – | – | – | 0–1 | 11–14 | 1 | 7, 8 | 0–2 | 1 | 7, 8 | 0–2 | 6, 7 |
1D | – | – | – | 1 | 14–16 | 0–1 | 8–10 | 0–2 | 1 | 8, 9 | 0–1 | 6–9 |
2DL04 | – | – | 1 | – | 6–8 | 1 | 3, 5–7 | 1 | 0–1 | 5–8 | 0–1 | 3, 4 |
3DL01 | – | 1 | – | 0–1 | 7–9, 11 | – | 2, 5–7 | 0–1 | 0–2 | 10–14 | 1–2 | 3, 4 |
3DL02 | – | 1 | – | – | 5 | – | 4 | – | 1 | 12, 13 | 1–2 | 4, 5 |
3DLW03 | – | – | – | 1 | 6 | – | 1 | 1 | 1 | 11 | 1 | 5 |
3DL05 | 1 | 1 | – | – | 8 | – | 2, 3 | – | 0–2 | 13–15 | 0–2 | 3, 4 |
3DL07 | – | 1 | – | 0–1 | 3, 7–11 | 0–1 | 2–5 | 0–2 | 1–2 | 13–15 | 0–2 | 3, 5, 6 |
3DL08 | 1 | – | – | – | 9–11 | – | 7, 9, 10 | – | 0–1 | 14–16 | – | 4, 5 |
3DL10 | – | 1 | – | – | 3, 4, 7, 8 | – | 0, 2 | – | 0–1 | 1, 13, 15 | – | 0, 4 |
3DLW36 | 1 | 1 | – | – | 10 | – | 5 | – | 2 | 14 | 1 | 3 |
3DLW44 | – | – | – | – | 10 | – | 9 | – | – | 13 | – | 3 |
3DLW45 | 1 | 1 | – | – | 6 | – | 2, 4 | – | 1 | 1, 2 | 0–2 | 3 |
3DS01 | 1 | – | 1 | 1 | 8, 9 | 1 | 4 | 2 | 1–2 | 10, 11 | 2 | 6 |
3DS02 | – | – | 1 | 2 | 6, 7 | 1 | 5, 6 | 3 | 0–2 | 10–14 | 0–2 | 5–8 |
3DS04 | – | 1 | 1 | – | 10 | 1 | 5 | 1 | 0–1 | 13 | 2 | 7 |
3DS05 | – | – | 1 | 1 | 8 | 1 | 3 | 2 | 1–2 | 8–10 | 1–2 | 6, 7 |
3DS06 | – | – | 1 | 0–1 | 6, 8, 9 | 1–2 | 4–6 | 1–3 | 0–2 | 9, 10, 12 | 1–2 | 5, 6 |
3DSW08 | – | 1 | 1 | – | 10 | 1 | 8 | 1 | – | 10 | 1 | 7 |
3DSW09 | – | – | 1 | 1 | 11 | 1 | 6 | 2 | – | 11, 12 | 1 | 5, 6 |
3DLW45/3DSW08 | 1 | 1 | 1 | – | 9 | 1 | 10 | 1 | – | 12 | 1 | 5 |
3DLW47 | 1 | 1 | – | – | 8 | 1 | 7 | 1 | 2 | 12 | 2 | 4 |
3DLW43 | 1 | – | – | – | 8 | – | 7, 8 | – | – | 14 | – | 5 |
3DSW39 | – | 1 | 1 | – | 9 | 1 | 6 | 1 | 1 | 12 | 2 | 3 |
3DLW51 | 1 | 1 | – | – | 11 | – | 6 | – | 2 | 15 | 1 | 4 |
2DP/3DL07 | – | 1 | – | 1 | 3 | – | 3 | 1 | 2 | 13 | 2 | 3 |
LTR intron 3 . | LTR intron 6 . | PRDM9 . | Interval motif . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Short . | Long . | Group 1 . | Group 2 . | Total . | All . | Selection . | ||||||
3DL20 | – | – | – | 0–1 | 11–14 | 1 | 7, 8 | 0–2 | 1 | 7, 8 | 0–2 | 6, 7 |
1D | – | – | – | 1 | 14–16 | 0–1 | 8–10 | 0–2 | 1 | 8, 9 | 0–1 | 6–9 |
2DL04 | – | – | 1 | – | 6–8 | 1 | 3, 5–7 | 1 | 0–1 | 5–8 | 0–1 | 3, 4 |
3DL01 | – | 1 | – | 0–1 | 7–9, 11 | – | 2, 5–7 | 0–1 | 0–2 | 10–14 | 1–2 | 3, 4 |
3DL02 | – | 1 | – | – | 5 | – | 4 | – | 1 | 12, 13 | 1–2 | 4, 5 |
3DLW03 | – | – | – | 1 | 6 | – | 1 | 1 | 1 | 11 | 1 | 5 |
3DL05 | 1 | 1 | – | – | 8 | – | 2, 3 | – | 0–2 | 13–15 | 0–2 | 3, 4 |
3DL07 | – | 1 | – | 0–1 | 3, 7–11 | 0–1 | 2–5 | 0–2 | 1–2 | 13–15 | 0–2 | 3, 5, 6 |
3DL08 | 1 | – | – | – | 9–11 | – | 7, 9, 10 | – | 0–1 | 14–16 | – | 4, 5 |
3DL10 | – | 1 | – | – | 3, 4, 7, 8 | – | 0, 2 | – | 0–1 | 1, 13, 15 | – | 0, 4 |
3DLW36 | 1 | 1 | – | – | 10 | – | 5 | – | 2 | 14 | 1 | 3 |
3DLW44 | – | – | – | – | 10 | – | 9 | – | – | 13 | – | 3 |
3DLW45 | 1 | 1 | – | – | 6 | – | 2, 4 | – | 1 | 1, 2 | 0–2 | 3 |
3DS01 | 1 | – | 1 | 1 | 8, 9 | 1 | 4 | 2 | 1–2 | 10, 11 | 2 | 6 |
3DS02 | – | – | 1 | 2 | 6, 7 | 1 | 5, 6 | 3 | 0–2 | 10–14 | 0–2 | 5–8 |
3DS04 | – | 1 | 1 | – | 10 | 1 | 5 | 1 | 0–1 | 13 | 2 | 7 |
3DS05 | – | – | 1 | 1 | 8 | 1 | 3 | 2 | 1–2 | 8–10 | 1–2 | 6, 7 |
3DS06 | – | – | 1 | 0–1 | 6, 8, 9 | 1–2 | 4–6 | 1–3 | 0–2 | 9, 10, 12 | 1–2 | 5, 6 |
3DSW08 | – | 1 | 1 | – | 10 | 1 | 8 | 1 | – | 10 | 1 | 7 |
3DSW09 | – | – | 1 | 1 | 11 | 1 | 6 | 2 | – | 11, 12 | 1 | 5, 6 |
3DLW45/3DSW08 | 1 | 1 | 1 | – | 9 | 1 | 10 | 1 | – | 12 | 1 | 5 |
3DLW47 | 1 | 1 | – | – | 8 | 1 | 7 | 1 | 2 | 12 | 2 | 4 |
3DLW43 | 1 | – | – | – | 8 | – | 7, 8 | – | – | 14 | – | 5 |
3DSW39 | – | 1 | 1 | – | 9 | 1 | 6 | 1 | 1 | 12 | 2 | 3 |
3DLW51 | 1 | 1 | – | – | 11 | – | 6 | – | 2 | 15 | 1 | 4 |
2DP/3DL07 | – | 1 | – | 1 | 3 | – | 3 | 1 | 2 | 13 | 2 | 3 |
The number of recombination-associated features that are identified within the different KIR genes is provided. The LTR sequences in introns 3 and 6 display at least 95% sequence similarity to the LTRs that were identified at the intervals. The PRDM9 motifs are divided into 2 broad motif groups, in which group 1 represents the binding motifs corresponding with the PRDM9*New1, PRDM9*New7, and PRDM9*New10 allotypes, whereas group 2 includes the motifs associated with PRDM9*New2, PRDM9*New6, and PRDM9*New9. The most confident motif matches are given in black (P value < 10 × 10−6), whereas lower confidence matches are in gray. The matches of the interval-derived motifs are also categorized by high (black; P < 10 × 10−25) and lower (gray) confidence scores.
Several sets of KIR genes display patterns of co-occurrence and co-exclusion.24 A strong linkage is, for example, documented for KIR3DS05, KIR3DL10, and KIR3DL01 in macaques, indicating the absence of recombination events within this cluster (Fig. 1). A recombination-associated LTR was not identified in intron 3 of these linked genes. However, in intron 6, either a short or long LTR was detected. Furthermore, while PRDM9 motifs are absent in KIR3DL10, 1 and 2 of these motifs were identified in KIR3DS05 and KIR3DL01, respectively. Other genes are involved in similarly linked clusters, like KIR3DS02, KIR3DL05, and KIR3DL07, with again the central gene lacking PRDM9 motifs, while different LTR sequences are found present in all genes. These observations suggests that not all recombination-associated elements may be active, or that selection avoids the uncoupling of these genes.
Equivalents of the LTR in intron 3 were only identified in human KIR3DL2, whereas the LTR elements associated with intron 6 were found in 7 distinct human KIR genes (Table 5). However, the rhesus macaque and human LTR elements deviate in their sequence length, which might have an impact on the initiation of recombination.62 A PRDM9 motif was identified with high confidence in all human KIR genes, except for KIR2DL5 and KIR2DS4. However, due to the limited number of recombination intervals determined in the available human KIR haplotypes, establishing a correlation with these molecular features is at present challenging.
The gene distribution of the different recombination-associated features in humans.
LTR intron 3 . | LTR intron 6 . | PRDM9 . | |
---|---|---|---|
3DL1 | – | 1 | 3 |
3DL2 | 1 | – | 1–3 |
3DL3 | – | – | 0, 3 |
2DL1 | – | 1 | 0–1 |
2DL2 | – | 1 | 3 |
2DL3 | – | 1 | 3, 5 |
2DL4 | – | 1 | 2, 3 |
2DL5 | – | – | – |
2DS1 | – | 1 | 0, 2 |
2DS2 | – | – | 1, 2 |
2DS3 | – | – | 1 |
2DS4 | – | 1 | – |
3DS1 | – | – | 2 |
LTR intron 3 . | LTR intron 6 . | PRDM9 . | |
---|---|---|---|
3DL1 | – | 1 | 3 |
3DL2 | 1 | – | 1–3 |
3DL3 | – | – | 0, 3 |
2DL1 | – | 1 | 0–1 |
2DL2 | – | 1 | 3 |
2DL3 | – | 1 | 3, 5 |
2DL4 | – | 1 | 2, 3 |
2DL5 | – | – | – |
2DS1 | – | 1 | 0, 2 |
2DS2 | – | – | 1, 2 |
2DS3 | – | – | 1 |
2DS4 | – | 1 | – |
3DS1 | – | – | 2 |
The number of recombination-associated features that are identified within the different KIR genes is provided. The LTR sequences in introns 3 and 6 display at least 95% sequence similarity to the LTRs that were identified at the recombination intervals of rhesus macaques. Furthermore, the number of matches with the 7 distinct PRDM9 binding motifs, reported by Altemose et al.,46 are provided for each KIR gene.
The gene distribution of the different recombination-associated features in humans.
LTR intron 3 . | LTR intron 6 . | PRDM9 . | |
---|---|---|---|
3DL1 | – | 1 | 3 |
3DL2 | 1 | – | 1–3 |
3DL3 | – | – | 0, 3 |
2DL1 | – | 1 | 0–1 |
2DL2 | – | 1 | 3 |
2DL3 | – | 1 | 3, 5 |
2DL4 | – | 1 | 2, 3 |
2DL5 | – | – | – |
2DS1 | – | 1 | 0, 2 |
2DS2 | – | – | 1, 2 |
2DS3 | – | – | 1 |
2DS4 | – | 1 | – |
3DS1 | – | – | 2 |
LTR intron 3 . | LTR intron 6 . | PRDM9 . | |
---|---|---|---|
3DL1 | – | 1 | 3 |
3DL2 | 1 | – | 1–3 |
3DL3 | – | – | 0, 3 |
2DL1 | – | 1 | 0–1 |
2DL2 | – | 1 | 3 |
2DL3 | – | 1 | 3, 5 |
2DL4 | – | 1 | 2, 3 |
2DL5 | – | – | – |
2DS1 | – | 1 | 0, 2 |
2DS2 | – | – | 1, 2 |
2DS3 | – | – | 1 |
2DS4 | – | 1 | – |
3DS1 | – | – | 2 |
The number of recombination-associated features that are identified within the different KIR genes is provided. The LTR sequences in introns 3 and 6 display at least 95% sequence similarity to the LTRs that were identified at the recombination intervals of rhesus macaques. Furthermore, the number of matches with the 7 distinct PRDM9 binding motifs, reported by Altemose et al.,46 are provided for each KIR gene.
Overall, it appears that recombination in the KIR region is driven by a combination of molecular features, including specific LTR elements and PRDM9 binding motifs that may play a dominant role.
Discussion
The thorough characterization of complex immune regions, such as the KIR region, was always a challenging enterprise using short-read sequencing platforms, due to copy number variation, high levels of sequence similarity, and extensive allelic polymorphism. The most common technique to characterize the KIR repertoire in humans is based on the determination of the presence or absence of gene segments using sequence-specific primers, which has its limitations.63 More in-depth studies in other primate species involved partial or full-length KIR transcript sequencing, occasionally in concert with segregation studies to resolve haplotype information.18 This approach revealed different levels of complexity, but precise KIR haplotype organizations at the genomic level remained an enigma. Long-read sequencing strategies opened avenues to assess and phase complex immune regions. We applied, for instance, a Cas9-mediated enrichment approach in combination with ONT sequencing to assemble phased rhesus macaque and human KIR haplotypes.23 Although considerable progress was made, this strategy is relatively expensive and time-consuming. In this communication a rapid software-controlled enrichment strategy is presented, known as adaptive sampling, that successfully resolved completely phased KIR haplotypes in a panel of rhesus macaques. These animals have an expanded MHC class I region as compared with humans.25 We aimed to determine if this expansion impacts the extent and evolution of the KIR repertoire. On average, the evolution of the human KIR region seems to be under less selective pressure as compared with rhesus macaques. Moreover, the growing number of completely characterized KIR haplotypes offers the opportunity to explore the molecular features that drive the expansion and contraction of this immune region.
The elevated recombination activity observed in the KIR regions of rhesus macaques and humans, compared with other segmentally duplicated immune gene systems, may involve their high levels of gene density and sequence similarity (Fig. 3; Fig. S3). However, it is more likely that additionally an even more sophisticated mechanism is at play, which maintains functional stability by producing in-frame recombination events, while generating genetic diversity. This mechanism appears to involve the presence of PRDM9-binding sites and LTR elements, which are located within introns, while the exon boundaries remain conserved. These features may shape the recombination landscape of KIR genes and haplotypes by facilitating the exchange of genomic material at specific hotspots during meiosis.
However, not all of these recombination-associated elements are consistently active. For example, PRDM9 is expressed in germ cells during meiosis, but only a fraction of the recombination hotspots are activated per cell. Moreover, the binding of PRDM9 alone does not guarantee the occurrence of a double-stranded break, as the interplay of various factors within the recombination machinery is essential for facilitating successful rearrangement.64 Active binding sites of PRDM9 in gametes may be identified by a parallel detection of DMC1 binding, a meiosis-specific recombinase that catalyzes strand invasion necessary for recombination.65 While this combined information is currently unavailable for the KIR regions in humans or macaques, future studies employing this approach could offer valuable insights into the active PRDM9 hotspots that drive diversification of KIR haplotypes. In addition, this exercise would also provide a more accurate pattern of the PRDM9 binding motifs. In this communication, we applied a widely-used and validated prediction tool to define the binding motifs of the various PRDM9 alleles identified in the panel of rhesus macaques.36 This software tool correctly predicted the binding of the most prevalent human variant, PRDM9-A, to a 13-mer motif that is enriched at recombination hotspots.30 Although these in silico predictions offer indicative binding motifs, their reliability may be limited due to the complexity of the interaction between the diverse array of zinc fingers and DNA. This is exemplified by the diverse set of DNA-binding motifs identified for human PRDM9-B, sharing a close match to the 13-mer motif but displaying distinct internal spacings.46 The diverse binding specificities may be explained by different factors, including the varied combinations of zinc fingers and the potential formation of PRDM9 multimers.66,67 The binding specificities of rhesus macaque and human PRDM9 are clearly distinct, stemming not only from differences in the type of zinc fingers but also in the number of fingers within each array. The smaller array in rhesus macaques correlates with a narrower DNA-binding motif, potentially leading to its increased occurrence compared with the longer motifs associated with human PRDM9. Overall, the overlap of PRDM9 binding targets with, or in proximity to, all 21 examined recombination intervals in the KIR region of rhesus macaques supports the role of this zinc finger protein in facilitating recombination events.
The other feature implicated in modulating the diversification of the KIR region is the distribution of specific LTR elements. These retroviral elements appear to be predominantly involved in recombination events within introns 3 and 6 of most KIR genes, represented by either short or long LTR sequences. Recently, we reported a similar LTR-driven recombination mechanism in the MHC-DRB region of rhesus macaques.51 In this context, 2 relatively long LTR sequences within and adjacent to the pseudogene DRB6 were associated with the shuffling of sets of paralogous DRB genes. In contrast to the KIR region, a relatively high number of inactive gene copies are generated by the diversification of the MHC-DRB cluster. This discrepancy indicates that a more sophisticated process diversifies the KIR region. Variation in the lengths and distributions of LTR elements in these 2 immune regions might contribute to the differential recombination events.
In addition to variation in molecular features that may promote rearrangements, the continuous diversification of the KIR gene content in macaques, compared with the more confined haplotypes in humans, is likely modulated by evolutionary selection. Selective pressure favoring functional KIR genes and haplotypes would probably drive the elimination of nonproductive rearrangements from the population by natural selection. Consequently, out-of-frame recombination events may occur but are less likely to be detected. However, an example of a dysfunctional gene that originated from a rearrangement and remained in human populations is KIR2DL5B*002.68,69 Although this gene is in-frame, it is not expressed due to the introduction of the promoter region from pseudogene KIR3DP1. This suggests that inactive alleles can occasionally arise from recombination events and remain in the gene repertoire, even though this occurrence is rare.
Different selective pressures may have an effect on the KIR gene repertoire. In humans, the selection pressures may largely involve successful pregnancy, as the KIRs play a pivotal role in the maternal immune tolerance.70,71 Specifically, KIRs bind to certain epitopes on HLA-C molecules, which are preferentially expressed on trophoblasts. This interaction facilitates processes like trophoblast invasion and vascular remodeling, both essential for a successful pregnancy. The evolutionary pressure to optimize these interactions has led to the selection of specific KIR haplotypes that promote reproductive success, resulting eventually in a less diverse but more specialized KIR repertoire in humans. In contrast, the KIR system in rhesus macaques is primarily driven by the need to maintain a functional relationship with the expanded MHC-A and MHC-B molecules, thereby indirectly adapting to the rapid evolution of pathogens, including an array of retroviruses.5,72 This difference in primary selective pressures is further enlarged by the education pathways of NK cells. Unlike humans, where KIR education is heavily reliant on HLA-C interactions, rhesus macaque NK cells are primarily educated through the NKG2A pathway interacting with MHC-E molecules.72,73 The shift in NK cell education pathways may reduce the selective pressure on specific KIR-MHC interactions in macaques. This might allow for greater expansion and diversification of the KIR gene repertoire without the reproductive constraints seen in humans.
Conclusively, the expansion and contraction of the primate KIR region appears to enhance the evolutionary adaptability, while the underlying molecular process and selective pressure maintain the functional integrity of KIR genes. We identified sequence features that may collectively contribute to the sophisticated diversification within the KIR region of primate species, including specific LTR elements and sets of PRDM9 binding motifs. Variation in these sequences across species may account for the contrasting levels of diversity recorded for the rhesus macaque and human KIR haplotypes.
Acknowledgements
The authors thank F. van Hassel for design of figures and artwork.
Author contributions
Marit K.H. van der Wiel (Formal analysis [equal], Investigation [equal], Methodology [equal], Validation [equal]), Ngoc Giang Le (Formal analysis [equal], Software [lead], Visualization [lead]), Nanine de Groot (Investigation [supporting]), Natasja G. de Groot (Conceptualization [supporting]), Ronald E. Bontrop (Conceptualization [supporting]), and Jesse Bruijnesteijn (Conceptualization [lead], Formal analysis [lead], Investigation [equal], Methodology [lead], Project administration [equal], Supervision [lead], Visualization [equal])
Supplementary material
Supplementary material is available at The Journal of Immunology online.
Funding
None declared.
Conflicts of interest
None declared.
Data availability
All sequencing data generated in this study has been submitted to the European Nucleotide Archive (https://www.ebi.ac.uk/ena/browser/home) under project number PRJEB77196. The annotated consensus sequences of the rhesus macaque KIR haplotypes have been submitted under accession numbers ERZ24798745 to ERZ24798765.
References
de Groot N, van der Wiel M, Le NG, de Groot NG, Bruijnesteijn J, Bontrop RE. Unraveling the architecture of major histocompatibility complex class II haplotypes in rhesus macaques. Genome Res. 2024;