-
PDF
- Split View
-
Views
-
Cite
Cite
Silvia Britto Barreto, L. Lacey Knowles, Paulo Roberto Antunes de Mello Affonso, Henrique Batalha‐Filho, Riverscape properties contribute to the origin and structure of a hybrid zone in a Neotropical freshwater fish, Journal of Evolutionary Biology, Volume 33, Issue 11, 1 November 2020, Pages 1530–1542, https://doi.org/10.1111/jeb.13689
- Share Icon Share
Abstract
Understanding the structure of hybrid zones provides valuable insights about species boundaries and speciation, such as the evolution of barriers to gene flow and the strength of selection. In river networks, studying evolutionary processes in hybrid zones can be especially challenging, given the influence of past and current river properties along with biological species‐specific traits. Here, we suggest that a natural hybrid zone between two divergent lineages of the sexually dimorphic Neotropical fish Nematocharax venustus was probably established by secondary contact as a result of a river capture event between the Contas and Pardo river basins. This putative river capture is supported by hydrogeological evidence of elbows of capture, wind gaps and geological faults. The morphological (colour pattern) and genetic (mtDNA and RADseq) variation reveal a clinal transition between parental lineages along the main river, with predominance of F2 hybrids at the centre of the hybrid zone, absence of early generation backcrosses and different levels of hybridization in the tributaries. We highlight that different sources of information are crucial for understanding how the riverscape spatial history influences the connectivity between and within river systems and, consequently, the dynamics of gene flow between freshwater lineages/species.
Abstract
INTRODUCTION
As windows to the evolutionary consequences of genetic exchange between divergent lineages, hybrid zones can provide useful insights about the forces involved in the speciation process, such as the evolution of barriers to gene flow and the strength of selection (Gompert, Mandeville, & Buerkle, 2017; Hewitt, 1988). These zones can arise under a variety of scenarios, ranging from those originating within a continuous population in response to environmental differences or, more commonly, after secondary contact between formerly allopatric populations (Hewitt, 1988). A focus on the origin and structure of a hybrid zone forms the central framework for understanding the consequences of hybridization, including processes related to: (a) reinforcement of reproductive barriers; (b) introgression and fusion between parental groups; or (c) stability of the hybrid zone (Albert, Jónsson, & Bernatchez, 2006; Stewart, Hudson, & Lougheed, 2017).
Most hybrid zones consist of clines (i.e. gradients of variation) maintained by a balance between selection against hybrids within the zone and dispersal of parental individuals into the zone, hence the term 'tension zone' (Barton & Hewitt, 1985). However, there are other theoretical models (e.g. bounded hybrid superiority and mosaic models) that seek to explain how these zones are maintained (see Arnold, 1997). The bounded hybrid superiority model is characterized by a smooth transition between parental groups throughout the hybrid zone, in which hybrids can exhibit equivalent or higher fitness compared to parental individuals in intermediate habitats (Moore, 1977). Mosaic hybrid zones, in turn, correspond to parental individuals adapted to different environments that are patchily distributed, and hybrid individuals occurring on the boundaries or in intermediate habitats (Harrison & Rand, 1989). Under these circumstances, endogenous selection (e.g. genetic incompatibilities) may act against hybrids and exogenous selection (e.g. environmental heterogeneity) may favour hybrids in the hybrid zone (De La Torre, Wang, Jaquish, & Aitken, 2014). Thus, each model encompasses a particular condition depending on how selection acts on parental and hybrid individuals, even though some hybrid zones can be represented by a mixture of models (Curry, 2015).
In river systems, applying such models to understand the structure of hybrid zones is particularly challenging because river networks are organized into a spatial hierarchy with downstream flow that makes equal (isotropic) dispersal in all directions unlikely (Fullerton et al., 2010; Hughes, 2007). In this sense, architectural and functional particularities of river systems should be considered in evolutionary analyses of obligate freshwater organisms because these physical characteristics interact with the way organisms disperse through space thereby influencing both diversity within populations and differentiation among populations (Chaput‐Bardy, Fleurant, Lemaire, & Secondi, 2009; Thomaz, Christie, & Knowles, 2016). For instance, although hybridization occurs widely in freshwater species, especially fishes (Scribner, Page, & Bartron, 2000), the position of populations within a river network and the geomorphological features within and between basins can affect the connectivity among populations and, consequently, the spatial distribution of genetic and/or phenotypic variation (Duvernell & Schaefer, 2014; Hughes, Schmidt, & Finn, 2009; Mandeville et al., 2017).
Here, we analyse a combination of phenotypic and genomic data, providing an ideal opportunity for investigating the complexity of natural hybrid zones in riverscapes. Specifically, we focus on the sexually dimorphic characid fish Nematocharax venustus Weitzman, Menezes, & Britski, whose males have elongated rays in their dorsal, pelvic and anal fins. The phenotypic and genetic variation found in this group shows a contact zone between lineages in drainages of the Gongogi River sub‐basin, in the Contas River basin (Figure 1), Northeastern Brazil. The sympatry of the two divergent mitochondrial DNA (mtDNA) lineages was first detected in the Cambiriba Stream, a tributary of the Gongogi River (Barreto et al., 2016). Interestingly, a new putative species (N. costai) was also described for the same locality (Bragança, Barbosa, & Mattos, 2013), although it was later synonymized with its only congener at the time (N. venustus) in a subsequent taxonomic treatment (Menezes, Zanata, & Camelier, 2015). The synonymization was justified by the authors because the species shared overlapped phenotypic features, including secondary sexual traits (Menezes et al., 2015). In sympatry, differences in secondary sexual traits can determine the evolution of reproductive barriers, given the direct link between these characters and processes such as species recognition and mate choice (Questiau, 1999). Therefore, without considering the dynamics of speciation in this putative hybrid zone, the taxonomic issue surrounding N. venustus and N. costai may reflect an incomplete assessment of the phenotypic geographic variation in Nematocharax.

Map of the Contas River basin, Bahia, Brazil (at the top right corner, in green) showing the stretch of the Gongogi River sub‐basin where the sampling was performed. The 14 sampled locations are situated in the Gongogi River (sites 1, 4, 6, 7, 11 and 14), Cambiriba Stream (sites 8, 9 and 10) and other four tributaries (sites 2, 3, 5, 12 and 13). Sampled locations are colour‐coded according to which of the divergent mitochondrial lineages are present, with the Southern lineage in yellow and the Northern lineage in green. The circles are divided proportionally according to the number of individuals belonging to each mitochondrial lineage in cases of sympatry (i.e. sites 6, 7, 8, 9, 10 and 11), corresponding to the putative hybrid zone between divergent lineages. Small black arrows indicate the water flow direction of the Gongogi River and Cambiriba Stream
With extended sampling efforts along the Gongogi River sub‐basin, we quantify the spatial structure of this hybrid zone. Specifically, we (a) evaluate the extent of hybridization between the divergent populations, (b) characterize how morphological (colour pattern) and genetic variation is structured across the hybrid zone, including genomic data based on RADseq, (c) verify whether there is coincidence and concordance among clines for the different data sets, and (d) indirectly infer the strength of selection in the maintenance of this hybrid zone based on cline analyses. We combined these results with geological and geomorphological data to infer the potential factors responsible for the origin and structure of this freshwater fish hybrid zone.
MATERIALS AND METHODS
Sampling
We collected 193 individuals of Nematocharax venustus comprising the two previously identified divergent mtDNA lineages (Barreto, 2019; Barreto et al., 2016), hereafter called Northern and Southern lineages, in 14 locations along the Gongogi River sub‐basin (Contas River basin); our sampling included the Cambiriba Stream (where the contact zone was first identified), the main river (Gongogi) and four other nearby tributaries (Figure 1). All individuals were photographed alive in the field under standard conditions (see Figure 2) to register their colour pattern. A small fragment of muscle tissue was removed and preserved in absolute ethanol at −20ºC in the laboratory, and the specimens were deposited in the ichthyological collection of the Universidade Federal da Bahia (UFBA), Brazil (Table S1). Collections were authorized by the Instituto Chico Mendes de Conservação da Biodiversidade (ICMBio/SISBIO; licence number 51856–2), and the euthanasia and experimental procedures were approved by the Ethics Committee of Utilization of Animals from the Universidade Estadual do Sudoeste da Bahia (CEUA/UESB, number 71/2014).

Male representatives of Nematocharax venustus highlighting some of the variation in the eye colour (A), caudal peduncle (B), pectoral fins (C), and tip of the pelvic fins (D) based on photographs taken in the field at sites 14, 7 and 4 (see Figure 1), corresponding to phenotypes representative of (a) the Northern lineage, (b) a putative hybrid and (c) the Southern lineage, respectively
Twenty individuals of Nematocharax venustus previously collected by Barreto et al. (2016) from two locations (sites 1 and 10 on the map in Figure 1) were included in our study, consisting of cytochrome c oxidase subunit I (COI) sequences and photographs taken in the field. COI sequences are available in BOLD (Barcode of Life Data Systems; http://www.boldsystems.org/) under accession numbers PIABA028‐14 to PIABA035‐14 and PIABA050‐14 to PIABA061‐14.
DNA extraction and sequencing
Total DNA was extracted from all collected individuals using the Wizard Genomic DNA Purification kit (Promega, Madison, WI, USA). To identify the mtDNA lineage of each individual, a 650 base‐pair (bp) fragment of the COI gene was amplified and sequenced with the primers FishF2_t1 and FishR2_t1 (Ward, Zemlak, Innes, Last, & Hebert, 2005) following Barreto et al. (2016), with approximately 15 samples per location. Sequencing was performed at the Gonçalo Moniz Research Center (FIOCRUZ‐Bahia) using the BigDyeTerminator v3.1 Cycle Sequencing Ready Reaction kit (Applied Biosystems, Foster City, CA, USA). The COI gene was selected as an appropriate molecular marker due to its general utility for studying population‐level phenomena in fish (e.g. Cunha et al., 2019; Lima et al., 2017; Thomaz, Malabarba, Bonatto, & Knowles, 2015), its previously proven ability to reveal sympatric divergence in Nematocharax (Barreto et al., 2016) and the availability of published sequences from populations analysed here.
A subset of 55 samples of the sequenced individuals was selected for the restriction site‐associated DNA sequencing (RADseq), with 2–5 individuals per location. Whenever possible, this selection included males and females for each location and individuals from both mtDNA lineages in sites of sympatry. Locations from the main river course (sites 1, 4, 6, 7, 11 and 14; see Figure 1), which correspond to the transect of the cline analyses, represented the largest sample sizes (i.e. five each). We generated the ezRAD libraries according to Toonen et al. (2013) and Knapp et al. (2016), extracting DNA using the DNeasy Blood and Tissue kit (Qiagen, Hilden, Germany), digesting the DNA with the restriction endonuclease DpnII (New England Biolabs, Ipswich, MA, USA) and preparing libraries using the Illumina TruSeq Nano kit (Illumina, San Diego, CA, USA), selecting fragments between 150 and 350 bp. Quantitative and qualitative validation of libraries was carried out using Qubit dsDNA BR (Broad Range) Assay kit (Thermo Fisher Scientific, Waltham, MA, USA) and Agilent 2,100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA), respectively. Paired‐end sequencing of fragments (2 × 75 bp) was performed using the Illumina NextSeq 550 System with two Mid Output v2 kits (150 cycles) (Illumina, San Diego, CA, USA) at the Genome Investigation and Analysis Laboratory (GENIAL) core facility (CEFAP‐USP, São Paulo, Brazil).
Sequence processing
Consensus COI sequences were obtained by comparing forward and reverse electropherograms in the software CodonCode Aligner 7.1.1 (CodonCode Corporation) and aligned using the ClustalW Multiple Alignment tool (Thompson, Higgins, & Gibson, 1994) in BioEdit 7.2.6.1 (Hall, 1999). All new COI sequences generated as part of this study are deposited in GenBank (accession numbers MN011189‐MN011202 and MN011364‐MN011542).
For the RADseq data, reads of each individual were demultiplexed using bcl2fastq 1.8.4 (Illumina; http://support.illumina.com/downloads.html). The toolbox ipyrad 0.7.28 (Eaton, 2014; http://ipyrad.readthedocs.io) was used to process the genomic sequences; details regarding read filtering, clustering within samples, joint estimation of heterozygosity and error rate, consensus base calls and clustering across samples are given in the Table S2. We used a de novo assembly method to filter data with reads trimmed 15 bp from each 3’ edge to reduce low quality bases, retaining all loci with less than 30% missing data.
Coloration data acquisition
The coloration of both the pelvic‐fin filament and the horizontal mark on the caudal peduncle represents the main differences found in individuals of Nematocharax venustus from the Gongogi River sub‐basin (Bragança et al., 2013; Menezes et al., 2015). We quantified the variation in these two characters, as well as in two other colour‐based characters that exhibited noticeable variation in field‐caught individuals (both males and females), specifically the colour of pectoral fins and a mark in the eye (see Figure 2). However, only the marks on the caudal peduncle and in the eye showed distinctive colour patterns with high frequencies in parental populations (≥0.8 in one parental population and ≤ 0.2 in the other), thus being useful as diagnostic traits for the hybrid zone analysis.
Variation in the red mark in the eye was described as five categories based on its intensity, ranging from intense to absent; a similar approach was applied to the pink mark on the caudal peduncle, which ranged from dark pink to lack of pink (see Figure 2). The colour assignment into categories was visually performed by the same observer from photographs taken in the field for all specimens. These data were used to estimate geographic clines across the hybrid zone (described below).
Hybrid zone analyses
Relationships among mtDNA haplotypes were evaluated using a median‐joining network built with the COI sequences in PopART (Leigh & Bryant, 2015). We identified the mtDNA lineage (Northern or Southern) of each individual and, based on this information, we mapped the locations of sympatry and allopatry of mitochondrial lineages.
Structure in the RADseq data was estimated using the variational Bayesian framework implemented in fastSTRUCTURE (Raj, Stephens, & Pritchard, 2014). We tested the number of clusters (K) from 1 to 5 with 10 independent runs each and using the default prior. The algorithm ‘chooseK.py’ was run to choose the appropriate number of model components explaining the structure in the data set.
To estimate the posterior probability (pp) that each individual belongs to a distinct hybrid class (i.e. pure1, pure2, F1, F2, BC1 and BC2; Anderson & Thompson, 2002), we used the R package parallelnewhybrid (Wringe, Stanley, Jeffery, Anderson, & Bradbury, 2017a) to run NewHybrids in parallel with a burn‐in of 50,000 and 100,000 sweeps. This analysis was performed with a panel of the 200 most informative SNPs (based on global Weir and Cockerham (1984)’s FST) generated by the R package hybriddetective (Wringe, Stanley, Jeffery, Anderson, & Bradbury, 2017b). The z option in NewHybrids was set as prior information for individuals from parental populations (i.e. locations 1 and 14 for the Southern and Northern lineages, respectively). We confirmed the convergence of the Markov Chain Monte Carlo (MCMC) chains using the hybriddetective.
Measures of genomic admixture within individuals were also obtained from the R package gghybrid (Bailey, 2018) by calculating a hybrid index that estimates the proportion of alleles that were inherited from one of the two parental groups (Anderson, 1949; Buerkle, 2005). To ensure that we only capture loci that show significant variation along the hybrid zone, we set the gghybrid parameters ‘max.S.MAF’ and ‘min.diff’ to 0.2 and 0.6, respectively. This implies that we only kept loci for which allele frequencies were no greater than 0.2 in one parental set and not lower than 0.8 in the other, and for which the difference in allele frequency between parental sets was greater than 0.6; parental populations correspond to sites 14 and 1 on the map in Figure 1 for the Northern and Southern lineages, respectively.
To test how genetic and morphological characters vary along the Gongogi River, we used HZAR (hybrid zone analysis for R), an R package that provides functions for fitting the traits to equilibrium geographic cline models (Gay, Crochet, Bell, & Lenormand, 2008; Szymura & Barton, 1986) and allows cline parameters to be estimated using the Metropolis‐Hastings algorithm (Derryberry, Derryberry, Maley, & Brumfield, 2014). We fit a combination of equations (15 models) that describe the shape of each cline as defined by Derryberry et al. (2014) following Szymura and Barton (1986, 1991), with a sigmoidal curve at the centre and two exponential decay curves (i.e. tails) on either side. The fit of the 15 models was compared to a null model with no clinal transition using Akaike Information Criterion (AIC) corrected for small sample size (AICc). The maximum likelihood parameters were extracted from the best‐fitting model (i.e. the model with the lowest AICc score). Clinal coincidence (same centre, c, measured in km from the sampling location 1) and concordance (same width, w, measured as 1/maximum slope) were evaluated based on the confidence intervals for each trait and SNP. The mtDNA cline was also modelled along the Gongogi River by using a site of the COI segregating between parental lineages.
We used the ancestry coefficients (q) estimated in fastSTRUCTURE considering the optimal value of K (i.e. K = 2, see Results) to estimate the cline that represents all genomic loci combined. The genomic loci were also analysed separately based on two filters: (1) exclusion of mtDNA loci after sequence similarity analysis with the BLAST tool in the NCBI database (http://www.ncbi.nlm.nih.gov/BLAST/) and (2) taking into account allele frequencies present in at least two individuals per location and partially diagnostic loci (i.e. those with frequency differences ≥ 0.6 between parental populations). This procedure resulted in a final data set of 99 putatively unlinked SNPs (i.e. one SNP per locus) from which we estimated cline parameter values independently. This subsampling was made to strictly analyse diagnostic SNPs (i.e. those that are distinct between parental populations) and to avoid estimates of allele frequency using less than four alleles per locus per location, thus reducing potential biases related to high amounts of missing data.
The hybrid zone transect was established by considering only the locations from the main river course (sites 1, 4, 6, 7, 11 and 14; see Figure 1) because HZAR requires data collected along one‐dimensional transects, with minimal variation perpendicular to the cline (Derryberry et al., 2014). MCMC sampling was run considering the length of this transect along the river course (63 km, including its bends) and using three independent chains with 1.0 × 106 generations for each model.
Geomorphological inference
Considering that the Southern lineage is also known to occur in a watershed adjacent to the Contas River basin (Barreto, 2019), we used GIS techniques to infer putative topographic changes that may have affected the surveyed area over time. We focused on investigating the occurrence of drainage rearrangements (i.e. river captures) between the Gongogi River sub‐basin and neighbouring basins. River captures correspond to the natural diversion of waters from one river to another adjacent one, which connects and isolates drainages and, consequently, aquatic populations (Albert, Craig, Tagliacollo, & Petry, 2018; Bishop, 1995). Thus, to infer such captures we used the following data sources: (a) the Continuous Cartographic Base of the Brazilian hydrography at 1:250,000 scale (DGC, 2017) to visualize the current configuration of rivers and detect elbows of capture (abrupt changes in the river course at the point of capture; Bishop, 1995); (b) SRTM 90 m Digital Elevation Data (Jarvis, Reuter, Nelson, & Guevara, 2008) to detect putative wind gaps (dry areas that correspond to ancient river beds; Ollier & Pain, 2000); and (c) the SD.24 Salvador sheet (DGC, 2016) to identify the presence and location of geological faults (areas subjected to tectonic reactivations; de Oliveira, 2010). All these data were analysed using the QGIS 3.4.1 software (QGIS Development Team, 2019).
RESULTS
Population structure
Two divergent mtDNA lineages were identified from the haplotype network for the COI gene in Nematocharax venustus, separated by 24 mutational steps (4.1% of divergence) (Figure S1). The Northern lineage includes a larger number of individuals (N = 164) and a broader distribution in the sampled area. Population assignment indicated sympatry of mtDNA lineages in six locations including the main river (Gongogi River) and a single tributary (Cambiriba Stream). Moreover, we found a frequency gradient for the two lineages along the main river, with the Northern and Southern lineages predominating at opposite stretches of the Gongogi River and co‐occurring near the confluence with the Cambiriba Stream (Figures 1‐2).
Regarding the RADseq data, we obtained a total of 133.8 M raw reads (1.3–4.3 M per sample), which were filtered into 129.5 M reads encompassing 1,441 loci and 4,226 SNPs (see Table S3 for a final summary of the statistics provided by ipyrad). FastSTRUCTURE results indicated that K = 2 best fit the data, with clusters (taxa) clearly corresponding to the two mtDNA lineages. The analysis showed individuals of mixed ancestry (Figure 3a), particularly in locations 6–13, whereas significant admixture was absent in the locations assumed as parental (1 and 14 for the Southern and Northern lineages, respectively).

Results of fastSTRUCTURE (a) and NewHybrids (b) analyses based on RADseq data. In the Structure plot (a) for the best K‐value (=2), each bar represents the proportion of ancestry of each individual with respect to each potential ancestor; yellow and green correspond to the divergent lineages. Squares below the bars indicate the mitochondrial group of each individual. In the NewHybrids plot (b), each bar shows the probability of each individual to belong to each hybrid category (i.e. pure 1, pure 2, F1, F2, backcross 1 and backcross 2). Numbers (4492–0005) on the top of the figure and within hexagons (1–14) are the sample codes and localities, respectively (see Figure 1). Black and grey hexagons refer to sites from the main river and tributaries, respectively
The NewHybrids analysis assigned nine, 12 and 33 individuals to the pure 1, pure 2 and F2 categories, respectively (Figure 3b), with high posterior probabilities (pp >0.99 ) for all samples except one (sample code 0022, location 4), which had pp values equal to 0.4 and 0.6 for the pure 1 and F2 categories, respectively. Geographically, samples classified as pure 1 were mostly detected in locations 1 and 4, whereas pure 2 individuals were found in locations 14, 5, and 3; F2 offspring were shown to occur in locations 2 and 6–13, particularly near the confluence with the Cambiriba Stream. No evidence of F1 hybrids or early generation backcrosses was detected.
The hybrid index estimated with gghybrid varied from zero to one, ranging from pure individuals of the Southern lineage to pure individuals of the Northern lineage and following an almost linear transect of hybridization along the main river (Figure 4). Note that the tributaries were not considered in the clinal analyses, especially those associated with locations 2, 3 and 5, also because of their geographic disjunction (i.e. they are geographically closer to location 1, where the Southern lineage predominates, but are genetically similar to location 14, where the Northern lineage predominates).

Distribution of the hybrid index score (and 95% credible intervals) estimated with the gghybrid for the 55 individuals across the hybrid zone. Horizontal dashed lines mark the innermost credible interval for each parental reference set. Numbers within hexagons correspond to the locations according to the map (see Figure 1). Black hexagons specify locations from the main river (Gongogi), whereas grey hexagons indicate locations from the tributaries
Hybrid zone dynamics
According to the AICc calculated by HZAR, the best‐fit model for the colour of the mark on the caudal peduncle was model IV (which sets trait interval [pMin and pMax] to observed values with two exponential tails mirrored on the cline centre; Figure 5a), and model I (pMin and pMax set to observed values with no exponential tails fitted) had the lowest AICc score for both the mark in the eye (Figure 5b) and the mtDNA (Figure 5c). Cline centre and width for these traits were: caudal peduncle c = 33.72 (27.46–43.40) km and w = 33.18 (18.44–50.58) km; eye c = 40.23 (35.88–45.27) km and w = 20.36 (11.70–32.51) km; and mtDNA c = 28.60 (24.21–33.22) km and w = 24.65 (16.34–39.17) km (Figure 5a–c). These values place the cline centre for the three traits around location 7, in the confluence area with the Cambiriba Stream (Figure 1), with the narrowest cline observed for the eye colour trait, followed by the mtDNA. The other two morphological traits (colour of the pectoral fin and colour of the tip of the pelvic fin) were not informative because they showed no clinal variation across the hybrid zone (data not shown).

Plots of the maximum likelihood clines and the 95% credible cline region estimated with the HZAR for the (a) mark on the caudal peduncle and (b) the mark in the eye, as well as (c) the mitochondrial haplotypes, and (d) the ancestry coefficients estimated in fastSTRUCTURE
Considering the cline estimated from the ancestry coefficients (q) inferred by fastSTRUCTURE, AICc score in HZAR indicated model I as the best‐fit model, which recovered a smoother and wider cline compared to the mtDNA and colour traits, with centre estimated at 35.55 (28.43–43.98) km (near location 7) and width of 40.13 (25.30–70.28) km (Figure 5d). The geographic cline analysis for the 99 RADseq loci revealed smooth clines for most of them, but 10 loci showed nonconcordant abrupt changes in frequency consistent with stepped clines along the transect (Figure S2). Overall, we found broad confidence intervals for cline widths and, consequently, extensive overlap among them (details on cline centre and width values for each SNP are shown in Table S4).
Riverscape evolution
Based on the investigation of geological and geomorphological data around the hybrid zone, we found evidence of topographical changes in the studied area, specifically in the headwaters of the Gongogi River sub‐basin (Figure 6). These changes suggest a past connection between the Contas and Pardo river basins through a river capture event. It is important to notice that the Pardo River basin is a separate but adjacent basin where populations belonging to the Southern lineage of Nematocharax also occur (Barreto, 2019). Thus, the drainage rearrangement may have allowed the dispersal of the Southern lineage from the Pardo to the Contas river basin. The putative river capture between these adjacent watersheds was inferred from three sources of evidence: (1) presence of drainages with abrupt changes in the flow direction (elbows of capture); (2) wind gaps at the boundary between river basins; and (3) two geological faults near the capture area (Figure 6).

Putative river capture (highlighted by the red circle) between the Contas and Pardo river basins (with boundaries shown in black; see inset for map of the entire region) in the focal area around the Gongogi River (i.e. sampled locations 1–5 from the hybrid zone; see also Figure 1). Evidence of drainage rearrangement includes changes in the relief (based on the location of geological faults, represented by the white dashed lines), elbows of capture (based on hydrographic layers, in blue) and wind gaps (based on Digital Elevation Model, DEM, where areas of low and high elevation are shown in dark and light grey, respectively). Note that the Pardo River basin is associated with the yellow lineage (shown here by two dots), which is the presumed source of the Southern lineage in the Contas River basin via river capture
DISCUSSION
Our analyses based on genomic data and colour traits support the occurrence of hybridization between lineages of Nematocharax that clearly correspond to two divergent gene pools, with clinal transition between parental forms across the Gongogi River and different levels of hybridization in the tributaries (Figures 1 and 3). Our findings also help elucidate the origin of this hybrid zone, given the geological and geomorphological evidence of a river capture event that probably caused secondary contact in that stretch of the Contas River basin (Figure 6).
Although population genomics studies with freshwater fishes have provided valuable insights into the ecological and evolutionary contexts of hybridization (e.g. Mandeville et al., 2017; McKelvey et al., 2016; Sotola, Ruppel, Bonner, Nice, & Martin, 2019), investigations on the influence of past and current river properties, especially using geomorphological data, are still scarce. In addition to inferences about the origin and structure of the hybrid zone in Nematocharax, we discuss the taxonomic implications of these results, including what our genomic data suggest about species boundaries in this fish genus.
Origin of the hybrid zone
Distinguishing between differentiation in a continuous population as a direct response to the environment versus divergence in allopatry followed by secondary contact is not an easy task when determining the origin of a hybrid zone (Harrison & Larson, 2016). In this study, the evidence points to a scenario of secondary contact and interbreeding between lineages of N. venustus that diverged in allopatry, forming a hybrid zone around the Cambiriba Stream, in the Gongogi River sub‐basin. The most likely explanation is that the current configuration of the rivers was not stable over time, with the Southern lineage reaching the Contas River basin (where the Northern lineage was already present) via dispersal allowed by a river capture (Figure 6). Our geomorphological data show typical evidence of this type of event, such as elbows of capture, wind gaps and geological faults (de Oliveira, 2010), thus providing a potential route for dispersal of the Southern lineage from the Pardo to the Contas river basin. This hypothesis is reinforced by the sharing of the same mtDNA haplotype of the Southern lineage between both river basins and by evidence of demographic expansion for the Southern lineage when including populations outside the Contas River basin (Figure 6; Barreto et al., in preparation).
Phylogeographic analyses encompassing lineages of Nematocharax throughout the entire distribution of the genus and using both COI sequences and RADseq data show that the Northern and Southern lineages are not sister groups (Barreto et al., in preparation). According to the COI, the time to the most recent common ancestor (TMRCA) is 0.48 (0.27–0.62) Mya for the Northern lineage and 0.43 (0.31–0.67) Mya for the Southern lineage. This period corresponds to the Pleistocene, during which tectonic reactivations of geological faults caused several topographic changes in coastal drainages of eastern Brazil, including drainage rearrangements (Ribeiro, 2006; Saadi et al., 2002), which may have promoted secondary contact between aquatic lineages.
Intriguingly, the hybrid zone analysed here includes the type locality (site 10 on the map; Figure 1) of a putative new species of Nematocharax (N. costai) that was synonymized with N. venustus (Bragança et al., 2013; Menezes et al., 2015). Despite the clear differentiation between the Northern and Southern lineages in mtDNA, RADseq and colour pattern, our data do not support N. costai in the way it was described by Bragança et al. (2013) based on morphological and morphometric data from five specimens that our analyses show to occur in an area with a high frequency of hybrids (site 10; Figure 1). As such, our work corroborates the synonymization of N. costai and N. venustus (see Menezes et al., 2015), but highlights the need for a taxonomically focused study on what is currently recognized as a single taxon, N. venustus.
Dynamics of the hybrid zone
Geographic clines allow verification of how the frequency of alleles or phenotypes vary along a transect, where smooth transitions suggest weak selection and less effective reproductive barriers, whereas stepped clines indicate strong selective pressures and substantial barriers to gene flow between lineages (Barton & Gale, 1993). Indeed, the clines estimated from the ancestry coefficients (Figure 5d) and for most individual RADseq loci (Figure S2) are more gradual and largely concordant, indicating that the strength of selection is probably weak and uniform across these loci. The presence of a few abruptly stepped clines (Figure S2) along the hybrid zone, in turn, could indicate that some loci are in regions of strong selection, which may be preventing genetic homogenization between parental groups (Feder, Flaxman, Egan, Comeault, & Nosil, 2013). These results exemplify the 'semi‐permeable' nature of hybrid zones, that is the idea that gene flow and reproductive isolation can be thought of in terms of genomic regions rather than entire genomes (Harrison & Larson, 2014; Kawakami & Butlin, 2012).
Regarding the colour traits (Figures 2and5a, b), it is possible that the differences between parental lineages have diverged before the secondary contact. Thus, the presence of individuals with mixed colour patterns (Figure 2b) at the centre of the hybrid zone would be a clear evidence of hybridization. However, because the abundance of F2 hybrids and absence of F1 offspring and early generation backcrosses, we may have detected a hybrid lineage that is becoming reproductively independent from the parental lineages, which means that the rates of hybridization and introgression have decreased over time. Colour traits would be of fundamental importance in this case, given their potential role on species recognition and mating success (e.g. Couldridge & Alexander, 2002; Houde, 1987).
It is worth noting that MCMC chains in the NewHybrids software can sometimes fail to converge, causing nearly all individuals to be recovered as F2 hybrids (Wringe et al., 2017b). However, our MCMC chains converged properly, as shown by hybriddetective, and the predominance of F2 hybrids is in accordance with our other findings, such as the clear spatial structuring of hybrid individuals within the transition zone between areas where parental lineages are found. Although the scale of dispersal for Nematocharax is unknown, it is plausible to assume that the spatial segregation between hybrids and parental populations is due to the small scale of movement generally described for small characins (Lucas & Baras, 2001). Therefore, the diagnosable genetic and morphological differences between the Southern and Northern lineages could be justified. Additionally, clinal variation across the hybrid zone may reflect isolation by distance in a linear riverine system and gradual admixture of the divergent lineages.
Even though mating behaviour has not yet been studied in Nematocharax, sexual selection might also play a role on the maintenance of this hybrid zone. Indeed, empirical and theoretical studies have demonstrated that assortative mating can influence the incidence and rate of hybridization (e.g. Culumber, Ochoa, & Rosenthal, 2014; MacCallum, Nürnberger, Barton, & Szymura, 1998; Vines, 2002). The sexual dimorphism found in Nematocharax, including the presence of hooks and spinules on fins and elongated fins in maturing and mature males, is also commonly reported in small characid species (Dagosta, Marinho, & Camelier, 2014; Marinho, Dagosta, & Birindelli, 2014; Zanata & Camelier, 2009) being usually related to male display behaviour and female mate choice (Bischoff, Gould, & Rubenstein, 1985). Further investigation can directly test whether assortative mating of hybrids contributes to reproductive isolation from the parental lineages.
Possible role of riverscape properties
Small South American characins have been described as resident or small‐scale migrants (Lucas & Baras, 2001). If this is true for Nematocharax, then it could imply restricted movement relative to the width of the hybrid zone, meaning that few pure individuals are likely to be found in the centre. Supporting this idea, field and aquarium observations (pers. obs.) indicate that Nematocharax has a territorial behaviour (particularly males), spending most of its lifetime in a very restricted area (Gerking, 1953). For these species, riverscape architecture may have a pronounced effect on the structure of the hybrid zone, influencing the spatial distribution of hybrids and pure individuals. For example, the Southern lineage probably entered the Gongogi River sub‐basin via river capture (Figure 6) and dispersed mainly along the main river channel. However, hybridization only occurs in three of the five tributaries (Figures 1 and 3), which might be associated with differential dispersal opportunities related to the altitudinal profile and the angle of connection to the main river. To illustrate this hypothesis, Figure 1 shows that, at the confluence with the Cambiriba Stream (site 7), the Gongogi River forms a sharp bend, thus creating a counterflow and favouring the formation of flooded and shallow areas particularly suitable for N. venustus (pers. obs.). This may help understand why secondary contact was predominant in tributaries such as the Cambiriba Stream, but absent in others (Figures 1 and 3).
Interestingly, artificial reservoirs have been built in the hybrid zone analysed here, specifically in location 10 on the map (Figure 1; Balneário Guaíra, type locality of the synonymized species) and in the connection point between the main river and the tributary where the location 5 is situated (Figure 1; Balneário Beach Park). Thus, although they are very recent, these places may have been responsible for imposing additional physical barriers or habitat disturbances (e.g. changes in river depth, temperature and waterflow), with potential influence on fish population structure (Valenzuela‐Aguayo, McCracken, Manosalva, Habit, & Ruzzante, 2020).
Overall, our findings highlight that biological traits and the past and current riverscape architectures can interact to originate and structure hybrid zones in freshwater fish species, even though it is difficult to disentangle the individual effects of a particular factor. Also, our data shed light on the dynamics of hybridization between lineages of N. venustus, with direct relevance to the study of interspecific boundaries in Nematocharax. Our work illustrates that the understanding of a hybrid zone is not dependent on the taxonomic status, but rather on the nature of differences between lineages or groups (Gompert & Buerkle, 2016; Mallet, 1995). Additional studies using, for example, behavioural data and experimental measures of relative fitness of hybrid and parental individuals are required to expand our knowledge on the ecological and evolutionary consequences of hybridization in this system.
ACKNOWLEDGMENTS
The authors thank the Rede de Plataformas Tecnológicas for the use of its Sequencing Facility in FIOCRUZ‐Bahia, the Core Facility for Scientific Research—University of São Paulo (CEFAP‐USP/GENIAL) for the use of its Illumina NextSeq sequencer, and the National Laboratory for Scientific Computing (LNCC/MCTI, Brazil) for providing HPC resources of the SDumont supercomputer, which have contributed to the research results reported within this paper. This study was financed in part by CAPES (Finance Code 001, PDSE proc. n. 88881.186858/2018‐01 and proc. n. 23038.000776/2017‐54), FAPESB (RED0045/2014 and JCB0026/2016) and CNPq (Research Productivity Fellowship n. 307037/2018‐5 and proc. n. 465767/2014‐1). The collection licence (number 51856‐2) was provided by the Instituto Chico Mendes de Conservação da Biodiversidade (ICMBio/SISBIO). This study was approved by the Ethics Committee of Utilization of Animals from the Universidade Estadual do Sudoeste da Bahia (CEUA/UESB, number 71/2014). All authors would like to thank Stuart J. E. Baird and two anonymous reviewers for their helpful comments that improved the manuscript.
CONFLICT OF INTEREST
The authors have no conflicts of interest to declare.
Peer Review
The peer review history for this article is available at https://publons.com/publon/10.1111/jeb.13689.
DATA AVAILABILITY STATEMENT
Data supporting this publication can be found on Dryad: https://doi.org/10.5061/dryad.mpg4f4qwx.