It matters how you google it? Using agent-based testing to assess the impact of user choices in search queries and algorithmic personalization on political Google Search results

Overview of experimental conditions and search terms

Condition	Algorithmic personalization	User choice
Immigration
Pro	vluchteling	vluchtelingen (pro_1)
	vluchtelingen nederland	vluchtelingencrisis (pro_2)
	vluchtelingen in nederland	vluchtelingenproblematiek (pro_3)
	… (n = 14)
Neutral	immigranten	immigratie (neutral_1)
	immigratie nederland	imigranten (neutral_2)
	migratie	immigratiecijfers (neutral_3)
	… (n = 13)
Anti	opvang asielzoekers	asielzoekers (anti_1)
	illegalen	azc (anti_2)
	asiel beleid	criminaliteit onder asielzoekers (anti_3)
	… (n = 10)
Mixed	vluchteling	–
	vluchtelingen nederland
	opvang asielzoekers
	… (n = 20)
Climate
High	zonnepanelen	klimaat politiek (high_1)
	politiek klimaat	politiek en klimaat (high_2)
	klimaattop	klimaatakkoord (high_3)
	… (n = 27)
Neutral	klimaatverandering	klimaat (neutral_1)
	klimaat verandering	milieu (neutral_2)
	klimaatveranderingen	klimaat veranderingen (neutral_3)
	… (n = 4)
Low	opwarming	opwarming aarde (low_1)
	opwarming van de aarde	zeespiegel stijging (low_2)
	stijging zeespiegel	aarde opwarming (low_3)
	… (n = 26)
Mixed	opwarming	–
	opwarming van de aarde
	zonnepanelen
	… (n = 52)
Both issues
Unrelated	npo	–
	nederland
	rtl
	… (n = 25)
None	–	–

Condition	Algorithmic personalization	User choice
Immigration
Pro	vluchteling	vluchtelingen (pro_1)
	vluchtelingen nederland	vluchtelingencrisis (pro_2)
	vluchtelingen in nederland	vluchtelingenproblematiek (pro_3)
	… (n = 14)
Neutral	immigranten	immigratie (neutral_1)
	immigratie nederland	imigranten (neutral_2)
	migratie	immigratiecijfers (neutral_3)
	… (n = 13)
Anti	opvang asielzoekers	asielzoekers (anti_1)
	illegalen	azc (anti_2)
	asiel beleid	criminaliteit onder asielzoekers (anti_3)
	… (n = 10)
Mixed	vluchteling	–
	vluchtelingen nederland
	opvang asielzoekers
	… (n = 20)
Climate
High	zonnepanelen	klimaat politiek (high_1)
	politiek klimaat	politiek en klimaat (high_2)
	klimaattop	klimaatakkoord (high_3)
	… (n = 27)
Neutral	klimaatverandering	klimaat (neutral_1)
	klimaat verandering	milieu (neutral_2)
	klimaatveranderingen	klimaat veranderingen (neutral_3)
	… (n = 4)
Low	opwarming	opwarming aarde (low_1)
	opwarming van de aarde	zeespiegel stijging (low_2)
	stijging zeespiegel	aarde opwarming (low_3)
	… (n = 26)
Mixed	opwarming	–
	opwarming van de aarde
	zonnepanelen
	… (n = 52)
Both issues
Unrelated	npo	–
	nederland
	rtl
	… (n = 25)
None	–	–

Note. The full list of search terms used for each algorithmic personalization condition are included in the Supplementary Tables SM2 (immigration) and SM3 (climate).

Table 1.

Overview of experimental conditions and search terms

Condition	Algorithmic personalization	User choice
Immigration
Pro	vluchteling	vluchtelingen (pro_1)
	vluchtelingen nederland	vluchtelingencrisis (pro_2)
	vluchtelingen in nederland	vluchtelingenproblematiek (pro_3)
	… (n = 14)
Neutral	immigranten	immigratie (neutral_1)
	immigratie nederland	imigranten (neutral_2)
	migratie	immigratiecijfers (neutral_3)
	… (n = 13)
Anti	opvang asielzoekers	asielzoekers (anti_1)
	illegalen	azc (anti_2)
	asiel beleid	criminaliteit onder asielzoekers (anti_3)
	… (n = 10)
Mixed	vluchteling	–
	vluchtelingen nederland
	opvang asielzoekers
	… (n = 20)
Climate
High	zonnepanelen	klimaat politiek (high_1)
	politiek klimaat	politiek en klimaat (high_2)
	klimaattop	klimaatakkoord (high_3)
	… (n = 27)
Neutral	klimaatverandering	klimaat (neutral_1)
	klimaat verandering	milieu (neutral_2)
	klimaatveranderingen	klimaat veranderingen (neutral_3)
	… (n = 4)
Low	opwarming	opwarming aarde (low_1)
	opwarming van de aarde	zeespiegel stijging (low_2)
	stijging zeespiegel	aarde opwarming (low_3)
	… (n = 26)
Mixed	opwarming	–
	opwarming van de aarde
	zonnepanelen
	… (n = 52)
Both issues
Unrelated	npo	–
	nederland
	rtl
	… (n = 25)
None	–	–

Condition	Algorithmic personalization	User choice
Immigration
Pro	vluchteling	vluchtelingen (pro_1)
	vluchtelingen nederland	vluchtelingencrisis (pro_2)
	vluchtelingen in nederland	vluchtelingenproblematiek (pro_3)
	… (n = 14)
Neutral	immigranten	immigratie (neutral_1)
	immigratie nederland	imigranten (neutral_2)
	migratie	immigratiecijfers (neutral_3)
	… (n = 13)
Anti	opvang asielzoekers	asielzoekers (anti_1)
	illegalen	azc (anti_2)
	asiel beleid	criminaliteit onder asielzoekers (anti_3)
	… (n = 10)
Mixed	vluchteling	–
	vluchtelingen nederland
	opvang asielzoekers
	… (n = 20)
Climate
High	zonnepanelen	klimaat politiek (high_1)
	politiek klimaat	politiek en klimaat (high_2)
	klimaattop	klimaatakkoord (high_3)
	… (n = 27)
Neutral	klimaatverandering	klimaat (neutral_1)
	klimaat verandering	milieu (neutral_2)
	klimaatveranderingen	klimaat veranderingen (neutral_3)
	… (n = 4)
Low	opwarming	opwarming aarde (low_1)
	opwarming van de aarde	zeespiegel stijging (low_2)
	stijging zeespiegel	aarde opwarming (low_3)
	… (n = 26)
Mixed	opwarming	–
	opwarming van de aarde
	zonnepanelen
	… (n = 52)
Both issues
Unrelated	npo	–
	nederland
	rtl
	… (n = 25)
None	–	–

Note. The full list of search terms used for each algorithmic personalization condition are included in the Supplementary Tables SM2 (immigration) and SM3 (climate).

Additionally, three other algorithmic personalization conditions were created. Mixed search behavior combines queries from both Pro and Anti or High and Low conditions equally, capturing seeking divergent opinions (Menchen-Trevino et al., 2023). Unrelated search behavior indicates an interest in (news) topics other than immigration and climate. It is constructed by extracting the top 20 most frequent named entities in Dutch news headlines.⁵ Finally, the None condition represents no search history, measuring the absence of algorithmic personalization.

Data collection

We used ScrapeBot (Haim 2020; see also the Github⁶), an open-source Selenium-based tool, to employ ABT. ScrapeBot, which allows for the simulation of human interaction with web pages in a visual browser, has been utilized in similar research designs (Haim et al., 2017; Haim et al., 2018).

We created 114 agents for each of the 18 conditions per political issue (N_immigration = 2052; N_climate = 2052). As one server can only conduct a single search for one agent at a time, we parallelize the data collection across 19 servers (i.e., 108 agents per server), ensuring each condition is equally represented on every server. This approach allows us to scale up the number of agents while ensuring limited time-related influence on the search results.

The data collection is performed consecutively for each of the two issues. The data collection procedure is divided into two phases: training and testing (see Figure 1 for an overview of the data collection procedure). During the training phase, we created a search history for each agent by conducting multiple Google searches with cookie collection turned on. The program randomly selects an agent, retrieves their saved cookies, and randomly selects a search query from the agents’ assigned set of search queries. It then searches this query on www.google.nl, retrieves autocomplete predictions, the first SERP’s HTML, and the cookies. It then moves on to the next agent to repeat the process. For an agent’s first run, a fresh browser environment (without histories or cookies) is initiated, the cookie statement is accepted, and the training procedure is followed, after which the agents’ cookies are stored in a central database, and retrieved and added to the browser environment for every next run that agent performs after that.

Overview of the data collection phases, with two main phases: the training phase on the left and testing phase on the right. The figure shows the sequence of different steps in these phases.

Figure 1.

Overview of data collection procedure.

Note. The data collection procedure is parallelized across 19 servers.

Google identifies an agent via its associated cookies rather than creating individual Google accounts for each agent, following methodology for similar studies (Haim et al., 2017; Haroon et al., 2023). Google Search’s documentation suggests that while Google Search personalizes search results based on activity saved in Google accounts (Google, 2023), it also personalizes search results based on cookies, even for users who are not logged in but have accepted the cookie statement (Google, 2024). The training phases were ended after 13 days (Immigration: August 18–31, 2023) or 11 days (Climate: September 4–15, 2023), during which each agent was trained M_immigration = 257.8 and M_climate = 217.6 times.⁷

The testing phase began within an hour the training phase ending for all agents. Agents trained according to the five different search histories are assigned to user choice conditions. During the testing phase, each agent conducts three testing searches, one for each of the three search queries. For the condition without search history, we use browser environments without any cookies loaded. These agents lack any search history, preventing personalization based on past searches.

During the testing procedure, the program randomly selects an agent and query combination, loads their training phase cookies (if applicable), navigates to www.google.nl, performs a search using a query from the agent’s assigned set, retrieves the autocomplete predictions and the first SERP’s HTML, and proceeds to the next agent. The testing phases ran on one day (Immigration: August 31, 2023; Climate: September 15, 2023).

Following best practices for ABT (Schwabl et al., 2024) and based on insights from Google Search’s documentation (Google, 2023), previous studies (Haim, 2020), and pretesting, we control for location, language settings, and browser settings in all searches: We set browser-language settings to Dutch, use servers located in Amsterdam, and use the same display size, browser type (Chrome), and use the most common Dutch user-agent string. While it is difficult to fully control for time-related influence on the search results, we minimized the time difference of testing searches between agents by parallelization across 19 servers: All testing searches are completed within 3 hours and 39 minutes.

Data preparation

Using the WebSearcher parsers⁸ (Robertson & Wilson, 2020), we extracted the URLs, textual content (e.g., title, snippet text), and SERP features, and their (sub)rank positions from the HTML of each SERP collected during the testing phase. This process created a dataset where each SERP is depicted by multiple rows, each corresponding to a search result. We identified regular search results and 12 SERP features. On average, each SERP contained 23.19 search results (SD = 4.89), 12.18 components (SD = 1.19) and 4.11 distinct SERP features (SD = 1.61).⁹ Less than 1% of cases involve unparseable SERP features, and in less than 1% of both climate and immigration cases content of features could not be parsed.

We extracted the information source (domain) from all URLs. The type of information source was determined using an existing list of domains categorizations (Loecherbach, 2023). Initially, this categorized approximately 80% of search results. An additional coding round was conducted to achieve full coverage. The sources are classified into news, background information (about public events and figures, e.g. Wikipedia, governmental or non-profit organizations), gateway (e.g., social media, search engines), and other websites unrelated to news and politics.

Dissimilarity metrics

To study whether user choices in search queries and algorithmic personalization lead to divergent search results, we need metrics to assess the (dis)similarity of search results across experimental conditions. Pairwise similarity metrics, commonly used in informational retrieval literature, are suitable for this purpose. An appropriate similarity measure should account for two curation decisions by search engines: the selection of search results from the index, and their ranking. Two SERPs can be identical in selection, but differently ranked. This is crucial, given the strong order effects in individuals’ interaction with search results, with lower-ranked results being less likely to be selected (Urman & Makhortykh, 2023).

We use Ranked-Biased Overlap (RBO), which is developed for comparing search results (Webber et al., 2010) and is increasingly employed in similar research (Makhortykh et al., 2020; Robertson et al., 2018; Urman et al., 2022). RBO addresses both selection and ranking by assigning greater weight to top results. The p (persistence) parameter determines this weighting, with smaller values emphasizing top results more. Following prior research (Urman et al., 2022), we computed RBO (p = .8) for a ranking-sensitive metric. To measure differences in SERPs, we inverted the RBO (1 − RBO) to obtain Inverted RBO (IRBO), ranging from 0 (identical) to 1 (completely dissimilar). IRBO is computed on the sources listed on the SERP, excluding search results without information sources like People Also Ask and images, across all possible SERP pairs, resulting in approximately 19 million comparisons per issue.

As a robustness check, we replicate our analyses using IRBO (p = .95), which is less attuned to ranking (Urman et al., 2022), and (Inverted) Jaccard Index, another common similarity measure for search results (e.g., Hannak et al., 2013; Kliman-Silver et al., 2015; Makhortykh et al., 2020; Puschmann, 2019). These additional analyses are detailed in the Supplementary material.

Results

Algorithmic personalization versus user choices in search queries

Figures 2 (immigration) and 3 (climate change) illustrate the average dissimilarity of information sources on SERPs returned to different search queries, grouped by user choice conditions on both axes. To assess whether algorithmic personalization resulted in divergent search results, we can examine the standard deviations of the mean dissimilarity scores across algorithmic personalization conditions within each search query-pair. These are presented in brackets in Figures 2 and 3 (see Supplementary Figures SM2 (immigration) and SM3 (climate change) for dissimilarity scores by algorithmic personalization-pairs per search query). For both issues and across all search queries, the standard deviations consistently approach zero, indicating minimal variation in dissimilarity scores across different algorithmic personalization conditions. This implies that the differences in the selection and ranking of information sources on the SERP can be attributed very little, if at all, to the agents’ differently trained search histories. Hence, algorithmic personalization did not lead to divergent information sources.

A triangular heatmap displaying the average dissimilarity. The x axis and y axis list search queries (anti_1 to pro_3).

Figure 2.

Immigration: Average dissimilarity of sources between and within search queries.

Note. Values represent the average IRBO (p = 0.8) for each search query-pair, grouped by user choice condition. The scores on the diagonal represent the dissimilarity within search queries, while the scores below the diagonal represent the similarity between search queries. Values in brackets represent the standard deviation of the mean dissimilarity scores for algorithmic personalization conditions, grouped by search query-pair. The standard deviations are close to zero, indicating little impact of algorithmic personalization.

A triangular heatmap displaying the average dissimilarity. The x axis and y axis list search queries (low_1 to high_3).

Figure 3.

Climate: Average dissimilarity of sources between and within search queries.

Note. Values represent the average IRBO (p = 0.8) for each search query-pair, grouped by user choice condition. The scores on the diagonal represent the dissimilarity within search queries, while the scores below the diagonal represent the similarity between search queries. Values in brackets represent the standard deviation of the mean dissimilarity scores for algorithmic personalization conditions, grouped by search query pair. The standard deviations are close to zero, indicating little impact of algorithmic personalization.

To assess the impact of user choices in search queries, we compare the dissimilarity scores between different search queries (i.e., the scores below the diagonal) to the scores within each search query (i.e., the scores on the diagonal). We also examine the scores of search queries within the same or different user choice conditions (i.e., clustered together on the axes).

For immigration-related queries (Figure 2), we observe a relatively high dissimilarity between search results for different search queries. The scores between different immigration-related search queries range from .75 to .99, indicating that 75% to nearly all of the information sources near the top of the SERPs are unique to each query. This level of dissimilarity is substantially higher than the dissimilarity observed within the same search query, which ranges from .09 to .56. Some dissimilarity in search results for the same query is likely, possibly due to factors like randomization by Google Search (Urman et al., 2022), minor time differences in data collection, or other unknown factors. The dissimilarity scores belonging to the same user choice conditions are not substantially lower than between queries, indicating that search results are unique at the granular level of search queries rather than aggregated user choices.

These patterns are less clear for climate-related search results (Figure 3). In line with the immigration search results, we generally observe higher dissimilarity scores between different search queries compared to within the same search queries. However, there are a few exceptions. The dissimilarity between search results returned to low_1, low_3, and neutral_3 is relatively low, equivalent to the level of dissimilarity within the same search queries. This indicates that these search result pages share many of the same information sources in top-ranked positions. Similar to immigration-related search results, these scores do not clearly cluster at the level of user choices. While the similarity between search results returned for low_1 and low_3 might suggest clustering at the user choice level, this is rather explained by the nearly identical nature of these search queries (i.e., same words in different order).

The Supplementary material presents the analyses using two alternative metrics: Inverted Jaccard Index and IRBO (p = .95). The patterns of dissimilarity are similar between IRBO (p = .80), presented in Figures 2 and 3, and Inverted Jaccard Index, while for IRBO (p = .95) the dissimilarity between search results for the same queries is substantially higher. This indicates a high degree of overlap between search results returned to the same search query, especially among top-ranked results, and that differences in search results for identical queries occur in the “long tail.” In other words, while the most relevant and highly ranked search results are similar, the results start to diverge with less relevant and lower-ranked search results, which is consistent with findings from previous research (Steiner et al., 2022; Urman et al., 2022).

SERP composition

The previous results described the extent to which search results vary due to user choices in search queries and algorithmic personalization (RQ1); this section examines how the SERPs varies with different search queries, specifically focusing on the information source, type of source, and the display of SERP features (RQ2). Since algorithmic personalization showed no impact on search results, we focus on user choices in search queries specifically by selecting only agents without search history. In this section, we present a weighted proportion to account for rank order, assigning weights to search results inversely proportional to their rank, giving higher weights to top-ranked search results (i.e., $\frac{1}{(rank + 1)}$ ⁠), and summing these weights for each source (type).

The large variation in source types and SERP features displayed in Figure 4 reflects the high dissimilarity scores between immigration queries. A few points stand out. While news sources are prevalent for two search queries (i.e., pro_1 and anti_1) due to the display of the Top Stories feature, news is mostly absent or have much a lower weighted proportions for other queries. Furthermore, the SERPs for anti_2, anti_3, and neutral_1 contain a relatively high weighted proportion of gateway websites like social media. Specifically, Twitter emerges as a frequent top-ranked source for anti_2 and neutral_1 (see Table 2), related to the presence of a Twitter feature showcasing “top tweets” about the queried topic. A manual inspection of these tweets reveals that those for anti_2 contain mostly news headlines, whereas the tweets for neutral_1 convey strong negative opinions about immigration. Furthermore, anti_3 features a scholarly articles SERP feature, making google.nl a frequent and top-ranked gateway source.

Subfigure a is a horizontal bar chart showing weighted proportions for three types of information sources: background information, news, gateway, and other, across different search queries (low_1 to high_3). Subfigure b is a heatmap displaying the share of SERPs containing a SERP feature. The y axis lists search queries (low_1 to high_3), and the x axis lists SERP features: Knowledge Panel, Local results, Top Stories, Twitter, Videos, and Featured Snippet.

Figure 4.

Immigration: SERP composition. (a) Information source types. (b) SERP features.

Note. (a) Weights are assigned to search results inversely proportional to their rank. (b) Values indicate the share of SERPs containing a specific SERP feature for each search query.

Table 2.

Immigration: Most frequent information sources per user choice condition by weighted proportion

anti_1	prop^w	type	anti_2	prop^w	type	anti_3	prop^w	type
ad.nl	0.13	news	twitter.com	0.41	gateway	google.nl	0.63	gateway
nrc.nl	0.11	news	coa.nl	0.20	bg info	rijksoverheid.nl	0.10	bg info
rtlnieuws.nl	0.11	news	wikipedia.org	0.12	bg info	quest.nl	0.05	other
volkskrant.nl	0.11	news	youtube.com	0.06	gateway	vpro.nl	0.04	news
looopings.nl	0.10	other	azczutphen.nl	0.04	other	nos.nl	0.03	news

*neutral_1*	prop^w	type	*neutral _2*	prop^w	type	*neutral _3*	prop^w	type

twitter.com	0.34	gateway	cbs.nl	0.40	bg info	cbs.nl	0.54	bg info
ind.nl	0.31	other	encyclo.nl	0.13	gateway	rijksoverheid.nl	0.13	bg info
rijksoverheid.nl	0.10	bg info	woorden.org	0.09	other	nos.nl	0.08	news
cbs.nl	0.08	bg info	rijksoverheid.nl	0.08	bg info	adviesraadmigratie.nl	0.07	bg info
wikipedia.org	0.07	bg info	uu.nl	0.07	other	europa.eu	0.06	bg info

*pro_1*	prop^w	type	*pro_2*	prop^w	type	*pro_3*	prop^w	type

ad.nl	0.14	news	wikipedia.org	0.48	bg info	rijksoverheid.nl	0.40	bg info
telegraaf.nl	0.14	news	vluchteling.nl	0.14	bg info	universiteitleiden.nl	0.13	other
trouw.nl	0.14	news	rodekruis.nl	0.08	other	amnesty.nl	0.10	bg info
nu.nl	0.13	news	universiteitleiden.nl	0.06	other	unhcr.org	0.08	bg info
wikipedia.org	0.12	bg info	nrc.nl	0.05	news	bnnvara.nl	0.07	news

anti_1	prop^w	type	anti_2	prop^w	type	anti_3	prop^w	type
ad.nl	0.13	news	twitter.com	0.41	gateway	google.nl	0.63	gateway
nrc.nl	0.11	news	coa.nl	0.20	bg info	rijksoverheid.nl	0.10	bg info
rtlnieuws.nl	0.11	news	wikipedia.org	0.12	bg info	quest.nl	0.05	other
volkskrant.nl	0.11	news	youtube.com	0.06	gateway	vpro.nl	0.04	news
looopings.nl	0.10	other	azczutphen.nl	0.04	other	nos.nl	0.03	news

*neutral_1*	prop^w	type	*neutral _2*	prop^w	type	*neutral _3*	prop^w	type

twitter.com	0.34	gateway	cbs.nl	0.40	bg info	cbs.nl	0.54	bg info
ind.nl	0.31	other	encyclo.nl	0.13	gateway	rijksoverheid.nl	0.13	bg info
rijksoverheid.nl	0.10	bg info	woorden.org	0.09	other	nos.nl	0.08	news
cbs.nl	0.08	bg info	rijksoverheid.nl	0.08	bg info	adviesraadmigratie.nl	0.07	bg info
wikipedia.org	0.07	bg info	uu.nl	0.07	other	europa.eu	0.06	bg info

*pro_1*	prop^w	type	*pro_2*	prop^w	type	*pro_3*	prop^w	type

ad.nl	0.14	news	wikipedia.org	0.48	bg info	rijksoverheid.nl	0.40	bg info
telegraaf.nl	0.14	news	vluchteling.nl	0.14	bg info	universiteitleiden.nl	0.13	other
trouw.nl	0.14	news	rodekruis.nl	0.08	other	amnesty.nl	0.10	bg info
nu.nl	0.13	news	universiteitleiden.nl	0.06	other	unhcr.org	0.08	bg info
wikipedia.org	0.12	bg info	nrc.nl	0.05	news	bnnvara.nl	0.07	news

Note. prop^w: Weighted proportion. Weights are assigned to search results inversely proportional to their rank, giving higher weights to higher ranked search results (i.e., $\frac{1}{rank + 1}$ ⁠), which are then summed for each source.

Table 2.

Immigration: Most frequent information sources per user choice condition by weighted proportion

anti_1	prop^w	type	anti_2	prop^w	type	anti_3	prop^w	type
ad.nl	0.13	news	twitter.com	0.41	gateway	google.nl	0.63	gateway
nrc.nl	0.11	news	coa.nl	0.20	bg info	rijksoverheid.nl	0.10	bg info
rtlnieuws.nl	0.11	news	wikipedia.org	0.12	bg info	quest.nl	0.05	other
volkskrant.nl	0.11	news	youtube.com	0.06	gateway	vpro.nl	0.04	news
looopings.nl	0.10	other	azczutphen.nl	0.04	other	nos.nl	0.03	news

*neutral_1*	prop^w	type	*neutral _2*	prop^w	type	*neutral _3*	prop^w	type

twitter.com	0.34	gateway	cbs.nl	0.40	bg info	cbs.nl	0.54	bg info
ind.nl	0.31	other	encyclo.nl	0.13	gateway	rijksoverheid.nl	0.13	bg info
rijksoverheid.nl	0.10	bg info	woorden.org	0.09	other	nos.nl	0.08	news
cbs.nl	0.08	bg info	rijksoverheid.nl	0.08	bg info	adviesraadmigratie.nl	0.07	bg info
wikipedia.org	0.07	bg info	uu.nl	0.07	other	europa.eu	0.06	bg info

*pro_1*	prop^w	type	*pro_2*	prop^w	type	*pro_3*	prop^w	type

ad.nl	0.14	news	wikipedia.org	0.48	bg info	rijksoverheid.nl	0.40	bg info
telegraaf.nl	0.14	news	vluchteling.nl	0.14	bg info	universiteitleiden.nl	0.13	other
trouw.nl	0.14	news	rodekruis.nl	0.08	other	amnesty.nl	0.10	bg info
nu.nl	0.13	news	universiteitleiden.nl	0.06	other	unhcr.org	0.08	bg info
wikipedia.org	0.12	bg info	nrc.nl	0.05	news	bnnvara.nl	0.07	news

anti_1	prop^w	type	anti_2	prop^w	type	anti_3	prop^w	type
ad.nl	0.13	news	twitter.com	0.41	gateway	google.nl	0.63	gateway
nrc.nl	0.11	news	coa.nl	0.20	bg info	rijksoverheid.nl	0.10	bg info
rtlnieuws.nl	0.11	news	wikipedia.org	0.12	bg info	quest.nl	0.05	other
volkskrant.nl	0.11	news	youtube.com	0.06	gateway	vpro.nl	0.04	news
looopings.nl	0.10	other	azczutphen.nl	0.04	other	nos.nl	0.03	news

*neutral_1*	prop^w	type	*neutral _2*	prop^w	type	*neutral _3*	prop^w	type

twitter.com	0.34	gateway	cbs.nl	0.40	bg info	cbs.nl	0.54	bg info
ind.nl	0.31	other	encyclo.nl	0.13	gateway	rijksoverheid.nl	0.13	bg info
rijksoverheid.nl	0.10	bg info	woorden.org	0.09	other	nos.nl	0.08	news
cbs.nl	0.08	bg info	rijksoverheid.nl	0.08	bg info	adviesraadmigratie.nl	0.07	bg info
wikipedia.org	0.07	bg info	uu.nl	0.07	other	europa.eu	0.06	bg info

*pro_1*	prop^w	type	*pro_2*	prop^w	type	*pro_3*	prop^w	type

ad.nl	0.14	news	wikipedia.org	0.48	bg info	rijksoverheid.nl	0.40	bg info
telegraaf.nl	0.14	news	vluchteling.nl	0.14	bg info	universiteitleiden.nl	0.13	other
trouw.nl	0.14	news	rodekruis.nl	0.08	other	amnesty.nl	0.10	bg info
nu.nl	0.13	news	universiteitleiden.nl	0.06	other	unhcr.org	0.08	bg info
wikipedia.org	0.12	bg info	nrc.nl	0.05	news	bnnvara.nl	0.07	news

The SERP composition for climate queries is presented in Figure 5, and it shows that the search results generally consist of mostly background information sources (e.g., governmental websites, Wikipedia, see Table 3) or websites unrelated to news or politics. Unlike immigration queries, gateway websites and, to a lesser extent, news, are rarely frequent and highly ranked parts of the search results. An exception are the search results for neutral_1, which shows a large share of news sources due to the Top Stories feature.

Figure 5.

Climate: SERP composition. (a) Information source types. (b) SERP features.

Note. (a) Weights are assigned to search results inversely proportional to their rank. (b) Values indicate the share of SERPs containing a specific SERP feature for each search query.

Table 3.

Climate: Most frequent information sources per user choice condition by weighted proportion

low_1	prop^w	type	low_2	prop^w	type	low_3	prop^w	type
klimaatakkoord.nl	0.37	bg info	wikipedia.org	0.36	bg info	klimaatakkoord.nl	0.25	bg info
klimaat.be	0.11	bg info	knmi.nl	0.23	other	europa.eu	0.17	bg info
wwf.nl	0.10	other	rijksoverheid.nl	0.08	bg info	klimaat.be	0.12	bg info
nos.nl	0.09	news	nos.nl	0.08	news	wwf.nl	0.10	other
wikipedia.org	0.08	bg info	deltaprogramma.nl	0.06	bg info	rijksoverheid.nl	0.07	bg info

*neutral_1*	prop^w	type	*neutral_2*	prop^w	type	*neutral_3*	prop^w	type

telegraaf.nl	0.22	news	wikipedia.org	0.44	bg info	rijksoverheid.nl	0.43	bg info
knmi.nl	0.17	other	milieucentraal.nl	0.12	other	wwf.nl	0.13	other
ad.nl	0.14	news	rijksoverheid.nl	0.06	bg info	klimaatadaptatienederland.nl	0.12	bg info
fd.nl	0.12	news	wiktionary.org	0.05	other	europa.eu	0.09	bg info
warmte365.nl	0.04	other	apple.com	0.04	other	wur.nl	0.06	other

*high_1*	prop^w	type	*high_2*	prop^w	type	*high_3*	prop^w	type

rijksoverheid.nl	0.61	bg info	rijksoverheid.nl	0.52	bg info	wikipedia.org	0.35	bg info
wwf.nl	0.08	other	wwf.nl	0.10	other	klimaatakkoord.nl	0.23	bg info
groenlinks.nl	0.07	bg info	klimaatlabelpolitiek.nl	0.07	bg info	rijksoverheid.nl	0.20	bg info
d66.nl	0.06	bg info	d66.nl	0.06	bg info	vng.nl	0.05	bg info
klimaatlabelpolitiek.nl	0.06	bg info	ipsos.com	0.05	other	emissieautoriteit.nl	0.05	other

low_1	prop^w	type	low_2	prop^w	type	low_3	prop^w	type
klimaatakkoord.nl	0.37	bg info	wikipedia.org	0.36	bg info	klimaatakkoord.nl	0.25	bg info
klimaat.be	0.11	bg info	knmi.nl	0.23	other	europa.eu	0.17	bg info
wwf.nl	0.10	other	rijksoverheid.nl	0.08	bg info	klimaat.be	0.12	bg info
nos.nl	0.09	news	nos.nl	0.08	news	wwf.nl	0.10	other
wikipedia.org	0.08	bg info	deltaprogramma.nl	0.06	bg info	rijksoverheid.nl	0.07	bg info

*neutral_1*	prop^w	type	*neutral_2*	prop^w	type	*neutral_3*	prop^w	type

telegraaf.nl	0.22	news	wikipedia.org	0.44	bg info	rijksoverheid.nl	0.43	bg info
knmi.nl	0.17	other	milieucentraal.nl	0.12	other	wwf.nl	0.13	other
ad.nl	0.14	news	rijksoverheid.nl	0.06	bg info	klimaatadaptatienederland.nl	0.12	bg info
fd.nl	0.12	news	wiktionary.org	0.05	other	europa.eu	0.09	bg info
warmte365.nl	0.04	other	apple.com	0.04	other	wur.nl	0.06	other

*high_1*	prop^w	type	*high_2*	prop^w	type	*high_3*	prop^w	type

rijksoverheid.nl	0.61	bg info	rijksoverheid.nl	0.52	bg info	wikipedia.org	0.35	bg info
wwf.nl	0.08	other	wwf.nl	0.10	other	klimaatakkoord.nl	0.23	bg info
groenlinks.nl	0.07	bg info	klimaatlabelpolitiek.nl	0.07	bg info	rijksoverheid.nl	0.20	bg info
d66.nl	0.06	bg info	d66.nl	0.06	bg info	vng.nl	0.05	bg info
klimaatlabelpolitiek.nl	0.06	bg info	ipsos.com	0.05	other	emissieautoriteit.nl	0.05	other

Table 3.

https://googleblog.blogspot.com/2009/12/personalized-search-for-everyone.html

Climate: Most frequent information sources per user choice condition by weighted proportion

low_1	prop^w	type	low_2	prop^w	type	low_3	prop^w	type
klimaatakkoord.nl	0.37	bg info	wikipedia.org	0.36	bg info	klimaatakkoord.nl	0.25	bg info
klimaat.be	0.11	bg info	knmi.nl	0.23	other	europa.eu	0.17	bg info
wwf.nl	0.10	other	rijksoverheid.nl	0.08	bg info	klimaat.be	0.12	bg info
nos.nl	0.09	news	nos.nl	0.08	news	wwf.nl	0.10	other
wikipedia.org	0.08	bg info	deltaprogramma.nl	0.06	bg info	rijksoverheid.nl	0.07	bg info

*neutral_1*	prop^w	type	*neutral_2*	prop^w	type	*neutral_3*	prop^w	type

telegraaf.nl	0.22	news	wikipedia.org	0.44	bg info	rijksoverheid.nl	0.43	bg info
knmi.nl	0.17	other	milieucentraal.nl	0.12	other	wwf.nl	0.13	other
ad.nl	0.14	news	rijksoverheid.nl	0.06	bg info	klimaatadaptatienederland.nl	0.12	bg info
fd.nl	0.12	news	wiktionary.org	0.05	other	europa.eu	0.09	bg info
warmte365.nl	0.04	other	apple.com	0.04	other	wur.nl	0.06	other

*high_1*	prop^w	type	*high_2*	prop^w	type	*high_3*	prop^w	type

rijksoverheid.nl	0.61	bg info	rijksoverheid.nl	0.52	bg info	wikipedia.org	0.35	bg info
wwf.nl	0.08	other	wwf.nl	0.10	other	klimaatakkoord.nl	0.23	bg info
groenlinks.nl	0.07	bg info	klimaatlabelpolitiek.nl	0.07	bg info	rijksoverheid.nl	0.20	bg info
d66.nl	0.06	bg info	d66.nl	0.06	bg info	vng.nl	0.05	bg info
klimaatlabelpolitiek.nl	0.06	bg info	ipsos.com	0.05	other	emissieautoriteit.nl	0.05	other

low_1	prop^w	type	low_2	prop^w	type	low_3	prop^w	type
klimaatakkoord.nl	0.37	bg info	wikipedia.org	0.36	bg info	klimaatakkoord.nl	0.25	bg info
klimaat.be	0.11	bg info	knmi.nl	0.23	other	europa.eu	0.17	bg info
wwf.nl	0.10	other	rijksoverheid.nl	0.08	bg info	klimaat.be	0.12	bg info
nos.nl	0.09	news	nos.nl	0.08	news	wwf.nl	0.10	other
wikipedia.org	0.08	bg info	deltaprogramma.nl	0.06	bg info	rijksoverheid.nl	0.07	bg info

*neutral_1*	prop^w	type	*neutral_2*	prop^w	type	*neutral_3*	prop^w	type

telegraaf.nl	0.22	news	wikipedia.org	0.44	bg info	rijksoverheid.nl	0.43	bg info
knmi.nl	0.17	other	milieucentraal.nl	0.12	other	wwf.nl	0.13	other
ad.nl	0.14	news	rijksoverheid.nl	0.06	bg info	klimaatadaptatienederland.nl	0.12	bg info
fd.nl	0.12	news	wiktionary.org	0.05	other	europa.eu	0.09	bg info
warmte365.nl	0.04	other	apple.com	0.04	other	wur.nl	0.06	other

*high_1*	prop^w	type	*high_2*	prop^w	type	*high_3*	prop^w	type

rijksoverheid.nl	0.61	bg info	rijksoverheid.nl	0.52	bg info	wikipedia.org	0.35	bg info
wwf.nl	0.08	other	wwf.nl	0.10	other	klimaatakkoord.nl	0.23	bg info
groenlinks.nl	0.07	bg info	klimaatlabelpolitiek.nl	0.07	bg info	rijksoverheid.nl	0.20	bg info
d66.nl	0.06	bg info	d66.nl	0.06	bg info	vng.nl	0.05	bg info
klimaatlabelpolitiek.nl	0.06	bg info	ipsos.com	0.05	other	emissieautoriteit.nl	0.05	other

Discussion and conclusion

Search engines are a major pathway to political information and news (Wojcieszak et al., 2022). Given their significant influence on political information exposure and behavior (Epstein & Robertson, 2015; Pan et al., 2007), this study addressed concerns about the extent to which user choices and algorithmic personalization lead to divergent search results about political issues (RQ1).

Our findings confirm prior research on filter bubbles in information search (e.g., Courtois et al., 2018; Puschmann, 2019): Algorithmic personalization alone does not lead to divergent information sources in political search results. Put simply, identical queries produce the same search results, regardless of variations in users’ search histories. Instead, our study shows that the search queries people formulate play a key role in shaping their exposure to search results. When (artificial) users input different queries about the same political issue, they encounter largely distinct information sources. This pattern is evident in searches related to both immigration and climate, although search results for climate-related queries tend to be more similar.

These findings have several important implications. First, these findings are important in the ongoing discourse around information exposure via search engines and other digital platforms, which often emphasizes the influence of technology. Previous algorithm auditing studies typically rely on a set of standardized queries generated by researchers to examine political search results. While these studies are valuable for understanding general patterns in search engine outputs and the influence of algorithmic factors, they might not accurately reflect the variety of search queries used in real life. Our study adopted a user-centered approach by basing the research design on a typology of political queries derived from real users, which can better address questions related to how people actually seek out and encounter political information through search engines.

Second, previous research indicates that the use of different search queries is associated with individuals’ political attitudes (e.g., Slechten et al., 2022; Trielli & Diakopoulos, 2022; van Hoof et al., 2024), underscoring the importance of considering a diversity of search terms. However, our study did not find support for the clustering related to political attitudes, as identified in previous research (van Hoof et al., 2024), influencing the search results in the same way. Instead, we discovered that the formulation of search queries is the primary factor driving the diversity of search results encountered by users. This might suggest that for these two issues, clusters of people are not pronounced enough when it comes to search queries, ultimately not leading to distinct clusters in search results. Alternatively, this could be an artifact of the design used by van Hoof et al. (2024) or how we adapted their design to our research.

Regardless, these intriguing findings call for replication and extension. This study is a case study of Google Search on searches for two political topics during a non-campaign period in the Netherlands. The specific search queries and search results may vary based on factors like timing, country, topic, and search engine. For instance, our findings differ from Trielli and Diakopoulos (2022) who found a mostly similar selection of search results for divergent partisan search queries about US political candidates, possibly because dominant news stories and outlets influence results the day before elections. This indicates that further research is necessary to understand how these mechanisms unfold in different contexts.

Third, our findings complement existing scholarship on user selections of search results identifying ranking (Pan et al., 2007; Ulloa & Kacperski, 2023; Urman & Makhortykh, 2023) and confirmation biases (i.e., political selective exposure)(Knobloch-Westerwick et al., 2015; Westerwick et al., 2013). We contribute to this literature by focusing on an earlier stage in the search process: Before users can select any search result, a query needs to be formulated that impacts the results that are even available for selection. While recent studies have given more attention to political search queries (e.g., Blassnig et al., 2023; Menchen-Trevino et al., 2023; Slechten et al., 2022; van Hoof et al., 2024), they had rarely extended their research to how they translate into information exposure on the search result page. In future research, researchers could combine multiple steps in the search process, such as follow whether user selections of search results further amplify biases introduced by query formulations (also see Slechten et al., 2022).

In the second part of our study, we demonstrated that search queries and search topics (immigration or climate) can yield different distributions of information sources and SERP features (RQ2). This is important because different types of sources, such as news or social media content, and the SERP features that highlight them, may inform users in varied ways. Furthermore, (AI-generated) SERP features have become integral parts of search engines, often making the SERP more influential than the linked pages (Epstein et al., 2022; Gleason et al., 2023). Interestingly, these features tend to be binary: They either consistently appear for specific keywords or topics, or not at all. This pattern highlights the need to better understand how users contribute to these patterns.

These findings offer qualitative insights into how varying search queries on similar topics influence the type of sources users encounter when seeking political information. While interesting, these results also call for replications to see how robust these findings are, such as across contexts. Moreover, with our focus on source (types), we could not determine the extent to which the SERPs actually provided different information. To uncover these patterns, future studies could develop new metrics to analyze the dissimilarity of textual content presented on SERPs or the underlying web pages.

This study has limitations in scope and method. Our methodological approach uses survey-based search queries, which is an important step toward understanding how search query choice may affect political information exposure. However, survey-based search queries may not fully capture real-life search behavior (Blassnig et al., 2023), which affects the study’s ecological validity.

Furthermore, our implementation of ABT enabled us to isolate the effects of search history on search results, gaining experimental control over other more ecological valid alternatives like crowdsourced algorithm audits. This comes with several implications. First, we focused on one signal of algorithmic personalization: search history. Despite efforts to enhance ecological validity, including using real queries and an extensive training phase, it remains uncertain whether this was sufficient for search engine algorithms to pick up on. Second, as we used non-logged in virtual agents in our implementation, our measurement of algorithmic personalization is restricted to that which occurs via cookies, and we cannot account for personalization based on activity saved in Google accounts. Third, while our interest was in understanding the “long-term” impact of search history on information exposure, as suggested by the filter bubble hypothesis, the short-term or “carry-over effects” of search history (Hannak et al., 2013), that is, impact of searches within a brief time window, warrant further research. Therefore, we encourage future research to explore other and related signals of algorithmic personalization in information search. Fourth, while it is impossible to control for all potentially confounding factors that impact search results, we followed best practices for ABT and controlled for known technical confounding variables (Schwabl et al., 2024). Yet, there are potentially other unknown factors that may have influenced how realistic our virtual agents were, such as multiple searches coming from the specific IP address of the servers we used. Regardless, data access facilitated by platforms like Google Search, and methodological innovations when such research opportunities remain unavailable, are required to further understand the interplay between users, algorithms, and search results in future studies.

While we should stay conscious of how search engines tailor information based on our past online behavior, especially for critical topics like politics, our findings deflate the notion of algorithm-driven information bubbles in Google Search. Our study emphasizes that focusing solely on algorithms may lead to an underestimation of user-driven effects. To understand how search engines deliver political information, it is essential to account for human choices in search queries. We suggest that future research in this field prioritizes human choices in information search rather than controlling for them.

Supplementary material

Supplementary material is available at Journal of Computer-Mediated Communication online.

Data availability

The data used for this article are available in the Open Science Framework at https://osf.io/h3a94/.

Funding

The data collection of search queries was supported by the Amsterdam School of Communication Research and its Digital Communication Methods Lab. The experiment was carried out on the Dutch national e-infrastructure with the support of the SURF Foundation (Grant No. EINF-5514).

Conflicts of interest: The authors declare there is no conflict of interest.

Notes

https://osf.io/h3a94/

https://github.com/mariekevh/ItMattersHowYouGoogleIt

https://osf.io/yu64r/

News headlines (n = 296, 355) are obtained via the AmCAT (Amsterdam Content Analysis Tool, https://github.com/amcat) published between April 1st, 2021 and June 30, 2022 by the six major Dutch newspapers in the Netherlands (De Telegraaf, NRC Handelsblad, De Volkskrant, Algemeen Dagblad, Trouw, and Het Financieele Dagblad).

https://github.com/MarHai/ScrapeBot

Due to some initial technical errors, there is a minor two-day difference between the training phases of both issues. This discrepancy is unlikely to impact our results, as algorithmic personalization has minimal influence (see Results section of this chapter). Additionally, in extremely rare instances, training phase runs failed (24 and 38 cases for immigration and climate, respectively). Since these occurred around the same time and were not isolated to specific server or agent, we suspect platform-related issues (e.g., extended loading times). We also do not expect these unsuccessful runs to affect our findings.

https://github.com/gitronald/WebSearcher

Search results are different from components in that a search result, e.g. a news item, can be included within a higher-level component, e.g. Top Stories.

References

Araujo

Boukes

Trilling

van Hoof

Wald

Zhang

(

2020

). Communication in the Digital Society Survey in the Netherlands.

10.17605/OSF.IO/YU64R

Arendt

Haim

Scherr

(

2020

Investigating Google’s suicide-prevention efforts in celebrity suicides using agent-based testing: A cross-national study in four European countries

Social Science & Medicine

262

112692

10.1016/j.socscimed.2019.112692

Bandy

(

2021

Problematic machine behavior: A systematic literature review of algorithm audits

Proceedings of the ACM on Human-Computer Interaction

(

CSCW1

–

Blassnig

Mitova

Pfiffner

Reiss

M. V.

(

2023

Googling referendum campaigns: Analyzing online search patterns regarding Swiss direct-democratic votes

Media and Communication

(

–

10.17645/mac.v11i1.6030

10.1016/j.electstud.2006.10.018

Boomgaarden

H. G.

Vliegenthart

(

2007

Explaining the rise of anti-immigrant parties: The role of news media content

Electoral Studies

(

404

–

417

Cardenal

A. S.

Aguilar-Paredes

Galais

Pérez-Montoro

(

2019

Digital Technologies and Selective Exposure: How Choice and Filter Bubbles Shape News Media Exposure

The International Journal of Press/Politics

(

465

–

486

10.1177/1940161219862988

10.1016/j.tele.2018.07.004

Courtois

Slechten

Coenen

(

2018

Challenging Google Search filter bubbles in social and political information: Disconforming evidence from a digital methods case study

Telematics and Informatics

(

2006

–

2015

10.1007/978-3-319-46963-8_8

Cozza

Hoang

V. T.

Petrocchi

Spognardi

(

2016

). Experimental Measures of News Personalization in Google News. In

Casteleyn

Dolog

Pautasso

(Eds.),

Current Trends in Web Engineering

(pp.

–

104

Springer International Publishing

Diakopoulos

Trielli

Stark

Mussenden

(

2018

). I vote for—How search informs our choice of candidate. In

Moore

Tambini

(Eds.),

Digital Dominance: The Power of Google, Amazon, Facebook, and Apple

(pp.

320

–

341

Oxford University Press

Google Preview

10.1371/journal.pone.0268081

Epstein

Lee

Mohr

Zankich

V. R.

(

2022

The Answer Bot Effect (ABE): A powerful new form of influence made possible by intelligent personal assistants and search engines

PLOS ONE

(

e0268081

Epstein

Robertson

R. E.

(

2015

The search engine manipulation effect (SEME) and its possible impact on the outcomes of elections

Proceedings of the National Academy of Sciences

112

(

E4512

–

E4521

10.1073/pnas.1419828112

Flaxman

Goel

Rao

J. M.

(

2016

Filter bubbles, echo chambers, and online news consumption

Public Opinion Quarterly

(

298

–

320

Fletcher

Nielsen

R. K.

(

2018

Are people incidentally exposed to news on social media? A comparative analysis

New Media & Society

(

2450

–

2468

10.1177/1461444817724170

10.1111/j.1460-2466.2009.01452.x

Garrett

R. K.

(

2009

Politically motivated reinforcement seeking: Reframing the selective exposure debate

Journal of Communication

(

676

–

699

. https://ojs.aaai.org/index.php/ICWSM/article/view/22142

Gleason

Robertson

R. E.

Wilson

(

2023

Google the gatekeeper: How search components affect clicks and attention

Proceedings of the International AAAI Conference on Web and Social Media

(

245

–

256

10.1080/1461670X.2019.1702892

Google

. (

2023

). Ranking results—How google search works. Retrieved May 3, 2023, from https://www.google.com/search/howsearchworks/how-search-works/ranking-results/

Google

. (

2024

). Your choice for cookies & data. Retrieved July 19, 2024, from https://support.google.com/accounts/answer/12065353

Haim

(

2020

Agent-based testing: An Automated approach toward artificial reactions to human behavior

Journalism Studies

(

895

–

911

10.1080/10410236.2015.1113484

Haim

Arendt

Scherr

(

2017

Abyss or shelter? On the relevance of web search engines’ search results when people google for suicide

Health Communication

(

253

–

258

Haim

Graefe

Brosius

H.-B.

(

2018

Burst of the filter bubble?: Effects of personalization on the diversity of Google News

Digital Journalism

(

330

–

343

10.1080/21670811.2017.1338145

10.1016/j.drugalcdep.2021.108874

Haim

Scherr

Arendt

(

2021

How search engines may help reduce drug-related suicides

Drug and Alcohol Dependence

226

108874

Hannak

Sapiezynski

Molavi Kakhki

Krishnamurthy

Lazer

Mislove

Wilson

(

2013

). Measuring personalization of web search. Proceedings of the 22nd international conference on World Wide Web,

527

–

538

10.1145/2488388.2488435

Haroon

Wojcieszak

Chhabra

Liu

Mohapatra

Shafiq

(

2023

Auditing YouTube’s recommendation system for ideologically congenial, extreme, and problematic recommendations

Proceedings of the National Academy of Sciences

120

(

e2213020120

10.1073/pnas.2213020120

10.1080/21670811.2019.1623700

Helberger

(

2019

On the democratic role of news recommenders

Digital Journalism

(

993

–

1012

Jiang

E. Robertson

Wilson

(

2019

). Auditing the partisanship of google search snippets. The World Wide Web Conference,

693

–

704

10.1145/3308558.3313654

Kliman-Silver

Hannak

Lazer

Wilson

Mislove

(

2015

). Location, location, location: The impact of geolocation on web search personalization. Proceedings of the 2015 Internet Measurement Conference,

121

–

127

10.1145/2815675.2815714

Knobloch-Westerwick

Johnson

B. K.

Westerwick

(

2015

Confirmation bias in online searches: Impacts of selective exposure before an election on political attitude strength and shifts

Journal of Computer-Mediated Communication

(

171

–

187

Maragh

Ekdale

High

Havens

Shafiq

(

2019

). Measuring political personalization of google news search. The World Wide Web Conference,

2957

–

2963

10.1145/3308558.3313682

Loecherbach

(

2023

). Diversity of news consumption in a digital information environment (Doctoral dissertation). Vrije Universiteit Amsterdam.

Makhortykh

Urman

Ulloa

(

2020

How search engines disseminate information about COVID-19 and why they should do better

Harvard Kennedy School Misinformation Review.

10.37016/mr-2020-017

10.1080/01972243.2022.2152915

Menchen-Trevino

Struett

Weeks

B. E.

Wojcieszak

(

2023

Searching for politics: Using real-world web search behavior and surveys to see political information searching in context

The Information Society

(

–

111

10.1080/15205436.2023.2173609

Nechushtai

Zamith

Lewis

S. C.

(

2023

More of the same? Homogenization in news recommendations when users search on Google, YouTube, Facebook, and Twitter

Mass Communication and Society

–

10.1111/j.1083-6101.2007.00351.x

Oliveira

Lopes

C. T.

(

2023

). The evolution of web search user interfaces—An Archaeological analysis of google search engine result pages. Proceedings of the 2023 Conference on Human Information Interaction and Retrieval,

–

10.1145/3576840.3578320

Pan

Hembrooke

Joachims

Lorigo

Gay

Granka

(

2007

In google we trust: Users’ decisions on rank, position, and relevance

Journal of Computer-Mediated Communication

(

801

–

823

Pariser

(

2011

The Filter Bubble: What The Internet Is Hiding From You

Penguin UK

Google Preview

10.1080/21670811.2018.1539626

Puschmann

(

2019

Beyond the bubble: Assessing the diversity of political search results

Digital Journalism

(

824

–

843

10.1038/s41586-023-06078-5

Robertson

R. E.

Green

Ruck

D. J.

Ognyanova

Wilson

Lazer

(

2023

Users choose to engage with more partisan news than they are exposed to on Google Search

Nature

618

(

7964

324

–

348

10.1007/s42001-024-00283-6

Robertson

R. E.

Lazer

Wilson

(

2018

). Auditing the personalization and composition of politically-related search engine results pages. Proceedings of the 2018 World Wide Web Conference on World Wide Web—WWW ’18,

955

–

965

10.1145/3178876.3186143

Robertson

R. E.

Wilson

(

2020

). Websearcher: Tools for auditing web search. Proceedings of the 2020 Computation+ Journalism Symposium (Boston, MA, USA)(C+ J 2020).

Sandvig

Hamilton

Karahalios

Langbort

(

2014

). Auditing algorithms: Research methods for detecting discrimination on internet platforms. “Data and Discrimination: Converting Critical Concerns into Productive Inquiry,” a preconference at the 64th Annual Meeting of the International Communication Association,

–

Schwabl

Haim

Unkel

(

2024

Aligning agent-based testing (ABT) with the experimental research paradigm: A literature review and best practices

Journal of Computational Social Science.

10.1177/00936502211012154

Slechten

Courtois

Coenen

Zaman

(

2022

Adapting the selective exposure perspective to algorithmically governed platforms: The case of Google Search

Communication Research

(

1039

–

1065

10.1080/1369118X.2020.1776367

Steiner

Magin

Stark

Geiß

(

2022

Seek and you shall find? A content analysis on the diversity of five search engines’ results on political queries

Information, Communication & Society

(

217

–

241

10.1007/s11109-007-9050-9

Stroud

N. J.

(

2008

Media use and political predispositions: Revisiting the concept of selective exposure

Political Behavior

(

341

–

366

Sunstein

C. R.

(

2001

Republic.com

Princeton University Press

Google Preview

10.1080/1461670X.2012.664341

Thurman

Schifferes

(

2012

The Future of personalization at news websites

Journalism Studies

(

5–6

775

–

790

10.1080/1369118X.2020.1764605

Trielli

Diakopoulos

(

2019

). Search as news curator: The role of google in shaping attention to news information. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems,

–

10.1145/3290605.3300683

Trielli

Diakopoulos

(

2022

Partisan search behavior and Google results in the 2018 U.S. midterm elections

Information, Communication & Society

(

145

–

161

10.1177/14614448231154926

Ulloa

Kacperski

C. S.

(

2023

Search engine effects on news consumption: Ranking and representativeness outweigh familiarity in news selection

New Media & Society

Unkel

Haim

(

2021

Googling politics: Parties, sources, and issue ownerships on google in the 2017 German federal election campaign

Social Science Computer Review

(

844

–

861

10.1177/0894439319881634

10.1007/s42001-023-00208-9

Urman

Makhortykh

(

2023

You are how (and where) you search? Comparative analysis of web search behavior using web tracking data

Journal of Computational Social Science

(

741

–

756

10.1177/08944393211006863

Urman

Makhortykh

Ulloa

(

2022

The matter of chance: Auditing web search results related to the 2020 U.S. presidential primary elections across six search engines

Social Science Computer Review

(

1323

–

1339

10.1177/14614448221104405

van Hoof

Meppelink

C. S.

Moeller

Trilling

(

2024

Searching differently? How political attitudes impact search queries about political issues

New Media & Society

(

3728

–

3750

Webber

Moffat

Zobel

(

2010

A similarity measure for indefinite rankings

ACM Transactions on Information Systems

(

–

10.1145/1852102.1852106

10.1177/19401612211009160

Westerwick

Kleinman

S. B.

Knobloch-Westerwick

(

2013

Turn a blind eye if you care: Impacts of attitude consistency, importance, and credibility on seeking of political information and implications for attitudes

Journal of Communication

(

432

–

453

Wojcieszak

Menchen-Trevino

Goncalves

J. F. F.

Weeks

(

2022

Avenues to news and diverse news exposure online: comparing direct navigation, social media, news aggregators, search queries, and article hyperlinks

The International Journal of Press/Politics

(

860

–

886

). https://policyreview.info/articles/analysis/should-we-worry-about-filter-bubbles

Wonneberger

Meijers

M. H. C.

Schuck

A. R. T.

(

2020

Shifting public engagement: How media coverage of climate change conferences affects climate change audience segments

Public Understanding of Science

(

176

–

193

10.1177/0963662519886474

Zuiderveen Borgesius

F. J.

Trilling

Möller

Bodó

De Vreese

C. H.

Helberger

(

2016

Should we worry about filter bubbles?

Internet Policy Review

(