Abstract

Human assumption of superior performance by machines has a long history, resulting in the concept of “machine heuristic” (MH), which is a mental shortcut that individuals apply to automated systems. This article provides a formal explication of this concept and develops a new scale based on three studies (Combined N =1129). Measurement items were derived from the explication and an open-ended survey (Study 1, N =270). These were then administered in a closed-ended survey (Study 2, N =448) to identify their dimensionality through exploratory factor analysis (EFA). Lastly, we conducted another survey (Study 3, N =411) to verify the factor structure obtained in Study 2 by employing confirmatory factor analysis (CFA). Analyses resulted in a validated scale of seven items that reflect the level of MH in individuals and identified six sets of descriptive labels for machines (expert, efficient, rigid, superfluous, fair, and complex) that serve as formative indicators of MH. Theoretical and practical implications are discussed.

Lay Summary

Are machines better than humans? People have long thought that machines are accurate and precise. They seem objective and fair. Such attributes make them better than humans in performing tasks that are based on rules. But, worse in performing creative tasks. This is because they lack emotions. When people use this logic to say that machines are better or worse than humans, it is called “machine heuristic.” In this study, we explored different meanings of this term. We came up with a complete definition. Based on the definition, we created survey questions for measuring “machine heuristic” in people. We tested these questions among several hundred individuals to make sure they are valid and reliable. We developed a scale that can be applied to both human tasks and mechanical tasks. In this article, we explain why our scale is useful. We also describe how to use it in research.

Alan Turing, the father of artificial intelligence (AI), posed the question: “Can machines think?” (Turing, 1950). He developed the Turing test to assess a machine’s ability to communicate in a manner indistinguishable from a human. Since machine’s intelligence cannot be directly observed, its communication—a product of intelligence—is typically evaluated (Gunkel, 2012). When a machine communicates with us, our mental generalizations about the strengths and weaknesses of machine performance could be triggered, and thereby influence our perceptions and behaviors. This rule of thumb is called “machine heuristic.”

Machine heuristic (MH) refers to “the mental shortcut wherein we attribute machine characteristics or machine-like operation when making judgments about the outcome of an interaction” (Sundar & Kim, 2019, p. 2). MH is triggered in either machine-mediated communication or human-machine communication. In machine-mediated communication (e.g., computer-mediated communication or AI-mediated communication), humans use machines that support and facilitate communication between humans (Elsayed, 2006). Their pre-existing beliefs about the communication facilitators (i.e., machines in this case) form a cognitive heuristic (i.e., MH) that can affect communication outcomes. This cognitive heuristic is often triggered by interface cues revealing that a machine, rather than a human, is creating or mediating the communication. In their 4C’s framework outlining the involvement of AI in human communication, Sundar and Lee (2022) describe how machines can serve as communicators and mediators in the mass communication context by creating and curating content for mass consumption and in the interpersonal communication context by conversing with users and co-authoring messages with them, respectively. When a machine is identified on the interface as the ostensible source and/or medium of communication, our pre-existing beliefs about machines are made salient, thus affecting our perceptions of the information delivered in that interaction. This notion was first introduced with the forerunner to MH, “computer heuristic,” which was defined as an assumption that computers were accurate and objective in their role as news aggregators (Sundar & Nass, 2001). Based on this, Sundar (2008) coined the term “machine heuristic” and identified it as one of the cognitive heuristics cued by the affordances of digital media in his Modality, Agency, Interactivity, and Navigability (MAIN) model. According to the model, the machine-like interface of a digital medium cues users’ MH wherein they attribute mechanical characteristics to the performance of the medium, which influences their judgment of the credibility of the medium and its content (Sundar, 2008).

Empirically, a long line of research has shown that source attribution to machines affects perceptions of communication receivers. For example, compared to human journalists, machine journalists are believed to be less biased, which leads to higher credibility of the resulting news article (Waddell, 2019). In the health domain, AI’s automated decisions are perceived to be more useful than human experts’ decisions (Araujo et al., 2020). On the other hand, in the context of emotional support, a human partner is perceived to be more supportive than a chatbot (Meng & Dai, 2021). Similarly, individuals trust and like in-person customer service more than automated customer service (Mays et al., 2022). A human instructor using a telepresence robot is perceived to be more credible than a social robot acting as a teacher, with positive effects on affective learning (Edwards et al., 2016). These studies show that the machine attribution of source elicits more positive or negative outcomes than human attribution. Such source attribution cues are also found in the HAII-TIME model, “an adaptation of the theory of interactive media effects (TIME) for the study of human-AI interaction” (HAII) (Sundar, 2020). The HAII-TIME model predicts that underlying machine attributes of AI will trigger users’ MH which will affect their perceptions, experiences, and trust in AI systems.

Even though the theoretical and empirical utility of MH is well recognized in both machine-mediated communication research and human–machine communication research, the concept of MH has not yet been rigorously explicated, and there is no validated scale for measuring MH. To be sure, some researchers have developed measures of MH, but those have not been statistically validated. For example, Waddell (2018) developed four reliable items (e.g., “if a machine does a job, then the task will be done objectively,” α  =  0.75) to measure MH when he examined the mediating effects of MH on the relationship between machine authorship and news credibility. These items were subsequently used by Wang (2021) in the context of content moderation (α  =  0.86). Sundar and Kim (2019) also developed five reliable items to measure belief in MH (e.g., “when machines perform a task, the results are more objective than when humans perform the same task,” α  =  0.87) when they investigated the moderating effects of belief in MH on the relationship between machine cues on the interface and the intention to disclose personal information. Banks et al. (2021) also developed a reliable scale of seven items for measuring MH (e.g., “machines are objective,” α  =  0.852) to examine the effects of an agent’s cued ontological category and the nature of its behavior on the invocation of MH and nature heuristic. However, these MH items have not been formally validated. The current research aims to rectify this by explicating the concept of MH and constructing a measurement scale that is not only reliable but also valid.

Typically, a measurement scale consists of formative indicators, reflective indicators, or both. Formative indicators compose a concept, i.e., they are the building blocks that contribute to it, so they depend on scholars’ explication and interpretation of different strands of meaning (Coltman et al., 2008). The reflective indicators of a concept are driven by the concept, so a change in that concept causes a change in its reflective indicators (Coltman et al., 2008). MH is a multi-stage cognition wherein individuals’ characterizations of general machines stored in their memory biases their perceptions of general machine performance, which then translates to their evaluations of a specific machine and its performance.

Since this is the first attempt to construct a scale for measuring MH, our goal is to develop both formative and reflective indicators of MH, i.e., which machine characteristics influence the formation of an individual’s MH and what perceptions change as a function of their MH. Accordingly, we explicated the concept of MH and conducted three empirical studies by running (1) open-ended data analysis, (2) exploratory factor analysis (EFA), and (3) confirmatory factor analysis (CFA) with validity tests and reliability tests to develop a measurement scale of MH.

Literature review

Machine heuristic explicated

To operationally define and construct a scale, the concept of MH has to be explicated first. MH is a rule of thumb based on characterizations of general machines stored in memory. Human characterizations of machines have a long history and have been manifested in the form of folk theories or naïve conceptions of how machines work under the hood. Folk theories “embody cognitive biases that influence thought and action” (Gelman & Legare, 2011, p. 380). As such, folk theories of machines contribute to MH that influences perceptions and behaviors. Literature in various disciplines suggests the historical existence of folk theories of machines. In 1947, Turing referred to general beliefs in the infallibility of machines (Copeland, 2004). This reference implies the public’s characterizations of machines, i.e., people generally think machines are accurate, which is echoed in the philosophy literature as well. In the 17th century, René Descartes proposed the machine metaphor (Ruse, 2002), likening the world to a machine, which he described as “efficiently functioning” (p. 286) and operating “according to unbroken law” (p. 287) (Ruse, 2005). This machine metaphor suggests that there has been a long-held view that machines are functional, efficient, and exact. It has also been believed that machines lack human traits, such as emotionality, interpersonal warmth, cognitive openness, and agency (Haslam et al., 2005). Such notions about machines appear to be ingrained and automatically applied when individuals make a decision or judgment in machine-mediated communication or human–machine communication. Such ingrained views about machines constitute the “machine heuristic.” Heuristics, or more formally “cognitive heuristics,” are rules of thumb learned and stored in individuals’ memory, which are activated and applied when making a judgment (Chaiken, 1980; Tversky & Kahneman, 1973). They serve as mental shortcuts that help individuals make quick decisions. MH is individuals’ rule of thumb about technology in general, which is triggered when they make a quick decision about a specific device or system. When individuals communicate via or with a specific machine, such as an AI assistant or a self-driving car, they may apply their rough rule of thumb about machines in general (ranging from factory machines to interactive kiosks), in order to make a quick decision about the specific machine they encounter. Hence, this construct is labeled “machine heuristic” rather than AI heuristic or self-driving-car heuristic. Regardless of the specific device or system, individuals apply the same rule of thumb about overall machine performance when they make a decision or judgment.

Formative indicators of machine heuristic

When Sundar (2008) first coined the term “machine heuristic,” he exemplified it with news readers’ perceptions of automated content curation by aggregators as not having any ideological bias and therefore more credible. In a similar vein, scholars in the humanities have alluded to fairness (i.e., lack of prejudice) of machines by describing mechanically produced images as being free from suspect human intervention (Daston & Galison, 1992). The nursing literature also shows that there have long been characterizations of machines as unrelated to a value, preference, and decision, thus socially and culturally neutral (Barnard, 1997). Astobiza’s experiment (2023) empirically supports the literature, as it revealed that the general public perceives machines as not having a mind, or the capacity for thought, compared to humans. It implies their tendency to perceive machines as being free from prejudice.

Across time and disciplines, machines have been known not only for fairness but also efficiency (i.e., functioning in a manner that does not waste time). In keeping with Descartes’ description of machines, today’s computer science literature also considers machines as entities that work efficiently (Subramaniam et al., 2009). Similarly, the business literature notes that managers expect machines to improve efficiency (Mueller et al., 1986).

People have characterized machines as free from not only preferences but also feelings and emotions. For instance, unemotionality (i.e., lacking emotion) was identified as the result of mechanization by Montague and Matson (1983). Similarly, Turkle (1984) noted that, in the 1970s and 1980s, computers were viewed as lacking emotion. Even in contemporary literature, emotionality is considered a uniquely human trait that machines lack (Haslam, 2006). For example, robots are regarded as unable to understand human emotions (Blar et al., 2014).

Machines are also described as programmed and therefore rigid (i.e., lacking flexibility) (Haslam, 2006), requiring human operators to compensate by adapting themselves to machines (Chignell & Hancock, 1986). That is, people characterize machines as adhering strictly to set rules.

Historically, machines have been seen as mere tools serving human operators rather than having their own agency. The literature in nursing describes machines as entities which do not make decisions by themselves (Barnard, 1997). Machines are just viewed as good mechanical servants that are used and controlled by human nurses (Ashworth, 1987). Astobiza’s experiment (2023) supports the literature by showing that the general public perceives machines as lacking free will, quite unlike humans. Accordingly, machines have been considered as tools and servants that are under human control (i.e., lacking free will). That is, machines are seen as generally controllable, unlike humans.

On the contrary, the education literature shows that machines are generally seen as complex (i.e., difficult to use). For example, Al-Senaidi et al. (2009) found that teachers often have disbelief in the benefits of machines. Specifically, some teachers believe that technology makes their tasks more difficult, and they lack the time to adopt new technologies (Al-Senaidi et al., 2009).

That said, machines are seen as superior entities compared to humans. Perhaps the most touted characteristic of machines is precision. Machines have long been considered accurate (i.e., free from errors). As Copeland (2004) notes, Turing believed that people have higher expectation for machines than humans in terms of following the rules. Very often, the stated goal of introducing computers and automated decision aids is to reduce human errors (Skitka et al., 1999), implying that machines will commit less errors than humans. Likewise, Pew Research Center (Tyson et al., 2023) found that a greater proportion of the American public believes that using AI would reduce errors made by human healthcare providers rather than increase them.

As technologies have become more autonomous, people perceive machines to be smarter than humans (Skitka et al., 1999). The literature on automation bias suggests that individuals consider automated decision aids as more authoritative sources than humans in decision-making (Parasuraman & Manzey, 2010). Accordingly, expertise (i.e., having authoritative knowledge or skill) is one of the characteristics that individuals increasingly attribute to machines.

In addition to beliefs in the fairness and accuracy of machines, Sundar and Kim’s (2019) conceptualization of MH includes beliefs about their security (i.e., ensuring the safety of information). Their study documented individual differences in beliefs about machines being superior to humans in handling their personal information. Likewise, Jian et al. (2000) conceptualized human–machine trust as perceived security of machines.

Combining these various references to machines in a variety of disciplines, we identified fair, efficient, unemotional, rigid, under human control, complex, accurate, expert, and secure as the attributes people generally think machines have, which could “form” their rule of thumb—the formative indicators of machine heuristic. Following the guidance provided by Chaffee (1991), we arrived at these formative indicators through a process of distillation (integral to meaning analysis in concept explication) by sorting the full variety of meanings of the concept into groups.

Reflective indicators of machine heuristic

Individuals’ unelaborate and rough characterizations of machines as fair, efficient, unemotional, rigid, under human control, complex, accurate, expert, and secure may be reflected in their judgments and decisions. For example, the concept of automation bias is manifested when an individual sees an automated decision aid as “a heuristic replacement for vigilant information seeking and processing” (Mosier et al., 1996, p. 204), i.e., a tendency to over-rely on automation (Goddard et al., 2012). Thus, individuals’ characterizations of machines in general may make them rely or depend on machines when it comes to performing a task, especially one requiring machine attributes, such as fairness and accuracy. Such over-reliance on machines is associated with over-trust in them (Parasuraman et al., 2008). Madhavan and Wiegmann (2007) define trust as “the expectation of, or confidence in, another, and is based on the probability that one party attaches to co-operative or favorable behavior by other parties” (p. 280). Accordingly, strong attribution of machine characteristics may elicit not only strong trust in machines but also high expectation of, and strong confidence in, machines’ work performance. Conversely, some machine attributes (e.g., lack of emotion) may make individuals have low levels of reliance on, dependence on, trust in, expectation of, and confidence in machines for performing tasks that require human traits and skills.

Taken together, we hypothesize that individuals’ characterizations of machines in general as fair, efficient, unemotional, rigid, under human control, complex, accurate, expert, and secure—the formative indicators of MH—“form” their MH, which will be “reflected” in terms of their level of reliance on, dependence on, trust in, expectation of, and confidence in machines’ task performance—the reflective indicators of MH—depending on whether the given task is seen as benefiting from machine or human attributes.

Mechanical and human tasks

A machine is “a tool containing one or more parts that uses mechanical, chemical, thermal, or electrical energy to perform an intended action” (Kiran, 2019, p. 32). Such purposive actions are called tasks (Thomas & Velthouse, 1990). The nature of the task can vary widely and include not only mechanical ones, such as calculating data, but also human ones, like providing companionship. Individuals will likely evaluate a machine’s ability to perform these tasks quite differently. They may perceive machines to be superior to humans in performing tasks that require machine attributes but inferior to humans in performing tasks that require human traits. For example, Lee (2018) found that individuals perceive algorithmic decisions as less trustworthy for human tasks compared to mechanical tasks. Likewise, human journalists believe that while creativity and flexibility are their irreplaceable attributes, the more routine aspects of their job can be mechanized because they are more scalable in terms of breadth and volume (van Dalen, 2012). Literature on computational art identifies art requiring creativity as a symbol of human tasks (Ragot et al., 2020) and describes negative bias toward machine-generated arts (Colton, 2008; Hong, 2018; Ragot et al., 2020), which is widely prevalent (Lamb et al., 2018). The widely held view is that it is impossible for machines to cover the domains of human understanding beyond rule-governed operations (Gaut, 2010) and simulate humans’ creative thinking (Strothotte & Schlechtweg, 2002). In addition, intuitive judgment, navigating social nuances, emotional expression, and understanding are all human skills that machines lack (Lee, 2018; Waytz & Norton, 2014). Accordingly, we can predict that individuals will show evidence of MH that reflects stronger trust in machines than humans when performing mechanical tasks and weaker trust in machines when performing human tasks. Extending Lee’s definitions of human and mechanical tasks (Lee, 2018) and Nass et al.’s (1995) conceptualization of anthropocentrism with respect to computers, we view mechanical tasks as rule-governed tasks that require the ability to manipulate symbols, such as mathematical operations and extensive information storage, and human tasks as those that transcend rules and require creativity, flexibility, intuition, and the ability to understand and express emotion and navigate social nuances.

Definition of machine heuristic

Taken together, the explication in the present study extends Sundar and Kim’s definition of MH—“the mental shortcut wherein we attribute machine characteristics or machine-like operation when making judgments about the outcome of an interaction” (Sundar & Kim, 2019, p. 2). This mental shortcut can be positive or negative depending upon the appropriateness of applying machine attributes to the feature or activity at hand: (1) the rule of thumb that machines have certain attributes, such as speed and precision, thus individuals have higher reliance on, dependence on, trust in, expectation of, and confidence in machines than humans for performing rule-governed tasks that require mechanical skills; (2) the rule of thumb that machines lack human attributes, thus individuals have lower reliance on, dependence on, trust in, expectation of, and confidence in machines than humans for performing tasks that are not based on rules, but instead require human skills.

As with most concepts, MH can be operationalized with formative indicators and reflective indicators. The characteristics that individuals roughly attribute to machines as a whole would be the reason why they think machines are better or worse than humans in performing a task. In this regard, unelaborate and biased beliefs in machine attributes would be the formative indicators of MH. The comparisons of machines and humans in performing mechanical and human tasks would be the reflective indicators of MH. We consider both in this article.

Study 1

Method

The purpose of Study 1 is to increase our initial pool of measurement items by developing more formative indicator items which could be either associated with the hypothesized factors of MH identified in our concept explication (e.g., efficient) or suggestive of new factors. We conducted an online survey that comprised six open-ended questions, which asked participants to describe six types of machines in the form of adjectives, followed by demographic questions.

Participants

Amazon Mechanical Turk workers (MTurkers) were recruited as participants (N =270). MTurkers are more Internet-savvy than the general online population (Marshall & Shipman, 2013). However, their savviness is limited to only one technology—Internet—but MH focuses on machines in general rather than any specific technology. MTurkers’ education levels are also higher than in the general online population (Marshall & Shipman, 2013), but it is not problematic, as higher education does not guarantee higher literacy or expertise across a wide range of machines. For example, in Redmiles et al.’s study (2019), only 48% of MTurkers, compared to 59% of the U.S. population, reported that they had enough knowledge about protecting their devices when using public WiFi. Likewise, only 61% of MTurkers, compared to 71% of the U.S. population, reported that they had enough knowledge of protecting their devices from malware and viruses.

We recruited United States citizens aged 18 or older to ensure that all participants were native speakers and therefore familiar with American English, which was the language used in all the study materials. They completed an online survey constructed on Qualtrics and received U.S. $.25 as compensation for participating in each of the three studies (i.e., Study 1, Study 2, and Study 3).

Procedure

The current research (including Study 1, Study 2, and Study 3) was approved by the Institutional Review Board (IRB) at our university. At the beginning of the survey, participants were asked to complete an informed consent form as required by the IRB. Upon consenting, they were administered screening questions (age and citizenship), followed by open-ended questions about machines. At the end of the survey, they were administered demographic questions.

Measures

Open-ended questions

Since MH is a rule of thumb about machines as a general entity, stored in individuals’ memory and effortlessly triggered for a quick decision or judgment, we wanted to find out what people thought about the attributes of machines in general, rather than a specific device or system. Thus, with six open-ended questions (see Supplementary Table 3), we asked participants to write down adjectives that immediately came to their mind when they thought about six kinds of machines operated by mechanical, chemical, thermal, and/or electrical energy, namely “machines in assembly-lines of factories,” “weather forecast devices and computer systems,” “automatic teller machine (ATM),” “computer software that does statistical analyses,” “online/mobile calendar app (e.g., Google Calendar),” and “Apple’s Siri.” These are commonly known to the general public and together capture Kiran’s (2019) definition of “machine” that we adopted for our concept explication.

Demographics

Participants were asked to report their age, gender, ethnicity, and education level.

Results

Descriptive analysis

Supplementary Table 1 shows participants’ demographics (see Supplementary material).

Analysis of responses to open-ended questions

DeVellis (2012) encourages researchers to have a large item pool for developing a measurement scale, but within reason, given the negative effects of survey length on response rates. Dillman (1978) recommends that the number of items should be about 125 or less in order to ensure sufficient response rates of good quality. Similarly, Yan et al. (2011) found that individuals who responded to an online survey questionnaire consisting of 101 items were less likely to break it off than those who responded to a longer online survey questionnaire consisting of 155 items. Thus, we aimed at developing an initial measurement item pool consisting of about 101 to 125 items. Our concept explication yielded 71 items in total, including 57 formative items, seven reflective items for mechanical tasks, and seven reflective items for human tasks (see Supplementary Table 2). This left us with room for about 30–54 more items to be added, based on open-ended responses to Study 1. Across the six different machines, we counted how many times each adjective was mentioned by participants. Supplementary Table 3 shows how many new adjectives were obtained from each open-ended question and how many existing adjectives (from the concept explication) were obtained again from each open-ended question. In examining the adjectives based on frequency of mentions, we found 10 to be an optimal cut-off point, as it resulted in 52 items (e.g., “machines are intelligent”), including 38 new adjectives and their 14 synonyms (see Supplementary Table 4). Together with the 71 items from the explication, we generated a total of 123 items for measuring the concept.

Discussion

Based on participants’ responses to the open-ended questions in Study 1 (i.e., the 38 adjectives used by participants to describe machines), we developed 52 formative items for measuring MH. Among the items using the adjectives, we expected that “machines are intelligent,” “machines are informative,” “machines are advanced,” and “machines are innovative” would fall into the expert dimension, which is one of the formative indicators of MH hypothesized in the concept explication. We expected that “machines are complex,” “machines are complicated,” and “machines are difficult” would fall into the complex dimension, and “machines are cold” would belong in the unemotional dimension. The rest of the new items did not seem like they would belong to any of these dimensions of formative indicators hypothesized in the concept explication, thus we anticipated that the EFA in Study 2 would reveal additional formative factors of MH.

Study 2

Method

The purpose of Study 2 was to identify the dimensionality of the formative and reflective indicators of the concept by conducting an EFA of the 123 measures.

Participants

According to Cattell (1978), the minimum sample size for factor analysis is three times the number of variables. Since we were planning to run an EFA of 123 variables, we needed at least 381 participants. For Study 2, we recruited 640 participants from MTurk. After eliminating those who had failed our attention-check measures and those did not complete the survey (except for optional demographic items), we were left with 448 participants in the dataset.

Procedure

Participants who agreed to the informed consent form and passed the screening questions were asked to respond to formative items, reflective items, and demographic questions in an online survey. Additionally, we administered instructed response items and an instructional manipulation check item to check their attention to the survey, given empirical evidence of significantly improved scale fit after excluding inattentive participants (Abbey & Meloy, 2017).

Measures

Instructed response

To check participants’ attention to the survey, the instruction and items1 were borrowed from Egelman and Peer (2015).

Formative indicators

Following DeVellis’s guidelines for scale development (2012), we generated formative indicator items based on our explication of MH (see Supplementary Table 2). We expected the following attributes to comprise the formative indicators of MH: fair, efficient, unemotional, rigid, under human control, complex, accurate, expert, and secure. Measurement items (e.g., “machines are uncompromising”) that tap into fair, efficient, unemotional, rigid, accurate, and expert were newly constructed with the synonyms of those six words obtained from Oxford English Dictionary. Two fair items (e.g., “machines are not prejudiced”) were constructed based on Barnard (1997). An efficient item (i.e., “machines are time-saving”), a rigid item (i.e., “machines do only what they are asked to do”), three accurate items (e.g., “machines make correct predictions”), and four secure items (e.g., “machines provide security”) were newly created based on concept explication. Three under human control items were adapted from Ashworth’s statements (1987) mentioned by Barnard (1997). Four complex items (e.g., “machines make task performance more complex”) were adapted from Al-Senaidi et al. (2009). Three expert items (e.g., “machines are skilled”) were adapted from the expertise dimension of an established source-credibility measure (Ohanian, 1990). Three secure items (e.g., “machines do not gossip”) were adapted from Sundar and Kim (2019). We also developed 52 additional items (e.g., “machines are intelligent”) using the 38 adjectives obtained from Study 1 (see Supplementary Table 4) and their 14 synonyms. During data collection, participants were asked to rate the degree to which they agreed with the items on a 7-point Likert scale.

Instructional manipulation check

To check participants’ attention to the survey, instructions and questions2 were borrowed from Kung et al. (2018).

Reflective indicators

Following the outcomes of our explication, we developed reflective items to measure individuals’ perceived overall comparison of machines with humans and their reliance on, dependence on, trust in, confidence in, belief in the competence of, and expectation of machines, compared to humans, in performing tasks. The reliance and dependence items (e.g., “machines are more reliable than humans in performing these tasks”) were adapted from the trust-in-automation scale (Jian et al., 2000). The trust, confidence, and belief in competence items (e.g., “I trust machines more than humans in performing such tasks”) were adapted from a scale measuring trust in automatic weapons detection (Merritt, 2011). Lastly, expectation and perceived overall comparison items (i.e., “I have a higher expectation for machines than humans in performing such tasks” and “machines are better than humans in performing such tasks”) were newly constructed based on our concept explication.

In answering these reflective items, participants were asked to imagine that machines and humans perform mechanical tasks. In the concept explication, we defined mechanical tasks as rule-governed tasks that require the ability to manipulate symbols, such as mathematical operations and extensive information storage. Based on this definition, we developed and provided the following examples to participants: Calculating mathematical problems; Statistically analyzing quantitative data; Memorizing all the Chinese characters in the world; Measuring someone's blood pressure; Detecting the current geographic location; and Reminding me of an important meeting. Similarly, participants were also asked to imagine that machines and humans perform human tasks and rate the degree to which they agreed with the statements about the comparison of machines with humans. In our explication, we defined human tasks as those that transcend rules and require creativity, flexibility, intuition, and the ability to understand and express emotion and navigate social nuances. Based on this definition, we developed and provided the following examples to participants: Creating a visual artwork; Improvising a jazz piece; Falling in love with someone; Consoling someone; Judging someone's first impression; and Assessing someone's sexual attractiveness.

Demographics

Participants were asked to report their age, gender, ethnicity, and education level.

Results

Descriptive analysis

Supplementary Table 1 shows participants’ demographics (see the Supplementary material).

Exploratory factor analysis

To test the suitability of the data for an EFA with the formative indicator items, we ran the Kaiser–Meyer–Olkin (KMO) sampling adequacy test and the Bartlett’s sphericity test. The KMO value was .94 and the p-value of the Bartlett’s test was less than .001, which met the criteria (KMO of between .8 and 1.0, and Bartlett’s test p-value of < .05) laid out by Shrestha (2021). All the communalities were higher than .56. Since we assumed that factors would be correlated to each other based on the concept explication and wanted to minimize variable complexity, we used quartimin rotation for the EFA with the formative indicator items, following the guidance of Sass and Schmitt (2010). We employed the principal axis factoring method for the EFA given that, as Mvududu and Sink (2013) note, “it is best suited for exploring the underlying factors theorized by the researcher” (p. 85). We simplified the structure by removing weak factors. Specifically, referring to Costello and Osborne (2005), we removed factors with less than three items and, using Field’s (2005) cut-off, retained only those items with the highest loading being at least .60. After applying these criteria, there were no cross-loading items. We ran a Scree test (Velicer & Jackson, 1990), which suggested the presence of six dimensions of formative indicators of MH based on both eigenvalues being greater than 1.0 and visual inspection of the Scree plot. The factors cumulatively accounted for 51.54% of the variance. Supplementary Table 5 shows the names of the six factors, the individual items for each factor, and the loading of each item on the factor to which it belongs (see the Supplementary material). The six factors were labeled expert (α = .90, M =4.84, SD =1.43), efficient (α = .93, M =5.76, SD = .99), rigid (α = .90, M =4.40, SD =1.41), superfluous (α = .92, M =2.96, SD =1.54), fair (α = .83, M =5.41, SD =1.22), and complex (α = .80, M =4.59, SD =1.36). As such, these six dimensions constitute the formative indicators of the concept of MH.

For the reflective indicator items, the KMO value was .94 and the p-value of the Bartlett’s test was less than .001, which were good enough to run an EFA with the items. All the communalities were higher than .63. As with the formative items, we used quartimin rotation for the EFA with the reflective indicator items. For the EFA, we employed the principal axis factoring method to explore the underlying factors. The analysis indicated the presence of two dimensions, accounting for 81.28% of the variance. Supplementary Table 6 provides the names of the factors, the individual items for each factor, and the loading of each item on the factor to which it belongs (see Supplementary material). The first factor, labeled human tasks (α = .98, M =3.00, SD =2.02), refers to the application of MH for performing human tasks. The second factor, labeled mechanical tasks (α = .93, M =5.37, SD =1.13), refers to the application of MH for mechanical tasks. Even though the same items were used for capturing MH for mechanical and human tasks, they factored along different dimensions, thus necessitating the construction of two distinct scale models: the mechanical task (MT) model and the human task (HT) model.

Discussion

The EFA revealed six dimensions of formative indicators of MH (i.e., expert, efficient, rigid, superfluous, fair, and complex). Among the dimensions of formative indicators hypothesized in the earlier concept explication, accurate, unemotional, under human control, and secure were eliminated. Instead, the additional measures developed in Study 1 came together to constitute a new factor labeled superfluous (see Supplementary Table 13). It is interesting that accurate was replaced by superfluous. This implies that people generally think machines are not always perfect, which stands to reason given “hallucinating” AIs and autopilot failures (e.g., Krisher & Johnson, 2024)—a sign of changing public consciousness that is moving away from the old tendency of viewing machines as infallible. Similarly, they seem to be moving away from the notion that machines are under human control given the recent proliferation of autonomous machines powered by AI. Likewise, secure is also not a reliable indicator of machine-ness given that cyberattacks have become commonplace (e.g., Bohannon, 2024). Finally, recent advances in social robotics, conversational agents and related technologies have served to make many devices quite capable of exhibiting emotions, even if only synthetically, thus unemotional may no longer be a good descriptor of machines.

The EFA also resulted in two dimensions of reflective indicators of MH (i.e., human tasks and mechanical Tasks). The two dimensions had the same items (e.g., “machines are better than humans in performing such tasks”) although the given scenarios (i.e., human task scenarios and mechanical task scenarios) were different. The results suggested two structural models of MH, one for each type of task, for testing with CFA in Study 3.

Study 3

Method

Study 3 was conducted to confirm the factor structure from Study 2 by running a CFA and constructing a scale to measure MH.

Participants

There is a rule of thumb that five to ten observations are needed per estimated parameter (Bentler & Chou, 1987 as cited in Wolf et al, 2013). In this study, each structural model has 44 parameters, so we aimed to collect data from at least 220 participants. We recruited 624 participants from MTurk. After eliminating those who had failed the attention checks and/or had not completed the survey, data from 290 participants remained in the dataset.

Procedure

The procedure followed in Study 2 was repeated in Study 3. In addition to that, participants were asked to respond to items from related scales for testing the validity of the new scale.

Measures

In addition to all the measures used in Study 2, we administered the following scales for testing the validity of our MH items:

Behavioral intentions

According to theories reviewed earlier (MAIN model and HAII-TIME model), cognitive heuristics triggered by interface cues predict behavioral intentions toward the technology. To assess criterion validity, we tested whether MH positively correlates with behavioral intention, which is the logically anticipated outcome of MH. Thus, we developed six items to measure behavioral intentions of using machines for performing mechanical tasks (e.g., “I am willing to use a machine to calculate mathematical problems rather than doing it myself or asking a human to do it”), adapting the six examples of mechanical tasks given to participants for assessing their responses to the reflective indicator items, and another set of six items to measure behavioral intentions of using machines for performing human tasks (e.g., “I am willing to ask a machine to create a visual artwork rather than doing it myself or asking a human to do it”), adapting the six examples of human tasks given to participants for assessing their responses to the reflective indicator items.

Technology dependence

To assess convergent validity and discriminant validity, we identified technology dependence (TD) as a concept similar to but different from MH. TD refers to “the extent to which users rely on the technology to solve a problem or perform specific functions” (Fan et al., 2017, p. 116). Both TD and MH are relevant to reliance on technology/machines, but TD is operationalized as human behavior whereas MH is described as human cognition. Furthermore, TD is a one-way concept that focuses only on reliance on technology, but MH is a two-way concept that emphasizes both reliance and reluctance in using machines. Thus, we adapted the three smartphone dependence items used by Fan et al. (2017) to measure TD (e.g., “I tend to use machines as often as I can”).

Automation bias

To test convergent validity and discriminant validity, we identified automation bias (AB) as another concept similar to but different from MH. AB is cognitive overreliance on automation, but MH is a cognitive heuristic based on the thoughts that machines are generally better or worse than humans in performing a particular task. Hence, we developed three Automation Bias items (e.g., “when something is automated, humans can relax”) for our validity tests, based on Mosier & Skitka (1999).

Results

Descriptive analysis

Supplementary Table 1 shows participants’ demographics (see the Supplementary material).

Confirmatory factor analysis

We conducted a CFA with maximum likelihood estimation in IBM SPSS Amos to comprehensively test the measurement and structural portions of the models that we hypothesized based on the EFA results obtained in Study 2. According to Hu and Bentler (1999), the fit of a model is good when its χ2 is not significant, root mean square error of approximation (RMSEA) is below .08, comparative fit index (CFI) is above .95, and standardized root mean square residual (SRMR) is below .08. However, Hoe (2008) notes that the χ2-significance test has a limitation: It is too sensitive to sample size, especially when there are 200 or more observations. Thus, Kline (2015) suggests that when the ratio of χ2/df is 3 or less, the model fit is acceptable. In addition, Hair et al. (2010) note that each factor loading estimate should be higher than 0.5.

Measurement models

Formative measurement model

A CFA revealed that the model fit of the initial formative measurement model of MH3 was acceptable (χ2 =897.85, df =362, p < .001; χ2/df =2.48; RMSEA = .07; 90% confidence interval [CI] = [.07, .08]; CFI = .90; and SRMR = .05). To improve the model fit, we drew covariances between two pairs of correlated errors within factors as suggested by modification indices. Since the covariances were drawn within factors identified based on the literature, this was conceptually justified. The overall fit of this modified model was better as a result (χ2 =670.51, df =342, p < .001; χ2/df =1.96; RMSEA = .06; 90% confidence interval [CI] = [.05, .06]; CFI = .94; and SRMR = .05).

Reflective measurement model of MH for mechanical tasks

A CFA revealed that the model fit of the initial reflective measurement model of MH for mechanical tasks4 was good (χ2 =37.36, df =14, p < .01; χ2/df =2.67; RMSEA = .08; 90% confidence interval [CI] = [.05, .11]; CFI = .97; and SRMR = .04). As before, the model was respecified by correlating error terms within factors as suggested by modification indices. As a result, the overall fit of the modified model was better (χ2 =12.31, df =9, p = .20; χ2/df =1.37; RMSEA = .04; 90% confidence interval [CI] = [.00, .08]; CFI = 1.00; and SRMR = .02).

Reflective measurement model of MH for human tasks

A CFA revealed that the model fit of the initial reflective measurement model of MH for human tasks5 was good (χ2 =32.69, df =14, p < .01; χ2/df =2.34; RMSEA = .07; 90% confidence interval [CI] = [.04, .10]; CFI = .99; and SRMR = .02). As with the previous models, covariances were drawn between correlated error terms within factors, as suggested by modification indices, after which the overall fit was better (χ2 =6.12, df =10, p = .81; χ2/df = .61; RMSEA = .00; 90% confidence interval [CI] = [.00, .04]; CFI = 1.00; and SRMR = .01).

Structural models

Next, we constructed two structural models—the mechanical task (MT) model and the human task (HT) model—because we expected, based on our concept explication, that individuals would have positive MH toward machines performing mechanical tasks but negative MH toward machines performing human tasks. In the MT model, the concept of MH is comprised of the characteristics that individuals attribute to machines (formative indicators) and their comparison of machines and humans in terms of performing mechanical tasks (reflective indicators). The HT model is the same except that the referent tasks are human tasks rather than mechanical ones.

Mechanical task model

A CFA revealed that the model fit of the initial MT model6 was acceptable (χ2 =1262.28, df =573, p < .001; χ2/df =2.20; RMSEA = .07; 90% confidence interval [CI] = [.06, .07]; CFI = .89; and SRMR = .05). As before, the model was respecified by correlating error terms within factors as suggested by modification indices, yielding a better fit (χ2 =1014.55, df =549, p < .001; χ2/df =1.85; RMSEA = .05; 90% confidence interval [CI] = [.05, .06]; CFI = .93; and SRMR = .05). This final MT model shows that as individuals agree more with the notion that machines are efficient (.721), fair (.034), complex (.386), and expert (.057), they are more likely to form stronger MH toward mechanical tasks, thus reflecting their thinking that machines are better than humans in performing mechanical tasks. The model also shows that as individuals agree more with the notion that machines are rigid (-.142) and superfluous (−.048), they are likely to form weaker MH toward mechanical tasks.

Human task model

A CFA revealed that the model fit of the initial HT model7 was also good (χ2 =1310.56, df =573, p < .001; χ2/df =2.29; RMSEA = .07; 90% confidence interval [CI] = [.06, .07]; CFI = .90; and SRMR = .05). Like the previous models, covariances were drawn between correlated error terms within factors, as suggested by modification indices, after which the overall fit was better (χ2 =1057.79, df =549, p < .001; χ2/df =1.93; RMSEA = .06; 90% confidence interval [CI] = [.05, .06]; CFI = .93; and SRMR = .05). This final HT model shows that as individuals agree more with the notion that machines are efficient (.331), superfluous (.960), and expert (.283), they are more likely to form stronger MH toward human tasks, thus reflecting their thinking that machines are better than humans in performing human tasks. On the contrary, the model also shows that as individuals agree more with the notion that machines are rigid (-.045), fair (-.313), and complex (-.271), they are more likely to form weaker MH toward human tasks.

Based on the two final models, MH has six formative sub-concepts (based on 29 formative items) and seven reflective items (see Supplementary Figures 1 and 2).

Validity and reliability tests

We ran a series of analyses to test criterion and construct validity based on participants’ ratings on items from related concepts. In addition, we also computed item and scale reliability.

Criterion validity

To assess the criterion validity of our new MH scale (i.e., whether it is statistically related to theoretical outcomes of MH), we examined the correlations between the mean value of the reflective indicator items from each model of the scale and logically possible outcomes of MH. In line with the prediction made by theory that cognitive heuristics have effects on behavioral trust, we found that compared to the HT model, the MT model is more strongly correlated with willingness to use/ask machines to perform mechanical tasks rather than doing it oneself or asking humans to do it. We also found that compared to the MT model, the HT model is more strongly correlated with willingness to use/ask machines to perform human tasks rather than doing it oneself or asking humans to do it (see Supplementary Table 7). This demonstrates good criterion validity of the new scale.

Construct validity

To investigate whether the new MH scale possesses good construct validity, we tested its convergent validity (i.e., whether it is statistically related to other constructs that are theoretically related to MH) and discriminant validity (i.e., whether it is non-identical to those other constructs). According to Fornell & Larcker (1981), discriminant validity is established when all the squared correlations involving a concept is less than the average variance extracted (AVE) for the concept, and convergent validity is adequate when composite reliability (CR) is greater than .60 and AVE is not less than .50. Given that AVE is a more conservative measure, Fornell & Larcker (1981) also note that convergent validity can be considered adequate when the CR is above .60 even if the AVE is less than .50. Supplementary Tables 8 and 9 show the correlations, AVE, and CR values, indicating that the MH scale for both MT and HT has good convergent as well as discriminant validity, thereby verifying their construct validity.

Item and scale reliability

Communality estimates from squaring the factor loading show that each reflective item captures a significant portion of the variance in each model of our new MH scale (see Supplementary Table 6). This suggests high item reliability. Furthermore, as mentioned above, in the MT model of the MH scale, the Cronbach’s alpha of the seven reflective items was .93. In the HT model, the Cronbach’s alpha was .98. Thus, both models have high scale reliability.

Discussion

This is the first known attempt at explicating, operationalizing, and validating the concept of MH. Based on analyses of survey responses, we have offered a MH scale with two models: (1) the MT model wherein individuals’ beliefs in the efficiency, usefulness, expertise, fairness, accuracy, and unemotionality of machines form MH that reflects their view that machines are better than humans in performing mechanical tasks; and (2) the HT model wherein individuals’ beliefs in the efficiency, superfluousness, and expertness of machines form MH that reflects their view that machines are better than humans in performing human tasks.

Specifically, in addition to beliefs in the attributes of machines we identified as the formative indicators through our concept explication (i.e., fair, efficient, rigid, complex, and expert), we identified beliefs in one additional attribute, superfluousness, through Study 1 (i.e., open-ended survey) and Study 2 (i.e., EFA). It signifies the belief that machines are redundant, boring to use, and generally annoying because they do not always contribute accurate or meaningful information, and are therefore of limited utility. They have imperfections just like humans as we are still in the era of artificial “narrow” intelligence, i.e., although machines have become more advanced in recent times, they still fall short of human expectations.

On the other hand, some of the hypothesized formative indicators we identified through concept explication (i.e., unemotional, under human control, accurate, and secure) did not end up being statistically significant formative factors of MH in Study 2. All the final formative indicators of our MH scale (i.e., expert, efficient, rigid, superfluous, fair, and complex) are beliefs in machine characteristics related to “task performance.” That could be the reason why beliefs in machines’ unemotionality were statistically excluded from the construct of MH during our EFA. Beliefs that machines are under human control might have led individuals to attribute the source of machine-generated outputs to humans, not machines, because machines under human control could imply that machine performance is influenced by human intention. It might be too early to include beliefs in machines’ accuracy in the construct of MH because, as mentioned above, we are still in the era of machines being considered superfluous despite recent technological advances. Finally, unlike Sundar and Kim’s (2019) MH measures, our EFA failed to identify security as a significant aspect of MH. Even though accuracy and security are integral to our initial conceptualization of MH, they may not be applicable to all machines and therefore unviable traits for a general scale of the concept. They may however contribute to the meaning of the heuristic in certain narrow uses of machines that pertain directly to accuracy of information (e.g., Sundar & Nass, 2001) and security of personal information (Sundar & Kim, 2019) respectively.

We also discovered that beliefs in machines’ efficiency and expertise are significantly associated with the perception that machines do a better job of performing both mechanical and human tasks compared to humans, and that beliefs in machines’ rigidity are negatively associated with that perception. However, beliefs in machines’ fairness and complexity are positively associated with the thought that machines are better than humans in performing mechanical tasks, but negatively associated with the thought that machines are better than humans in performing human tasks. As mechanical tasks are rule-governed, fairness is a positive contributor to perceived task performance. Also, mechanical tasks often require calculation or information storage, so complexity could be a strength as well. However, when these attributes are applied to machines in the context of them performing human tasks, they serve to undermine the perception that machines are better than humans. Lastly, beliefs in machines’ superfluousness are negatively associated with the perception that machines are better than humans in performing mechanical tasks, but positively associated with the perception that machines are better than humans in performing human tasks. The superfluousness indicator consists of attributes that people find lacking in machines. Such attributes serve to contradict the notion that machines are better than humans, especially in performing mechanical tasks. In contrast, what people do not consider as perfect machine attributes could improve task performance in the human domain that tend to transcend rule-governed mechanical tasks, in line with the saying “to err is human.” It seems machines ought to be seen as fallible to be considered superior to humans in performing human tasks.

Guidelines for scale use

Supplementary Table 10 shows our finalized items for measuring MH (see the Supplementary material). Supplementary Table 11 and Table 12 are our metrics for gauging MH toward mechanical and human tasks respectively (see the Supplementary material). The formative indicators in Supplementary Table 10 are the building blocks of MH while the reflective indicators are reflections of triggered MH and therefore more directly predictive of MH-prompted biased processing. The specific tasks under consideration in a given study could be inserted in place of the words “these tasks” in the reflective measures. Alternatively, broad classes of tasks could be referenced in these items, by replacing “these tasks” with “mechanical tasks such as typing and counting” for example. The wording of the items would also need to be changed depending upon whether MH is measured as a moderator, i.e., belief, or as a mediator, i.e., in direct reference to a specific machine used in the study. The wording in Supplementary Table 10 lends itself to the former, to assess each study participant’s degree of belief in MH. When using this construct as a mediator, it is important to modify it to refer to the specific machine with which participants interacted in the study. For example, an item in a study could read, “This AI system has machine-like reliability” if using a reflective indicator, or as “This AI system has machine-like efficiency” or “This AI system has human-like biases” when using as a formative indicator. Molina and Sundar (2022) followed this strategy in their measurement of positive machine heuristic (e.g., “has machine-like accuracy”) and negative machine heuristic (e.g., “has human-like subjective judgments”, which was reverse-coded).

Utility of the Machine Heuristic scale

Machine heuristic has been measured as a mediator, a moderator, or a control variable in a variety of studies about psychological aspects of communication technologies. As a mediator, MH unpacks the causal relationship between machine agency cues (e.g., source attribution cues and system-generated cues) and users’ psychological outcomes. A communicative entity’s machine attribution triggers its user’s MH in the form of a state variable that affects the user’s perceptions positively or negatively. Research has found that machine attribution increases perceived credibility of news articles (Waddell, 2019) and perceived usefulness of health-related decisions (Araujo et al., 2020), but decreases perceived emotional support (Meng & Dai, 2021), trust in customer service (Mays et al., 2022), and perceived credibility of instructors in the context of affective learning (Edwards et al., 2016). Such findings are in line with the “cue route” of the HAII-TIME model (Sundar, 2020, p. 81), which predicts that interface manifestations and underlying attributes of AI trigger cognitive heuristics that affect user perceptions, expectations, and trust. Our new MH scale could help discover how a communicative entity’s machine attribution on the interface, either as a source cue or via algorithmic attributes, affects its users’ perceptions as predicted in the HAII-TIME model.

In addition, system-generated cues, such as sociometric indicators about users displayed on social media, also have effects on users’ perceptions, like perceived attractiveness, popularity, and self-confidence of other users (Edwards et al., 2013; Kleck et al., 2007; Tong et al., 2008). Likewise, cues generated by AI could trigger users’ MH while they are communicating with other users, mediated by AI. This can affect user perceptions of AI-mediated communication (AI-MC), which refers to “interpersonal communication that is not simply transmitted by technology, but modified, augmented, or even generated by a computational agent to achieve communication goals” (Hancock et al., 2020, p. 90). Thus, our MH scale would contribute to the development of machine interfaces and gauging user experience in the context of AI-MC. In addition, some research has considered MH as a moderator (e.g., Sundar & Kim, 2019) or a control variable (e.g., Jia & Liu, 2021), measuring individual differences in prior beliefs in MH. The new MH scale can be used to investigate these individual differences and help us causally predict (Bellur & Sundar, 2014) the role of MH, as being triggered by interface cues conveying machine agency.

Limitations and directions for future research

Machine heuristic is a cognitive heuristic wherein people attribute machine characteristics or machine-like operation when making judgments about their interaction with a machine. Since the abilities of machines change over time as technologies develop, the characteristics people attribute to machines could change across generations. This suggests that the scales used to capture the concept of MH may need to be updated over time. Our scale is necessarily constrained by the six examples of machines used to prompt our respondents. While every effort was made to cover the wide gamut of machines, they do not include newer AI technologies which have seen rapid development and diffusion in recent years, performing tasks that were hitherto squarely in the human domain, such as creating new art works, generating original prose, and composing music. Over time, these emerging abilities could alter human conception of machines and therefore that of MH. For now, however, we are still in the era of artificial narrow intelligence (ANI) (e.g., Kalota, 2024) wherein AI is used only for specific tasks (Kaplan & Haenlein, 2019). As we enter the era of artificial general intelligence (AGI) wherein AI is “able to reason, plan, and solve problems autonomously for tasks they were never even designed for” (Kaplan & Haenlein, p. 16) and the era of artificial super intelligence (ASI) “which are truly self-aware and conscious systems that, in a certain way, will make humans redundant” (Kaplan & Haenlein, p. 16), we may need to investigate the characteristics people attribute to machines at that time and reshape the scale for MH. Future research would benefit by tracking the changing meaning of MH over time by running our factor analysis with new and emergent technologies.

As the first formal attempt to develop a valid scale for measuring MH, we aimed at constructing a comprehensive and all-encompassing scale of seven items. While the seven reflective items will suffice for most research purposes, scholars interested in exploring formative aspects of MH may be guided by the six formative factors (comprising 29 items) identified in our study. Depending on circumstances, respondents might perceive these scales as too long to complete, and the questionnaire may run the danger of causing fatigue. Thus, the research community would benefit from a shorter version of the scale that can capture MH in a more efficient manner. For example, the formative MH scale could be comprised of just six items (instead of 29), with one representative measure from each of the six factors that best fits the interaction scenario in the study. The reflective MH scale could also be shortened to three items (from the current seven) to parsimoniously address dependability, competence, and expectations, or perhaps to just one item (“machines are better than humans in performing such tasks”). The literature has numerous examples of shorter measurement scales that were developed subsequent to the construction of a long original version. Likewise, we expect scholars to employ more parsimonious versions of our scale and test them for their content validity.

Future research would also do well to add to our perceptual measurement of MH by exploring processual aspects of the concept. Since heuristics are mental shortcuts that are understood to be easily retrieved from the cognitive system, measures of construct accessibility, such as response/reaction time, can be quite helpful in assessing trait strength of MH and thereby predict the likelihood of its application in a given interaction situation.

Conclusion

This study provides a validated scale for measuring machine heuristic, featuring seven statements about the comparison of machines with humans to be used for both mechanical and human tasks. We expect that this measurement scale will be useful for a wide variety of studies pertaining to human–machine communication and machine-mediated communication.

Supplementary material

Supplementary material is available at Journal of Computer-Mediated Communications online.

Conflicts of interest. None declared.

Data availability

The data underlying this article will be shared on reasonable request to the first author.

Notes

1

This study requires you to voice your opinion using the scales below. It is important that you take the time to read all instructions and that you read questions carefully before you answer them. Previous research on preferences has found that some people do not take the time to read everything that is displayed in the questionnaire. The questions below serve to test whether you actually take the time to do so. Therefore, if you read this, please answer ‘three’ on the first question, add three to that number and use the result as the answer on the second question. Thank you for participating and taking the time to read all instructions.

I would prefer to live in a large city rather than a small city.

I would prefer to live in a city with many cultural opportunities, even if the cost of living was higher.

Participants were asked to rate the degree to which they agree to these two instructed response items on a 7-point Likert scale. If participants did not choose “3” to respond to the first item or “6” to respond to the second item, they would be considered as failing the attention check.

2

Workplace Facilities

Most modern theories of psychology recognize the fact that social perceptions do not take place in a social vacuum. Individual preferences and knowledge, along with situational variables can greatly impact the perception process. In order to facilitate our research on perceptions of workplace behaviors, we are interested in knowing certain factors about you, the perceiver. Specifically, we are interested in whether you actually take the time to read the directions; if not, then some of our manipulations that rely on changes in the instructions will be ineffective. So, in order to demonstrate that you have read the instructions, please ignore the facility items below. Instead, simply click the other option and in the corresponding box, enter the text: I read the instructions.

Which of these facilities are available at your workplace?

(Click on all that apply)

Right below this instruction and question, participants were provided the following options: Canteen/vending machine, lounge, coffee maker, air conditioning/heating, storeroom, washroom, windows, parking, childcare facilities, and other with a blank.

3

The normality of the data for this model was checked based on Kline’s guideline (2015). We found that there was no skewness or kurtosis and the multivariate model was normally distributed. Mardia’s coefficient = 202.48, which is lesser than P(P+2) = 899; P = the number of observed variables in the formative measurement model of MH = 29.

4

The normality of the data for this model was checked. We found that there was no skewness or kurtosis and the multivariate model was normally distributed. Mardia’s coefficient = 26.12, which is lesser than P(P+2) = 63; P = the number of observed variables in the formative measurement model of MH = 7.

5

The normality of the data for this model was checked. We found that there was no skewness or kurtosis and the multivariate model was normally distributed. Mardia’s coefficient = 13.83, which is lesser than P(P+2) = 63; P = the number of observed variables in the formative measurement model of MH = 7.

6

The normality of the data of this model was checked. We found that there was no skewness or kurtosis and the multivariate model was normally distributed. Mardia’s coefficient = 302.77, which is lesser than P(P+2) = 1368; P = the number of observed variables in the formative measurement model of MH = 36.

7

The normality of the data of this model was checked. We found that there was no skewness or kurtosis and the multivariate model was normally distributed. Mardia’s coefficient = 291.98, which is lesser than P(P+2) = 1368; P = the number of observed variables in the formative measurement model of MH = 36.

References

Abbey
J. D.
,
Meloy
M. G.
(
2017
).
Attention by design: Using attention checks to detect inattentive respondents and improve data quality
Journal of Operations Management53–56(1),
63
70
.

Al-Senaidi
S.
,
Lin
L.
,
Poirot
J.
(
2009
).
Barriers to adopting technology for teaching and learning in Oman
.
Computers & Education
,
53
(
3
),
575
590
.

Araujo
T.
,
Helberger
N.
,
Kruikemeier
S.
,
De Vreese
C. H.
(
2020
).
In AI we trust? Perceptions about automated decision-making by artificial intelligence
.
AI & Soceity
,
35
(
3
),
611
623
.

Ashworth
P.
(
1987
).
Technology and machines—bad masters but good servants
.
Intensive Care Nursing
,
3
(
1
),
1
2
.

Astobiza
A. M.
(
2023
).
Do people believe that machines have minds and free will? Empirical evidence on mind perception and autonomy in machines
.
AI and Ethics
,
1
9
.

Banks
J.
,
Edwards
A. P.
,
Westerman
D.
(
2021
).
The space between: Nature and machine heuristics in evaluations of organisms, cyborgs, and robots
.
Cyberpsychology, Behavior, and Social Networking
,
24
(
5
),
324
331
.

Barnard
A.
(
1997
).
A critical review of the belief that technology is a neutral object and nurses are its master
.
Journal of Advanced Nursing
,
26
(
1
),
126
131
.

Bellur
S.
,
Sundar
S. S.
(
2014
).
How can we tell when a heuristic has been used? Design and analysis strategies for capturing the operation of heuristics
.
Communication Methods and Measures
,
8
(
2
),
116
137
.

Bentler
P. M.
,
Chou
C. P.
(
1987
).
Practical issues in structural modeling
.
Sociological Methods & Research
,
16
(
1
),
78
117
.

Blar
N.
,
Idris
S. A.
,
Jafar
F. A.
,
Ali
M. M.
(
2014
, July). Robot and human teacher. Proceedings of 2014 International Conference on Computer, Information and Telecommunication Systems (CITS),
1
3
. IEEE.

Bohannon
M.
(
2024
, February 23). Change healthcare cyberattack disrupts services nationwide—Here’s what to know. Forbes. https://www.forbes.com/sites/mollybohannon/2024/02/23/change-healthcare-cyberattack-disrupts-services-nationwide-heres-what-to-know/?sh=7f9758eb85b9

Cattell
R. B.
(
1978
).
The scientific use of factor analysis in behavioral and life sciences
.
Springer
.

Chaffee
S. H.
(
1991
).
Explication
.
SAGE Publications
.

Chaiken
S.
(
1980
).
Heuristic versus systematic information processing and the use of source versus message cues in persuasion
.
Journal of Personality and Social Psychology
,
39
(
5
),
752
.

Chignell
M. H.
,
Hancock
P. A.
(
1986
). Knowledge-based load leveling and task allocation in human-machine systems. NASA Technical Reports Server. https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19860023513.pdf

Coltman
T.
,
Devinney
T. M.
,
Midgley
D. F.
,
Venaik
S.
(
2008
).
Formative versus reflective measurement models: Two applications of formative measurement
.
Journal of Business Research
,
61
(
12
),
1250
1262
.

Colton
S.
(
2008
, March). Creativity versus the perception of creativity in computational systems. Proceedings of AAAI Spring Symposium: Creative Intelligent Systems,
8
,
7
. https://www.aaai.org/Papers/Symposia/Spring/2008/SS-08-03/SS08-03-003.pdf

Copeland
B. J.
(Ed.). (2004). The essential Turing.
Oxford University Press
.

Costello
A. B.
,
Osborne
J.
(
2005
).
Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis
.
Practical Assessment, Research, and Evaluation
,
10
(
1
),
7
.

Daston
L.
,
Galison
P.
(
1992
).
The image of objectivity
.
Representations
,
40
,
81
128
.

DeVellis
R. F.
(
2012
).
Scale development: Theory and applications
.
SAGE Publications
.

Dillman
D. A.
(
1978
).
Mail and telephone surveys: The total design method
(Vol.
19
).
Wiley
.

Edwards
A.
,
Edwards
C.
,
Spence
P. R.
,
Harris
C.
,
Gambino
A.
(
2016
).
Robots in the classroom: Differences in students’ perceptions of credibility and learning between “teacher as robot” and “robot as teacher”
.
Computers in Human Behavior
,
65
,
627
634
.

Edwards
C.
,
Spence
P. R.
,
Gentile
C. J.
,
Edwards
A.
,
Edwards
A.
(
2013
).
How much Klout do you have… A test of system generated cues on source credibility
.
Computers in Human Behavior
,
29
(
5
),
A12
A16
.

Egelman
S.
,
Peer
E.
(
2015
, April). Scaling the security wall: Developing a security behavior intentions scale (SeBIS). Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems,
2873
2882
.

Elsayed
A.
(
2006
). Machine-mediated communication: The technology. Proceedings of the Sixth International Conference on Advanced Learning Technologies (ICALT’06),
1194
1195
.

Fan
L.
,
Liu
X.
,
Wang
B.
,
Wang
L.
(
2017
).
Interactivity, engagement, and technology dependence: understanding users’ technology utilisation behaviour
.
Behaviour & Information Technology
,
36
(
2
),
113
124
.

Field
A.
(
2005
)
Discovering statistics using SPSS
(2nd ed.).
Sage
.

Fornell
C.
,
Larcker
D. F.
(
1981
).
Evaluating structural equation models with unobservable variables and measurement error
.
Journal of Marketing Research
,
18
(
1
),
39
50
.

Gaut
B.
(
2010
).
The philosophy of creativity
.
Philosophy Compass
,
5
(
12
),
1034
1046
.

Gelman
S. A.
,
Legare
C. H.
(
2011
).
Concepts and folk theories
.
Annual Review of Anthropology
,
40
,
379
398
.

Goddard
K.
,
Roudsari
A.
,
Wyatt
J. C.
(
2012
).
Automation bias: a systematic review of frequency, effect mediators, and mitigators
.
Journal of the American Medical Informatics Association
,
19
(
1
),
121
127
.

Gunkel
D. J.
(
2012
).
Communication and artificial intelligence: Opportunities and challenges for the 21st century
.
Communication+ 1
,
1
(
1
),
1
25
.

Hair
J. F.
,
Anderson
R. E.
,
Babin
B. J.
,
Black
W. C.
(
2010
).
Multivariate data analysis
(7th ed.).
Pearson
.

Hancock
J. T.
,
Naaman
M.
,
Levy
K.
(
2020
).
AI-mediated communication: Definition, research agenda, and ethical considerations
.
Journal of Computer-Mediated Communication
,
25
(
1
),
89
100
.

Haslam
N.
(
2006
).
Dehumanization: An integrative review
.
Personality and Social Psychology Review
,
10
(
3
),
252
264
.

Haslam
N.
,
Bain
P.
,
Douge
L.
,
Lee
M.
,
Bastian
B.
(
2005
).
More human than you: Attributing humanness to self and others
.
Journal of Personality and Social Psychology
,
89
(
6
),
937
.

Hoe
S. L.
(
2008
).
Issues and procedures in adopting structural equation modelling technique
.
Journal of Quantitative Methods
,
3
(
1
),
76
. https://ink.library.smu.edu.sg/sis_research/5168

Hong
J. W.
(
2018
, July). Bias in perception of art produced by artificial intelligence. Proceedings of International Conference on Human-Computer Interaction,
290
303
.

Hu
L. T.
,
Bentler
P. M.
(
1999
).
Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives
.
Structural Equation Modeling: A Multidisciplinary Journal
,
6
(
1
),
1
55
.

Jia
C.
,
Liu
R.
(
2021
).
Algorithmic or human source? Examining relative hostile media effect with a transformer-based framework
.
Media and Communication
,
9
(
4
),
170
181
.

Jian
J. Y.
,
Bisantz
A. M.
,
Drury
C. G.
(
2000
).
Foundations for an empirically determined scale of trust in automated systems
.
International Journal of Cognitive Ergonomics
,
4
(
1
),
53
71
.

Kalota
F.
(
2024
).
A primer on generative artificial intelligence
.
Education Sciences
,
14
(
2
),
172
.

Kaplan
A.
,
Haenlein
M.
(
2019
).
Siri, Siri, in my hand: Who’s the fairest in the land? On the interpretations, illustrations, and implications of artificial intelligence
.
Business Horizons
,
62
(
1
),
15
25
.

Kiran
D. R.
(
2019
).
Production planning and control: A comprehensive approach
.
Elsevier Science
.

Kleck
C. A.
,
Reese
C. A.
,
Behnken
D. Z.
,
Sundar
S. S.
(
2007
, May 24–28). The company you keep and the image you project: Putting your best face forward in online social networks [Conference presentation]. The 57th Annual Conference of the International Communication Association, San Francisco, CA, United States. http://www.homeworkgain.com/wp-content/uploads/edd/2019/08/20190612225929thecompanyyoukeep.pdf

Kline
R. B.
(
2015
).
Principles and practice of structural equation modeling
(4th ed.).
Guilford Press
.

Krisher
T.
,
Johnson
G.
(
2024
, April 24). Tesla driver in Seattle-area crash that killed motorcyclist told police he was using Autopilot. The Associated Press. https://apnews.com/article/tesla-crash-washington-autopilot-motorcyclist-killed-a572c05882e910a665116e6aaa1e6995

Kung
F. Y.
,
Kwok
N.
,
Brown
D. J.
(
2018
).
Are attention check questions a threat to scale validity?
Applied Psychology
,
67
(
2
),
264
283
.

Lamb
C.
,
Brown
D. G.
,
Clarke
C. L.
(
2018
).
Evaluating computational creativity: An interdisciplinary tutorial
.
ACM Computing Surveys (CSUR
),
51
(
2
),
1
34
.

Lee
M. K.
(
2018
).
Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management
.
Big Data & Society
,
5
(
1
),
1
16
.

Madhavan
P.
,
Wiegmann
D. A.
(
2007
).
Similarities and differences between human–human and human–automation trust: an integrative review
.
Theoretical Issues in Ergonomics Science
,
8
(
4
),
277
301
.

Marshall
C. C.
,
Shipman
F. M.
(
2013
, May). Experiences surveying the crowd: Reflections on methods, participation, and reliability. Proceedings of the 5th Annual ACM Web Science Conference,
234
243
.

Mays
K. K.
,
Katz
J. E.
,
Groshek
J.
(
2022
).
Mediated communication and customer service experiences: Psychological and demographic predictors of user evaluations in the United States
.
Periodica Polytechnica Social and Management Sciences
,
30
(
1
),
1
11
. http://hdl.handle.net/10125/64079

Meng
J.
,
Dai
Y. N.
(
2021
).
Emotional support from AI Chatbots: Should a supportive partner self-disclose or not?
Journal of Computer-Mediated Communication
,
26
(
4
),
207
222
.

Merritt
S. M.
(
2011
).
Affective processes in human–automation interactions
.
Human Factors
,
53
(
4
),
356
370
.

Molina
M. D.
,
Sundar
S. S.
(
2022
).
When AI moderates online content: effects of human collaboration and interactive transparency on user trust
.
Journal of Computer-Mediated Communication
,
27
(
4
), zmac010.

Montague
A.
,
Matson
F. W.
(
1983
).
The dehumanization of man
.
McGraw-Hill
.

Mosier
K. L.
,
Skitka
L. J.
(
1999
, September).
Automation use and automation bias
.
Proceedings of the Human Factors and Ergonomics Society Annual Meeting
,
43
(
3
),
344
348
. Sage.

Mosier
K. L.
,
Skitka
L. J.
,
Burdick
M. D.
,
Heers
S. T.
(
1996
, October).
Automation bias, accountability, and verification behaviors
.
Proceedings of the Human Factors and Ergonomics Society Annual Meeting
,
40
(
4
),
204
208
. Sage.

Mueller
W. S.
,
Clegg
C. W.
,
Wall
T. D.
,
Kemp
N. J.
,
Davies
R. T.
(
1986
).
Pluralist beliefs about new technology within a manufacturing organization
.
New Technology, Work and Employment
,
1
(
2
),
127
139
.

Mvududu
N. H.
,
Sink
C. A.
(
2013
).
Factor analysis in counseling research and practice
.
Counseling Outcome Research and Evaluation
,
4
(
2
),
75
98
.

Nass
C. I.
,
Lombard
M.
,
Henriksen
L.
,
Steuer
J.
(
1995
).
Anthropocentrism and computers
.
Behaviour & Information Technology
,
14
(
4
),
229
238
.

Ohanian
R.
(
1990
).
Construction and validation of a scale to measure celebrity endorsers' perceived expertise, trustworthiness, and attractiveness
.
Journal of Advertising
,
19
(
3
),
39
52
.

Parasuraman
R.
,
Manzey
D. H.
(
2010
).
Complacency and bias in human use of automation: An attentional integration
.
Human Factors
,
52
(
3
),
381
410
.

Parasuraman
R.
,
Sheridan
T. B.
,
Wickens
C. D.
(
2008
).
Situation awareness, mental workload, and trust in automation: Viable, empirically supported cognitive engineering constructs
.
Journal of Cognitive Engineering and Decision Making
,
2
(
2
),
140
160
.

Redmiles
E. M.
,
Kross
S.
,
Mazurek
M. L.
(
2019
, May). How well do my results generalize? comparing security and privacy survey results from mturk, web, and telephone samples. Proceedings of 2019 IEEE Symposium on Security and Privacy, 1326–1343.

Ruse
M.
(
2002
).
Robert Boyle and the machine metaphor
.
Zygon
,
37
(
3
),
581
596
.

Ruse
M.
(
2005
).
Darwinism and mechanism: metaphor in science
.
Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences
,
36
(
2
),
285
302
.

Sass
D. A.
,
Schmitt
T. A.
(
2010
).
A comparative investigation of rotation criteria within exploratory factor analysis
.
Multivariate Behavioral Research
,
45
(
1
),
73
103
.

Shrestha
N.
(
2021
).
Factor analysis as a tool for survey analysis
.
American Journal of Applied Mathematics and Statistics
,
9
(
1
),
4
11
.

Skitka
L. J.
,
Mosier
K. L.
,
Burdick
M.
(
1999
).
Does automation bias decision-making?
International Journal of Human-Computer Studies
,
51
(
5
),
991
1006
.

Strothotte
T.
,
Schlechtweg
S.
(
2002
).
Non-photorealistic computer graphics: modeling, rendering, and animation
.
Morgan Kaufmann
.

Subramaniam
S. K. A. L.
,
Husin
S. H. B.
,
Yusop
Y. B.
,
Hamidon
A. H. B.
(
2009
). Machine efficiency and man power utilization on production lines. Proceedings of the 8th WSEAS International Conference on Electronics, Hardware, Wireless and Optical communications,
70
75
.

Sundar
S. S.
(
2008
). The MAIN model: A heuristic approach to understanding technology effects on credibility. In
Metzger
M. J.
,
Flanagin
A. J.
(Eds.),
Digital media, youth, and credibility
(pp.
72
100
).
The MIT Press
.

Sundar
S. S.
(
2020
).
Rise of machine agency: A framework for studying the psychology of human–AI interaction (HAII)
.
Journal of Computer-Mediated Communication
,
25
(
1
),
74
88
.

Sundar
S. S.
,
Kim
J.
(
2019
, May). Machine heuristic: When we trust computers more than humans with our personal information. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems,
1
9
.

Sundar
S. S.
,
Lee
E. J.
(
2022
).
Rethinking communication in the era of artificial intelligence
.
Human Communication Research
,
48
(
3
),
379
385
.

Sundar
S. S.
,
Nass
C.
(
2001
).
Conceptualizing sources in online news
.
Journal of Communication
,
51
(
1
),
52
72
.

Thomas
K. W.
,
Velthouse
B. A.
(
1990
).
Cognitive elements of empowerment: An “interpretive” model of intrinsic task motivation
.
Academy of Management Review
,
15
(
4
),
666
681
.

Tong
S. T.
,
Van Der Heide
B.
,
Langwell
L.
,
Walther
J. B.
(
2008
).
Too much of a good thing? The relationship between number of friends and interpersonal impressions on Facebook
.
Journal of Computer-Mediated Communication
,
13
(
3
),
531
549
.

Turing
A. M.
(
1950
).
Computing machinery and intelligence
.
Mind
, LIX(
236
),
433
460
. https://www.jstor.org/stable/2251299

Turkle
S.
(
1984
).
The second self: Computers and the human spirit
.
Simon & Schuster
.

Tversky
A.
,
Kahneman
D.
(
1973
).
Availability: A heuristic for judging frequency and probability
.
Cognitive Psychology
,
5
(
2
),
207
-
232
.

Tyson
A.
,
Pasquini
G.
,
Spencer
A.
,
Funk
C.
(
2023
, February 22). 60% of Americans would be uncomfortable with provider relying on AI in their own health care. Pew Research Center. https://www.pewresearch.org/science/2023/02/22/60-of-americans-would-be-uncomfortable-with-provider-relying-on-ai-in-their-own-health-care/

Ragot
M.
,
Martin
N.
,
Cojean
S.
(
2020
, April). Ai-generated vs. human artworks. a perception bias towards artificial intelligence? Proceedings of Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems,
1
-
10
.

van Dalen
A.
(
2012
).
The algorithms behind the headlines: How machine-written news redefines the core skills of human journalists
.
Journalism Practice
,
6
(
5–6
),
648
658
.

Velicer
W. F.
,
Jackson
D. N.
(
1990
).
Component analysis versus common factor analysis: Some further observations
.
Multivariate Behavioral Research
,
25
(
1
),
97
114
.

Waddell
T. F.
(
2018
).
A robot wrote this? How perceived machine authorship affects news credibility
.
Digital Journalism
,
6
(
2
),
236
255
.

Waddell
T. F.
(
2019
).
Can an algorithm reduce the perceived bias of news? Testing the effect of machine attribution on news readers’ evaluations of bias, anthropomorphism, and credibility
.
Journalism & Mass Communication Quarterly
,
96
(
1
),
82
100
.

Wang
S.
(
2021
).
Moderating uncivil user comments by humans or machines? The effects of moderation agent on perceptions of bias and credibility in news content
.
Digital Journalism
,
9
(
1
),
64
83
.

Waytz
A.
,
Norton
M. I.
(
2014
).
Botsourcing and outsourcing: Robot, British, Chinese, and German workers are for thinking—not feeling—jobs
.
Emotion
,
14
(
2
),
434
.

Wolf
E. J.
,
Harrington
K. M.
,
Clark
S. L.
,
Miller
M. W.
(
2013
).
Sample size requirements for structural equation models: An evaluation of power, bias, and solution propriety
.
Educational and Psychological Measurement
,
73
(
6
),
913
934
.

Yan
T.
,
Conrad
F. G.
,
Tourangeau
R.
,
Couper
M. P.
(
2011
).
Should I stay or should I go: The effects of progress feedback, promised task duration, and length of questionnaire on completing web surveys
.
International Journal of Public Opinion Research
,
23
(
2
),
131
-
147
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Associate Editor: Adam Joinson
Adam Joinson
Associate Editor
Search for other works by this author on:

Supplementary data