-
PDF
- Split View
-
Views
-
Cite
Cite
Nicole Gotzner, Diana Mazzarella, Negative strengthening: The interplay of evaluative polarity and scale structure, Journal of Semantics, Volume 41, Issue 1, February 2024, Pages 103–117, https://doi.org/10.1093/jos/ffae004
- Share Icon Share
Abstract
This work investigates absolute adjectives in the not very construction and how their pragmatic interpretation depends on the evaluative polarity and the scale structure of their antonymic pairs. Our experimental study reveals that evaluatively positive adjectives (clean) are more likely to be strengthened than evaluatively negative ones (dirty), and that maximum standard adjectives (clean or closed) are more likely to be strengthened than minimum standard ones (dirty or open). Our findings suggest that both evaluative polarity and scale structure drive the asymmetric interpretation of gradable adjectives under negation. Overall, our work adds to the growing literature on the interplay between pragmatic inference, valence and semantic meaning.
1. Introduction
Since the seminal work of Osgood et al. (1957), research in psychology has explored the role of connotation in language understanding. Osgood and colleagues identified three basic components to describe the affective dimension of word meaning across cultures: valence (pleasantness evoked by a stimulus), dominance (degree of control exerted by a stimulus), and arousal (intensity of emotion evoked by a stimulus) (see for example Warriner et al., 2013; Mohammad, 2018 for emotion lexicon). Despite this, though, connotation has been to a large extent overlooked in formal studies of meaning (but see Nouwen, 2021; Beltrama et al., 2023, a.o.): “In formal linguistics, the focus is rather on the denotative meanings, which are generally void of affective content” (Nouwen, 2021: 1). This is the most surprising when considering that some word classes, such as antonymic adjectives, exhibit strong biases in terms of whether they are perceived as positive or negative (e.g. Boucher & Osgood, 1969; Paradis et al., 2012; Mohammad, 2018).
The interpretation of adjectives presents a fertile ground for exploring the interplay of connotative and truth-conditional meaning, especially when their valence interacts with the pragmatic effects of negation (see for example Israel, 2004). Here, we focus on a particular pragmatic inference—negative strengthening—which has been argued to be socially motivated (Brown et al., 1987; Horn, 1989). For example, an utterance like Joe’s suit is not very clean may implicate that “Joe’s suit is dirty” as the more hedged form avoids the direct expression of a negative property. In this case, the speaker chose a complex expression involving a negated adjective over a simple antonymic expression. The key finding underpinning the assumption that negative strengthening is socially motivated is that the interpretation of negated statements tends to be asymmetric: positive adjectives like clean (in not very clean) or happy (in not happy) are strengthened more readily than corresponding negative antonyms. The speaker’s choice of a negative antonym (not sad) is arguably not motivated by avoiding the direct expression of a positive adjective (happy) as there is no social reason to do so.
This so-called polarity asymmetry in negative strengthening has been corroborated experimentally (e.g. Ruytenbeek et al., 2017; Gotzner & Mazzarella, 2021; Mazzarella & Gotzner, 2021). Yet most experimental work on negative strengthening has focused on relative adjectives under negation (e.g. not happy), which due to their inherent vagueness, can be easily exploited for such pragmatic functions (Gotzner et al., 2018a, 2018b; Leffel et al., 2019). As Horn (1989) shows, such strengthening implicatures occur in a wide range of constructions and may underpin a variety of different linguistic phenomena. Yet the vagueness of expressions may be crucial for such an implicature to arise (Krifka, 2002; Leffel et al., 2019).
In what follows, we investigate the interpretation of absolute adjectives in the not very construction, as the modifier very arguably coerces them into vague relative interpretations. We provide experimental evidence that both evaluative polarity and scale structure (what type of standard underlies the adjective’s semantics) affect negative strengthening.
Our paper is structured as follows. First, we present the theoretical background on antonymy, polarity, and negative strengthening. We then discuss how scale structure affects truth-conditional and pragmatic meaning (Section 2). In Section 3, we present experimental evidence that evaluative polarity and scale structure are crucial to negative strengthening. Finally, we discuss these findings in the context of the growing literature on the interplay between valence and truth-conditional meaning (Section 4).
2. Theoretical Background
2.1. Antonymy and Polarity
Many lexical items participate in an antonymic relation with another item. But not all antonyms express the concept of opposition in a similar way (for seminal work on antonymy in philosophy and linguistics see Vendler, 1963; Givón, 1970; Lehrer & Lehrer, 1982; Cruse, 1986 and Horn, 1989). Non-gradable antonyms like dead or alive obey the Law of the Excluded Middle: one cannot be neither dead nor alive at the same time. In contrast, a subset of gradable adjectives, relative adjectives like large and small, allow for a middle ground between the extension of the positive term and its negative antonym, where neither term applies (see for example Cruse, 1986).
Each member of an antonymic pair represents a polar opposite. Polarity is typically defined based on three distinct—often, but not always, converging—criteria (see Cruse, 1986 and Ruytenbeek et al., 2017 for a discussion). The first one, evaluative polarity, is based on the perceived valence resulting from desirability judgements. While clean is considered to be a desirable property by most people, hence evaluatively positive (E+), dirty is judged as undesirable and thus evaluatively negative (E−) (see Boucher & Osgood, 1969 and Mohammad, 2018 and Buechel et al., 2020 for valence judgments across multiple languages). The second notion, dimensional polarity, is based on the measurement scale associated with the antonymic pair. Interestingly, the pair dirty/clean is one in which the evaluative and dimensional notion of polarity mismatch. For each antonymic pair, the positive polar is the one mapping onto the relevant dimension for the measurement on the scale: dirty is positive (Dim+) and clean is negative (Dim−) as we quantify the amount of dirt (something with a greater amount of dirt counts as dirtier). The third notion of polarity is based on markedness. Typically the positive polar is morphologically unmarked, whereas the negative one is marked (e.g. certain/uncertain).1
2.2. Negative strengthening, face management, and the positivity bias
It is well established that polarity influences the interpretation of adjectives under negation. Research on negative strengthening, for instance, has identified polarity as a crucial factor affecting the availability of a contrary reading (see, e.g. Ducrot, 1973; Horn, 1989). That is, the negation of an adjective is more likely to be strengthened to convey the affirmation of the antonym when the negated adjective is positive (not happy to mean “unhappy”) than when it is negative (not unhappy to mean “happy”).
Recent experimental studies provide evidence for such a polarity asymmetry, while mainly looking at the class of relative adjectives like happy (Colston, 1999; Fraenkel & Schul, 2008; Ruytenbeek et al., 2017; Gotzner & Mazzarella, 2021; Mazzarella & Gotzner, 2021; Gotzner & Kiziltan, 2022; cf. Paradis & Willners, 2006 for asymmetric interpretations of negated absolute adjectives).2
While these studies tested antonymic pairs in which different notions of polarity converge (see especially Ruytenbeek et al., 2017), standard explanations of the asymmetric interpretation of negated positive and negative adjectives attribute a crucial role to speaker’s and hearers’ sensitivity to evaluative polarity.3
Existing proposals agree that hearers should expect negative evaluations to be conveyed via the negation of a positive adjective (but not the other way around), but diverge with respect to the underlying pragmatic reasoning. Hearers may assume that speakers intend to convey negative evaluations indirectly to i) leave themselves a loophole (Seright, 1966), ii) preserve plausible deniability (Keenan, 1976; Krifka, 2002), iii) mitigate a face-threatening act (Brown et al., 1987; Horn, 1989, 2017), or iv) simply avoid straightforwardly negative expressions (e.g. Terkourafi et al., 2020; Mazzarella & Gotzner, 2021; building on the Pollyanna Principle from Boucher & Osgood, 1969). Due to their inherent vagueness, relative adjectives are ideal candidates to serve these types of pragmatic reasoning, as their meaning leaves several interpretative possibilities open. As Keenan (1976) and Krifka (2002) have argued “one is not easily proven wrong if one stays vague”. This opens up the question of the generalizability of the polarity asymmetry of negative strengthening to other types of adjectives. As we discuss in the next section, so-called absolute adjectives represent an interesting test case: while they typically do not show characteristics of vagueness, they can be coerced into relative-like vague interpretations by modifiers like very. This allows us to examine whether valence interacts with other semantic features, such as scale structure, which arguably play a role in modulating pragmatic inference.
2.3. Scale structure and pragmatic inferences
The distinction between relative and absolute gradable adjectives is based on the types of standards they invoke (see especially Bierwisch, 1989; Kamp & Rossdeutscher, 1994; Rotstein & Winter, 2004; Kennedy & McNally, 2005; Solt, 2015a). Relative adjectives involve context-dependent standards, which vary depending on the comparison class (e.g. the cut-off point for classifying someone as tall will be different among the class of basketball players and average American males).4 In turn, absolute adjectives have a fixed standard of comparison, which either corresponds to the minimum (dirty) or maximum degree (clean) of a scale. Evidence for such a distinction comes from linguistic tests involving the propensity of different adjective types to combine with modifiers like completely, which references the end points on a scale, or slightly, which indicates a minimum degree (Kennedy & McNally, 2005). Relative adjectives only combine with modifiers like very that shift the standard on the corresponding scale but do not reference a lower or upper bound (see for example Klein, 1980; Kennedy & McNally, 2005).
The relative/absolute distinction goes hand in hand with distinct entailment patterns and the propensity for different implicatures to be derived. As shown in (3) and (4), respectively, the negation of an absolute adjective like clean entails the assertion of its antonym dirty while this is not the case for relative adjectives like large and small (Cruse, 1986, Rotstein & Winter, 2004, Kennedy, 2007, a.o.). Interestingly, absolute adjectives in the not very construction pattern with relative adjectives: they also lack the entailment to their antonym, as shown in (5). Not very clean in (5) could mean “clean but not very much so,” “slightly dirty,” or “filthy,” leaving room for pragmatic inference. Arguably, not very clean could be used to avoid expressing the evaluatively negative dirty directly, as typical for relative adjectives.
(3) The shirt is not clean ⇒ The shirt is dirty.
(4) The door is not large ⇏ The door is small.
(5) The shirt is not very clean ⇏ The shirt is dirty.
In principle, absolute adjectives should be infelicitous with vague modifiers like very. Yet Kennedy & McNally (2005: 369ff.) discuss counterexamples to this generalization in which very coerces absolute adjectives into relative-like imprecise interpretations. The acceptability of very with absolute adjectives correlates with different readings and entailment patterns. For example, when dry describes a stable property such as the average degree of moisture in the atmosphere, it has a relative interpretation. Thus dry is felicitous with very in (6-a) and the negation in (6-b) does not generate an entailment to the antonym wet, hence the continuation is felicitous. If, however, dry is used to describe a transient property like the amount of moisture on a surface (7-a), it has an absolute interpretation, and the negation in (7-b) entails that the glasses are wet (Kennedy & McNally, 2005: 371).
(6) a. This region of the country is very dry.
b. This region of the country is not dry (but it’s not wet either)
(7) a. The glasses are very dry.
b. The glasses are not dry (#but they are not wet either)
The relative/absolute distinction has been shown to be crucial to the derivation of pragmatic inferences under negation (Gotzner et al., 2018a, 2018b; Leffel et al., 2019; Alexandropoulou & Gotzner, 2022; Gotzner & Kiziltan, 2022). One common finding of these previous studies is that relative adjectives are preferably negatively strengthened (not pretty to mean “rather ugly”) while absolute adjectives are more likely to trigger variants of scalar implicatures. These different kinds of implicature are logically incompatible and anticorrelated, which means that participants either derive negative strengthening or scalar implicatures for a given adjective (Gotzner et al., 2018b).
Leffel et al. (2019) provided an account of the role of vagueness in implicature via a borderline constraint:
Borderline constraint: if a sentence S and its alternative S1 are such that S ∧ ¬ S1 could be a borderline contradiction, then the (scalar) implicature resulting from the negation of S1 is not derived.
The basic idea is that relative adjectives in their bare and modified form, such as large and very large, are more difficult to distinguish. Since there are few degrees that clearly count as large but not very large, there could be a borderline contradiction. Not very large therefore does not typically license a scalar implicature to large but tends to be interpreted as rather small (negative strengthening). In the case of minimum standard adjectives such as late, on the other hand, there is a clear cut-off point for times that count as late and one can easily make a distinction between late and very late arrivals. Thus, there is no potential borderline contradiction and the indirect scalar implicature is derived. For example, an utterance like Joe was not very late would convey that “he was late but not very much so.”
Leffel et al.’s account explains the difference between relative and minimum standard adjectives based on the competition between the modified forms (not very late) and negated variants (not late) that are informationally stronger. Maximum standard adjectives have not been tackled by Leffel et al. and they present an intriguing theoretical puzzle. The combination of “very” with absolute adjectives results in different entailment patterns for minimum and maximum standard adjectives. As shown in (8), very dirty entails the minimum standard dirty.
However, as shown in (9), because of the potential borderline contradiction between full and not very full, the borderline constraint should make the scalar implicature of maximum standard adjectives less available (e.g. “The glass is full” from The glass is not very full in 10). As a result, The glass is not very full should be more likely to be negatively strengthened to “The glass is rather empty”.5
(8) The shirt is very dirty ⇒ The shirt is dirty.
(9) #The glass is (completely) full and not very full.
(10) The glass is very full ⇏ The glass is full (strictly speaking)
The present study seeks to test this generalization in order to evaluate accounts that assume vagueness to be crucial in the derivation of negative strengthening (Krifka, 2002; Leffel et al., 2019). By shifting our focus from relative to absolute adjectives, we plan to assess the robustness of the polarity asymmetry in negative strengthening and the extent to which it carries over to cases in which the modifier very coerces absolute adjectives into a relative-like interpretation.
3. The present study
3.1. Goals and predictions
The present study pursues two goals. Goal (1) is to establish whether there is a polarity asymmetry for absolute adjectives in the not very construction. Goal (2) is to test whether evaluative polarity and scale structure jointly modulate negative strengthening. Examples (7)–(10) exemplify the kinds of antonyms tested in our experiment. We tested two within-subject factors: evaluative polarity (E+, E−) and adjective type (minimum standard, maximum standard).
(7) The shirt is not very dirty (E−, minimum standard)
(8) The shirt is not very clean (E+, maximum standard)
(9) The door is not very open (E+, minimum standard)
(10) The door is not very closed (E−, maximum standard)
Our main hypothesis, preregistered at https://osf.io/t7ar6, is that evaluativity should give rise to a polarity asymmetry as reflected in a main effect: E+ adjectives should be strengthened more than E− adjectives. If this hypothesis is confirmed, the presence of additional effects related to adjective type will suggest that:
H1: adjective type also affects the degree of negative strengthening (resulting in a main effect of adjective type), or
H2: the strength of the polarity asymmetry is modulated by adjective type (resulting in an interaction between evaluativity and adjective type).
3.2. Methods
3.2.1. Participants
We recruited 100 participants with US IP addresses on Mechanical Turk, based on a power simulation with pilot data. Participants were screened for native language and only included in the analysis if their self-reported native language was English. Based on our exclusion criteria, 96 participants were included in the final analysis (68 male, 27 female, 1 no gender reported). Their mean age was 37.57, with a standard deviation of 9.68 (age range 24–65). The experiment lasted 10 minutes and participants were paid $0.80. The experiment was conducted following the ethics policy of the Deutsche Forschungsgemeinschaft (DFG). The protocol was approved by the Ethics Board of the Deutsche Gesellschaft für Sprachwissenschaft (DGfS). Participants' consent was obtained at the start of the survey and their data were fully anonymized.
3.2.2. Materials
We selected ten antonymic pairs belonging to the class of absolute adjectives from the literature and verified that the combination of the different adjectives with the modifier is felicitous (checking co-occurrences in COCA and Google search). A research assistant and the first author of this paper annotated the list of adjectives for polarity and scale structure. In particular, the following linguistic tests were used: subjective judgment of desirability as a measure of evaluative polarity and modification with slightly/completely/fully as a test of scale structure (Kennedy & McNally, 2005). Only adjective pairs with respect to which the annotator’s judgments converged were selected as experimental materials. Appendix 1 shows all items and test sentences. We further extracted valence values from the VAD corpus (Mohammad, 2018; see also Nouwen, 2021 for a similar application of this corpus in semantics). This was used as a further criterion to select from the initial list of items tested in a pilot study (n = 78). In this study, we found a main effect of evaluative polarity but not dimensional polarity with a larger item set.6 We selected five items from the pilot based on the valence metric in Mohammad (2018) and added a further set of five items, in which evaluative polarity and adjective type were in opposite directions. We prioritized the armchair judgments of polarity within the sentence context since the corpus measures do not distinguish among different word senses and since the theoretical hypotheses are about binary judgments of polarity.
Before conducting our main experiment, we ran an entailment pretest, in which the items were used without the modifier very. This was done to ensure that participants judged corresponding antonyms as entailing each other to a similar degree. Any asymmetry potentially arising in our main experiment with the not very construction would thus be due to pragmatic strengthening (and not to baseline differences in entailment patterns). The results of this pretest are available at https://osf.io/9vctf/.
Our experiment had a 2-Polarity (E+, E−) × 2-Adjective type (minimum standard, maximum standard) within-subject, within-item design. Hence, participants completed twenty experimental trials. Table 1 shows an example item as used in the main experiment.
Imagine you are being told “Joe’s suit is not very dirty” According to the statement, Joe’s suit is: dirty 1 2 3 4 5 6 7 clean |
Imagine you are being told “Joe’s suit is not very dirty” According to the statement, Joe’s suit is: dirty 1 2 3 4 5 6 7 clean |
Imagine you are being told “Joe’s suit is not very dirty” According to the statement, Joe’s suit is: dirty 1 2 3 4 5 6 7 clean |
Imagine you are being told “Joe’s suit is not very dirty” According to the statement, Joe’s suit is: dirty 1 2 3 4 5 6 7 clean |
Participants’ task was to indicate what the speaker wanted to communicate on a scale ranging from the adjective used in the sentence frame to its antonym. For example, in the example item, participants judged the extent to which—according to the statement—the suit is dirty/clean. Judgments were given on a 7-point Likert scale anchored at the negated adjective of the test sentence (1) and its antonym (7).7 Hence, we measured the degree of negative strengthening as a function of the likelihood with which the antonym of a pair is taken to be conveyed by the speaker’s utterance.
In addition to the critical items, participants were presented with 6 filler statements not involving negation such as John is gorgeous (where the response scale was anchored at the adjectives gorgeous and ugly). The filler sentences also served as attention checks. Experimental trials and filler trials were randomized for each participant using an in-built randomization function.
The experiment was programmed in HTML and run via MTurk’s in-built environment. The experimental procedures and predictions were preregistered with the as.predicted.org template on the Open Science Framework (https://osf.io/t7ar6).
3.2.3. Procedure
Participants read an instruction explaining the task with an example. The running example was an adjective not used in the stimulus set (Imagine you are being told: “John is not very tall”). For each stimulus, the 1- to 7-point scale was anchored to the adjective used in the statement (1) and its antonym (7). The instructions told participants to judge what the speaker wanted to communicate with the statement.
3.3. Results
We excluded four participants based on inconsistent responses in the filler trials (more than 50% responses not in line with the adjective used in the filler statements). Figure 4 shows the mean responses by Polarity and Adjective type. Figure C.1 in Appendix 3 shows the results for each individual adjective.

Proportion of response choice with 95% confidence interval broken by Polarity and Adjective type.
The results were analyzed with cumulative link mixed-effects models using the function clmm() in the ordinal package. We first ran a model with the sum-coded factors Polarity, Adjective type, their interaction as well as random slopes for items and slopes participants (maximal random effects structure, Barr et al., 2013). The results of the model showed a main effect of Polarity (B = 0.33, SE = 0.065, z = 5.0, P < .001) and a main effect of Adjective type (B = 0.32, SE = 0.064, z = 4.97, P < .001). There was no interaction between Polarity and adjective type (P = .87). A summary of the model is presented in Table B.1 in Appendix 2 and all data are made available on OSF (https://osf.io/9vctf/).
The main effect of polarity again confirms the existence of a polarity asymmetry. This extends the polarity asymmetry found in previous studies (e.g. Ruytenbeek et al., 2017) to absolute adjectives in the not very construction, which are arguably coerced into relative interpretations. Further, the main effect of Adjective type indicates that maximum standard adjectives (not very closed) are more likely to be negatively strengthened than minimum standard ones (not very open). Overall, these results suggest that both evaluative polarity and adjective type modulate negative strengthening. Since there was no interaction of the two factors, we have no evidence that the effect of evaluativity would be different across adjective type in the not very construction. Overall, we conjecture that both evaluativity and scale structure are crucial to negative strengthening and the effect of evaluativity is the same across adjective type.
4. General discussion and conclusions
We provided experimental evidence that the polarity asymmetry in negative strengthening is driven by evaluative polarity, therefore our results are compatible with the existence of a positivity bias in language. According to the Pollyanna principle, there is “universal human tendency to use positive words more frequently” (Boucher & Osgood, 1969). The present study indicates that this positivity bias can extend to complex utterances. If speakers have an overall tendency to use negated positive adjectives in order to avoid expressing a negative evaluation directly, interpreters should indeed be more likely to strengthen the negation of an evaluatively positive antonym than that of an evaluatively negative one.
We have further extended previous findings on the relevance of the absolute/relative distinction to the derivation of pragmatic inferences under negation (Gotzner et al., 2018a, 2018b; Leffel et al., 2019; Alexandropoulou & Gotzner, 2022; Gotzner & Kiziltan, 2022). On a strict semantic view, absolute adjectives should not be compatible with strengthened meanings (but see also Paradis & Willners, 2006; Alexandropoulou & Gotzner, 2022). However, one may argue that the modifier very either elicits a relative-like interpretation of absolute adjectives or introduces a greater granularity level/higher standard of precision (see Alexandropoulou & Gotzner, 2022 for a similar argument).8 The results of the present study contribute to this research by showing that absolute adjectives can indeed be subject to negative strengthening in the not very construction.
Concerning adjective type, the higher rates of negative strengthening for maximum standard adjectives may be due to the fact that the indirect scalar implicature for maximum standard adjectives (not very clean ⇝ “clean”) is not derivable. Based on Leffel et al.’ (2019) account, we may argue that there are fewer degrees in [clean; very clean] than there are in [dirty; very dirty] (since dirty is minimum standard, any small amount of dirt counts as “dirty”). Therefore, the indirect scalar implicature of maximum standard adjectives (not very clean to convey “clean”) should be more difficult to satisfy than that of minimum standard ones (not very dirty to convey “dirty”) and maximum standard adjectives will thus favor negative strengthening. This finding provides additional evidence in favor of the borderline constraint on vague implicatures proposed by Leffel et al. (2019).
The current study highlighted that both evaluative polarity and adjective type (minimum v. maximum standard) modulate negative strengthening, thus suggesting that multiple aspects of previous accounts of negative strengthening should be integrated. Let us conclude by discussing the relevance of the current work to the role of valence in language and the communicative function of vagueness. As noted by many scholars, the use of vague expressions contradicts efficient language use (Lipman, 2009; van Deemter, 2009). Many answers to the question why we use vague expressions in the first place have been proposed (see for example Solt, 2015b for an overview). Most relevant to the current investigation is the view that vague language may help to save the speaker’s and hearer’s face (Keenan, 1976; Krifka, 2002). As Krifka (2002) suggests, a speaker who communicates vaguely can hardly be proven wrong. Opting for the negated expression, however, comes at the cost of being less informative (which flouts Grice’s Quantity Maxim, see Grice, 1975 and Horn, 1984, 1989 for the interplay of Quantity with other Maxims). In some contexts, speakers have a strong motivation not to be proven wrong, which is often based on social motives (see Keenan, 1976 for seminal work). Here, we have looked at adjectives that are not vague per se but only allow for a loophole in the specific construction (not very) we tested. We found that evaluative polarity, thus socioemotive aspects of language, played a crucial role in the interpretation of such expressions, in addition to scale structure.
Overall, our work demonstrates the importance of evaluativity in language interpretation, in conjunction with recent investigations in other domains (e.g. Beltrama et al., 2023; Nouwen, 2021). The view that connotation is fundamental to language advocated by Osgood et al. (1957), has made much impact in computer science and psychology. However, the linguistic tradition has mostly tried to strip away connotation from the study of meaning. We argue that looking at evaluativity provides exciting avenues for future research in semantics and pragmatics, and that valence corpora are a useful tool to quantify the role of connotation in language (see also Nouwen, 2021 and van Tiel & Pankratz, 2021 for recent applications of this kind).
CONFLICT OF INTEREST STATEMENT
None declared.
FUNDING
This work was supported by the DFG (Emmy Noether grant awarded to NG, Nr. GO 3378/1–1)).
Footnotes
For an alternative characterization of markedness based on presuppositional behaviors, see Rett (2015).
Giora et al. (2005), in turn, did not find an asymmetric interpretation pattern for positive and negative terms.
As for markedness, Ruytenbeek et al. (2017) showed that the polarity asymmetry is typically stronger for morphological pairs (containing negative morphemes such as “happy” and “unhappy”) than for non-morphological pairs (involving lexical antonyms like “happy” and “sad”).
See for example Solt & Gotzner (2012) for experimental evidence of how participants shift their interpretation depending on the comparison class for relative but not absolute adjectives. Further evidence for a distinction between relative and absolute adjectives can be found in Syrett, Kennedy, and Lidz (2010), among others.
We thank an anonymous reviewer for making us aware of this fact.
This study established that dimensional polarity did not drive the polarity asymmetry in negative strengthening. Yet our dimensionally positive adjectives were all maximum standard and the dimensionally negative ones minimum standard adjectives. There is no sensible way in which dimensional polarity could be separated from scale structure/adjective type, which is why we did not investigate this further. Note that the preregistration for the current study did not include any hypothesis about dimensional polarity.
An anonymous reviewer raised the issue that the response scale could be interpreted in different ways (e.g. as a measurement scale or a metalinguistic similarity scale). We assume that participants interpret the scale as a metalinguistic negative strengthening scale, as indicated by the prompt. That is, we take participants to reason about the degree to which the speaker intended to convey the antonym. Note that the adjectives anchoring the scale end points are flipped depending on the adjective presented in the test sentence.
Recent frameworks of adjective meaning take different priors to be at the basis of the relative/absolute distinction (e.g. Lassiter & Goodman, 2013) rather than modeling this distinction categorically (but see Xiang et al., 2022 for evidence that priors alone cannot capture this distinction).
References
Appendix 1
List of antonymic pairs and test sentences.
Min . | Max . | Evaluativity . | Test Sentence . | . |
---|---|---|---|---|
Open | Closed | Pos-neg | Peter's door is not very open/closed. | |
Familiar | Foreign | Pos-neg | Sue's approach is not very familiar/foreign to her colleagues. | |
Acquainted | Ignorant | Pos-neg | Nick is not very acquainted with/ignorant of foreign languages. | |
Hairy | Bald | Pos-neg | Joe's newborn is not very hairy/bald. | |
Visible | Hidden | Pos-neg | Kylie's bar is not very visible/hidden. | |
Damaged | Intact | Neg-pos | Emily's reputation is not very damaged/intact. | |
Dirty | Clean | Neg-pos | Joe's suit is not very dirty/clean. | |
Drunk | Sober | Neg-pos | Liz is not very drunk/sober. | |
Dangerous | Safe | Neg-pos | Jake's neighborhood is not very dangerous/safe. | |
Sick | Healthy | Neg-pos | Jim is not very sick/healthy. |
Min . | Max . | Evaluativity . | Test Sentence . | . |
---|---|---|---|---|
Open | Closed | Pos-neg | Peter's door is not very open/closed. | |
Familiar | Foreign | Pos-neg | Sue's approach is not very familiar/foreign to her colleagues. | |
Acquainted | Ignorant | Pos-neg | Nick is not very acquainted with/ignorant of foreign languages. | |
Hairy | Bald | Pos-neg | Joe's newborn is not very hairy/bald. | |
Visible | Hidden | Pos-neg | Kylie's bar is not very visible/hidden. | |
Damaged | Intact | Neg-pos | Emily's reputation is not very damaged/intact. | |
Dirty | Clean | Neg-pos | Joe's suit is not very dirty/clean. | |
Drunk | Sober | Neg-pos | Liz is not very drunk/sober. | |
Dangerous | Safe | Neg-pos | Jake's neighborhood is not very dangerous/safe. | |
Sick | Healthy | Neg-pos | Jim is not very sick/healthy. |
Min . | Max . | Evaluativity . | Test Sentence . | . |
---|---|---|---|---|
Open | Closed | Pos-neg | Peter's door is not very open/closed. | |
Familiar | Foreign | Pos-neg | Sue's approach is not very familiar/foreign to her colleagues. | |
Acquainted | Ignorant | Pos-neg | Nick is not very acquainted with/ignorant of foreign languages. | |
Hairy | Bald | Pos-neg | Joe's newborn is not very hairy/bald. | |
Visible | Hidden | Pos-neg | Kylie's bar is not very visible/hidden. | |
Damaged | Intact | Neg-pos | Emily's reputation is not very damaged/intact. | |
Dirty | Clean | Neg-pos | Joe's suit is not very dirty/clean. | |
Drunk | Sober | Neg-pos | Liz is not very drunk/sober. | |
Dangerous | Safe | Neg-pos | Jake's neighborhood is not very dangerous/safe. | |
Sick | Healthy | Neg-pos | Jim is not very sick/healthy. |
Min . | Max . | Evaluativity . | Test Sentence . | . |
---|---|---|---|---|
Open | Closed | Pos-neg | Peter's door is not very open/closed. | |
Familiar | Foreign | Pos-neg | Sue's approach is not very familiar/foreign to her colleagues. | |
Acquainted | Ignorant | Pos-neg | Nick is not very acquainted with/ignorant of foreign languages. | |
Hairy | Bald | Pos-neg | Joe's newborn is not very hairy/bald. | |
Visible | Hidden | Pos-neg | Kylie's bar is not very visible/hidden. | |
Damaged | Intact | Neg-pos | Emily's reputation is not very damaged/intact. | |
Dirty | Clean | Neg-pos | Joe's suit is not very dirty/clean. | |
Drunk | Sober | Neg-pos | Liz is not very drunk/sober. | |
Dangerous | Safe | Neg-pos | Jake's neighborhood is not very dangerous/safe. | |
Sick | Healthy | Neg-pos | Jim is not very sick/healthy. |
Appendix 2
Summary of cumulative link mixed-effects models including the sum-coded fixed effect Polarity (Experiment 1). Model formula: clmm(negative strengthening ~ type * polarity + (1+ type * polarity|item) + (type * polarity|participant))
Random effects . | Variance . | SD . | Correlations . | |
---|---|---|---|---|
Participants . | (Intercept) . | 6.29168 . | 2.5083 . | . |
Type | 0.06537 | 0.2557 | −0.963 | |
Polarity | 0.07089 | 0.2662 | 0.050 0.220 | |
Type:polarity | 0.78529 | 0.8862 | −0.394 0.627 0.898 | |
Items | (Intercept) | 0.05960 | 0.2441 | |
Type | 0.07277 | 0.2698 | 0.379 | |
Polarity | 0.05840 | 0.2417 | 0.309 0.997 | |
Type:polarity | 0.23538 | 0.4852 | 0.153 −0.856 −0.892 |
Random effects . | Variance . | SD . | Correlations . | |
---|---|---|---|---|
Participants . | (Intercept) . | 6.29168 . | 2.5083 . | . |
Type | 0.06537 | 0.2557 | −0.963 | |
Polarity | 0.07089 | 0.2662 | 0.050 0.220 | |
Type:polarity | 0.78529 | 0.8862 | −0.394 0.627 0.898 | |
Items | (Intercept) | 0.05960 | 0.2441 | |
Type | 0.07277 | 0.2698 | 0.379 | |
Polarity | 0.05840 | 0.2417 | 0.309 0.997 | |
Type:polarity | 0.23538 | 0.4852 | 0.153 −0.856 −0.892 |
Summary of cumulative link mixed-effects models including the sum-coded fixed effect Polarity (Experiment 1). Model formula: clmm(negative strengthening ~ type * polarity + (1+ type * polarity|item) + (type * polarity|participant))
Random effects . | Variance . | SD . | Correlations . | |
---|---|---|---|---|
Participants . | (Intercept) . | 6.29168 . | 2.5083 . | . |
Type | 0.06537 | 0.2557 | −0.963 | |
Polarity | 0.07089 | 0.2662 | 0.050 0.220 | |
Type:polarity | 0.78529 | 0.8862 | −0.394 0.627 0.898 | |
Items | (Intercept) | 0.05960 | 0.2441 | |
Type | 0.07277 | 0.2698 | 0.379 | |
Polarity | 0.05840 | 0.2417 | 0.309 0.997 | |
Type:polarity | 0.23538 | 0.4852 | 0.153 −0.856 −0.892 |
Random effects . | Variance . | SD . | Correlations . | |
---|---|---|---|---|
Participants . | (Intercept) . | 6.29168 . | 2.5083 . | . |
Type | 0.06537 | 0.2557 | −0.963 | |
Polarity | 0.07089 | 0.2662 | 0.050 0.220 | |
Type:polarity | 0.78529 | 0.8862 | −0.394 0.627 0.898 | |
Items | (Intercept) | 0.05960 | 0.2441 | |
Type | 0.07277 | 0.2698 | 0.379 | |
Polarity | 0.05840 | 0.2417 | 0.309 0.997 | |
Type:polarity | 0.23538 | 0.4852 | 0.153 −0.856 −0.892 |
Coefficients
. | Estimate . | SE . | z-value . | P-value . |
---|---|---|---|---|
Type | 0.31774 | 0.06384 | 4.929 | .001 |
Polarity | 0.327837 | 0.06526 | 4.926 | .001 |
Type:polarity | −0.02022 | 0.13373 | −0.13373 | .88 |
. | Estimate . | SE . | z-value . | P-value . |
---|---|---|---|---|
Type | 0.31774 | 0.06384 | 4.929 | .001 |
Polarity | 0.327837 | 0.06526 | 4.926 | .001 |
Type:polarity | −0.02022 | 0.13373 | −0.13373 | .88 |
. | Estimate . | SE . | z-value . | P-value . |
---|---|---|---|---|
Type | 0.31774 | 0.06384 | 4.929 | .001 |
Polarity | 0.327837 | 0.06526 | 4.926 | .001 |
Type:polarity | −0.02022 | 0.13373 | −0.13373 | .88 |
. | Estimate . | SE . | z-value . | P-value . |
---|---|---|---|---|
Type | 0.31774 | 0.06384 | 4.929 | .001 |
Polarity | 0.327837 | 0.06526 | 4.926 | .001 |
Type:polarity | −0.02022 | 0.13373 | −0.13373 | .88 |
Appendix 3: Results broken by item

Proportion of response choice with 95% confidence interval broken by Polarity and Adjective type per item.