Abstract

Agrammatism is a disorder of language production characterized by short, simplified sentences, the omission of function words, an increased use of nouns over verbs and a higher use of heavy verbs. Despite observing these phenomena for decades, the accounts of agrammatism have not converged. Here, we propose and test the hypothesis that the lexical profile of agrammatism results from a process that opts for words with a lower frequency of occurrence to increase lexical information. Furthermore, we hypothesize that this process is a compensatory response to patients’ core deficit in producing long, complex sentences. In this cross-sectional study, we analysed speech samples of patients with primary progressive aphasia (n = 100) and healthy speakers (n = 65) as they described a picture. The patient cohort included 34 individuals with the non-fluent variant, 41 with the logopenic variant and 25 with the semantic variant of primary progressive aphasia. We first analysed a large corpus of spoken language and found that the word types preferred by patients with agrammatism tend to have lower frequencies of occurrence than less preferred words. We then conducted a computational simulation to examine the impact of word frequency on lexical information as measured by entropy. We found that strings of words that exclude highly frequent words have a more uniform word distribution, thereby increasing lexical entropy. To test whether the lexical profile of agrammatism results from their inability to produce long sentences, we asked healthy speakers to produce short sentences during the picture description task. We found that, under this constrained condition, a similar lexical profile of agrammatism emerged in the short sentences of healthy individuals, including fewer function words, more nouns than verbs and more heavy verbs than light verbs. This lexical profile of short sentences resulted in their lower average word frequency than unconstrained sentences. We extended this finding by showing that, in general, shorter sentences get packaged with lower-frequency words as a basic property of efficient language production, evident in the language of healthy speakers and all primary progressive aphasia variants.

Video Abstract

Introduction

Impaired functioning of the left hemisphere's inferior frontal areas is associated with a distinct style of language production known as agrammatism. Common symptoms of agrammatism include short, simplified sentences, the omission of function words, a decreased use of verbs relative to nouns and increased use of heavy verbs than light verbs.1–9 Despite observing some of these symptoms for over two centuries, accounts aiming to explain the underlying mechanism of agrammatism have not converged.

One of the most consistently reported features of agrammatism is the omission of function words, such as pronouns, auxiliary verbs and determiners.4,5,10–12 In contrast to content words that carry the main message of a sentence, function words primarily play a grammatical role with likely distinct storage and access processes than those for content words.13–15 Because agrammatism is thought to disrupt the syntactic structure of a sentence, access to function words has been hypothesized to get more affected than access to content words.16–18 Another key symptom of agrammatism is the increased use of nouns over verbs.19–26 Since both nouns and verbs are content words, the dual system for retrieving function/content words cannot explain the noun/verb dissociation. Instead, this verb deficit has been attributed to the greater syntactic complexity of verbs in processing the relationships among sentence elements.6,21,27–29 Lastly, the third and more recent case of lexical dissociation in agrammatism is an increased use of heavy verbs relative to light verbs compared to healthy language production.30,31 Light verbs such as go, do, get and take are semantically more general and associated with less specific objects.32–34 It is suggested that heavy verbs are more resistant to disruption as they are semantically richer and more specific to a particular context than light verbs.31 The heavy/light verb distinction represents yet another case of lexical dissociation with a proposed underlying mechanism not matching those for content/function words and noun/verb dissociations.35 In the absence of a converging account, agrammatism is mainly conceptualized as a multi-component disorder in which each component results from a distinct mechanism. The co-occurrence of various symptoms is thus ascribed to the proximity of their underlying neural circuitries.6

In this article, we introduce an alternative conceptual framework based on information theory that explains the lexical profile of agrammatism not as a conglomerate of independent disorders but as a monolithic compensatory response. According to this framework, the agrammatic lexical profile arises from a process that opts for lower-frequency words in language production to increase sentence information. This strategic choice of words is a response to the agrammatic patients’ core deficit in producing long, complex sentences. We evaluate this hypothesis by analysing the language of patients with primary progressive aphasia (PPA) and healthy speakers as they participate in a picture description task. PPA has three subtypes, with the non-fluent variant (nfvPPA) characterized by agrammatic/effortful speech and two other variants, logopenic (lvPPA) and semantic (svPPA), by lexicosemantic deficits.36

First, we test the hypothesis that the three cases of lexical dissociation in agrammatism are the products of choosing low-frequency words. We analyse a large corpus of spoken language consisting of over 115 million words to show that the word types preferred by nfvPPA patients—content over function words, nouns over verbs and heavy over light verbs—indeed occur less frequently than the alternative type. This work will complement previous literature showing function words are more common than content words by extending the analyses to other word types in the spoken modality.13,37

Next, we show how choosing low-frequency words increases the lexical information of sentences based on concepts from Shannon's Mathematical Theory of Communication.38 The information content of a phenomenon relates to its predictability. Highly predictable phenomena yield little information, and more surprising ones offer more information. Using low-frequency words increases the information content of a sentence in two ways. First, low-frequency words are generally less predictable than high-frequency words in a sentence. For example, the lower-frequency word Oxalis is also less predictable than the word plant from the context, ‘In my garden, I grow this type of ….’. The relationship between the frequency of a word and its predictability in a context is statistically significant39 but not absolute because low-frequency words can become more predictable in certain contexts.40

The second way in which using low-frequency words enhances the information content of a sentence is by increasing its lexical entropy. Entropy is another way to measure predictability, with phenomena of high entropy being less predictable, hence more informative. The maximum lexical entropy of a sentence occurs when its words have an equal probability of appearance or uniform distribution, for example, when each word appears only once. Conversely, a skewed word distribution decreases lexical entropy, for example, when a word appears more than others. Now the question is how selecting low-frequency words from the lexicon of all possible words would result in higher lexical entropy of sentences. The answer to this question lies in the properties of the word distribution of the lexicon. As famously shown through Zipf's law, the lexicon of all words of a language contains few words with a very high frequency of occurrence and the rest with a much lower frequency.41,42 This pattern results in a highly skewed distribution for the high-frequency words and a long uniform tail for the low-frequency words. As a result, we expect selecting words from the more uniformly distributed tail of the lexicon for making sentences to result in higher lexical entropy than choosing from the skewed part of the distribution. We will run a computational simulation to show the statistical relationship between the use of low-frequency words and higher lexical entropy at the sentence level.

Lastly, we test the hypothesis that the lexical profile of agrammatism arises as a response to the central deficit of patients in producing long, complex sentences. We ask healthy individuals to describe the same picture using short sentences of only one to two words. Our expectation is that the constraint on sentence length will induce a similar lexical profile of agrammatism. We then test the general hypothesis that shorter sentences contain lower-frequency words as a basic property of language production in healthy speakers and all three variants of PPA.

Materials and methods

Participants

For this cross-sectional study, we recruited 100 patients with PPA from an ongoing longitudinal study conducted in the Frontotemporal Disorders Unit of Massachusetts General Hospital (MGH). Comprehensive clinical and language assessments were used to characterize and subtype patients into nfvPPA (n = 34), svPPA (n = 25) and lvPPA (n = 41), as previously described.43 All patients were native English speakers. The number of patients in this cohort reflects the data available when we started this study. We included ratings on the Progressive Aphasia Severity Scale which uses the clinician's best judgement, integrating information from the patient's examination and a companion's description of routine daily functioning.44 The Progressive Aphasia Severity Scale includes boxes for fluency, syntax, word retrieval and expression, repetition, auditory comprehension, single-word comprehension, reading, writing and functional communication. The Progressive Aphasia Severity Scale Sum-of-boxes is the sum of each of the box scores. The study also includes 34 age-matched healthy controls enrolled through the Speech and Feeding Disorders Laboratory at the MGH Institute of Health Professions. These participants passed a cognitive screen, were native English speakers, and had no history of neurologic or developmental speech or language disorders. For the constrained language task, we recruited a separate cohort of 31 individuals from Amazon's Mechanical Turk. Amazon's Mechanical Turk participants filled out a short survey about their neurological and language backgrounds. Only language samples from participants who were native English speakers with no self-reported history of brain or speech-language disorder, either developmental or acquired, were included in the analyses. These participants had an average age of 47.6 with an average year of education of 16.1. In this cohort, 21 participants were female, and 27 were right-handed. The clinical and demographic information on the participants is shown in Table 1. All participants provided informed consent following guidelines established by the Mass General Brigham Healthcare System Institutional Review Boards, which govern human subjects research at MGH. The Brain Resilience in Aging: Integrated Neuroscience Studies (BRAINS) at MGH approved data collection for Amazon's Mechanical Turk individuals.

Table 1

Clinical and demographic characteristics of participants performing the unconstrained taska

nfvPPAlvPPAsvPPAControlsStatistics (P-value)
Sample size34412534
Mean age (SD)65.87 (9.05)65.45 (6.92)60.93 (7.84)64.84 (8.44)F = 1.384 (0.252)
Handedness, right87%80.6%88.9%73.3%χ2 = 7.080 (0.313)
Mean years of education (SD)17.31 (7.43)16.63 (2.22)16.56 (1.71)15.87 (1.54)F = 0.622 (0.592)
Female:male21:1318:2315:1019:15χ2 = 2.911 (0.406)
PASS sum-of-boxes5.70 (3.56)5.77 (2.63)5.00 (2.21)F = 0.494 (0.612)
Mean MoCA score (SD)23.38 (4.34)20.52 (4.57)19.60 (6.91)F = 3.101 (0.052)
nfvPPAlvPPAsvPPAControlsStatistics (P-value)
Sample size34412534
Mean age (SD)65.87 (9.05)65.45 (6.92)60.93 (7.84)64.84 (8.44)F = 1.384 (0.252)
Handedness, right87%80.6%88.9%73.3%χ2 = 7.080 (0.313)
Mean years of education (SD)17.31 (7.43)16.63 (2.22)16.56 (1.71)15.87 (1.54)F = 0.622 (0.592)
Female:male21:1318:2315:1019:15χ2 = 2.911 (0.406)
PASS sum-of-boxes5.70 (3.56)5.77 (2.63)5.00 (2.21)F = 0.494 (0.612)
Mean MoCA score (SD)23.38 (4.34)20.52 (4.57)19.60 (6.91)F = 3.101 (0.052)

MoCA, Montreal Cognitive Assessment; PASS, Progressive Aphasia Severity Scale; SD, standard deviation.

a

We report F-value for ANOVA and χ2 for chi-square tests.

Table 1

Clinical and demographic characteristics of participants performing the unconstrained taska

nfvPPAlvPPAsvPPAControlsStatistics (P-value)
Sample size34412534
Mean age (SD)65.87 (9.05)65.45 (6.92)60.93 (7.84)64.84 (8.44)F = 1.384 (0.252)
Handedness, right87%80.6%88.9%73.3%χ2 = 7.080 (0.313)
Mean years of education (SD)17.31 (7.43)16.63 (2.22)16.56 (1.71)15.87 (1.54)F = 0.622 (0.592)
Female:male21:1318:2315:1019:15χ2 = 2.911 (0.406)
PASS sum-of-boxes5.70 (3.56)5.77 (2.63)5.00 (2.21)F = 0.494 (0.612)
Mean MoCA score (SD)23.38 (4.34)20.52 (4.57)19.60 (6.91)F = 3.101 (0.052)
nfvPPAlvPPAsvPPAControlsStatistics (P-value)
Sample size34412534
Mean age (SD)65.87 (9.05)65.45 (6.92)60.93 (7.84)64.84 (8.44)F = 1.384 (0.252)
Handedness, right87%80.6%88.9%73.3%χ2 = 7.080 (0.313)
Mean years of education (SD)17.31 (7.43)16.63 (2.22)16.56 (1.71)15.87 (1.54)F = 0.622 (0.592)
Female:male21:1318:2315:1019:15χ2 = 2.911 (0.406)
PASS sum-of-boxes5.70 (3.56)5.77 (2.63)5.00 (2.21)F = 0.494 (0.612)
Mean MoCA score (SD)23.38 (4.34)20.52 (4.57)19.60 (6.91)F = 3.101 (0.052)

MoCA, Montreal Cognitive Assessment; PASS, Progressive Aphasia Severity Scale; SD, standard deviation.

a

We report F-value for ANOVA and χ2 for chi-square tests.

Language samples

For the unconstrained condition, the participants were asked to look at a drawing of a family at a picnic from the Western Aphasia Battery–Revised45 and describe it using as many complete sentences as they could. Responses were audio-recorded in a quiet room and later transcribed by a researcher blind to the grouping. For the constrained condition, we asked the Amazon's Mechanical Turk participants to describe the same picture using either one or two-word sentences. A sentence was defined as an independent clause and all clauses dependent on it.46 As language data were sparse for sentences containing more than 20 words (∼1% of all utterances), the scatter plots show 99% of sentences with a length of ≤20.

Text analysis of language samples

We used Quantitext, a text analysis toolbox developed in MGH Frontotemporal Disorders Unit, to automatically generate a set of quantitative language metrics to increase the precision and objectivity of language assessments while reducing human labour (along with the goals outlined previously).47 The toolbox uses several natural language processing tools, such as Stanza48 and text analysis libraries in R. Quantitext receives transcribed language samples as input and generates several metrics, such as sentence length, log word frequency, log syntax frequency,49 content units,50 the efficiency of lexical and syntactic items,51 part of speech tags and the distinction of heavy and light verbs. Nouns, verbs (except for auxiliary verbs), adjectives and adverbs are considered content words, and others as function words. The toolbox classifies the verbs go, have, do, come, give, get, make, take, be, bring, put and move as light verbs while excluding auxiliaries from this list.31 All other verbs are classified as heavy verbs.

Corpus analysis and measuring word frequency

For corpus analysis and the measurement of word frequency, we used the Corpus of Contemporary American English (COCA).52 The corpus comprises 1 billion words of contemporary American English in eight genres: TV/movies, spoken, fiction, magazine, newspaper and academic. To best represent our study cohort, we used the spoken genre of COCA, which consists of transcripts of unscripted conversations from about 150 different TV and radio programmes with 115 937 138 lemmatized and 121 465 711 token words.

Measuring the lexical entropy normalized by word count

Given a string of words with probabilities p(x1) for the first word type to p(xn) for the nth word type, the lexical entropy of the string normalized by word count is calculated by the following formula.

Normalized entropy takes values from 0 to 1.

Statistical analysis

For the statistical analyses of this study, we used the R software version 4.1.2. We used independent t-tests to compare the log frequency of the lemmatized words of COCA. We used generalized additive models (GAM) to estimate the smooth but potentially non-linear relationship between sentence length and word frequency. GAM is a generalized linear model in which the mean of the outcome is a sum of unknown smooth univariate functions of continuous predictors. Spline functions are popular for bases in GAM because they can approximate smooth functions when the number of internal knots is large enough.53 Splines are piece-wise polynomial functions and the places where two neighbouring pieces of the polynomial meet are known as the internal knots. A spline function becomes more flexible (i.e. capable of describing a wider range of non-linear functions) as the number of internal knots increases. However, too many internal knots usually lead to overfitting, and thus, the number of internal knots should be selected carefully. To avoid overfitting and increase the generalizability of the fitted model, a smoothness penalty on the spline function is usually employed to prevent the model from interpolating. A commonly used class of penalties targets the L2 norm of the derivative of a given order and controls the complexity of the fitted GAM. We use thin plate regression splines54 as the basis functions and set the number of internal knots of the spline GAM model shows the degree of curvature of the relationship. The value of effective degrees of freedom (EDF) formed by the GAM model shows the degree of curvature of the relationship. A value of 1 for EDF is translated as a linear relationship. Values larger than one denote a more complex relationship between the predicting and outcome variables. We used the gam function in the mgcv package in R to fit the model.55 We included in the model separate spline functions of sentence length for each group of subjects (e.g. PPA variants versus healthy controls) and a subject-specific random slope. The model parameters were estimated via the restricted maximum likelihood method.56 To test whether the relationship between word frequency and sentence length was different in PPA variants compared to healthy controls, we performed a generalized likelihood ratio test for penalized splines.57 To compare the features of agrammatism across different groups, we used mixed-effects models with subject-specific random intercept via the lme4 package in R.58

Results

Analysing a large corpus of spoken data

Here, we analysed a large corpus of spoken language from COCA. Using independent t-tests, we found that in spoken language, content words have a lower log frequency (mean = 3.44, SD = 2.32) than function words (mean = 5.70, SD = 3.56) [t(913.07) = −19.049 P < 0.001] consistent with the previous literature. Furthermore, we found that nouns, on average, have a lower log frequency of occurrence (mean = 3.58, SD = 2.34) than verbs (mean = 4.68, SD = 2.30) [t(8728.8) =−32.774, P < 0.001]. Similarly, heavy verbs showed a lower log frequency (mean = 4.67, SD = 2.27) than light verbs (mean = 12.84, SD = 1.47) [t(11.11) = −19.235, P < 0.001] in the spoken corpus. As such, the word types preferred by patients with nfvPPA have a lower frequency of occurrence than the less preferred words. Figure 1 shows the word distribution of COCA with colour-coded bars. Words of the type preferred by patients with agrammatism, bars in the red spectrum, are mainly located in the low-frequency tail of the distribution, while words of the less preferred type, bars in the grey spectrum, are in the skewed high-frequency part of the distribution.

Words of the type preferred by patients with nfvPPA and the impact of this choice on lexical entropy. The rank-ordered bar graph shows the word distribution of spoken language based on COCA. Bars in the grey spectrum show the words of the less preferred type by nfvPPA patients, and bars in the red spectrum depict words of the preferred type. The blue curve shows how the normalized lexical entropy of six-word strings increases as the sampling occurs from the less skewed sections of the distribution.
Figure 1

Words of the type preferred by patients with nfvPPA and the impact of this choice on lexical entropy. The rank-ordered bar graph shows the word distribution of spoken language based on COCA. Bars in the grey spectrum show the words of the less preferred type by nfvPPA patients, and bars in the red spectrum depict words of the preferred type. The blue curve shows how the normalized lexical entropy of six-word strings increases as the sampling occurs from the less skewed sections of the distribution.

A computational simulation to show that making sentences from the low-frequency tail of the lexicon results in higher lexical entropy

Here, we use a computational simulation to show that excluding higher-frequency words in a sentence increases normalized lexical entropy. In this simulation, we created strings of six words, the average sentence length produced by patients with nfvPPA in our cohort. First, we created a set of 10 000 strings with a length of six words by randomly sampling from the entire COCA, obeying the frequency weights of the corpus. This first set is the baseline, and the rest will be test sets. For the second set, we created another 10 000 six-word strings, excluding the most frequent word of the lexicon, ‘the’, from sampling. For the third set, we excluded the two most frequent words from sampling. We followed this procedure to create 100 sets of 10 000 strings of six words and calculated the normalized lexical entropy of each set.

Interestingly, we found that by only excluding ‘the’ from sampling, the normalized lexical entropy of sentences significantly increased (t = 5.2178, P < 0.001). The difference became larger as more high-frequency words were excluded from sampling. The blue curve in Fig. 1 shows the increase in normalized lexical entropy as a function of sampling from the lower-frequency sections of the lexicon. Supplementary material shows the statistical significance of the difference between the normalized lexical entropy of the baseline set and each of the 99 test sets. We replicated these findings for strings with a length varying from 2 to 15, as shown in Supplementary Fig. 1 and Table 1.

Analysing the effect of length on the lexical profile of a sentence

Comparing language production of healthy individuals under the unconstrained and sentence length-constrained conditions

Here, we compare the lexical profile of sentences of healthy individuals under the constraint of producing one- to two-word sentences with the unconstrained condition. To predict each case of lexical dissociation, we fitted a mixed-effects model with random effects for subjects with the condition of language production as a predictor. We found that the constrained sentences of healthy individuals contained a higher proportion of content words to all words (mean = 0.91, SD = 0.26) than the unconstrained condition (mean = 0.45, SD = 0.12) (β = 0.460, SE = 0.014, t = 33.34, P < 0.001). The proportion of nouns to nouns plus verbs was higher in the constrained (mean = 0.83, SD = 0.32) than unconstrained condition (mean = 0.61, SD = 0.24) (β = 0.218, SE = 0.030, t = 7.24, P < 0.001). Lastly, constrained sentences had a higher proportion of heavy verbs to all verbs (mean = 0.94, SD = 0.24) than the unconstrained condition (mean = 0.63, SD = 0.42) (β = 0.308, SE = 0.046, t = 6.702, P < 0.001). Interestingly, constrained sentences had more verbs in -ing form (mean = 0.80, SD = 0.40) than unconstrained sentences (mean = 0.44, SD = 0.43) (β = 0.308, SE = 0.068, t = 4.506, P < 0.001).

Furthermore, in a mixed-effects model with random effects for subjects with the condition of language production and sentence length as predictors, the log frequency of all words was lower in the constrained (mean = 8.03, SD = 2.10) than unconstrained condition (mean = 11.86, SD = 0.89) (β = −3.508, SE = 0.171, t = −20.541, P < 0.001). The log frequency of content words of a given sentence was also lower in the constrained (mean = 8.13, 1.94) than unconstrained condition (mean = 9.36, SD = 1.38) (β = −0.960, SE = 0.179, t = −5.36, P < 0.001). Figure 2 shows the lexical profile of the sentences of healthy speakers under the two conditions.

Box plots of the lexical profile of sentences under unconstrained and constrained conditions (n = 65). Boxes show the 25th, 50th and 75th percentile, and the whiskers represent the minimum and maximum values, excluding outliers.
Figure 2

Box plots of the lexical profile of sentences under unconstrained and constrained conditions (n = 65). Boxes show the 25th, 50th and 75th percentile, and the whiskers represent the minimum and maximum values, excluding outliers.

Evaluating sentence length–word frequency relationship in healthy individuals

We tested the general hypothesis that shorter sentences contain lower-frequency words in the language production of healthy speakers. We fitted a GAM to the sentences of healthy speakers in the unconstrained condition with a subject-specific random intercept to model the relationship between sentence length and word frequency. We found that sentence length could predict the average log frequency of all words within that sentence (EDF = 6.97, P < 0.001). As can be seen in Fig. 3A, the sentence length–word frequency relationship is approximately linear until the curve starts to plateau. To determine the sentence length where the curve plateaus, we created a random data set where the value of sentence length varied from 1 to 20 with 0.1 increments. We then used the fitted GAM of sentence length–word frequency in the unconstrained healthy language production data to predict the word frequency of the randomly created data set. We found that the maximum word frequency occurs at a sentence length of 11.4 (Fig. 3B).

Sentence length–word frequency relationship in the healthy control group (n = 34). (A) The scatter plot shows the relationship between the average log frequency of words within a given sentence and the sentence length in the unconstrained language of healthy individuals. (B) The GAM partial effect plot of the smooth term for word frequency is used to determine where the curve begins to plateau (dashed line). The shaded areas indicate the pointwise 95% confidence intervals of the fitted curves.
Figure 3

Sentence length–word frequency relationship in the healthy control group (n = 34). (A) The scatter plot shows the relationship between the average log frequency of words within a given sentence and the sentence length in the unconstrained language of healthy individuals. (B) The GAM partial effect plot of the smooth term for word frequency is used to determine where the curve begins to plateau (dashed line). The shaded areas indicate the pointwise 95% confidence intervals of the fitted curves.

Since longer sentences tend to contain more function words, we re-examined the sentence length–word frequency relationship after adding the proportion of function words to all words to the statistical model. We used a multivariable GAM to predict the average frequency of all words of a sentence from its length and the proportion of function words to all words with a subject-specific random intercept. We continued to find a significant relationship between sentence length and the average log frequency of all words (EDF = 5.12, P < 0.001) and the proportion of function words to all words (EDF = 2.90, P < 0.001). Furthermore, we evaluated the relationship between sentence length and the average log frequency of the content words within a sentence in a similar GAM model as that for the average log frequency of all words. We found that sentence length could predict the average log frequency of content words within a sentence (EDF = 4.05, P = 0.002).

Lexical information across the four groups

First, we compare the average log frequency of words and the normalized lexical entropy of sentences among PPA variants and healthy controls. We fitted a mixed-effects model with random effects for subjects to predict the average log frequency of all words with group and sentence length as predictors. We found that patients with nfvPPA produce sentences with a lower average log frequency of all words (mean = 10.88, SD = 1.92) than that of healthy controls (mean = 11.86, SD = 0.89) (β = −0.659, SE = 0.164, t = −4.015, P < 0.001), lvPPA (mean = 11.47, SD = 1.53) (β = −0.435, SE = 0.152, t = −2.845, P = 0.005) and svPPA (mean = 11.60, SD = 1.40) (β = −0.540, SE = 0.172, t = −3.131, P < 0.001). We ran a similar model by adding the proportion of function words to all words as a third predictor and continued to find that patients with nfvPPA produce sentences with a lower log frequency of all words than that of healthy controls (β = −0.572, SE = 0.149, t = −3.842, P < 0.001), lvPPA (β = −0.280, SE = 0.138, t = −2.047, P = 0.042) and svPPA (β = −0.311, SE = 0.157, t = −1.981, P = 0.049). Furthermore, we fitted a mixed-effects model with random effects for subjects to predict the average log frequency of content words with group and sentence length as predictors. We found that patients with nfvPPA produce sentences with a lower log frequency of content words (mean = 8.40, SD = 1.84) than those of healthy controls (mean = 9.36, SD = 1.38) (β = −0.700, SE = 0.180, t = −3.899, P < 0.001), lvPPA (mean = 9.81, SD = 1.99) (β = −0.124, SE = 0.166, t = −7.467, P = 0.040) and svPPA (mean = 10.09, SD = 1.68) (β = −0.147, SE = 0.187, t = −7.820, P < 0.001).

To compare normalized lexical entropy, we fitted a mixed-effects model with random effects for subjects to predict normalized lexical entropy with group as a predictor. We found that patients with nfvPPA produce sentences with higher normalized lexical entropy (mean = 0.995, SD = 0.11) than controls (mean = 0.993, SD = 0.01) (β = 0.003, SE = 0.001, t = 2.251, P = 0.026), lvPPA (mean = 0.993, SD = 0.01) (β = 0.002, SE = 0.001, t = 2.100, P = 0.038) but not different from svPPA (mean = 0.994, SD = 0.01) (β < 0.001, SE = 0.001, t = 0.684, P = 0.495).

Lastly, in a mixed-effects model with random effects for subjects to predict entropy from word frequency, we found a significant negative relationship between the two, showing sentences with lower-frequency words have higher lexical entropy (β = 0.001, SE < 0.001, t = −3.894, P < 0.001).

Comparing sentence length–word frequency relationships among healthy speakers and three PPA variants

Here, we evaluate the relationship between the log word frequency of all words in a given sentence and sentence length in the three variants of PPA and compare it with that of healthy controls. We fitted a GAM to sentences produced by all groups with a subject-specific random intercept to model the relationship between sentence length and word frequency. The results showed that the average log frequency of all words could be predicted from sentence length (EDF = 7.54, P < 0.001) (Fig. 4A). To examine whether there was an interaction between word frequency–sentence length relationship and group, we performed a generalized likelihood ratio test for penalized splines to examine. We found no interaction between the average log word frequency–sentence length curve and group (d.f. = −9.247, Deviance = 14.03, P = 0.927), showing that the curves have a similar overall shape across four groups.

Sentence length and word frequency relationship across four groups (n = 165). (A) The scatterplot shows the relationship between sentence length and the average log frequency of all words within a given sentence in all groups. (B) The scatterplot shows the relationship between sentence length and the average log frequency of content words within a given sentence in all groups combined.
Figure 4

Sentence length and word frequency relationship across four groups (n = 165). (A) The scatterplot shows the relationship between sentence length and the average log frequency of all words within a given sentence in all groups. (B) The scatterplot shows the relationship between sentence length and the average log frequency of content words within a given sentence in all groups combined.

We repeated these analyses for content words and found that the average log frequency of content words could be predicted from sentence length (EDF = 5.75, P < 0.001) (Fig. 3B). Similarly, we found no interaction between the shape of sentence length–content word frequency curve and group (d.f. = −8.20, Deviance = −29.25, P = 0.265).

Discussion

In this study, we provided a parsimonious account of agrammatism based on information theory. We showed that the particular lexical profile of agrammatism arises from a process that favours low-frequency words in response to patients’ core difficulty in producing long, complex sentences. Previous research has shown that speakers of a language are sensitive to the probability distribution of words and syntactic rules of that language.59–61 This sensitivity to the statistical properties of language allows learners, including infants, to discover words, syntactic structures and sound patterns from the ground up.62 Here, we show that sensitivity to word frequency can further assist patients with agrammatism in optimizing the lexical information of their short sentences. While sentences generated from this cognitive process may appear disjointed, they capture the essence of the intended meaning. Furthermore, as an integral part of our proposal, we provided a computational simulation to show how selecting low-frequency words increases the lexical entropy of sentences. We found that even excluding the single most frequent word of the lexicon, ‘the’, increases the lexical entropy of the resulting word strings. Our simulation further delineated the relationship between word frequency and lexical entropy. Although the average word frequency and lexical entropy of sentences are two related measures of predictability, we found word frequency to be a more specific index of lexical information for clinical purposes as it better differentiated PPA variants.

Our work, initially outlined in Rezaii et al.,63 revives a series of accounts from the past century based on compensation. According to ideas regarding the ‘economy of effort’, the intensive effort required to articulate speech forces non-fluent patients to plan short strings of only essential words that exclude function words.64,65 Despite its plausibility, this idea did not fare well during the pre-information theory era, likely because it lacked a robust way to measure information. Our study operationalizes the measuring of lexical information and extends the compensation idea to other cases of lexical dissociation in agrammatism beyond just the function/content word dissociation. The theoretical foundation of this proposal was further supported by a review paper that regarded the compensatory response of patients with agrammatism as a rational behaviour in the face of their increased cost of language production.66

The conceptual framework of this study shifts away from the syntax-centric accounts that consider a deficit in syntax processing to be the cause of the lexical profile of agrammatism. Under such accounts, it remained unclear why patients with agrammatism could have intact online access to the verb lexicon,67,68 access to all possible argument structures of verbs during online sentence processing,67,68 and minimal errors in using function words in sentence completion tasks.69 Unlike syntax-centric accounts, these results, which indicate near-normal lexical production in agrammatism, fit well within the information-theoretic account of the disorder.

Our study further showed that the lexical compensation strategy in agrammatism is not unique to patients with nfvPPA but rather highlights a fundamental property of normal language production. When healthy speakers were constrained to produce one- and two-word sentences, their language exhibited features similar to those of patients with agrammatism, including an increase in the proportion of content words to all words, nouns to verbs and heavy verbs to light verbs. In a constrained production, it is logical to expect function words and light verbs to be dropped due to their semantic emptiness. However, the relationship between the increased use of nouns and information compression may not be readily apparent. Previous research has shown that in writing, particularly in scientific abstracts with strict word limits, the ratio of nouns to verbs tends to increase.70–72 The information compression capacity of nouns stems from several of their inherent properties.73–76 Nominalization involves transferring the information content from a clause to a noun phrase, which can lead to shorter and simpler sentences. This process also allows for stacking groups of meaning within noun phrases, which is a more economical approach compared to stacking clauses, as the latter can result in verbosity and complexity. Moreover, using noun phrases in place of clauses anonymizes the agent of the action and shifts the focus from the actor to the action while saving words.

Also similar to patients with agrammatism,77–80 the constrained language of healthy speakers resulted in more verbs in -ing form. For example, in describing a man who was fishing on a pier, both groups used the sentence ‘man fishing’. Deviations in verb inflection in agrammatism reflect a complex phenomenon involving multiple factors, such as the type of production task and the cognitive load associated with it, the accessibility of certain verb forms and the verb properties specific to a language.81–83 Our findings on the increased use of participles under production constraints suggest that attempts to optimize information transfer could be another determining factor in verb morphology. Using participles without auxiliaries might offer a concise way of expressing an action while emphasizing its progressive aspect.

Following the observation of the lexical profile of the length-constrained condition, we tested a general relationship between the average word frequency and the length of a sentence. We found that up to a sentence length of ∼11 words, shorter sentences predict lower-frequency words. The sentence length–word frequency relationship showed to be a fundamental property of language production that is preserved even in patients with lexicosemantic deficits. This fundamental property can be explained by forces that shape human language to transfer the maximum amount of information with the least effort.84 When speakers plan to communicate a particular idea, they have multiple options regarding the choice of words, syntax and sentence length.85,86 However, some formulations may be inefficient due to redundancy or verbosity and are therefore avoided. Constructing longer sentences87–91 and retrieving lower-frequency words92–94 are two cognitively demanding processes that share the same goal of conveying more information. Therefore, balancing the two sources of information would be an efficient way of communicating a message. This efficiency of communication draws on essential properties of language production. One property is the existence of a self-monitoring system that tracks how well the message of a sentence is conveyed.95 The other property is the close interactivity between the lexical and structural sources of information.35,96,97 This interactivity enables patients with sentence structure deficits to compensate by using low-frequency words and patients with lexicosemantic impairments by choosing more complex structures to convey their message.98

From this work, we cannot determine the functional locus of bottleneck99 in producing long, complex sentences in patients with nfvPPA. Among alternative candidates are impairments at the conceptualization and language planning stages, poor executive function,100 impaired working memory, deficient phonological processing23,101 and motor planning deficits.65 Future work is needed to elucidate the core mechanism underlying the fundamental limitation on sentence production in patients with ‘agrammatism’.

Supplementary material

Supplementary material is available at Brain Communications online.

Acknowledgements

We thank Drs Arash Afraz and Nick Chater for their comments on this work and Dr Jordan Green for sharing speech samples from control participants. We express special appreciation to the participants in this study and their family members for their time and effort, without which this research would not have been possible.

Funding

This work was supported by National Institute of Health (R01 DC014296, R01 DC013547, R21 DC019567 and R21 AG073744).

Competing interests

The authors report no competing interests.

Data availability

The language scores of anonymized patients and healthy individuals and the R code used for data analysis are available at the following link. https://dataverse.harvard.edu/dataset.xhtml? persistentId=doi:10.7910/DVN/SOMVGP.

References

1

Damasio
AR
.
Aphasia
.
N Engl J Med
.
1992
;
326
(
8
):
531
539
.

2

Saffran
EM
,
Berndt
RS
,
Schwartz
MF
.
The quantitative analysis of agrammatic production: Procedure and data
.
Brain Lang
.
1989
;
37
(
3
):
440
479
.

3

Goodglass
H
.
Agrammatism in aphasiology
.
Clin Neurosci
.
1997
;
4
(
2
):
51
56
.

4

Goodglass
H
,
Berko Gleason
J
.
Agrammatism and inflectional morphology in English
.
J Speech Hear Res
.
1960
;
3
:
257
267
.

5

Zurif
EB
,
Caramazza
A
,
Myerson
R
.
Grammatical judgments of agrammatic aphasics
.
Neuropsychologia
.
1972
;
10
(
4
):
405
417
.

6

Miceli
G
,
Silveri
MC
,
Villa
G
,
Caramazza
A
.
On the basis for the agrammatic's difficulty in producing main verbs
.
Cortex J Devoted Study Nerv Syst Behav
.
1984
;
20
(
2
):
207
220
.

7

Goodglass
H
,
Geschwind
N
. Language disorder (in aphasia). In:
Carterette
E
and
Friedman
M
, eds.
Handbook of perception
.
Vol. 7.
Academic Press
;
1976
:389-428.

8

Kussmaul
A
, ed.
Handbuch Der Speciellen Pathologie und therapie: Die storungen der sprache versuch einer pathologie der sprache
. Vol.
12
.
FCW Vogel
;
1877
.

9

Deleuze
JPF
, ed.
Histoire critique du magnétisme animal
.
Chez Belin-Leprieur
;
1819
.

10

de Villiers
J
.
Quantitative aspects of agrammatism in aphasia
.
Cortex
.
1974
;
10
(
1
):
36
54
.

11

Caramazza
A
,
Zurif
EB
.
Dissociation of algorithmic and heuristic processes in language comprehension: Evidence from aphasia
.
Brain Lang
.
1976
;
3
(
4
):
572
582
.

12

Goodenough
C
,
Zurif
EB
,
Weintraub
S
.
Aphasics’ attention to grammatical morphemes
.
Lang Speech
.
1977
;
20
(
1
):
11
19
.

13

Gleason
HA
, ed.
An introduction to descriptive linguistics
.
Rinehart and Winston
;
1961
.

14

Garrett
MF
. Processes in language production. In:
Newmeyer
FJ
, ed.
Linguistics: The Cambridge Survey: Vol. III. Language: psychological and biological aspects
.
Cambridge University Press; 1988
:69-96.

15

Garrett
MF
. The analysis of sentence production. In: Gordon H, ed.
Bower, Psychology of learning and motivation
, 1st edn., Vol.
9
.
Academic Press
;
1975
:133-177.

16

Bradley
DC
,
Garrett
MF
,
Zurif
EB
. Syntactic deficits in Broca's aphasia. In David Caplan:
Biological studies of mental processes
.
MIT Press
;
1980
:1969-1986.

17

Segalowitz
SJ
,
Lane
KC
.
Lexical access of function versus content words
.
Brain Lang
.
2000
;
75
(
3
):
376
389
.

18

Pulvermüller
F
.
Brain mechanisms linking language and action
.
Nat Rev Neurosci
.
2005
;
6
(
7
):
576
582
.

19

Bates
E
,
Chen
S
,
Tzeng
OJ
,
Li
P
,
Opie
M
.
The noun-verb problem in Chinese aphasia
.
Brain Lang
.
1991
;
41
(
2
):
203
233
.

20

Bird
H
,
Franklin
S
.
Cinderella revisited: A comparison of fluent and non-fluent aphasic speech
.
J Neurolinguistics
.
1996
;
9
(
3
):
187
206
.

21

Zingeser
LB
,
Berndt
RS
.
Retrieval of nouns and verbs in agrammatism and anomia
.
Brain Lang
.
1990
;
39
(
1
):
14
32
.

22

Daniele
A
,
Giustolisi
L
,
Silveri
MC
,
Colosimo
C
,
Gainotti
G
.
Evidence for a possible neuroanatomical basis for lexical processing of nouns and verbs
.
Neuropsychologia
.
1994
;
32
(
11
):
1325
1341
.

23

Hillis
AE
,
Caramazza
A
.
Converging evidence for the interaction of semantic and sublexical phonological information in accessing lexical representations for spoken output
.
Cogn Neuropsychol
.
1995
;
12
(
2
):
187
227
.

24

Williams
SE
,
Canter
GJ
.
Action-naming performance in four syndromes of aphasia
.
Brain Lang
.
1987
;
32
(
1
):
124
136
.

25

Druks
J
.
Verbs and nouns—A review of the literature
.
J Neurolinguistics
.
2002
;
15
(
3-5
):
289
315
.

26

Miceli
G
,
Silveri
MC
,
Nocentini
U
,
Caramazza
A
.
Patterns of dissociation in comprehension and production of nouns and verbs
.
Aphasiology
.
1988
;
2
(
3-4
):
351
358
.

27

Saffran
EM
,
Schwartz
MF
,
Marin
OSM
.
The word order problem in agrammatism: II. Production
.
Brain Lang
.
1980
;
10
(
2
):
263
280
.

28

Lapointe
SG
.
A theory of verb form use in the speech of agrammatic aphasics
.
Brain Lang
.
1985
;
24
(
1
):
100
155
.

29

Berndt
R
,
Haendiges
AN
,
Mitchum
CC
,
Sandson
J
.
Verb retrieval in aphasia. 2. Relationship to sentence processing
.
Brain Lang
.
1997
;
56
(
1
):
107
137
.

30

Bencini
G
,
Ronald
D
.
Verb access difficulties in agrammatic aphasic narratives
. In: Paper Presented at the 70th Annual Meeting of the Linguistic Society of America. San Diego, CA.
1996
.

31

Breedin
SD
,
Saffran
EM
,
Schwartz
MF
.
Semantic factors in verb retrieval: An effect of complexity
.
Brain Lang
.
1998
;
63
(
1
):
1
31
.

32

Jespersen
O
, ed.
Modern English grammar on historical principles: Part V Syntax
.
Allen & Unwin
;
1965
.

33

Maouene
J
,
Laakso
A
,
Smith
LB
.
Object associations of early-learned light and heavy English verbs
.
First Lang
.
2011
;
31
(
1
). doi:.

34

Kegl
J
.
Levels of representation and units of access relevant to agrammatism
.
Brain Lang
.
1995
;
50
(
2
):
151
200
.

35

Gordon
JK
,
Dell
GS
.
Learning to divide the labor: An account of deficits in light and heavy verb production
.
Cogn Sci
.
2003
;
27
(
1
):
1
40
.

36

Gorno-Tempini
ML
,
Hillis
AE
,
Weintraub
S
, et al.
Classification of primary progressive aphasia and its variants
.
Neurology
.
2011
;
76
(
11
):
1006
1014
.

37

Dell
GS
.
Effects of frequency and vocabulary type on phonological speech errors
.
Lang Cogn Process
.
1990
;
5
(
4
):
313
349
.

38

Shannon
CE
.
A mathematical theory of communication
.
Bell Sys Tech J
.
1948
;
27
:
379
423
.

39

Rezaii
N
,
Michaelov
J
,
Josephy-Hernandez
S
, et al.
A computational approach for measuring sentence information via surprisal: Theoretical implications in nonfluent primary progressive aphasia. medRxiv
.
Cold Spring Harbor Laboratory Press; 2022
.

40

Piantadosi
ST
,
Tily
H
,
Gibson
E
.
Word lengths are optimized for efficient communication
.
Proc Natl Acad Sci USA
.
2011
;
108
(
9
):
3526
3529
.

41

Zipf
GK
, ed.
The psycho-biology of language
.
MIT Press
;
1936
.

42

Zipf
GK
, ed.
Human behavior and the principle of least effort
.
Addison-Wesley Press; 1949
.

43

Sapolsky
D
,
Domoto-Reilly
K
,
Negreira
A
,
Brickhouse
M
,
McGinnis
S
,
Dickerson
BC
.
Monitoring progression of primary progressive aphasia: Current approaches and future directions
.
Neurodegener Dis Manag
.
2011
;
1
(
1
):
43
55
.

44

Sapolsky
D
,
Domoto-Reilly
K
,
Dickerson
BC
.
Use of the Progressive Aphasia Severity Scale (PASS) in monitoring speech and language status in PPA
.
Aphasiology
.
2014
;
28
(
8-9
):
993
1003
.

45

Kertesz
A
,
Kertesz
A
,
Raven
JC
, eds.
Psychcorp (firm). WAB-R: Western aphasia battery-revised
.
PsychCorp
;
2007
.

46

Hunt
KW
, ed.
Grammatical structures written at three grade levels
.
National Council of Teachers of English
;
1965
.

47

Rezaii
N
,
Wolff
P
,
Price
BH
.
Natural language processing in psychiatry: The promises and perils of a transformative approach
.
Br J Psychiatry
.
2022
;
220
(
5
):
251
253
.

48

Qi
P
,
Zhang
Y
,
Zhang
Y
,
Bolton
J
,
Manning
CD
.
Stanza: A Python natural language processing toolkit for many human languages
. In:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
.
Association for Computational Linguistics
.
2020
:
101
108
.

49

Rezaii
N
,
Mahowald
K
,
Ryskin
R
,
Dickerson
B
,
Gibson
E
.
Syntactic rule frequency as a measure of syntactic complexity: insights from primary progressive aphasia
. In:
34th Annual CUNY Conference on Human Sentence Processing
. Accessed 14 June 2022. https://www.cuny2021.io/2021/02/24/251/.

50

Josephy-Hernandez
S
,
Rezaii
N
,
Jones
A
, et al.
Automated analysis of functional written communication in the three variants of primary progressive aphasia (P7-3.001)
.
Neurology
.
2022
;
98
:
3240
.

51

Rezaii
N
,
Ryskin
R
,
Cordella
C
,
Quimby
M
,
Dickerson
B
,
Gibson
E
.
An information-theoretic characterization of language production in primary progressive aphasia
.
J Neuropsychiatry Clin Neurosci
.
2020
;
32
:
E19
.

52

Davies
M
.
The Corpus of Contemporary American English (COCA)
.
2008
. https://www.english-corpora.org/coca/.

53

Schumaker
L
, ed.
Spline functions: Basic theory
. 3rd edn.
Cambridge University Press
;
2007
.

54

Wood
SN
.
Thin plate regression splines
.
J R Stat Soc Ser B Stat Methodol
.
2003
;
65
(
1
):
95
114
.

55

Wood
SN
.
Mixed GAM computation vehicle with automatic smoothness estimation
;
2012
.

56

Corbeil
RR
,
Searle
SR
.
Restricted maximum likelihood (REML) estimation of variance components in the mixed model
.
Technometrics
.
1976
;
18
(
1
):
31
38
.

57

Crainiceanu
C
,
Ruppert
D
,
Claeskens
G
,
Wand
MP
.
Exact likelihood ratio tests for penalised splines
.
Biometrika
.
2005
;
92
(
1
):
91
103
.

58

Bates
D
,
Mächler
M
,
Bolker
B
,
Walker
S
.
Fitting linear mixed-effects models using lme4
.
J Stat Softw
.
2015
;
67
(
1
):
1
48
.

59

Hale
J
.
A probabilistic Earley parser as a psycholinguistic model
. In:
Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies
.
Association for Computational Linguistics
.
2001
:
1
8
.

60

Jurafsky
D
.
A probabilistic model of lexical and syntactic access and disambiguation
.
Cogn Sci
.
1996
;
20
(
2
):
137
194
.

61

Levy
R
.
Expectation-based syntactic comprehension
.
Cognition
.
2008
;
106
(
3
):
1126
1177

62

Divjak
D
. Measuring exposure: Frequency as a linguistic game changer. In: Divjak D, ed.
Frequency in language: Memory, attention and learning
.
Cambridge University Press
;
2019
:
40
71
.

63

Rezaii
N
,
Ren
B
,
Quimby
M
,
Hochberg
D
,
Dickerson
B
.
Less is more in language production: Shorter sentences contain more informative words. medRxiv
;
2022
.

64

Isserlin
M
.
Über agrammatismus
.
Zeitschrift für Neurologie und Psychiatrie
.
1922
;
75
:
332
416
.

65

Pick
A
, ed.
Die Agrammatischen Sprachstörungen: Studien zur Psychologischen Grundlegung der Aphasielehre
.
Springer
;
1913
.

66

Fedorenko
E
,
Ryskin
R
,
Gibson
E
.
Agrammatic output in non-fluent/Broca’s aphasia as a rational behavior. PsyArXiv
;
2022
.

67

Shapiro
LP
,
Levine
BA
.
Verb processing during sentence comprehension in aphasia
.
Brain Lang
.
1990
;
38
(
1
):
21
47
.

68

Shapiro
LP
,
Gordon
B
,
Hack
N
,
Killackey
J
.
Verb-argument structure processing in complex sentences in Broca’s and Wernicke’s aphasia
.
Brain Lang
.
1993
;
45
(
3
):
423
447
.

69

Kolk
HHJ
,
Van Grunsven
MJF
,
Keyser
A
. On parallelism between production and comprehension in agrammatism. In: Kean M, ed.
Agrammatism
.
Elsevier
;
1985
:
165
206
.

70

Halliday
MAK
,
Christie
F
, eds.
Spoken and written language
. 2nd edn.
Oxford University Press
;
1989
.

71

Yue
L
,
Zhang
Y
.
Realization of nominalization functions in abstracts
.
Int J Lang Linguist
.
2019
;
6
(
4
):
v6n4p22
.

72

Zhu
Y
,
Dong
H
.
Lexical metaphor, grammatical metaphor and their complementarity in technical discourses
.
Shandong Foreign Lang Teach
.
2001
;
4
:
5
8
.

73

Halliday
M
, ed.
An introduction to functional grammar
.
Edward Arnold
;
1994
.

74

Kazemian
B
,
Behnam
B
,
Ghafoori
N
.
Ideational grammatical metaphor in scientific texts: A Hallidayan perspective
.
Int J Linguist
.
2013
;
5
(
4
):
146
.

75

L
i Q.
Functions of nominalization in scientific news discourse
.
In: 4th International Conference on Education, Management and Computing Technology. Atlantis Press
;
2017
.

76

Ravelli
LJ
, ed.
Metaphor, mode and complexity : An exploration of co-varying patterns
.
Nottingham Trent University, Department of English and Media Studies
;
1999
.

77

Bastiaanse
R
,
Thompson
CK
.
Verb and auxiliary movement in agrammatic Broca's aphasia
.
Brain Lang
.
2003
;
84
(
2
):
286
305
.

78

Centeno
JG
.
Use of verb inflections in the oral expression of agrammatic Spanish-speaking aphasics
.
Unpublished doctoral dissertation, CUNY Acad Works
;
1996
:
119
.

79

Centeno
J
,
Obler
L
.
Agrammatic verb errors in Spanish speakers and their normal discourse correlates
.
Publ Res
.
Published online January 1, 2001
. https://academicworks.cuny.edu/gc_pubs/55.

80

Arslan
S
,
Bamyacı
E
,
Bastiaanse
R
.
A characterization of verb use in Turkish agrammatic narrative speech
.
Clin Linguist Phon
.
2016
;
30
(
6
):
449
469
.

81

Faroqi-Shah
Y
.
Are regular and irregular verbs dissociated in non-fluent aphasia? A meta-analysis
.
Brain Res Bull
.
2007
;
74
(
1-3
):
1
13
.

82

Faroqi-Shah
Y
,
Friedman
L
.
Production of verb tense in agrammatic aphasia: A meta-analysis and further data
.
Behav Neurol
.
2015
;
2015
:
983870
.

83

Friedmann
N
. Moving verbs in agrammatic production. In:
Bastiaanse
R
,
Grodzinsky
Y
, eds.
Grammatical disorders in aphasia. A neurolinguistic perspective
.
Wiley; 2000
:
152
170
.

84

Gibson
E
,
Futrell
R
,
Piantadosi
ST
, et al.
How efficiency shapes human language
.
Trends Cogn Sci
.
2019
;
23
(
12
):
1087
.

85

Bock
K
,
Ferreira
VS
. Syntactically speaking. In: Nathan P, ed.
The Oxford handbook of language production
, Oxford University Press;
2014
:21-46.

86

Levelt
WJ
,
Roelofs
A
,
Meyer
AS
.
A theory of lexical access in speech production
.
Behav Brain Sci
.
1999
;
22
(
1
):
1
38
; discussion
38
75
.

87

Sigurd
B
,
Eeg-Olofsson
M
,
van Weijer
J
.
Word length, sentence length and frequency—Zipf revisited
.
Stud Linguist
.
2004
;
58
(
1
):
37
52
.

88

Futrell
R
,
Mahowald
K
,
Gibson
E
.
Large-scale evidence of dependency length minimization in 37 languages
.
Proc Natl Acad Sci USA
.
2015
;
112
(
33
):
10336
10341
.

89

Gibson
E
.
Linguistic complexity: Locality of syntactic dependencies
.
Cognition
.
1998
;
68
(
1
):
1
76
.

90

Grodner
D
,
Gibson
E
.
Consequences of the serial nature of linguistic input for sentenial complexity
.
Cogn Sci
.
2005
;
29
(
2
):
261
290
.

91

Tsizhmovska
NL
,
Martyushev
LM
.
Principle of least effort and sentence length in public speaking
.
Entropy
.
2021
;
23
(
8
):
1023
.

92

Kittredge
AK
,
Dell
GS
,
Verkuilen
J
,
Schwartz
MF
.
Where is the effect of frequency in word production? Insights from aphasic picture naming errors
.
Cogn Neuropsychol
.
2008
;
25
(
4
):
463
492
.

93

Oldfield
RC
,
Wingfield
A
.
Response latencies in naming objects
.
Q J Exp Psychol
.
1965
;
17
(
4
):
273
281
.

94

Jescheniak
JD
,
Levelt
WJM
.
Word frequency effects in speech production: Retrieval of syntactic information and of phonological form
.
J Exp Psychol Learn Mem Cogn
.
1994
;
20
(
4
):
824
843
.

95

Levelt
WJM
. Self-monitoring and self-repair. In Levelt WJM, ed.
Speaking: From intention to articulation
.
Bradford Books
;
1993
:
458
499
.

96

Thorne
J
,
Faroqi-Shah
Y
.
Verb production in aphasia: Testing the division of labor between syntax and semantics
.
Semin Speech Lang
.
2016
;
37
(
1
):
23
33
.

97

Goodman
EBJC
.
On the inseparability of grammar and the lexicon: Evidence from acquisition, aphasia and real-time processing
.
Lang Cogn Process
.
1997
;
12
(
5-6
):
507
584
.

98

Neguine
R
,
Kyle
M
,
Rachel
R
,
Bradford
D
,
Edward
G
.
A syntax–lexicon trade-off in language production
.
Proc Natl Acad Sci USA
.
2022
;
119
(
25
):
e2120203119
.

99

Christiansen
MH
,
Chater
N
.
The now-or-never bottleneck: A fundamental constraint on language
.
Behav Brain Sci
.
2016
;
39
:
e62
.

100

Grossman
M
,
Irwin
DJ
.
Primary progressive aphasia and stroke aphasia
.
Contin Minneap Minn
.
2018
;
24
(
3
):
745
767
.

101

Kean
ML
.
Agrammatism: A phonological deficit?
Cognition
.
1979
;
7
(
1
):
69
83
.

Abbreviations

     
  • COCA =

    Corpus of Contemporary American English

  •  
  • EDF =

    effective degrees of freedom

  •  
  • GAM =

    generalized additive models

  •  
  • lvPPA =

    logopenic variant primary progressive aphasia

  •  
  • MGH = =

    Massachusetts General Hospital

  •  
  • nfvPPA = =

    non-fluent variant primary progressive aphasia

  •  
  • PASS =

    Progressive Aphasia Severity Scale

  •  
  • PPA =

    primary progressive aphasia

  •  
  • svPPA =

    semantic variant primary progressive aphasia

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Supplementary data