Evolutionary correspondence analysis of the semantic dynamics of frames

Row profiles at time $t = 14$

Sub-discourses	C1	C2	C3	C4	…	C525	Total
Germany, left (FR)	0.0011	0.1134	0.1134	0.1134	…	0.0011	100
Germany, centre (SZ)	0.0324	0.0003	0.2271	0.0649	…	0.0973	100
Germany, right (DW)	0.0004	0.0856	0.1285	0.3854	…	0.1285	100
Greece, left (ET)	0.0569	0.0003	0.0003	0.0285	…	0.0285	100
Greece, centre (TN)	0.0006	0.0006	0.0006	0.0006	…	0.0649	100
Greece, right (KA)	0.1390	0.0007	0.0007	0.0007	…	0.0007	100
Average Profile	0.0393	0.0238	0.0862	0.1018	…	0.0627	100

Sub-discourses	C1	C2	C3	C4	…	C525	Total
Germany, left (FR)	0.0011	0.1134	0.1134	0.1134	…	0.0011	100
Germany, centre (SZ)	0.0324	0.0003	0.2271	0.0649	…	0.0973	100
Germany, right (DW)	0.0004	0.0856	0.1285	0.3854	…	0.1285	100
Greece, left (ET)	0.0569	0.0003	0.0003	0.0285	…	0.0285	100
Greece, centre (TN)	0.0006	0.0006	0.0006	0.0006	…	0.0649	100
Greece, right (KA)	0.1390	0.0007	0.0007	0.0007	…	0.0007	100
Average Profile	0.0393	0.0238	0.0862	0.1018	…	0.0627	100

Table 1.

Row profiles at time $t = 14$

Sub-discourses	C1	C2	C3	C4	…	C525	Total
Germany, left (FR)	0.0011	0.1134	0.1134	0.1134	…	0.0011	100
Germany, centre (SZ)	0.0324	0.0003	0.2271	0.0649	…	0.0973	100
Germany, right (DW)	0.0004	0.0856	0.1285	0.3854	…	0.1285	100
Greece, left (ET)	0.0569	0.0003	0.0003	0.0285	…	0.0285	100
Greece, centre (TN)	0.0006	0.0006	0.0006	0.0006	…	0.0649	100
Greece, right (KA)	0.1390	0.0007	0.0007	0.0007	…	0.0007	100
Average Profile	0.0393	0.0238	0.0862	0.1018	…	0.0627	100

Sub-discourses	C1	C2	C3	C4	…	C525	Total
Germany, left (FR)	0.0011	0.1134	0.1134	0.1134	…	0.0011	100
Germany, centre (SZ)	0.0324	0.0003	0.2271	0.0649	…	0.0973	100
Germany, right (DW)	0.0004	0.0856	0.1285	0.3854	…	0.1285	100
Greece, left (ET)	0.0569	0.0003	0.0003	0.0285	…	0.0285	100
Greece, centre (TN)	0.0006	0.0006	0.0006	0.0006	…	0.0649	100
Greece, right (KA)	0.1390	0.0007	0.0007	0.0007	…	0.0007	100
Average Profile	0.0393	0.0238	0.0862	0.1018	…	0.0627	100

Table 2 reports the column-profiles $ψ_{p n} (t) \times 100$ at time $t = 14$ ⁠, for $p = 1, 2, \dots, 6$ and $n = 1, 2, \dots, 525$ ⁠. The last column is the Average Column-Profile, the main diagonal of the matrix $D_{P} (t)$ ⁠. From Table 2, we can see that of all the references to the first concept that were recorded at time $t = 14$ ⁠, almost 40% derive from the Greek left-leaning Eleftherotypia (ET) and the right-leaning Kathimerini (KA), with another 20% found in the German centrist SZ. Across all concepts, the Greek centrist Ta Nea (TN) accounts for about 12% of all concept references.

Table 2.

Column profiles at time $t = 14$

Sub-discourses	C1	C2	C3	C4	…	C525	Average Profile
Germany, left (FR)	0.1988	32.8947	9.0662	7.6805	…	0.1247	6.8945
Germany, centre (SZ)	19.8807	0.3289	63.4633	15.3610	…	37.4065	24.0928
Germany, right (DW)	0.1988	65.7895	27.1985	69.1244	…	37.4065	18.2505
Greece, left (ET)	39.7614	0.3289	0.0907	7.6805	…	12.4688	27.4697
Greece, centre (TN)	0.1988	0.3289	0.0907	0.0768	…	12.4688	12.0494
Greece, right (KA)	39.7614	0.3289	0.0907	0.0768	…	0.1247	11.2431
Total	100	100	100	100	…	100	100

Sub-discourses	C1	C2	C3	C4	…	C525	Average Profile
Germany, left (FR)	0.1988	32.8947	9.0662	7.6805	…	0.1247	6.8945
Germany, centre (SZ)	19.8807	0.3289	63.4633	15.3610	…	37.4065	24.0928
Germany, right (DW)	0.1988	65.7895	27.1985	69.1244	…	37.4065	18.2505
Greece, left (ET)	39.7614	0.3289	0.0907	7.6805	…	12.4688	27.4697
Greece, centre (TN)	0.1988	0.3289	0.0907	0.0768	…	12.4688	12.0494
Greece, right (KA)	39.7614	0.3289	0.0907	0.0768	…	0.1247	11.2431
Total	100	100	100	100	…	100	100

Table 2.

Column profiles at time $t = 14$

Sub-discourses	C1	C2	C3	C4	…	C525	Average Profile
Germany, left (FR)	0.1988	32.8947	9.0662	7.6805	…	0.1247	6.8945
Germany, centre (SZ)	19.8807	0.3289	63.4633	15.3610	…	37.4065	24.0928
Germany, right (DW)	0.1988	65.7895	27.1985	69.1244	…	37.4065	18.2505
Greece, left (ET)	39.7614	0.3289	0.0907	7.6805	…	12.4688	27.4697
Greece, centre (TN)	0.1988	0.3289	0.0907	0.0768	…	12.4688	12.0494
Greece, right (KA)	39.7614	0.3289	0.0907	0.0768	…	0.1247	11.2431
Total	100	100	100	100	…	100	100

Sub-discourses	C1	C2	C3	C4	…	C525	Average Profile
Germany, left (FR)	0.1988	32.8947	9.0662	7.6805	…	0.1247	6.8945
Germany, centre (SZ)	19.8807	0.3289	63.4633	15.3610	…	37.4065	24.0928
Germany, right (DW)	0.1988	65.7895	27.1985	69.1244	…	37.4065	18.2505
Greece, left (ET)	39.7614	0.3289	0.0907	7.6805	…	12.4688	27.4697
Greece, centre (TN)	0.1988	0.3289	0.0907	0.0768	…	12.4688	12.0494
Greece, right (KA)	39.7614	0.3289	0.0907	0.0768	…	0.1247	11.2431
Total	100	100	100	100	…	100	100

It is easy to verify that

\begin{aligned} \underset{P \times 1}{diag {D_{P} (t)}} & = Ψ (t) D_{N} (t) 1_{P} = F (t) D_{N} {(t)}^{- 1} D_{N} (t) 1_{N} = F (t) 1_{N}, \\ \underset{N \times 1}{diag {D_{N} (t)}} & = Π {(t)}^{'} D_{P} (t) 1_{P} = F {(t)}^{'} D_{P} {(t)}^{- 1} D_{P} (t) 1_{P} = F {(t)}^{'} 1_{P}, \end{aligned}

that is,

\begin{aligned} f_{p .} (t) & = \sum_{n = 1}^{N} ψ_{p n} (t) f_{. n} (t) = \sum_{n = 1}^{N} \frac{f_{p n} (t)}{f_{. n} (t)} f_{. n} (t) = \sum_{n = 1}^{N} f_{p n} (t), p = 1, \dots, P \\ f_{. n} (t) & = \sum_{p = 1}^{P} π_{p n} (t) f_{p .} (t) = \sum_{p = 1}^{P} \frac{f_{p n} (t)}{f_{p .} (t)} f_{p .} (t) = \sum_{p = 1}^{P} f_{p n} (t), n = 1, \dots, N \end{aligned}

where $1_{K}$ is the $K \times 1$ vector with ones.

Table 3 shows that the Average Row-Profile, which is the main diagonal of the matrix $D_{N} (t)$ as well as the last line in Table 1, is a weighted average of the row-profiles with weights given by the Average Column-Profile, which is the main diagonal of the matrix $D_{P} (t)$ as well as the last column in Table 2. Analogously, the Average Column-Profile is a weighted average of the column-profiles with weights given by the Average Row-Profile.

Table 3.

Row profiles at time $t = 14$

[0.0011, 0.1134, 0.1134, 0.1134, \dots, 0.0011]

× 0.068945

+

[0.0324, 0.0003, 0.2271, 0.0649, \dots, 0.0973]

× 0.240928

+

[0.0004, 0.0856, 0.1285, 0.3854, \dots, 0.1285]

× 0.182505

+

[0.0569, 0.0003, 0.0003, 0.0285, \dots, 0.0285]

× 0.274697

+

[0.0006, 0.0006, 0.0006, 0.0006, \dots, 0.0649]

× 0.120494

+

[0.1390, 0.0007, 0.0007, 0.0007, \dots, 0.0007]

× 0.112431

=

[0.0393, 0.0238, 0.0862, 0.1018, \dots, 0.0627]

4.1 Extracting factors and loadings

Correspondence analysis is based on the spectral decomposition of the rescaled and centred frequencies. Let $Y (t) = D_{P}^{- 1 / 2} (t) F (t) D_{N}^{- 1 / 2} (t)$ be the $P \times N$ matrix that rescales both rows and columns of the matrix of frequencies $F$ ⁠. Then define the symmetric (and time-varying) $N \times N$ matrix $Γ_{N}^{Y} (t) : = D_{N} {(t)}^{- \frac{1}{2}} F^{'} (t) D_{P} {(t)}^{- 1} F (t) D_{N} {(t)}^{- \frac{1}{2}},$ which can be written as $Y {(t)}^{'} Y (t)$ ⁠. That is, $Γ_{N}^{Y} (t)$ is the sample covariance matrix of $Y (t)$ ⁠. It is possible to show (see Saporta, 2006) that the largest eigenvalue of $Γ_{N}^{Y} (t)$ is identically equal to one (identically means for all $Y (t)$ and all t), because of the constraint that frequencies sum up to one. The same constraint implies that the corresponding factor is $\pm 1$ ⁠, and therefore the first principal component of $Γ_{N}^{Y} (t)$ is useless for the interpretation of the results.² For this reason, practitioners perform CA on the (first rescaled and then) centred data matrix, that can be computed as $X = Y - D_{P}^{1 / 2} 1_{P} 1_{N}^{'} D_{N}^{1 / 2}$ ⁠, cf. Section 5.1. Let $Γ_{N}^{X} (t) = X {(t)}^{'} X (t)$ ⁠, and define its spectral decomposition as

Γ_{N}^{X} (t) = Λ_{N} (t) V_{N} (t) Λ_{N} {(t)}^{'}

(8)

$Λ_{N} (t)$ being the matrix whose columns are the eigenvectors corresponding to the eigenvalues of $Γ_{N}^{X} (t)$ ⁠, that are collected in decreasing order in the $N \times N$ diagonal matrix $V_{N} (t) = diag {v_{1} (t), \dots, v_{N} (t)}$ ⁠. In our application where $P = 6$ an $N = 525$ ⁠, and therefore the rank of $Γ_{N} (t)$ is at most $P - 1$ ⁠, that is, $v_{P} (t) = \dots = v_{N} (t) = 0$ for all t. In order to choose the number of latent factors $r$ to retain, we calculated the time-varying trace ratio

τ_{r} (t) = \frac{\sum_{j = 1}^{r} v_{j} (t)}{\sum_{k = 1}^{P - 1} v_{k} (t)}, 1 \leq r \leq P - 1,

(9)

which measures the amount of variance that is captured by the first $r$ axes. In our application $τ_{1} = 0.55$ and $τ_{2} = 0.71$ ⁠, that is, the first two eigenvalues can explain more than 70% of the total variability in the data, and thus we choose $r = 2$ and focus on two factors: the first and the second.

Correspondence analysis seeks to represent the interrelationships of categories of row and column variables (the sub-discourses and the concepts, respectively) on a two-dimensional map (more generally, in a $r$ -dimensional map). It can be thought of as trying to plot a cloud of data points (the cloud having height, width, thickness) on a single plane to give a reasonable summary of the relationships and variation within them. More concretely, loadings and factors can be represented on the same axes, see Figure 4. Moreover, for each of the P rows (the sub-discourses) and each of the N columns (the concepts), we can compute the contribution (of that row or that column) to the variability of the new (latent) axes. For any $1 \leq r \leq P - 1$ ⁠, the estimated time-varying $P \times r$ factors and $N \times r$ loadings are defined, respectively, as

(L o w - dimensional) {\hat{Φ}}_{P}^{(1 : r)} (t) = [D_{P} {(t)}^{- \frac{1}{2}}] Y (t) Λ_{N}^{(1 : r)} (t),

(10)

(H i g h - dimensional) {\hat{Λ}}_{N}^{(1 : r)} (t) = [D_{N} {(t)}^{- \frac{1}{2}}] Λ_{N}^{(1 : r)} (t) [V_{N}^{(1 : r)} {(t)}^{\frac{1}{2}}],

(11)

for all $t = 1, \dots, T$ ⁠. It easy to verify (see Benzécri, 1992) the following formulas

\begin{aligned} \underset{P \times r}{{\hat{Φ}}_{P}^{(1 : r)} (t)} & = \underset{P \times N}{Π (t)} \underset{N \times r}{{\hat{Λ}}_{N}^{(1 : r)} (t)} [\underset{r \times r}{V_{N}^{(1 : r)} {(t)}^{- \frac{1}{2}}}], \\ \underset{N \times r}{{\hat{Λ}}_{N}^{(1 : r)} (t)} & = \underset{N \times P}{Ψ {(t)}^{'}} \underset{P \times r}{{\hat{Φ}}_{P}^{(1 : r)} (t)} [\underset{r \times r}{V_{N}^{(1 : r)} {(t)}^{- \frac{1}{2}}}] \end{aligned},

(12)

which explain the meaning of the so-called barycentric (or dual) relationship that exists between factors and loadings. The duality in (12) is a direct consequence of the symmetric role played by row- and column-profiles. Performing PCA on the row-profiles is equivalent to a PCA on the column-profiles: the factors of one analysis are (up to $\sqrt{λ}$ ⁠) the loadings of the other analysis, and therefore they can be represented on the same factorial space (see Bouroche & Saporta, 1980, p. 93–94). In Section 6.3, we apply this joint representation to our data and interpret the results.

For all $j = 0, 1, \dots, r$ ⁠, the time-varying contributions of the sets of observations and concepts to the variability of the jth principal component (the jth new axis) are defined, respectively, as

\begin{matrix} ω_{j p}^{ϕ} (t) = f_{p .} (t) {\hat{ϕ}}_{p j}^{2} (t) / v_{j} (t), & p = 1, \dots, P, \\ ω_{j n}^{λ} (t) = f_{. n} (t) {\hat{λ}}_{n j}^{2} (t) / v_{j} (t), & n = 1, \dots, N . \end{matrix}

(13)

Since $v_{0}^{Y} (t) = 1$ ⁠, ${\hat{ϕ}}_{p 0}^{Y} (t) = \pm 1$ ⁠, and ${\hat{λ}}_{n 0}^{Y} (t) = \pm 1$ ⁠,

the contribution of the $p -$ th set of observations to the variability of the trivial factor (⁠ $j = 0$ ⁠) is the $p -$ th row profile $ω_{0 p}^{ϕ} (t) = f_{p .} (t)$ ⁠, and
the contribution of the $n -$ th concept to the variability of the trivial loading (⁠ $j = 0$ ⁠) is the $n -$ th column profile $ω_{0 n}^{λ} (t) = f_{. n} (t)$ ⁠.

It is possible that a point that is close to the origin has a stronger contribution than that of a point that is far from to the origin. Indeed:

The strength of the contribution of the pth set of observations to the variability of the jth axis depends on both its mass $f_{p .} (t)$ and its squared coordinate ${\hat{ϕ}}_{p j}^{2} (t)$ on that axis. Evaluating its position on the jth axis—given by ${\hat{ϕ}}_{p j} (t)$ —will be thus not sufficient to determinate its contribution.
The strength of the contribution of the nth concept to the variability of the jth axis depends on both its mass $f_{. n} (t)$ and its squared coordinate ${\hat{λ}}_{n j}^{2} (t)$ on that axis. Evaluating its position on the jth axis—given by ${\hat{λ}}_{n j} (t)$ —will be thus not sufficient to determinate its contribution.

In the next section, we present two important transition formulas for the computation of the matrices of factors and loadings defined in (10) and (11), respectively. These formulas permit the transition between the space of the variables (the $N = 525$ concepts) and the space of the observations (the $P = 6$ sub-discourses). Equation (14) is useful whenever one of the sizes (P or N) is large compared with the other, whereas (15) allows to obtain factors and loadings directly from row-profiles and column profiles.

4.2 Transition formulas

For ease of notation let us drop the time dependence and define, analogously to the $N \times N$ matrix $Γ_{N}^{X} = X^{'} X$ ⁠, the $P \times P$ matrix $Γ_{P}^{X} = X X^{'}$ whose eigenvalues are collected in decreasing order by the $P \times P$ diagonal matrix

V_{P} (t) = diag {v_{1} (t), \dots, v_{P - 1} (t), 0} .

Analogously to Principal Components Analysis (cf. Härdle & Simar, 2015), it is possible to prove that factors and loadings in (10) and (11) can be obtained, respectively, as

\begin{aligned} {\hat{Φ}}_{P}^{(1 : r)} & = [D_{P}^{- 1 / 2}] Λ_{P}^{(1 : r)} {[V_{P}^{(1 : r)}]}^{1 / 2} \\ {\hat{Λ}}_{N}^{(1 : r)} & = [D_{N}^{- 1 / 2}] X^{'} Λ_{P}^{(1 : r)}, \end{aligned}

(14)

for any positive integer $r \leq P - 1$ ⁠, where $Λ_{P}^{(1 : r)}$ is the $P \times r$ matrix whose columns are the eigenvectors corresponding to the largest $r$ eigenvalues of $Γ_{P}^{X}$ collected in decreasing order by the $r \times r$ diagonal matrix $V_{P}^{(1 : r)} = diag {v_{1}, \dots, v_{r}} = V_{N}^{(1 : r)}$ ⁠. In our case $P ≪ N$ and therefore, it is computationally faster to diagonalize $Γ_{P}^{X}$ (rather than $Γ_{N}^{X}$ ⁠) and then obtain ${\hat{Φ}}_{P}^{(1 : r)}$ and ${\hat{Λ}}_{N}^{(1 : r)}$ using (14). In Proposition 4 of Appendix B of the online supplementary material, we show that the three $N \times N$ matrices

\begin{aligned} Γ_{N}^{X} & = X^{'} X \\ Γ_{N}^{Ψ Π} & = Ψ^{'} Π \\ Γ_{N}^{Π Ψ} & = Π^{'} Ψ \end{aligned}

share the same eigenvalues, and that the orthonormal eigenvectors $Λ_{N}$ of $Γ_{N}^{X}$ can be obtained as

\begin{aligned} Λ_{N} & = D_{N}^{1 / 2} Λ_{N}^{Ψ Π} ([Λ_{N}^{Ψ Π}]^{'} D_{N} Λ_{N}^{Ψ Π})^{- 1 / 2}, \\ = D_{N}^{- 1 / 2} Λ_{N}^{Π Ψ} ([Λ_{N}^{Π Ψ}]^{'} D_{N}^{- 1} Λ_{N}^{Π Ψ})^{- 1 / 2}, \end{aligned}

(15)

where $Λ_{N}^{Ψ Π}$ and $Λ_{N}^{Π Ψ}$ are the unit-length eigenvectors of $Γ_{N}^{Ψ Π}$ and $Γ_{N}^{Π Ψ}$ ⁠, respectively. Both factors and loadings in (10) and (11) depend on the eigenvectors $Λ_{N}$ of the matrix $Γ_{N}^{Y}$ ⁠. Equation (15) can be useful to obtain $Λ_{N}$ from the eigenvectors of either row-profiles or column-profiles. The result in (15) represents a novel contribution to CA: it permits to obtain loadings and factors using row-profiles and column-profiles rather than from the matrix of frequencies. Indeed, from row- and column-profiles we can obtain $Λ_{N}$ through (15), then (since $X^{'} X$ ⁠, $Ψ^{'} Π$ and $Π^{'} Ψ$ share the same eigenvalues) we can find ${\hat{Φ}}_{P}$ and ${\hat{Λ}}_{N}$ via (10) and (11), respectively.

4.3 Smoothing the loadings

In principle, there are two approaches to smoothing the loadings: the first is the one we followed in Motta and Baden (2013), the second is the one presented in this paper.

Given our observations we compute the cross-products and smooth them, to obtain the time-varying covariance matrix. Then we obtain the latent matrices (loadings and factors) from the smoothed covariance matrix. Here the smoothing is applied in the first step to the cross-products, that is:
- we first compute and smooth the matrix of cross-products, and
- we then derive the latent matrices from the smoothed covariance matrix.
Given our observations, we compute the cross-products (without smoothing) to obtain the time-varying raw covariance matrix $Γ^{X} (t)$ defined in (8). Then we obtain the latent quantities (loadings and factors) from this raw (or unsmoothed) covariance matrix. And finally, we smooth the latent quantities. Here the smoothing is not applied to the cross-products:
- we first derive the latent matrices from the unsmoothed cross-products, and then
- we smooth loadings and factors.

It is well known that unit-norm eigenvectors of a time-invariant matrix are unique up to a ± sign. In the case of time-invariant matrices, this indeterminacy can be resolved by fixing the first row of the eigenvector matrix to be positive (see Lawley & Maxwell, 1971, page 18). However when dealing with matrix-valued functions, following this convention leads to unsmooth eigenvector functions. A matrix-valued function is a path of matrices $A (t)$ whose entries depend on a real variable $x \in R$ (see, e.g. Bunse-Gerstner et al., 1991). If $A (t)$ is smooth in t, there exist smooth $Λ$ and smooth $V$ orthonormal and diagonal, respectively, such that $Λ {(t)}^{'} A (t) Λ (t) = V (t)$ for all t (see Dieci & Eirola, 1999, Proposition 2.4). Nevertheless, although mathematically such a smooth $Λ (t)$ does exist, the identification up to sign renders the eigenvectors computationally unsmooth. That is, the eigenvectors of $A (t)$ obtained from conventional eigendecomposition of $A (t)$ are generally not differentiable in t. In Appendix A.2 of the online supplementary material, we illustrate this phenomenon through a numerical example.

In order to remove the discontinuities over time of the time-varying loadings, we use the following criterion. For a given time point $t_{0}$ ⁠, we change the sign of the $N \times 1$ vector ${\hat{λ}}_{j} (t_{0} + 1)$ ⁠, the j-loadings at time $t_{0} + 1$ ⁠, if

‖ {\hat{λ}}_{j} (t_{0} + 1) - {\hat{λ}}_{j} (t_{0}) ‖ > ‖ {\hat{λ}}_{j} (t_{0} + 1) + {\hat{λ}}_{j} (t_{0}) ‖ .

(16)

For more details about the estimator in (16), we refer the reader to Motta et al. (2023). When using (16) to pick ‘the right sign’ of the loadings at a certain point $t_{0}$ in time, we also need to change the corresponding value of the factors at that point $t_{0}$ in time as well, since at each time point the relationship between factors and loadings must hold.

We apply the criterion in (16) to the ‘raw loadings’ $\hat{λ}$ rather than the smoothed loadings $\tilde{λ}$ defined in (17) below, since we are more likely to detect abrupt changes (smoothing reduces the size of the jumps at the points of discontinuity). To smooth the loadings, we follow two steps: in step (a) we make sure to avoid spurious discontinuities that are only due to the up-to-sign misidentification of eigenvectors, and in step (b) we smooth the loadings in a way to account only for present (⁠ $ℓ = 0$ ⁠) and past (⁠ $ℓ > 0$ ⁠) values of the raw loadings.

For $j = 1, 2$ ⁠, we use the criterion in (16) to ‘pick the right sign’ of the $N \times 1$ vector ${\hat{λ}}_{j} (t_{0})$ ⁠, the j-loadings at time $t_{0}$ ⁠, and we change the corresponding value of the factor ${\hat{ϕ}}_{j} (t_{0})$ ⁠. We do not need to flip all the factors accordingly: we do it for each j separately.
For $t = m + 1, \dots, T$ ⁠, define the smoothed loading matrix at time t as
$\tilde{Λ} (t) = \sum_{ℓ = 0}^{m} w (ℓ) \hat{Λ} (t - ℓ), with w (ℓ) = \frac{m + 1 - ℓ}{\sum_{k = 0}^{m} (k + 1)},$
(17)
where m is the length of our one-sided window. The weights $w (ℓ)$ are decreasing with ℓ, with the denominator being $\sum_{k = 0}^{m} (k + 1) = \frac{(m + 1) (m + 2)}{2}$ ⁠. For example, with $m = 3$ we obtain
$\tilde{Λ} (t) = 0.4 \hat{Λ} (t) + 0.3 \hat{Λ} (t - 1) + 0.2 \hat{Λ} (t - 2) + 0.1 \hat{Λ} (t - 3) .$

The properties of the smoothed loadings in (17) depend on the smoothing parameter m. In Appendix C of the online supplementary material, we derive the following asymptotic expression for the Mean Squared Error of the smoothed loadings:

M S E (m) = r {(\frac{m}{T})}^{2} + 1 / m .

The MSE is the sum of two terms: the first is the squared-bias, whereas the second is the variance. The larger the value of m, the larger the bias and the smaller the variance (and vice-versa). A small value of m allows to capture the local curvatures of loadings, at a cost of a larger variance. In non-parametric statistics, the bandwidth $m_{T}$ is a sequence that depends on the sample size T. In order to balance the trade-off between bias and variance, we select the optimal value of the smoothing parameter m as the minimizer of the MSE, that is, $m_{o p t} (T) = (\frac{T^{2}}{2 r})^{1 / 3}$ ⁠. The function $m_{o p t} (T)$ shows how the optimal value for the parameter m (adopted to smooth the loadings) relates to the overall length T of the time series.

For our application, we find that the approximate MSE invariance holds for m between 16 and 18 weeks, see Figure 1. We choose $m = 17$ (a quadrimester) because with $T = 143$ and $r = 2$ we obtain $m_{o p t} = (\frac{T^{2}}{2 r})^{1 / 3} = 17.2 \approx 17$ ⁠.

The mean squared error MSE(m) of the smoothed loadings Λ~ as a function of the smoothing parameter m. We minimize it w.r.t. m by choosing m=17.

Figure 1.

The mean squared error $M S E (m)$ of the smoothed loadings $\tilde{Λ}$ as a function of the smoothing parameter m. We minimize it w.r.t. m by choosing $m = 17$ ⁠.

4.4 Smoothing the factors

In our application, we extract two factors, $ϕ_{1} (t)$ and $ϕ_{2} (t)$ ⁠, both $P \times 1$ ⁠. After extracting the factors in (10) and adjusting their sign according to (16), we fit to each factor ${\hat{ϕ}}_{j} (t)$ a time-varying VAR(1) model according to the specification in (5):

\underset{P \times 1}{ϕ_{j} (t)} - \underset{P \times 1}{μ_{j} (\frac{t}{T})} = \underset{P \times P}{A_{j} (\frac{t}{T})} [\underset{P \times 1}{ϕ_{j} (t - 1)} - \underset{P \times 1}{μ_{j} (\frac{t - 1}{T})}] + \underset{P \times P}{Σ_{j}^{1 / 2}} \underset{P \times 1}{ε_{j} (t),} j = 1, 2,

(18)

where

\begin{aligned} ϕ_{j} (t) & = [ϕ_{j}^{D ℓ} (t), ϕ_{j}^{D c} (t), ϕ_{j}^{D r} (t), ϕ_{j}^{G ℓ} (t), ϕ_{j}^{G c} (t), ϕ_{j}^{G r} (t)]^{'}, \\ μ_{j} (u) & = [μ_{j}^{D ℓ} (u), μ_{j}^{D c} (u), μ_{j}^{D r} (u), μ_{j}^{G ℓ} (u), μ_{j}^{G c} (u), μ_{j}^{G r} (u)]^{'}, \end{aligned}

(19)

and

\begin{aligned} A_{j} (u) & = (\begin{matrix} A_{j}^{D D} (u) & A_{j}^{D G} (u) \\ A_{j}^{G D} (u) & A_{j}^{G G} (u) \end{matrix}) \\ = (\begin{matrix} A_{j}^{D ℓ, D ℓ} (u) & A_{j}^{D ℓ, D c} (u) & A_{j}^{D ℓ, D r} (u) & A_{j}^{D ℓ, G ℓ} (u) & A_{j}^{D ℓ, G c} (u) & A_{j}^{D ℓ, G r} (u) \\ A_{j}^{D c, D ℓ} (u) & A_{j}^{D c, D c} (u) & A_{j}^{D c, D r} (u) & A_{j}^{D c, G ℓ} (u) & A_{j}^{D c, G c} (u) & A_{j}^{D c, G r} (u) \\ A_{j}^{D r, D ℓ} (u) & A_{j}^{D r, D c} (u) & A_{j}^{D r, D r} (u) & A_{j}^{D r, G ℓ} (u) & A_{j}^{D r, G c} (u) & A_{j}^{D r, G r} (u) \\ A_{j}^{G ℓ, D ℓ} (u) & A_{j}^{G ℓ, D c} (u) & A_{j}^{G ℓ, D r} (u) & A_{j}^{G ℓ, G ℓ} (u) & A_{j}^{G ℓ, G c} (u) & A_{j}^{G ℓ, G r} (u) \\ A_{j}^{G c, D ℓ} (u) & A_{j}^{G c, D c} (u) & A_{j}^{G c, D r} (u) & A_{j}^{G c, G ℓ} (u) & A_{j}^{G c, G c} (u) & A_{j}^{G c, G r} (u) \\ A_{j}^{G r, D ℓ} (u) & A_{j}^{G r, D c} (u) & A_{j}^{G r, D r} (u) & A_{j}^{G r, G ℓ} (u) & A_{j}^{G r, G c} (u) & A_{j}^{G r, G r} (u) \end{matrix}) \end{aligned} .

(20)

For $j = 1, 2$ let ${\hat{ϕ}}_{j} (t)$ be the jth estimated factor, extracted using CA, and let ${\hat{μ}}_{j} (u)$ the estimated time-varying means. Then define ${\tilde{ϕ}}_{j} (t) = {\hat{ϕ}}_{j} (t) - {\hat{μ}}_{j} (\frac{t}{T})$ as the jth demeaned estimated factor, that is, the estimate of $ϕ_{j} (t) - μ_{j} (\frac{t}{T})$ ⁠, and therefore

{\tilde{ϕ}}_{j}^{k} (t) = \sum_{p = 1}^{P} A_{j}^{k, p} (\frac{t}{T}) {\tilde{ϕ}}_{j}^{p} (t - 1) .

(21)

For example, when $k = D c$ we have

\begin{aligned} {\tilde{ϕ}}_{j}^{D c} (t) & = A_{j}^{D c, D ℓ} (\frac{t}{T}) {\tilde{ϕ}}_{j}^{D ℓ} (t - 1) + A_{j}^{D c, D c} (\frac{t}{T}) {\tilde{ϕ}}_{j}^{D c} (t - 1) + A_{j}^{D c, D r} (\frac{t}{T}) {\tilde{ϕ}}_{j}^{D r} (t - 1) \\ + A_{j}^{D c, G ℓ} (\frac{t}{T}) {\tilde{ϕ}}_{j}^{G ℓ} (t - 1) + A_{j}^{D c, G c} (\frac{t}{T}) {\tilde{ϕ}}_{j}^{G c} (t - 1) + A_{j}^{D c, G r} (\frac{t}{T}) {\tilde{ϕ}}_{j}^{G r} (t - 1) . \end{aligned}

For all j, we estimate jointly the P entries of the vector $μ_{j} (t)$ and the $P^{2}$ entries of the matrix $A_{j} (t)$ in (18) by means of local linear regression, according to the weighted least-squares (WLS) approach introduced by Motta (2021). For a fixed j, consider the LS-VAR(1) in (18)

ϕ (t) - μ (\frac{t}{T}) = A (\frac{t}{T}) [ϕ (t - 1) - μ (\frac{t - 1}{T})] + ε_{t}, t = 1, \dots, T,

(22)

with $ϕ (0) = μ (0) = 0$ and $E [ε_{t} ϕ {(s)}^{'}] = 1_{{s = t}} I_{P}$ ⁠. If the largest eigenvalue of $A$ lies inside the unit circle

sup_{u \in (0, 1)} | v_{1} [A (u)] | < 1,

$ϕ (t)$ is locally stationary and causal. Our goal is to estimate $A (u)$ at a fixed $u \in (0, 1)$ by WLS. We can rewrite (22) as

ϕ (t) = m (\frac{t}{T}) + A (\frac{t}{T}) ϕ (t - 1) + ε_{t},

(23)

where $m (\frac{t}{T}) = μ (\frac{t}{T}) - A (\frac{t}{T}) μ (\frac{t - 1}{T})$ for all $t = 1, \dots, T$ ⁠. If we define

\begin{aligned} \underset{P \times (P + 1)}{B (u)} & = [\underset{P \times 1}{m (u),} \underset{P \times P}{A (u)}] \\ \underset{(P + 1) \times 1}{ϕ^{μ} (t)} & = (\begin{matrix} 1 \\ ϕ (t) \end{matrix}) \end{aligned}

model (23) can be written as

ϕ (t) = B (\frac{t}{T}) ϕ^{μ} (t - 1) + ε_{t},

(24)

and therefore we define $\tilde{B} (u)$ as the minimizer of the weighted loss function

\sum_{t = 1}^{T} ‖ ϕ (t) - B (\frac{t}{T}) ϕ^{μ} (t - 1) ‖^{2} K_{h} (\frac{t}{T} - u),

(25)

where the local weights $K_{h} (y) = \frac{1}{h} K (y / h)$ are rescaled Kernel functions, and where the bandwidth sequence $h \equiv h_{T}$ tends to zero slower than $T^{- 1}$ ⁠: $h_{T} \to 0$ and $T h_{T} \to \infty$ as $T \to \infty$ ⁠. Letting $B^{(1)} (u) : = \frac{\partial B (x)}{\partial x} |_{x = u}$ ⁠, we consider model (24) and use the local-linear approximation of $B (\frac{t}{T})$ in a neighbourhood of u

B (\frac{t}{T}) \approx B (u) + (\frac{t}{T} - u) B^{(1)} (u)

to estimate $B (u)$ ⁠, our parameter of interest, in the approximate model

ϕ (t) \approx B (u) ϕ^{μ} (t - 1) + (\frac{t}{T} - u) B^{(1)} (u) ϕ^{μ} (t - 1) + ε_{t} .

Define

\begin{aligned} \underset{T \times P}{Φ_{0}} & = {\underset{P \times 1}{ϕ (0)}, \underset{P \times 1}{ϕ (1)}, \dots, \underset{P \times 1}{ϕ (T - 1)}}^{'}, \\ \underset{T \times P}{Φ_{1}} & = {\underset{P \times 1}{ϕ (1)}, \underset{P \times 1}{ϕ (2)}, \dots, \underset{P \times 1}{ϕ (T)}}^{'}, \\ \underset{T \times (P + 1)}{Φ_{0}^{μ}} & = [1_{T} | Φ_{0}], \end{aligned}

and the diagonal matrices

\begin{aligned} \underset{T \times T}{Δ_{1} (u)} & = diag {\frac{t}{T} - u, 1 \leq t \leq T}, \\ \underset{T \times T}{K_{T} (u)} & = diag {K_{h} (\frac{1}{T} - u), K_{h} (\frac{2}{T} - u), \dots, K_{h} (1 - u)} . \end{aligned}

Motta (2021) proved that the local-linear minimizer of (25) is

\tilde{B} (u) = Φ_{1}^{'} W_{T} (u) Φ_{0}^{μ} [Φ_{0}^{μ^{'}} W_{T} (u) Φ_{0}^{μ}]^{- 1}

(26)

= [\underset{P \times 1}{\tilde{m} (u)}, \underset{P \times P}{\tilde{A} (u)}],

(27)

where

W_{T} (u) = K_{T} (u) - K_{T} (u) Δ_{1} (u) Φ_{0}^{μ} [Φ_{0}^{μ^{'}} Δ_{1} (u) K_{T} (u) Δ_{1} (u) Φ_{0}^{μ}]^{- 1} Φ_{0}^{μ^{'}} Δ_{1} (u) K_{T} (u) .

The WLS equations (26) and (27) generalize the LS estimators of the time-invariant VAR(1) model (see Reinsel, 1997, Section 4.3.1) to the locally stationary framework. For our application, the means $μ (u)$ defined in (19) are obtained from $\tilde{m} (u)$ in (27), and are plotted in Figure 4. The VAR(1) curves $A_{j} (t)$ ⁠, defined in (20) and estimated according to $\tilde{A} (u)$ in (27), are presented in Figure 6. We emphasize that the VAR matrix $A$ in (20) is ‘full’ rather than diagonal: the VAR(1) coefficients of sub-discourses k in (21) explain how sub-discourse k depends on its own lagged value as well as on lagged values of the other five sub-discourses.

We estimate the matrix $A (u)$ in (20) by the last P columns of the $P \times (P + 1)$ matrix $\tilde{B} (u; h)$ obtained in (26)–(27). The VAR(1) matrix $\tilde{A} (u; h)$ depends on the bandwidth h, the smoothing parameter. Using the theory of local polynomials (Fan & Gijbels, 1996), it is possible to prove that

sup_{u \in [h, 1 - h]} ‖ \tilde{A} (u; h) - A (u) ‖^{2} = O_{p} (h^{4} + \frac{1}{T h}),

(28)

where $h^{2}$ is the order of the bias, and $\frac{1}{T h}$ is the order of the variance. The mean squared error in (28) shows how the variance of the smoothed factors relates to the overall length T of the time series. The optimal bandwidth minimizing (28) has the usual rate $h_{o p t} = O (T^{- 1 / 5})$ ⁠, and the corresponding optimal rate of the Mean Squared Error in (28) is $O (T^{- 4 / 5})$ ⁠.

In practice the bandwidth is chosen by cross-validation. For non-parametric regression, in the case of dependent observations, cross-validation is known to be severely affected by dependence. In order to adjust for the effect of (possible) dependence on bandwidth selection, we select the smoothing parameter h by means of the ‘leave-(2ℓ +1)-out’ version of cross-validation by Chu and Marron (1991).

5 Understanding ECA as a time-varying factor model: identification and invariance

Correspondence analysis may be defined as a special case of principal components analysis (PCA) of the rows and columns of a table, especially applicable to a cross-tabulation. However CA and PCA are used under different circumstances. Principal components analysis is used for tables consisting of continuous measurement, whereas CA is applied to contingency tables (i.e. cross-tabulations). In Section 5.1, we establish the connection between CA and PCA, and show that it is possible to write the rows-and-columns rescaled matrix of frequencies as a factor model, see (33). Also, the same matrix can be approximated by a factor model of a lower dimensional rank, see (34). In Section 5.2, we show that in our model, changes to the observations translate into changes of the same scale to the latent factor.

5.1 Correspondence analysis as a factor model: identifying the common components

For all t, let

Y (t) = D_{P}^{- 1 / 2} (t) F (t) D_{N}^{- 1 / 2} (t), and

(29)

Z (t) = D_{P}^{1 / 2} (t) 1_{P} 1_{N}^{'} D_{N}^{1 / 2} (t) .

(30)

In this section, we show that there exists a representation of the rescaled frequencies $Y (t)$ as factor model with loadings given by the eigenvectors $Λ (t)$ of the sample covariance $Γ^{X} (t) = X {(t)}^{'} X (t)$ ⁠, factors given by the principal components $Y (t) Λ (t)$ ⁠, and idiosyncratic components $Z (t)$ ⁠. Without loss of generality, in what follows we assume that $P < N$ ⁠. Then define for all $1 \leq m \leq P - 1$ and all t, the $P \times m$ matrix

Φ_{P}^{(1 : m)} (t) = Y (t) Λ_{N}^{(1 : m)} (t)

(31)

as the matrix whose columns are the principal components corresponding to the eigenvalues $v_{1} (t) \geq v_{2} (t) \geq \dots \geq v_{m} (t)$ of $Γ_{N}^{X} (t)$ ⁠, that is,

Γ_{N}^{X} (t) Λ_{N}^{(1 : m)} (t) = Λ_{N}^{(1 : m)} V_{N}^{(1 : m)} (t) .

Using the ‘reconstruction’ formula (see Benzécri, 1992; van der Heijden & de Leeuw, 1985)

\begin{aligned} f_{p n} (t) & = f_{p .} (t) f_{. n} (t) [1 + \sum_{j = 1}^{P - 1} \frac{1}{\sqrt{v_{j} (t)}} {\hat{ϕ}}_{p j} (t) {\hat{λ}}_{n j} (t)] \\ = f_{p .} (t) f_{. n} (t) + f_{p .} (t) f_{. n} (t) \sum_{j = 1}^{P - 1} \frac{1}{\sqrt{v_{j} (t)}} {\hat{ϕ}}_{p j} (t) {\hat{λ}}_{n j} (t), \end{aligned}

(32)

we can prove that

Y (t) = Z (t) + \underset{P \times (P - 1)}{Φ_{P}^{(1 : P - 1)}} (t) \underset{(P - 1) \times N}{Λ_{N}^{(1 : P - 1)}} {(t)}^{'} .

(33)

Hence for all $r < P - 1$ ⁠, we can approximate $Y (t)$ as

Y (t) \approx Z (t) + \underset{P \times r}{Φ_{P}^{(1 : r)}} (t) \underset{r \times N}{Λ_{N}^{(1 : r)}} {(t)}^{'} .

(34)

Therefore, the eigenvectors $Λ_{N}^{(1 : r)} (t)$ and the principal components $Φ_{P}^{(1 : r)} (t)$ in (34) are, respectively, loadings and factors in model (3). If $r = P - 1$ we reconstruct exactly the matrix $Y$ (no dimension reduction allowed). Notice that equations (31) and (33), together with the ortho-normality of the columns of $Λ_{N} (t)$ ⁠, imply that

\begin{aligned} Φ_{N}^{(1 : P - 1)} (t) = Y (t) Λ_{N}^{(1 : P - 1)} & = Z (t) Λ_{N}^{(1 : P - 1)} + Φ_{P}^{(1 : P - 1)} (t) Λ_{N}^{(1 : P - 1)} {(t)}^{'} Λ_{N}^{(1 : P - 1)} (t) \\ = Z (t) Λ_{N}^{(1 : P - 1)} + Φ_{P}^{(1 : P - 1)} (t), \end{aligned}

that is, the idiosyncratic components $Z (t)$ are orthogonal to the loadings: $Z (t) Λ_{N}^{(1 : P - 1)} = O$ ⁠. Equation (32) above shows that CA decomposes the departure from independence in a contingency table. Indeed, under the assumption of independence between rows and columns (sub-discourses and concepts, respectively) we would have $f_{p n} (t) = f_{p .} (t) f_{. n} (t)$ ⁠, in matrix form $F (t) = D_{P} (t) 1_{P} 1_{N}^{'} D_{N} (t)$ ⁠, which would imply

Y (t) = D_{P}^{- 1 / 2} (t) F (t) D_{N}^{- 1 / 2} (t) = D_{P}^{1 / 2} (t) 1_{P} 1_{N}^{'} D_{N}^{1 / 2} (t) = Z (t) .

If we define the $P \times N$ matrices

X_{N}^{(1 : P - 1)} (t) = Φ_{P}^{(1 : P - 1)} (t) Λ_{N}^{(1 : P - 1)} {(t)}^{'}

(35)

X_{N}^{(1 : r)} (t) = Φ_{P}^{(1 : r)} (t) Λ_{N}^{(1 : r)} {(t)}^{'}

(36)

of rank $P - 1$ and $r$ ⁠, respectively, we can rewrite (33) and (34) as

\begin{aligned} Y (t) & = X_{N}^{(1 : P - 1)} (t) + Z (t) \\ \approx X_{N}^{(1 : r)} (t) + Z (t), \end{aligned}

(37)

where the approximation $X_{N}^{(1 : P - 1)} (t) \approx X_{N}^{(1 : r)} (t)$ is due to the fact that $r < P - 1$ ⁠. If $r = P - 1$ then $X_{N}^{(1 : P - 1)} (t) \equiv X_{N}^{(1 : r)} (t)$ ⁠. The matrices $X_{N}^{(1 : r)} (t)$ and $Z (t)$ in (37) are, respectively, the common components $X (t)$ and idiosyncratic components $Z (t)$ in model (3). In other words, the matrix $Z (t)$ contains the (rescaled) frequencies we would observe if rows and columns were independent. Hence $X (t)$ measures the departure from independence, or commonness, between rows and columns. The approximation in (37) shows that

the first principal component of $Γ_{N}^{Y} (t)$ ⁠, that corresponds to the eigenvalue $\equiv 1$ ⁠, represents the idiosyncratic components $Z (t)$ ⁠, whereas
the main $r$ principal components of $Γ_{N}^{X} (t)$ ⁠, that correspond to the largest eigenvalues $v_{1} (t), \dots, v_{r} (t)$ of $Γ_{N}^{X} (t)$ ⁠, represent the common components.

The approximation in (37) is a time-varying factor model for the rescaled frequencies, with

idiosyncratic components $Z (t)$ of rank one and
common components $X_{N}^{(1 : r)} (t)$ of rank $r < P - 1$ ⁠.

5.2 Correspondence analysis as a factor model: invariance of the scale of the factors

The modelling of time-varying latent factors suggests that the constructs being evaluated are changing with time. The use of time-varying loadings allows for both the scale and meaning of the latent variables to change across time. In this section, we show that, although both loadings and factors are time-varying, it is still possible to distinguish whether observed changes are due to changes in the latent variables (i.e. the factors $Φ$ ⁠), or changes to what is being measured (i.e. the observations $Y$ ⁠). Recall (31) and define $Φ_{P} (t) = Φ_{P}^{(1 : r)} (t)$ ⁠. Then it follows from the definition of the Frobenius norm that

‖ Φ_{P} (t) ‖^{2} = trace {Φ_{P} {(t)}^{'} Φ_{P} (t)} = trace {Λ_{N} {(t)}^{'} Y {(t)}^{'} Y (t) Λ_{N} (t)} = trace {V_{N} (t)} = \sum_{j = 1}^{r} v_{j} (t) .

Since

‖ Y (t) ‖^{2} = trace {Y {(t)}^{'} Y (t)} = trace {Γ_{N}^{Y}} = 1 + \sum_{j = 1}^{P - 1} v_{j} (t)

we have the following decomposition:

‖ Y (t) ‖^{2} = 1 + \sum_{j = 1}^{r} v_{j} (t) + \sum_{j = r + 1}^{P - 1} v_{j} (t) .

(38)

If $r = P - 1$ ⁠, (38) becomes

‖ Y (t) ‖ = 1 + ‖ Φ_{P} (t) ‖ .

(39)

The decomposition in (38) allows to measure what proportion of the observed changes are due to true changes in the latent variables, and shows that the observed changes are not affected by the scale of the ruler (i.e. the scale of the loadings). Equation (39) shows that allowing for time-variation in both latent factors $Φ (t)$ and loadings $Λ (t)$ simultaneously, does not pose identification problems. Indeed, for all t the scale of $Y (t)$ is uniquely determined by the scale of $Φ (t)$ ⁠. That is, the observed changes are not affected by the scale of the ruler (i.e. by the scale of the loadings).

Suppose now we multiply the observations by a scalar $c \in R$ ⁠, and define $Υ (t) = c Y (t)$ with corresponding covariance matrix $Γ^{Υ} (t) = Υ {(t)}^{'} Υ (t) = c^{2} Y {(t)}^{'} Y (t)$ ⁠. Due to the orthogonality between $X (t)$ and $Z (t)$ (cf. Section 5.1), $Γ^{Υ} (t) = c^{2} Γ^{X} (t) + c^{2} Z (t) Z {(t)}^{'}$ ⁠. Letting $Γ^{ξ} (t) = c^{2} Γ^{X}$ ⁠, it follows from (8) that $Γ^{ξ} (t) = c^{2} Λ_{N} V_{N} Λ_{N}$ ⁠, that is, $Γ^{ξ} (t)$ has the same eigenvectors $Λ_{N}^{ξ} (t) = Λ_{N} (t)$ as $Γ^{X} (t)$ and eigenvalues $V_{N}^{ξ} (t) = c^{2} V_{N} (t)$ ⁠. Hence applying (31) to the observations $Υ$ ⁠, we obtain

Φ_{P}^{Υ} (t) = Υ (t) Λ_{N}^{ξ} (t) = c Y (t) Λ_{N} (t) = c Φ_{P} (t),

(40)

for all t. Equation (40) shows that multiplying the observations by a scalar translate into the same rescaling for the factors. Using the same arguments, it is possible to prove that (40) holds for observations of the form $Υ (t) = Y (t) R$ ⁠, for any orthogonal matrix $R$ satisfying $R R^{'} = c^{2} I_{N}, c \in R$ ⁠. That is, our approach disentangles changes in constructs (the factors) from changes in measurement (the loadings).

6 Using ECA to understand the co-evolution and inter-dependency of linked public debates

Compared to EFA in equations (1)–(2) of Section 3, ECA offers four key analytic opportunities that offer additional depth for ‘big data’ analysis and interpretation. First, the decomposition of the overall observed variance into the contributions of each sub-discourse to each latent factor represents the extent to which different sub-discourses make use of the same or different combinations of concepts to interpret the studied issue. Second, ECA enables the identification of distinct factors for each sub-discourse, which permit a direct analysis of the extent to which different sub-discourses make use of similar or different interpretations over time. Third, the time-varying location of each sub-discourse and the time-varying loadings of concepts on each factor in the same barycentric representation captures the specific content of those interpretations distinguished by each factor. Fourth, the estimated cross-lagged auto-regressive coefficients express how sub-discourses’ uses of specific interpretations at a given time is related to the previous interpretations’ presence, with respect to the same sub-discourse as well as other sub-discourses’ coverage. Importantly, the analysis does not focus on any specific constructs to be measured, but rather identifies evolving yet systematic patterns of concept references that account for variation in the news coverage, indicating differential framing choices. Other than conventional CA, ECA permits all of the above analyses to depend on time, such that the information captured by each factor is permitted to evolve, as are any associations and influences between these.

In the following, we will discuss each of the above points in turn, using our application to the German and Greek news debates regarding the 2009–2012 financial crisis to illustrate possible interpretations of the presented data.³

6.1 Latent factors can be interpreted as the contributions of variables and cases

Figure 2a shows the distribution of the ‘Typical’ (or average) concept with respect to the six sub-discourses, obtained by averaging characteristics over the $N = 525$ concepts.⁴ The distribution of the typical concept for sub-discourse p is the time-varying contribution $ω_{0 p}^{ϕ} (t)$ in (13), that is, the row-profile $f_{p .} (t)$ of sub-discourse p. Figure 2b,c shows the time-varying contributions $ω_{j p}^{ϕ} (t)$ in (13), with $j = 1, 2$ ⁠.

Horizontal axis: time in weeks, t=1,2,…,T=143, running from 10/01/2009 to 06/30/2012. Panel a: time-varying distribution of the ‘Average Concept’ with respect to the 6 sub-discourses. Panels b,c: time-varying contributions of the 6 sub-discourses to Factor 1 and 2, respectively.

Figure 2.

Horizontal axis: time in weeks, $t = 1, 2, \dots, T = 143$ ⁠, running from 10/01/2009 to 06/30/2012. Panel a: time-varying distribution of the ‘Average Concept’ with respect to the 6 sub-discourses. Panels b,c: time-varying contributions of the 6 sub-discourses to Factor 1 and 2, respectively.

For each factor, the contributions add up to 1 across both columns and rows at each given point in time. Accordingly, ECA provides a fast way for measuring the (time-varying) extent to which each concept or each sub-discourse are responsible for the variability of each latent factor.

The distribution of the average concept (Figure 2a) captures the rapid evolution of concepts required to describe the constantly shifting news agenda, which is shared across sub-discourses but may be present to a time-varying extent: the German centrist and the Greek leftist sub-discourses dominate the debate until the Greek leftists vanish at week $t = 117$ ⁠. After the left-leaning Eleftherotypia ceases to publish following its bankruptcy in December 2011, its role is (to some extent) taken over by the centrist Ta Nea. The German debate plays a slightly larger role in defining the common news agenda, especially in the beginning and end of the observed period, with a pronounced early influence of the conservative Welt, culminating with the initial riots in Greece; after that, the situation definitions presented by the centrist Süddeutsche Zeitung take over as the most typical formulation of the common news agenda.

The first and second factors account for differences in framing that distinguish between different sub-discourses over time: As Figure 2 shows, all sub-discourses contribute to Factor 1 in relatively stable proportions, suggesting the presence of evolutionary, but reasonably enduring differences that distinguish the observed sub-discourses. These differences, whose nature cannot yet been known from this display alone, are most prominent initially in the discourse of the German centre and the Greek left, and later, after the latter’s termination, both of the remaining Greek sub-discourses. Factor 2, by contrast, draws upon the different sub-discourses in a rapidly changing fashion, suggesting a succession of different distinctions that organize the coverage over time: The longest phase marked by a relatively consistent contribution of different sub-discourses extends from week 59 to week 73, and appears to focus on distinct framing choices within the Greek media (primarily between the centre and right), while the differences identified in this phase barely matter for the German coverage. At most times, the second factor extracts differences that distinguish primarily within the German or the Greek sub-discourses, but not both. While the factor’s varying attention to either Germany or Greece primarily responds to where more pronounced, evolving patterns emerge in the public debate, it plausibly reflects the changing power relations within either domestic debate. For example, Greece’s centrist sub-discourse dominates throughout most of 2011, but loses dominance when the Social Democratic government collapses in week 110 (November 2011).

To illustrate the added value of ECA versus any arbitrary contingency summary, we have computed a contingency table of the amount of news articles about the eurocrisis by party families (left, centrist, right) and by week, see Figure 3. This contingency table has the same visual appearance as Figure 2, but the percentages of Factor 2 by week (Figure 2c) are significantly different from the percentages in Figure 3, especially in the ‘predominantly white’ period from $t = 58$ to $t = 111$ ⁠. The contributions (of the six sub-discourses) obtained through ECA provide a syntheses which is joint and multi-dimensional at the same time. It is joint because it involves all the concepts; it is multi-dimensional because those contributions are distributed over two different factors, each capturing a separate/specific aspect of the phenomenon. In contrast, in Figure 3, we select five key concepts that capture competing ways of labelling the crisis to provide an individual and uni-dimensional synthesis: As Figure 3 shows, labels foregrounding the economic damage (‘Economic crisis’) are far more commonly invoked in Greece than in Germany, mostly by the Greek left; The Greek centre temporarily suspends its use of the label while the Social-democratic government tries to push its austerity agenda, but re-appropriates it as the crisis deepens. By contrast, interpretations as a ‘Financial crisis’ are irrelevant in Greece, but persistently dominate in Germany. The focus on ‘Debt’ is persistently shared across both countries, while interpretations as ‘Bank’ or ‘Euro crisis’ gain and lose focus in both countries over time. For example, with the introduction of a technocratic government in Greece after week 111 (December 2011), dominant use of the ‘Euro crisis’ label shifts from Greece to the German discourse. We can also see that under Greece’s social-democratic government, emphasis shifted from economics to debt in weeks 51–58 (October and November 2010), just at the time when its centrist sub-discourse gained dominance on Factor 2 in Figure 2. While also a conventional CA could have identified the enduring contributions made by each concept and sub-discourse, it is evident that flattening the analysis into a single phase under investigation loses much of the nuance and insight offered by ECA.

Horizontal axis: time in weeks, t=1,2,…,T=143, running from 1 October 2009 to 30 June 2012. Vertical axis: weekly percentage of news articles about the five crisis split by party families. Panel a, bank crisis; Panel b, debt crisis; Panel c, economic crisis; Panel d, euro crisis; Panel e, financial crisis.

Figure 3.

Horizontal axis: time in weeks, $t = 1, 2, \dots, T = 143$ ⁠, running from 1 October 2009 to 30 June 2012. Vertical axis: weekly percentage of news articles about the five crisis split by party families. Panel a, bank crisis; Panel b, debt crisis; Panel c, economic crisis; Panel d, euro crisis; Panel e, financial crisis.

6.2 Separate sets of factors can be obtained for each case

While factors are extracted jointly from all the sub-discourses, as in EFA, ECA permits us to derive a different set of factors for each case (in our case, each sub-discourse) directly from the analysis. Unlike EFA (Motta & Baden, 2013), where all counts are summed up across cases (so we would lose the distinction between sub-discourses) to obtain a $T \times N$ matrix, with ECA we are able to ‘retain’ the specific information brought by the different sub-discourses by looking at a time-varying frequency-table of size $P \times N \times T$ ⁠. As a result, for each sub-discourse, we obtain $P = 6$ time-varying means (see Figure 4a) and $P = 6$ time-varying auto-regressive coefficients (see Figure 6). By plotting the location of each sub-discourse relative to the joint factors, as can be seen in Figure 4a, we can analyse how the respective sub-discourses align with one another over time. We now interpret the smoothed factors $μ (t)$ in (19), plotted in Figure 4a, whereas in Section 6.3 we interpret the smoothed loadings $Λ (t)$ in (17), presented in panel b of the same figure.

Figure 4.

The $P = 6$ Sub-discourses and primary crisis labels on common factors 1 & 2 (barycentric representation). The integers $t = 1, \dots, T = 143$ denote the week numbers running from 10/01/2009 to 06/30/2012. Panel a: Sub-discourses as barycentre of their uses of concepts. We are plotting the $P = 6$ curves $μ (t)$ defined in (19) and obtained from $\tilde{m} (u)$ in (27). The projection of the pth row-profile (⁠ $p = 1, 2, 3, 4, 5, 6$ ⁠) on the jth axis (⁠ $j = 1, 2$ ⁠) is obtained by a weighted average of the $N = 525$ coordinates of the pth row-profile with weights given by the $N = 525$ loadings on the jth axis. Panel b: Crisis labels as barycentre of their use across sub-discourses. We are plotting five selected smoothed loadings, that is, five selected rows of $\tilde{Λ}$ in (17). The projection of the nth column-profile (⁠ $n = 1, 2, \dots, 525$ ⁠) on the jth axis (⁠ $j = 1, 2$ ⁠) is obtained by a weighted average of the $P =$ 6 coordinates of the nth column-profile with weights given by the $P = 6$ factors on the jth axis.

Focusing on those two factors that represent time-varying distinctions between those frames foregrounded by different media, Figure 4a displays the position of each outlet’s coverage. The plot shows that Factor 1 durably distinguishes between the coverage presented by the German outlets (black, yellow, and red lines), which are persistently located on the right side of the figure, and the Greek outlets (blue, grey, and light blue lines), which remain on the left side at all times. Within each country, all three sub-discourses remain closely aligned with one another over time on this factor. We can thus identify Factor 1 as capturing persistent, country-specific perspectives and idiosyncrasies in the coverage.

By contrast, those time-varying distinctions captured by Factor 2 reflect a wide range of different similarities and contrasts in the debate regarding different ways of interpreting the crisis (notably, as a debt, Euro, or banking crisis; the distinction between interpretations as a financial or economic crisis is, as we have seen in Figure 3, part of the country-specific differences). At the outset, before the crisis escalated, for instance, the factor contrasts the discourse of the Greek right (blue, week 1) against that of the German left (red), with the German centre and right (yellow, red) leaning toward the Greek right, the Greek left (light blue) halfway toward the German left, and the Greek centrist Ta Nea (grey) at a neutral midpoint. As the crisis develops (ca. week 20, February–March 2010), the contrast between the Greek right and the other Greek sub-discourses remains, while all three German papers find themselves in between; with the Europeanization of the crisis (starting with Ireland around week 55, October 2010), the contrast between German left and Greek right is restored, with the remaining outlets lined up in between. In the running-up to the Greek planned (and later cancelled) referendum over the controversial European rescue plan (ca. week 110), the entire German discourse as well as the Greek right contrast against the Greek left. Within Greece, shifting frames mostly reconstitute a right–left conflict over changing concerns, with the centrist sub-discourse mostly taking in an intermediate position (except for summer 2010, around week 44, when the conflict is between Ta Nea and the conservative Kathimerini). In Germany, we see phases opposing both left and right against the centrist SZ, led initially by the left-leaning outlets and later, on a different conflict, by the right-leaning Welt. Toward the end of the debate, both conservative media assume a distinctive position, while the rest gathers close to the centre. Factor 2 captures time-varying distinctions in the framing of the news, which adjoin and separate different sub-discourses over time.

In order to show the added value of (the evolutionary) ECA vs. (the time-invariant) CA, we have computed a standard (or ‘ordinary’) CA and plotted the results in Figure 5. In Section 6.3 we interpret and compare ECA (Figure 4) with CA (Figure 5), and highlight how ECA captures the evolutionary paths of smoothed loadings and smoothed factors, as compared to the static nature of CA.

Figure 5.

Joint representation of the six political Sub-discourses and five crisis-related concepts, obtained by applying a time-invariant (standard) CA. The size of a circle is the contribution to the first factor, whereas the size of an asterisk is the contribution to the second factor.

6.3 Variables and cases can be represented dynamically within the same factor space

Unlike PCA, CA enjoys a symmetry property between factors and loadings: we can represent loadings and factors jointly, that is, on the same factorial space. This symmetry is due to the important barycentric property which is shared by loadings and factors that are extracted by CA. More precisely for a given axis j (with $j = 1, \dots, r$ ⁠), the factors are the barycentre of the loadings, and loadings are the barycentre of the factors, up to a scale factor given by the square root of the jth eigenvalue:

\begin{matrix} {\hat{ϕ}}_{p j} (t) = \frac{1}{\sqrt{v_{j} (t)}} \sum_{n = 1}^{N} π_{p n} (t) {\hat{λ}}_{n j} (t), & p = 1, \dots, P, \\ {\hat{λ}}_{n j} (t) = \frac{1}{\sqrt{v_{j} (t)}} \sum_{p = 1}^{P} ψ_{p n} (t) {\hat{ϕ}}_{p j} (t), & n = 1, \dots, N, \end{matrix}

where the weights $ψ_{p n} (t)$ and $π_{p n} (t)$ (with $p = 1, \dots, P$ and $n = 1, \dots, N$ ⁠) are the time-varying row-profiles and column-profiles defined in (7), respectively.

Geometrically, this means that we can project the $P = 6$ row-profiles in the factorial space of the loadings, and the $N = 525$ column-profiles in the factorial space of the sub-discourses, such that both variables (our concepts) and cases (the six sub-discourses) can be located within the same coordinate system, and their location can be interpreted in equivalent ways. In Figure 4, we make use of this property: Below panel a, which represents the $P = 6$ row-profiles as the barycentres of the $N = 525$ columns (see above), panel b represents the $N = 525$ column-profiles as the barycentres of the $P = 6$ rows on the same two factors as above. This display permits the joint interpretation of both representations, wherein the time-varying concept loadings can be tied to the composition of the news coverage of the respective sub-discourses, and each sub-discourse can be characterized based on its alignment within the space spanned by the respective concepts. The origin of the axes represents, at any point in time, the ‘average (or typical) sub-discourse’ and the ‘average (or typical) concept’, respectively. As the semantic meaning captured by each dimension is expressed by the time-varying factor loadings of all 525 concepts, tracking how key labels and charged concepts load on the respective axes offers a fast and informative access to interpreting what framing choices are expressed by each identified factor. Illustrating the time-varying factor loadings of all 525 concepts would result in a dense configuration of overlapping curves. In Figure 4b, we thus represent the time-varying trajectories of five concepts selected to illustrate five different ways of framing the crisis. Specifically, we represent the time-varying loadings of the main labels used to define the crisis that had been coded among the 525 concepts—as Euro crisis (green line), debt crisis (black), financial crisis (pink), banking crisis (red), or economic crisis (blue). Reading the figure in conjunction with the insights obtained from the location of sub-discourses on the same axes (panel a), we can see that the Greek debate persistently focuses on the economic crisis (positive loadings on Factor 1), whereas the German debate is more concerned about those aspects related to the financial and banking system, as well as the debt crisis (negative loadings). At the same time, the between-country differences expressed by Factor 1 are not fully stable, as numerous other concepts align in different constellations over time such as references to a Euro crisis, which contribute more to the German papers’ framing (positive values) throughout the second year of the crisis (dominated by the controversy over Euro-bonds) but are more associated with the Greek coverage (negative values) before and after that.

Regarding those distinctions in the framing of the crisis captured by Factor 2, Figure 4 shows that none of the crisis labels contribute to the distinctive framing at the outset of the crisis, before it was recognized as such and further defined. By week 20 (February–March 2010), references to the banking crisis load positively, where the sub-discourse of the Greek left was located (see above), while the other Greek sub-discourses prefer a focus on the Euro crisis, which loads negatively. By week 55, references to an economic and Euro crisis (positive loadings, associated with the Greek right) contrast against references to a banking crisis (negative loadings, associated with the German left). In week 110 (November 2011), all crisis labels are again in a position close to the centre, reflecting their secondary role in defining present framing conflicts; financial, banking and debt crisis load slightly above zero (associated with the shared view of the German debate and the Greek right), and economic crisis slightly below, associated with the Greek left. Beyond these five labels, which of course offer only a very cursory understanding of the respective interpretations, we could further examine the loadings of the remaining 520 concepts to reconstruct how the differences in interpretation captured by each factor evolve and are associated with the coverage of different outlets over time. While we will not discuss this in detail here, the analysis of concept loadings shows that a focus on micro- versus macro-economic effects and mechanisms structures the meaning expressed on Factor 2 at most times, albeit with shifting emphases (e.g. on employment, productivity, inflation, etc.). The analysis also shows that there is no persistent right–left cleavage, neither within the Greek nor in the German public debate.

As explained in Section 4, ECA is performed on the time-varying matrix $C (t)$ of counts. It is interesting to investigate the results obtained by applying an ‘ordinary’ (time-invariant) CA to the same data. To this end, we extract loadings and factors from the time-invariant $6 \times 525$ matrix $\sum_{t = 1}^{T} C (t)$ ⁠, and plot he results in Figure 5. The figure correctly identifies the two dominant meanings of both factors: Factor 1 still distinguishes the Greek from the German public debate, while Factor 2 captures differences of perspective between the domestic political camps. However, the loss of the temporal dimension hides not only the important contribution of also the German domestic debate to Factor 2 (which can be seen plainly from Figure 2, but also the important evolution of cleavages and alliances in the domestic debates: As we have shown in Figure 4, the Greek right is not consistently opposed to the Greek left and centre, as Figure 5 indicates, nor are the three German sub-discourses remotely as well-aligned over time as their positioning in Figure 5 would suggest. The time-invariant view also deprives us of recognizing the close alignment between the Euro crisis narrative and the Greek right, or the very variable use of the ‘Bank crisis’ and ‘Euro crisis’ labels among German media.

6.4 For each factor, separate auto-regressive coefficients measure dynamic causality within domestic sub-discourses as well as between foreign sub-discourses

Exploiting the fact that in ECA, unlike EFA, separate factors can be obtained for each sub-discourse, it follows that it is also possible to separately estimate the auto-regressive coefficients representing the relative evolution of these factors over time.

Granger (1969) defined a concept of causality which, under suitable conditions, is fairly easy to deal with in the context of VAR models. Granger causality is a statistical concept derived from the notion that causes may not occur after effects and that if one variable is the cause of another, knowing the status on the cause at an earlier point in time can enhance prediction of the effect at a later point in time (see Lütkepohl, 2005, Section 2.3). The VAR model has been widely employed in econometric analyses (Granger & Newbold, 1986) to elucidate underlying mechanisms using Granger causality. In a VAR model, such as our equation (18), causalities and non-causalities can be determined by looking at coefficients $A^{p, q}$ of the matrix $A$ in (20). Self-influences are measured by the diagonal entries (⁠ $p = q$ ⁠) of $A$ ⁠, whereas cross-influences by the off-diagonal entries (⁠ $p \neq q$ ⁠). Cross-influences should be assumed only when changes cannot be predicted on the basis of auto-regression.

For our application, we compute confidence bands of the VAR coefficients obtained by fitting to the factors a time-invariant (or standard) VAR model (see Reinsel, 1997, Section and 4.3.1), and we focus on those time-varying coefficients that (i) are significantly different than zero, and (ii) take values outside the bands. With this approach, we achieve two goals: (a) we can interpret the coefficients in terms of Granger Causality, and (b) we can appreciate the added value of ECA versus CA.

In Figure 6, we show these evolutionary VAR coefficients (colored curves) alongside the coefficients obtained by an ‘ordinary’, time-invariant model (black flat-lines). These coefficients can be interpreted as the extent to which one sub-discourse’s use of the framing represented by the respective factor influence the coverage of the same and other sub-discourses in the subsequent week. Disregarding the very beginning and end of the line, which respond heavily to the initial and final states of the recorded data, Figure 6a shows a profound influence of the Greek left onto the Greek right sub-discourse on Factor 1 (the country-specific perspective; weeks 23–58, from March to November 2010), marking the formation of the Greek popular opposition discourse against the social-democratic government’s rapidly escalating measures to contain the growing crisis. However, as can be seen in panel b, the emerging shared interpretation of the economic crisis does not include the left’s emphasis on the role of banks and debt: At no time is there a significant influence in the same direction on Factor 2 (the specific interpretation of the crisis). Instead, the Greek left-wing sub-discourse temporarily exerts its influence upon the Greek centre’s interpretation of the crisis (Factor 2, panel c), raising pressure to consult the electorate about planned measures, and culminating in a referendum proposed in week 105, immediately preceding the collapse of the government. Earlier, during the rise of the debt- and austerity-focused discourse among the Greek centre, the centrist sub-discourse had markedly distanced itself from the left, which was at that time fanning mass protests against the government (week 72, April 2011). Panel d, finally, shows that following the establishment of Papedemos’ technocratic government in Greece (week 111, December 2011), the sub-discourse of the Greek right exerted a persistent positive influence upon its German counterpart, reflecting efforts among the German conservative government to support Greece’s efforts at managing the crisis. Beyond these exemplary influences, an analysis of all possible interactions shows a rich web of temporary, mutual or one-sided influences, which are not limited to sub-discourses in one country alone, but also illustrate close inter-dependencies bwteen the public debates in both countries. By contrast, the time-invariant measure of influences identifies only very few enduring influences (mostly of the Greek left on Factor 1), and some auto-regressive dynamics within the same sub-discourse on Factor 2—and misrepresents the time-changing directionality and strength of mutual inter-dependencies.

Horizontal axis: time in weeks, t=1,2,…,T=143, running from 10/01/2009 to 06/30/2012. Vertical axis: selected entries of the time-varying VAR matrices A1 and A2 defined in (20), estimated according to (27). Panel a: first factor of Greece-right caused by the first factor of Greece-left. Panel b: second factor of Greece-right caused by the second factor of Greece-left. Panel c: second factor of Greece-centre caused by the second factor of Greece-left. Panel d: second factor of German-right caused by the second factor of Greece-right.

Figure 6.

Horizontal axis: time in weeks, $t = 1, 2, \dots, T = 143$ ⁠, running from 10/01/2009 to 06/30/2012. Vertical axis: selected entries of the time-varying VAR matrices $A_{1}$ and $A_{2}$ defined in (20), estimated according to (27). Panel a: first factor of Greece-right caused by the first factor of Greece-left. Panel b: second factor of Greece-right caused by the second factor of Greece-left. Panel c: second factor of Greece-centre caused by the second factor of Greece-left. Panel d: second factor of German-right caused by the second factor of Greece-right.

10.1080/19312458.2021.2015574

7 Conclusions

In this paper, we identify and interpret the small-dimensional latent factors underlying the evolution over time of large-dimensional semantic concepts with respect to three political-ideologically differentiated sub-discourses (left, centre, and right) in two countries (Greece and Germany). Our objective is threefold: (i) reducing the number of correspondences involved into a contingency table, (ii) describing the evolutionary (or smoothly time-varying) structure of the relationships between the reduced variables, and (iii) allowing N to grow at least as fast as T, so to meet the typical challenge presented by ‘big data’. The novel ECA introduced in this paper targets our objective.

ECA enables an analysis of the co-evolution of linked debates that goes beyond the capacities of existing methodological approaches in several critical ways. Drawing upon the evolutionary perspective adopted from EFA (Motta & Baden, 2013), ECA permits a rigorous comparative analysis of expressed meanings, despite the fact that the specific contents of the debate are constantly changing (Baumgartner et al., 2008). By distilling latent variables as the dominant factors structuring co-variation in a much higher-dimensional data process, it enables an analysis of auto-regressive and cross-lagged dependencies without the need to focus on only a few, static, and pre-selected variables. Through the joint estimation of underlying factors, it is possible to delineate what meanings are shared (trivial factor), distinguish between enduringly different perspectives (Factor 1; notably, in our example, the labelling as primarily economic or financial crisis) or constitute specific, transient frames (Factor 2, which in our case captured a variety of temporarily salient disagreements about the interpretation of the crisis). In particular, the barycentric properties of ECA allow an analysis that regards the changing associations between semantic concepts and the variable alignments between different sub-debates within the same low-dimensional space. In this way, we can not only understand how different interpretations shape the co-evolution of debates at different times, but also how each sub-discourse contributes to these controversies and where it positions itself therein.

Smoothness is a key assumption in our model. The loading matrix $Λ (t)$ in (3), as well as the mean-vector $μ (t)$ and the VAR-matrix $A (t)$ in (4) are allowed to vary smoothly (i.e. slowly) over time. Another assumption of our approach is that the factors $ϕ (t)$ in (4) are regressed upon their own first-order past value, thus excluding the possible dependence of the factors upon higher-order lagged values. Our method might therefore not be appropriate when the underlying parameters measure transitions that are abrupt (i.e. discontinuous) over time, and/or in the case of long-memory latent factors.

Given the methodological focus of the present paper and the limitations of space, we have only sketched some illustrative analyses above, which remain far behind a full analysis enabled by ECA. The presented analyses serve to demonstrate the specific capacities added by ECA: They illustrate how ECA enables us to trace complex patterns of textual co-variations across different sub-discourses and over time to understand the time-changing focus of salient controversies; to identify the differential power relations among sub-discourses, which gain and lose dominance in their capacity to define prevalent perspectives over time; and to reconstruct the inter-dependencies between the framing choices found in different sub-discourses, reflecting the formation of temporary alliances (e.g. among the Greek left and right early into the crisis) and the exertion of political pressure (e.g. forcing the Greek government to call a referendum to respond to popular protests). Our findings not only underscore the critical value of modelling ongoing, evolutionary changes in these debates, but additionally reveal a rich web of reciprocal influences between different sub-discourses, which remain hidden in conventional CA. The variable grain of analysis afforded by ECA lends itself to both further statistical analysis (e.g. adding covariates, such as econometric data) and a nuanced qualitative investigation (e.g. adding event timelines or information about editorial lines) of comparative patterns.

Extending the gaze to social scientific research more generally, there exists a large variety of phenomena that shares a similar data structure to the case documented here: From intergovernmental negotiations (e.g. delegations whose members hold time-varying preferences on numerous issues), to group-dynamic interactions (e.g. social media platforms whose users interact with one another in time-changing ways), to psychological processes (e.g. individuals whose beliefs or attitudes toward many objects evolve in inter-dependent ways), there are many social processes that can be characterized by time-variant data on a large number of variables obtained in equivalent fashion from multiple sites for the purpose of comparative analysis. The same is true for applications beyond the social sciences, be they complex patterns of neuronal activation, meteorological data, market survey or radio signals. ECA offers an avenue for studying such data in ways that do not require researchers to assume those underlying processes responsible for complex observations to remain known and stable, so as to obtain low-dimensional time-series data; it does not require the collapse of diachronic changes in order to obtain cross-sectional data matrices accessible to conventional dimension reduction techniques; nor is it limited to the case-wise analysis of complex evolutionary change, as EFA has been. Despite its focus the macro-level dynamics of the time-varying interrelations between a finite set of latent variables, ECA maintains the link to the underlying micro-level patterns of evolving manifest data. As the procedure is entirely formal and makes no assumptions about the specific nature of represented processes, it can be easily adapted also to the analysis of other linked, high-dimensional evolutionary processes both within and beyond the social sciences. Given that such processes are in fact quite common, and that existing methodological tools are often ill-equipped to permit a comparative analysis of complex, co-evolving data, ECA offers a valuable addition to the methodological toolbox.

Acknowledgments

Christian Baden would like to thank Dimitra Dimitrakopoulou for her help with collecting and interpreting the data.

Funding

This work is funded by the project ‘Data collection for this study has been supported by the European Union, Marie Skłodowska-Curie Grant No. 627682’.

Data availability

The Dataset we used in this article are available as Supplementary material files. Source codes for the reproduction of our results are available at https://github.com/giovanni-motta/.

Supplementary material

Supplementary material is available online at Journal of the Royal Statistical Society: Series A.

References

Baden

C.

,

Pipal

C.

,

Schoonvelde

M.

, &

van der Velden

,

M. A. C. G.

(

2022

).

Three gaps in computational text analysis methods for social sciences: A research agenda

.

Communication Methods & Measures

,

16

,

1

–

18

.

10.1111/jcom.2017.67.issue-1

Baden

C.

, &

Tenenboim-Weinblatt

K.

(

2017

).

Convergent news? A longitudinal study of similarity and dissimilarity in the domestic and global coverage of the Israeli-palestinian conflict

.

Journal of Communication

,

67

(

1

),

1

–

25

.

Baden

C.

(

2018

).

Reconstructing frames from intertextual news discourse : A semantic network approach to news framing analysis. In Doing news framing analysis II: Empirical and theoretical perspectives

.

Baumgartner

F. R.

,

De Boef

S. L.

, &

Boydstun

A. E.

(

2008

).

The decline of the death penalty and the discovery of innocence

.

Cambridge University Press

.

10.1285/i20705948v12n2p542

Beh

E.

, &

Lombardo

R.

(

2019

).

A genealogy of correspondence analysis: Part 2-the variants

.

Electronic Journal of Applied Statistical Analysis

,

12

,

552

–

603

.

Beh

E. J.

, &

Lombardo

R.

(

2021

).

An introduction to correspondence analysis

(3rd ed.).

Wiley

.

Beltran

J.

,

Gallego

A.

,

Huidobro

A.

,

Romero

E.

, &

Padró

L.

(

2021

).

Male and female politicians on twitter: A machine learning approach

.

European Journal of Political Research

,

60

(

1

),

239

–

251

.

Benzécri

J. P.

(

1992

).

Correspondence analysis handbook

.

Marcel Decker

.

Bouroche

J.-M.

, &

Saporta

G.

(

1980

).

L’analyse des données

.

Presses Universitaires de France

.

Bunse-Gerstner

A.

,

Byers

R.

,

Mehrmann

V.

, &

Nichols

N.

(

1991

).

Numerical computation of an analytic singular value decomposition of a matrix valued function

.

Numerische Mathematik

,

60

(

1

),

1

–

39

.

Burscher

B.

,

Vliegenthart

R.

, &

de Vreese

C.

(

2016

).

Frames beyond words: Applying cluster and sentiment analysis to news coverage of the nuclear power issue

.

Social Science Computer Review

,

34

(

5

),

530

–

545

.

10.1177/0894439315596385

10.1080/19312458.2020.1812555

Chan

C.-H.

,

Zeng

J.

,

Wessler

H.

,

Jungblut

M.

,

Welbers

K.

,

Bajjalieh

J. W.

,

van Atteveldt

W.

, &

Althaus

S. L.

(

2020

).

Reproducible extraction of cross-lingual topics (rectr)

.

Communication Methods and Measures

,

14

(

4

),

285

–

305

.

Chu

C.-K.

, &

Marron

J. S.

(

1991

).

Comparison of two bandwidth selectors with dependent errors

.

The Annals of Statistics

,

19

(

4

),

1906

–

1918

.

10.1214/aos/1176348377

Dahlhaus

R.

(

1997

).

Fitting time series models to nonstationary processes

.

The Annals of Statistics

,

25

(

1

),

1

–

37

.

10.1214/aos/1034276620

D’Ambra

L.

, &

Lauro

C. N.

(

1989

). Non-symmetrical correspondence analysis for three-way contingency table. In

Multiway data analysis

(pp.

301

–

315

).

Elsevier

.

10.1137/S0895479897330182

Dieci

L.

, &

Eirola

T.

(

1999

).

On smooth decompositions of matrices

.

SIAM Journal on Matrix Analysis and Applications

,

20

(

3

),

800

–

819

.

Doise

W.

,

Clemence

A.

, &

Lorenzi-Cioldi

F.

(

1993

).

The quantitative analysis of social representations

.

Harvester Wheatsheaf

.

10.1016/j.jeconom.2010.11.007

Eichler

M.

,

Motta

G.

, &

von Sachs

R.

(

2011

).

Fitting dynamic factor models to non-stationary time series

.

Journal of Econometrics

,

163

(

1

),

51

–

70

.

Fan

J.

, &

Gijbels

I.

(

1996

).

Local polynomial modelling and its applications

.

Chapman and Hall

.

10.3389/fcomm.2022.955493

Fatema

S.

,

Li

Y.

, &

Fugui

D.

(

2022

).

Social media influence on politicians’ and citizens’ relationship through the moderating effect of political slogans

.

Frontiers in Communication

,

7

,

1

–

21

.

Granger

C.

, &

Newbold

P.

(

1986

).

Forecasting economic time series

(2nd ed.).

Elsevier

.

Granger

C. W. J.

(

1969

).

Investigating causal relations by econometric models and cross-spectral methods

.

Econometrica

,

37

(

3

),

424

–

438

.

Greenacre

M.

(

2016

).

Correspondence analysis in practice

(3rd ed.).

Chapman and Hall/CRC

.

10.1080/21670811.2015.1093271

Härdle

W. K.

, &

Simar

L.

(

2015

).

Applied multivariate statistical analysis

(4th ed.).

Springer

.

Hellsten

I.

,

Dawson

J.

, &

Leydesdorff

L.

(

2010

).

Implicit media frames: Automated analysis of public debate on artificial sweeteners

.

Public Understanding of Science (Bristol, England)

,

19

(

5

),

590

–

608

.

10.1177/0963662509343136

Jacobi

C.

,

van Atteveldt

W.

, &

Welbers

K.

(

2016

).

Quantitative analysis of large amounts of journalistic texts using topic modelling

.

Digital Journalism

,

4

(

1

),

89

–

106

.

Kleinnijenhuis

F.

, &

Fan

D. P.

(

1999

).

Media coverage and the flow of voters in multiparty systems: The 1994 national elections in Holland and Germany

.

International Journal of Public Opinion Research

,

11

(

3

),

233

–

256

.

10.1093/ijpor/11.3.233

10.1037/0033-295X.104.2.211

Kleinnijenhuis

J.

,

Schultz

F.

, &

Oegema

D.

(

2015

).

Frame complexity and the financial crisis: A comparison of the united states, the United Kingdom, and Germany in the period 2007–2012

.

Journal of Communication

,

65

(

1

),

1

–

23

.

Landauer

T. K.

, &

Dumais

S. T.

(

1997

).

A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge

.

Psychological Review

,

104

(

2

),

211

–

240

.

Lawley

D.

, &

Maxwell

A.

(

1971

).

Factor analysis as a statistical method

.

Butterworths

.

Leydesdorff

L.

(

2011

).

Meaning as a sociological concept: A review of the modeling, mapping and simulation of the communication of knowledge and meaning

.

Social Science Information

,

50

(

3-4

),

391

–

413

.

10.1177/0539018411411021

Leydesdorff

L.

, &

Vaughan

L.

(

2006

).

Co-occurrence matrices and their applications in information science: Extending ACA to the web environment

.

Journal of the American Society for Information Science and Technology

,

57

(

12

),

1616

–

1628

.

Lütkepohl

H.

(

2005

).

New introduction to multiple time series analysis

.

Springer

.

Motta

G.

(

2009

).

Evolutionary factor analysis

.

Institut de Statistique

.

10.1080/19312458.2012.760730

Motta

G.

(

2017

).

Factor analysis: Evolutionary. In M. Allen (Ed.) The SAGE Encyclopedia of Communication Research Methods (pp. 511–514). SAGE Publications, Inc

.

Motta

G.

(

2021

).

‘Joint mean-vector and var-matrix estimation for locally stationary var(1) processes’, arXiv, arXiv:2104.11358, preprint: not peer reviewed

.

Motta

G.

, &

Baden

C.

(

2013

).

Evolutionary factor analysis of the dynamics of frames: Introducing a method for analyzing high-dimensional semantic data with time-changing structure

.

Communication Methods and Measures

,

7

(

1

),

48

–

82

.

10.1017/S0266466611000053

Motta

G.

,

Hafner

C.

, &

von Sachs

R.

(

2011

).

Locally stationary factor models: Identification and nonparametric estimation

.

Econometric Theory

,

27

(

6

),

1279

–

1319

.

10.1111/biom.2012.68.issue-3

Motta

G.

, &

Ombao

H.

(

2012

).

Evolutionary factor analysis of replicated time series

.

Biometrics

,

68

(

3

),

825

–

836

.

Motta

G.

,

Wu

W. B.

, &

Pourahmadi

M.

(

2023

).

√2‌-estimation for smooth eigenvectors of matrix-valued functions

.

Biometrika

,

110

(

4

),

1077

–

1098

.

10.1080/10584609.2020.1812777

Nicholls

T.

, &

Culpepper

P. D.

(

2021

).

Computational identification of media frames: Strengths, weaknesses, and opportunities

.

Political Communication

38

(

1–2

),

159

–

181

.

10.1111/ajps.2014.58.issue-4

Reinsel

G. C.

(

1997

).

Elements of multivariate time series analysis

.

Springer-Verlag

.

Roberts

M. E.

,

Stewart

B. M.

,

Tingley

D.

,

Lucas

C.

,

Leder-Luis

J.

,

Gadarian

S. K.

,

Albertson

B.

, &

Rand

D. G.

(

2014

).

Structural topic models for open-ended survey responses

.

American Journal of Political Science

,

58

(

4

),

1064

–

1082

.

Rosin

G. D.

, &

Radinsky

K.

(

2019

).

Generating timelines by modeling semantic change. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL) (pp. 186–195). Association for Computational Linguistics

.

Saporta

(

2006

).

Probabilités, analyse des données et statistique

.

TECHNIP

.

10.1111/j.1460-2466.2000.tb02843.x

Schön

D.

(

1994

).

Frame reflection. Towards the resolution of intractable policy controversies

.

Semetko

H. A.

, &

Valkenburg

P. M.

(

2000

).

Framing european politics: A content analysis of press and television news

.

Journal of Communication

,

50

(

2

),

93

–

109

.

10.1080/10584600903297240

Sheafer

T.

, &

Gabay

I.

(

2009

).

Mediated public diplomacy: A strategic contest over international agenda building and frame building

.

Political Communication

,

26

(

4

),

447

–

467

.

Trenz

H. J.

(

2004

).

Media coverage on European governance: Exploring the European public sphere in national quality newspapers

.

European Journal of Communication

,

19

(

3

),

291

–

319

.

10.1177/0267323104045257

van Atteveldt

W.

(

2008

).

Semantic network analysis: Techniques for extracting, representing, and querying media content

.

Free University of Amsterdam

.

10.1111/j.0021-9916.2007.00329.x

van der Heijden

P. G. M.

, &

de Leeuw

J.

(

1985

).

Correspondence analysis used complementary to loglinear analysis

.

Psychometrika

,

50

(

4

),

429

–

447

.

Van Gorp

B.

(

2007

).

The constructionist approach to framing: Bringing culture back in

.

Journal of Communication

,

57

,

60

–

78

.

10.1177/107769900808500409

Vliegenthart

R.

, &

Walgrave

S.

(

2008

).

The contingency of intermedia agenda setting: A longitudinal study in Belgium

.

Journalism & Mass Communication Quarterly

,

85

(

4

),

860

–

877

.

Wessler

H.

,

Woyniak

A.

,

Hofer

L.

, &

Lück

J.

(

2016

).

Global multimodal frames on climate change: A comparison of five democracies around the world

.

The International Journal of Press/Politics

,

21

(

4

),

423

–

445

.

10.1177/1940161216661848

Yardi

S.

, &

Boyd

D.

(

2010

).

Dynamic debates: An analysis of group polarization over time on twitter

.

Bulletin of Science, Technology & Society

,

30

(

5

),

316

–

327

.

10.1177/0270467610380011