Correspondence Analysis and
Spectral Map Analysis

Biplot produced by (Canonical) Correspondence Analysis, showing row-profiles and column-profiles as points in factor space, ref.: ../sarobetsu_CCA

The oldest and most popular method of factor analysis of the second type (shape or profiles only) is Correspondence Analysis (CA), developed by Jean-Paul Benzecri in 1973.
The method was first applied to lexicography and sociology. In its simplest form it is applied to contingency tables in which the row- and column-entities represent categories of two variables (for example geographical location and level of income), and where the data represent counts (number of people in the various contingencies of the categories of the two variables).

Unlike in Principal Components Analysis (PCA), this method does away with the size component (the total number of people in each category of location and the total number in each category of level of income) by dividing each element of the contingency table jointly by its corresponding row- and column-totals.

The size component is defined here by the totals by rows and the totals by columns, usually displayed in the margins of the table, and therefore also referred to as the marginal totals of the contingency table. The operation is also called joint row-and column-closure or double-closure. It results in a table of row- and column-profiles.

A peculiar feature of Correspondence Analysis is the variable weighting of rows and column-categories proportionally to their corresponding marginal totals. This gives more emphasis (or weigth) to the rows and columns that are associated to categories that have greater importance (or larger size) in the original contingency table. The weighted eigenvectors of the double-closed table are then computed and the result is displayed in a biplot.

The method has interesting statistical properties, such as distributional equivalence, which states that rows with similar profiles can be combined without affecting the representations of the columns on the biplot (and vice versa).

A Google search on “Correspondence Analysis” returns about 230,000 hits.

An alternative method of factor analysis of the second type ( shape or profiles only) is Spectral Map Analysis (SMA) or Spectral Mapping, described by Paul J. Lewi in 1967 and generalized in 1980. It was originally developed in the context of pharmacological tests on medicinal compounds. Although it has found applications in many diverse fields, including marketing and finance, it has had little impact so far. It was rediscovered by Michael Greenacre in 2005 and re-baptized Log-Ratio Analysis (LRA).

This approach differs from Correspondence Analysis by the way in which the size component is removed from the data. While Correspondence Analysis applies double-closure, Spectral Mapping, alias Log-Ratio Analysis, applies logarithmic transformation and double-centering of the data. All other features of the two methods are the same, with the exception that the weights for the row- and column-entities in Spectral Mapping are not strictly limited to the marginal totals of the original table as it is the case in Correspondence Analysis. In Spectral mapping they can be defined in any sensible way, including equal weighting of the row- and column-entities.

Spectral Map Analysis can be applied to data that are either in the form of a contingency table (adding up to a meaningful grand total) or that are defined on ratio scales (yielding meaningful ratios). A drawback is that all data must be strictly positive, although random zeroes (resulting from rounding small numbers or from values below the detection limits of measurement) can often be replaced by small positive values without significantly affecting the result of the analysis.

Greenacre has shown that the method possesses the property of sub-compositional coherence, in addition to the above mentioned property of distributional equivalence. Subcompositional coherence means that the result of Spectral Mapping/Log-Ratio Analysis is invariant to the selection of a particular sub-composition from the original table. This makes the method also suitable for compositional data (such as occur in chemical analyses of foodstuffs, archaeological artifacts, and so on). It can be intuitively understood that removal of one or more columns from the data table does not affect the ratios among elements from the remaining columns.

An application of Spectral Mapping/Log Ratio Analysis is discussed in more detail further down.

Back to Begin       Back to Title Page       Previous       Next

Date created: December 19, 2005         Date last modified: September 6, 2006