-
PDF
- Split View
-
Views
-
Cite
Cite
Anne Danielsen, Mats Johansson, Chris Stover, Bins, Spans, and Tolerance: Three Theories of Microtiming Behavior, Music Theory Spectrum, Volume 45, Issue 2, Fall 2023, Pages 181–198, https://doi.org/10.1093/mts/mtad005
- Share Icon Share
Abstract
This study compares three recent theories of expressive microtiming in music. While each theory was originally designed to engage a particular musical genre—Anne Danielsen’s beat bins for funk, Neo-Soul, and other contemporary Black musical expressions, Chris Stover’s beat span for “timeline musics” from Africa and the African diaspora, and Mats Johansson’s rhythmic tolerance for Scandinavian fiddle music—we consider how they can productively coexist in a shared music-analytic space, each revealing aspects of musical structure and process in mutually reinforcing ways. In order to explore these possibilities, we bring all three theories to bear on a recording of Thelonious Monk’s “Monk’s Dream,” focusing on Monk’s piano gestures as well as the relationship between saxophonist Charlie Rouse’s improvised solo and Monk’s and bassist John Ore’s accompaniments.
The beat is dense, but its ambiguity of shadings opens it up and keeps it moving.1
For a great deal of the world’s metered music—meaning, oversimply, music that unfolds cyclically, where each cycle is clarified by the presence of a series of at least somewhat agreed-upon recurring structural “beats”2—variable beat durations, “muddy” beat positions, and timings that seem to stretch nominal isochrony are the norm rather than the exception. This is as true of hip-hop, neo-soul, salsa, and many other popular music genres as it is of jazz and innumerable global folk traditions. The popularity and within-cultural pervasiveness of these genres indicate that listeners do not find such features extraordinary: they are part of the essential fabric that defines each genre’s sonic signature. Indeed, the ease with which both performers and fans handle such rhythmic phenomena, which tend to cause trouble for both traditional theories of musical rhythm and meter and for basic research into the perception and cognition of music’s temporal unfolding, is striking.
The prevalence of these expressive microtimings also calls for a thorough reconsideration of the relationship between what we might call structural and expressive properties of musical rhythm. Conventional music theory suggests a Platonic model: a virtual, unmarked original precedes and grounds expressively varied copies, which can always be referred back to as the original prototype. Similarly, in classic music psychology, there is a tendency to approach microtiming as an expressive “addendum” to a “standard” structure, as we find, for example, in Carl Seashore’s claim that “the artistic expression of feeling in music consists in esthetic deviation from the regular.”3 Expressive microtiming is thus often characterized as variations on (or in negative terms, deviations from) a fixed norm. An alternative way of understanding expressive microtiming is to eschew this model and consider, following Ingmar Bengtsson, Bruno Repp, and others, that “lengthenings and shortenings are not deviations from the norm—they are the norm.”4 Following Eric Clarke, one might say that rhythm contains both its relevant structuring patterns and the potential for the significant or expressive variation of these patterns, but inverting the process: variations come first and then are categorized as patterns.5 Models in this way of thinking are backformed from a corpus of singular expressions as acts of ethological taxonomy; that is, we generalize by observing shared behaviors across similar musical contexts and then forming “types.”6 Such a reconsideration involves turning our analytic attention more fully to the flexibility and plasticity of rhythmic events at the microlevel of musical production and perception, and striving to understand how musical processes unfold by working outward from that minute level of detail.
Many empirical details of microtiming variation in music have been examined in previous research.7 This article furthers these studies by comparing three recent theories of how microtiming variation occurs. Mats Johansson’s theory of rhythmic tolerance emerged as part of an explanatory framework for timing variations in traditional Scandinavian fiddle music and concerns the flexibility of beat and measure durations as well as the experiential status of such temporal fluctuations—that is, the extent to which they are noticed and actively engaged by insider performers and listeners.8 Anne Danielsen’s theory of beat bins was originally developed in an analysis of microrhythmic relationships in neo-soul artist D’Angelo’s “Left and Right,” and claims that the internal pulse reference we use to structure and understand beat-based musical rhythms is not a series of points in time but has temporal extension and a particular shape.9 Chris Stover’s theory of beat span strives to explain temporal malleability in an extended family of African and Afrodiasporic music practices, referred to as timeline musics, by the ways in which different virtual pulse cycles are superimposed and pull on one another, creating a stretched space (that is, a span) within which several different event locations are possible.10
Despite originating from specific musical contexts, beat bins, beat spans, and rhythmic tolerance share a fundamental theoretical premise: rhythm involves an interaction between actual sounding events and “virtual” structuring mechanisms, such as meter, pulse, subdivision, or stylistic figures (the latter encompassing both style-specific rhythmic patterns and melodic-rhythmic formulas), which the perceiver projects onto sounding events.11 The role and indeed the ontological status of such non-sounding aspects of rhythm becomes clearer in the context of Gilles Deleuze’s notion of the “virtual.” According to Deleuze, the virtual and the actual (in our scenario, the actual sounded musical events) mutually constitute one another as different manifestations of what he calls the “real.” This co-constitution results from a continuous process of differentiation whereby the virtual and actual inflect and transform one another.12 From this perspective, actual sounded events are no more “real” than virtual ones, since the latter are constantly having an effect on the former in terms not only of how we might potentially perceive them but also how they interrelate in concrete, empirical terms. The opposite is true as well: actual events have an effect on virtual ones, potentially transforming the latter. This means that virtual reference structures like meter and pulse are fully real and must be defined as part of the phenomenon at hand, but also that sounded events play a role in constituting how those virtual reference structures take shape in the first place.
The aim of this article is twofold. First, we seek to clarify similarities and differences between our three related theoretical models, each of which attempts to address the structuring mechanisms that shape the perception and production of musical rhythm. Second, we want to demonstrate how the three analytical approaches can be combined to generate a richer understanding of musical microrhythm than each could do alone. In the first half of this article, we will briefly outline the three theories in turn. In each case, some musical context will be provided that explains how the theory took shape through careful attention to particular musical phenomena in a specific musical tradition and then coalesced into a nuanced explanatory model. In the second half, we bring our three theories into dialogue on somewhat neutral turf, analyzing temporal-relational unfoldings in Thelonious Monk’s 1962 recording of “Monk’s Dream.” This recording represents a different genre than those originally engaged in our research and showcases a playful but analytically challenging stretching and bending of rhythmic values. This makes the performance well suited to assess the extent to which the different concepts are distinctive to particular genres or musical practices, as well as whether they can be successfully “exported” to new genres and combined to engage a richer explanatory potential than each can do alone.
Rhythmic tolerance
The concept of rhythmic tolerance implies that there is a context-dependent flexibility in the timing of rhythmic events as well as in the framework against which it is produced and perceived. The concept was initially developed to account for the performative and interpretive flexibility that characterizes a particular style of traditional Scandinavian fiddle music known as springar (Norway) or polska (Sweden), hereafter referred to as the springar genre.
Despite the fact that this is dance music, one of springar music’s distinctive features is a striking rhythmic-temporal variability, which raises questions about the relationship between seemingly inconsistent and/or ambiguous rhythmic behavior and the overall consistency or well-formedness of rhythmic patterns as experienced by performers, dancers, and listeners. To analytically account for this seeming contradiction, rhythmic tolerance first of all concerns the flexibility of the rhythmic framework: beat and measure durations as well as subdivision ratios may vary considerably across and within tunes/performances without compromising one’s experience of flow, tempo, and groove. This also pertains to the detectability of such temporal fluctuations—that is, the extent to which they are noticed and actively engaged. A second dimension of rhythmic tolerance arises from the observation that the experienced location of rhythmic events (often expressed as a “perceptual center” or P-center13) may vary between perceivers and contexts with no single interpretation regarded as universally correct. Under certain conditions, both these dimensions also pertain to synchronization behavior in the sense that co-occurring microrhythmic interpretations and behaviors may be experienced as coherent and synchronized despite substantial discrepancies in absolute terms.
The passage shown in Example 1 and heard in Audio Example 1—featuring a four-measure segment of the Swedish polska tune “Frisells storpolska” as played by Pers Hans Olsson (1942–2020) (melody) and Björn Ståbi (1940–2020) (second voice)—clearly illustrates these different aspects of rhythmic tolerance. First, there is substantial variation in beat durations between measures. The first beat, for example, fluctuates between 347 and 635 ms (a 288-ms spread). The scale of these fluctuations is enormous, especially considering that this is dance music, and also compared to findings from experimental contexts.14 Moreover, there is evidence suggesting that the characteristic beat-timing variation featured in the springar tradition is barely noticed by expert listeners, if at all.15 The fact that expert listeners tend not to notice these fluctuations is related, in turn, to two basic premises: that duration (short/long) is conflated with accentuation (light/heavy) in how springar rhythms are conceptualized and practiced, and that beats are produced in and through—rather than in relation to—the emerging melodic-rhythmic course of events. Taken together, the implication is that beat-timing variations are not in themselves a focus of attention, rhythmic tolerance pointing to the fact that beats are passively allowed to fluctuate as much as they are intentionally shaped to particular durations.

“Frisells storpolska,” as played by Pers Hans Olsson and Björn Ståbi (melody voice only). Four-measure segment with timing data
Second, in some instances, the perceived start of the beat is strikingly ambiguous. The third beat of the second measure, “A” in Example 2, and the transition to the first beat of the following measure (“B”) are two illustrative examples. The arrows in Example 2 represent potential beat onsets: 1 and 2 are potential locations for A and 3, 4, and 5 are potential locations for B.16 Starting with A, arrow 1 represents what would typically be a pick-up note rather than a beat onset. The pick-up interpretation is supported by the melodic-rhythmic context, in which the subsequent D (arrow 2) represents the structurally logical start of the beat, which is embellished by the preceding C♯. However, by the time the D arrives, there is too little time left for the remaining events to constitute a convincing beat from a listening perspective (the C♯ occupies as much as 37% of the whole figure), which supports an on-the-beat interpretation of the C♯ “grace note.” Similarly, we might question considering the C♯ (arrow 2) as part of the preceding beat, which decidedly consists of an eighth note pair (D–C♯) and nothing more. In the case of B, the beat onset is even more ambiguous. Here, D clearly represents the beat onset, but which D? The fiddlers are gradually sneaking or surging into the beat with an elongated ornamented gesture which also seems to extend into the D of the following measure through dynamic accentuation with the bow (arrows 4 and 5). In this scenario, arrows 3, 4, and 5 are all viable alternatives for a beat onset location. However, instead of choosing between these points in time as a single “correct” beginning point, a more convincing interpretation is that the ambiguous physical representation increases the tolerance range, implying that the beat beginning itself has a certain extension.

Measures 2 and 3 of the four-measure segment are shown in Example 1 (only the melody voice is shown). This is a more detailed notation of the transition between measures 2 and 3 with arrows indicating possible beat onset positions
Third, the interaction between the two fiddlers (not shown in Ex. 2, but hearable in Audio Ex. 1) is overall tightly synchronized. This indicates a shared conception of the melodic-rhythmic structure of the tune, as well as of the expressive means through which that structure is communicated and highlighted through bow phrasing, dynamics, and ornamentation. Such a shared conception of the music’s expressive impetus suggests, in turn, that the synchronization between the fiddlers is not achieved by attending and adjusting to the shifting beat durations as such (a seemingly impossible task), but through a shared understanding of how the tune’s expressive potential is made manifest in each given instant, a process to which these variations in beat duration is intrinsic. Finally, while there are timing discrepancies between the voices on a note-to-note level, the overall impression remains that of a coherent and cohesive performance.
In summary, the concept of rhythmic tolerance concerns three dimensions of rhythmic production and perception: (1) the flexibility of the rhythmic framework; (2) the tolerance with which rhythmic events are perceptually identified on a time axis; and (3) how rhythmic events are synchronized between performers. While initially identified in an attempt to analytically explore the temporal peculiarities of the springar genre, rhythmic tolerance may be thought to manifest in all types of music, as will be addressed below.
Beat bins
The beat bin hypothesis grew out of musical analyses of African-American groove-based music of the last decades. In neo-soul, for example, rhythmic layers are often deliberately slightly displaced in relation to each other, producing multiple locations of the same beat at the micro-level of groove.17 The beat bin theory states that in response to such “multiple” beats, the perceptual beat, or tactus, will normally take the shape of a wider “bin” that encompasses all events that articulate the beat, merging them into one compound sound. This perceptual merging still allows for individual events to have a particular timing profile in relation to the beat, though, as they may fall within the beat bin, on the rim of the bin, be early or late in relation to the beat bin, or simply so off that they relate to the previous or succeeding bin or to a different pulse reference. This pulse reference is dynamic in the sense that the shape of the beat bin can change over the course of a song.
The groove of neo-soul artist D’Angelo’s “Left and Right,” from the album Voodoo (2000), is an example of how beat bin width can be manipulated and used as an important aesthetic aspect of the overall feel. In this song, the shape of the internal reference changes from a narrow to a wide beat bin that absorbs what might have been heard as discrepant beat-related rhythmic events.
“Left and Right” starts with a syncopated guitar and percussion part that implies a clear, regular quarter-note pulse. However, when the rhythmic layer consisting of kick drum, bass guitar, and snare drum enters, it articulates an alternate surface-level beat that is considerably earlier than what we just heard, calling into question the normative status of the guitar/percussion layer and possibly inviting us to ask which is the “real” beat. If we carefully examine the composite groove, we discover that the “glitch” or discrepancy between the two rhythmic layers is considerable: approximately 55 ms on beats 1 and 3 of the basic one-measure rhythmic pattern ( meter), and approximately 80 ms on beats 2 and 4—that is, between 8% and 12% of a quarter note at the song’s tempo of 92 beats per minute. As the notation and annotated waveform in Example 3 illustrate, the glitch is particularly salient on beats 2 and 4, where the sharp attack of the guitar, which plays a syncopated sixteenth note ahead of the beat, is extremely close to the equally sharp attack of the snare drum on the beat. This can be clearly heard in Audio Example 2.

Notation (a) and waveform representation (b) of the first bar of “Left & Right” (adapted from Danielsen, Haugen, and Jensenius 2015)
Put differently, the virtual or “structural” distance is one-sixteenth note, whereas the actual distance is closer to one thirty-second note. This introduces a characteristic “tilt,” an unevenness that at first calls for an immediate adjustment in phase, but whose effect changes over the course of the song. Accordingly, there may be three different experiential phases of this groove.18 The first corresponds to the guitar-percussion introduction. The second is the transition following the entrance of the drum kit and bass layer, when the perceiver is unsettled by the new micro-rhythmic design. The third is the experience of being fully synchronized with the beat-bin nature of the groove. This is where we begin to experience the “tilt” going away and the groove settling into a more rolling feel. According to the beat bin theory, this is because the listener is now adjusting to the multiple onsets: the width and shape of each pulsation in the listener’s internal (perceptual) pulse reference extends from a narrow, point-like to a wider, more saddle-shaped beat bin, as shown in Example 4.19

In sum, the beat bin theory suggests that the precision with which we process beats in a groove-based context varies systematically with the width and shape of actual beat-related rhythmic events as well as with the listener’s stylistic expectations. It also pertains to musical contexts where “muddy” sounds often make the exact location of beats and other temporal events unclear.20 Generally, the beat bin can be defined as the perceptual counterpart to the width and shape—that is, the acoustic features—of the sound(s) that are located at beat-related metrical positions. A beat bin can therefore be produced by the “muddy” qualities of a single sound, for example, a sound with a slow or gradual attack that therefore lacks a clear perceptual center21—or, as in “Left and Right,” by the co-presence of several temporally proximate rhythmic events that stretch the start of the beat. Perceptually, multiple onsets falling within the boundaries of the beat bin will be heard as belonging to the same virtual beat, whereas onsets falling outside these boundaries will be heard as belonging to another category––namely, that of “not part of the beat.”22 A wide beat bin, then, increases the listener’s overall tolerance for “imprecise” locations of rhythmic events, producing an openness as to where rhythmic events can take place at the microlevel while still being heard as (part of) a given beat.
Beat span
Beat span was originally conceptualized to describe microtiming inflections in a family of African and Afro-diasporic musics collectively referred to as “timeline musics.”23 Its premise is that the virtual temporal substrate—the grid—is always in flux. This flux is a product of multiple metric forces—specifically, co-extensive triple and quadruple subdivisions of a basic four-count metric structure—which conspire to pull played events in one temporal direction or another as a performance unfolds. The “span” of beat span refers to the temporal extendedness produced by those forces. Beat span results from interactive performative actions that continually produce next forces, the precise nature of which at any given moment partially engenders the shape the beat span will take as the music continues to unfold. Meter in this sense amounts to a process of force relations continuously shaping one another. These two concepts—“opening up” the beat as an active, enacted process, and the way this process contributes to the particular way the music moves forward from beat to beat—are extremely important for the theory.
To understand how beat span operates, we might abstract away from performed reality and posit two idealized metric strata, each of which stands in for a richly plural, fluid, improvisatory phenomenon. These two strata can be described as coextensive 12-pulse and 16-pulse cycles, or 12-cycle and 16-cycle for short. Each of these cycles can be attended to from multiple metric and “metric-like” perspectives. For example, the 12-cycle can be parsed into triple, quadruple, and sextuple metric traversals, while a four-count metric stratum as well as the non-isochronous tresillo sequence are both ways to attend to the 16-cycle’s temporal unfolding. Example 5 shows some of these traversals.

Four-count and six-count traversals of the 12-cycle; four-count and tresillo traversals of the 16-cycle
In other words, beat span assumes that two kinds of performative multistability—what David Locke calls “multidimensionality”24—are operative; namely (1) that both the virtual 12- and 16-cycles are present either overtly or as felt effects, and (2) that multiple metric “paths” are found within each cycle. It also assumes that each of these cycle nexuses can be thought of as a stretched version of the other. For example, the isochronous six-count traversal of the 12-cycle stretches into two iterations of tresillo, shown in Example 6, which is not to say that tresillo is a form of meter, but that the line between meter and not-meter is blurry.

Six-count 12-cycle traversal and tresillo “stretched” into one another
It is crucial to underscore that what appears here to be metric grids are virtual phenomena to an even greater extent than in most theories of meter. They are co-present as gravitational forces that have effects on the temporal location of performed events; the latter in this account tend to fall somewhere within a small stretch of time “operationalized” by the double pull of forces. In other words, these coextensive virtual strata contribute to the staking out of an ongoing series of temporally extended spaces within which performed events occur (which the triangles in Ex. 6 suggest), even though the strata themselves are seldom clearly articulated by any particular performance layer. They might be—that is always an option within the improvisational, interactive fabric of what Meki Nzewi calls the “Ensemble Thematic Cycle”25—but they needn’t be.
The virtual, isochronous 12- and 16-cycle strata, then, are only half the story, because those performed events are also doing the work of staking out that space. This is where the theory of beat span derives its explanatory power, since it is the cleavage of some given performed gesture toward one beat-span extreme or the other, and the effect such cleavages have on other performed events, that animates what matters about beat spans. In other words, actual and virtual events are more or less equal participants in the staking-out of the music’s microtemporal unfolding, and any given event has the capacity to affect the others around it.
The Afro-Cuban rumba columbia offers a compelling illustration of all these concepts at work. As the notation below illustrates, multiple metric strata—virtual 12-count and 16-count traversals of regularly recurring cycles—impinge on and are impinged upon by played events. Example 7 shows a brief passage from a recording of “Elegía a los columbianos” by the Cuban rumba ensemble Los Muñequitos de Matanzas. The excerpt can be heard in Audio Example 3. Here we see five performed strata: a bell articulating the 12-cycle “standard pattern”26 in the top staff, a chekere also articulating the 12-cycle with an alternating physical up-down motion (each “down” landing on beats 1 and 3 in alternation) in the second staff, two drums, the segundo and salidor, in the third and fourth staves, and a repeated driving figure called catá in the bottom staff (a more improvisatory lead drum, the quinto, is not shown, nor are voice or dance strata). All of these parts interact with one another and with the virtual cyclic grids in a continually evolving creative process.

Segundo and salidor parts “stretching” between beat span limits in “Elegía a los columbianos” (Los Muñequitos de Matanzas). Triangle noteheads are “bass” tones; small hash-mark noteheads refer to pitchless “muff” tones. Transcription by Chris Stover
From a beat span perspective, there is one structural and one processual point that need to be made. First, the co-presence of the 12-cycle (played by bell and chekere) and 16-cycle (played by catá), represented by and time signatures, respectively, is very clear and easy to hear. This is a matter of the music’s underlying structure. But from a processual perspective, it’s not nearly that simple: these layers are gently, continuously pulling each other out of alignment. This can be seen in the behavior of the two drums, as the example shows. The segundo is playing a repeated figure that seems to scan closely to the virtual 16-cycle grid. But its timing is fairly consistently “stretched”: each pair of onsets that occurs within the space of beats 2 and 4 is subtly displaced from the grid, occupying a liminal position between beat-span “limits.”27 (For this reason the segundo layer could probably just as easily have been notated in with arrows pointing in the opposite direction.) Likewise the salidor: starting on beat 2 of the third measure of the example, each trio of onsets is stretched such that they nearly resemble the triplet figures shown in parentheses above the staff.
To reiterate, then, beat span refers to short spans of time within which performed events occur. These are engendered by two kinds of related forces, virtual and actual. Virtual forces include the gravitational pulls of multiple n-cycle strata (usually 12- and 16-cycle strata, though others are possible28) that may or may not be materially present in the music. Actual forces are enacted by actual played events, which stake out positions within the beat span and affect other played events in a continuous interplay.
Bins, spans, and tolerance compared
All three of these concepts share an emphasis on “opening up” the beat as an active, enacted process––a focus on a virtual temporal substrate that is itself always in flux, and an insistence that it is played events and virtual forces in combination that do that opening-up work. It is important, therefore, to consider in what ways they overlap conceptually but also in what ways they differ, and—most importantly—how in differing they can mutually reinforce one another: how they can be used in tandem to create a multivalent analytic approach. Beat span theorizes temporal flux as a product of multiple metric forces that together contribute to the particular way the music moves forward from beat to beat. Similarly, but with a different explanatory mechanism, the theory of rhythmic tolerance suggests that the durational variation of beats and measures is produced bottom-up from an emerging and shifting melodic-rhythmic course of events. Beat bins shift the analytic focus slightly from the production of extended beats to their perceptual implications—in terms of what might be heard as “precise” and “imprecise” locations of rhythmic events—for cultural insiders. Narrow or wide beat bins are also, however, produced by those played events; indeed, the distinction between production and perception for all three theories is deliberately opaque.
Perhaps the main difference between the three theories is the produced/perceived features they bring into focus. Beat bins have a temporal extension which orbits around short but not punctual beats; the purview of beat bins is the shape of those beats, which undergird all rhythmic beat-based activity and form a relatively steady, yet dynamic, grid. The beats in rhythmic tolerance are construed differently––there is no steady grid, and what we might call a beat is the entirety of the interonset interval from one particular kind of salient event to the next. In beat span, there are multiple grids, but they are pulling to and fro as each exerts a gravitational pull on the other. We suggest that the three theories do not describe fully discrete phenomena, but foreground different aspects of phenomena that may, we now suggest, occur across diverse music-cultural contexts.
This is an important consideration because all three concepts rely on how specific features of different music-cultural practices are perceived among practitioners and enculturated expert listeners; that is, they all take into account the sometimes highly specific stylistic competences and expectations of performers and listeners. A shared assumption, then, is that the experience of musical rhythm often reflects a particular cultural or microcultural disposition, way of musical knowing, and genre-specific sensibility. At the same time, however, all three theories suggest that these kinds of enacted microtiming processes occur to some extent across a broader spectrum of musical practices. We therefore wish to address the extent to which each of the three concepts can be used to account for phenomena outside of the contexts they were devised to explain. To this end, in the second half of this article, we bring our three theories into dialogue around a shared example to see whether each can bring a set of perspectives that enriches the next.
Comparative analysis of “monk’s dream”
In this second half of the article, we analyze select passages from the Thelonious Monk Quartet’s 1962 recording of “Monk’s Dream.” This recording displays a number of analytically challenging rhythmic features that make it well suited for testing to what extent and in what ways the three concepts can be brought into dialogue to reveal what we hear as some essential temporal aspects of music from a different genre than those originally engaged in our research. We engage the three theories in reverse order. First, we apply a beat span perspective to the interplay between saxophonist Charlie Rouse’s and Monk’s first improvised chorus, paying special attention to several ways in which their played onsets stretch alongside and impinge on one another. Second, we analyze the early moments of the performance using a beat bin methodology, to theorize how the microtiming relationships that transpire in the first improvised chorus might have originated. This, we suggest, illustrates how (actual) played events early on give rise to (virtual) structuring forces that, in turn, affect the succeeding improvisations in a profound way. Finally, we apply the model of rhythmic tolerance to explore subtle relationships between saxophone and bass timings in a related passage from the first improvised chorus. We discuss possible interpretations of the saxophone–bass nexus, including how rhythmic tolerance relates to and/or is generated by beat bin and beat span phenomena, respectively.
Beat Span Analysis of Rouse/Monk Interaction, First Improvised Chorus
The interplay between saxophone and piano (and, to an extent, bass and drums) in Charlie Rouse’s first improvised chorus offers several opportunities to consider how beat spans are enacted in performance. As with the rumba columbia example above, co-extensive 12- and 16-cycle undercurrents are present, which stake out two different “limits” on how swung eighth notes might be produced (which Fernando Benadon describes as beat-upbeat ratios [BURs] of 2:1 and 1:1, respectively29). Example 8 shows how these BURs scan to their respective underlying n cycles.

2:1 and 1:1 beat-upbeat ratios (BURs) with 12- and 16-cycle substrates, respectively
These strata interact in at least two ways in this passage. The first, as Example 8 suggests, is at the “eighth-note” level, which is the basic pulse level for the jazz of this period (and the subject of many studies of microtiming in jazz30). Rouse and Monk’s eighth notes are expressed in three distinct ways as the performance unfolds: as exaggerated trochee figures that scan closely to the 2:1 BUR, 12-cycle orientation, as nominally “straight” isochronous rhythmic articulations that express the 1:1 BUR, 16-cycle orientation, and somewhere in between the two. The second is a recurring figure from the song’s melody that Monk plays at the quarter-note triplet level; we’ll return to this below.
To illustrate the three kinds of eighth-note interpretations, Example 9 shows three brief passages transcribed into staff notation with a few additional analytic annotations (discussed below). All three passages can be heard in Audio Example 4.31 Rouse’s eighth notes in Example 9(a) express the 12-cycle fairly unequivocally (thus their notational rendering as triplets), whereas Monk’s melodic interjection in the beginning of the first measure of Example 9(c) aligns very closely with eighth notes in the 16-cycle. Several of Rouse’s onsets in Example 9(b) fall somewhere between the two cycle limits, as shown by arrows beneath the staff.32

12-cycle, 16-cycle, and liminal expressions. Measure numbers refer to Rouse’s first improvised chorus; timestamps refer to the full song track
In order to zoom in with greater precision, some relevant BUR data are provided. In the first measure of Example 9(a), Rouse’s first two onsets are very nearly perfectly in alignment with the 12-cycle (64:36 BUR) and each next pair of onsets stretches slightly further out of alignment with that stratum, to 59:41 and then 57:43.33 In the second measure Rouse’s three eighth-note pairs are all quite congruent, each tending toward the 12-cycle (61:39, 63:37, and 61:39, respectively). With the exception of the beat-upbeat pair that spans beat 3 of measure 1, then (with a BUR of 57:43, nearly precisely halfway between isochrony and a 2:1 ratio), each pair in this passage can be heard unproblematically as a 12-cycle expression. Note that while beat 4 of m. 2 resists easy BUR analysis due to the equivocal nature of its on-beat onset (that is, it is not easy to determine where Rouse’s first note begins), it is easy enough to hear how it scans to the same 12-cycle orientation as the rest of the passage.
In Example 9(b), we see that Rouse is expressing a more elastic time-feel, not only moving more freely between beat-span limits but also with more variety in terms of its meso-rhythmic surface. The first and last annotated BURs (beats 1 and 4) are right between beat span limits; beat 2 stretches considerably if we take the sounded E4 as the second term of the BUR.34 Monk’s melodic interjection shown in the first two measures of Example 9(c) (bottom staff), then, comes out of a passage of comparatively wild abandon in the chorus’s bridge: Rouse has been repeating and varying a single rising gesture, each iteration beginning with a timbrally dense, overblown mordent around B♭3 and rising two tritones, with Monk paraphrasing the song’s melody with a skeletal version of precisely what he played during his melodic exposition 32 measures earlier. Monk responds to this rhythmically divergent passage by beginning the AABA chorus’s third “A” section with four nearly precisely isochronous, crisply articulated 16-cycle onsets (BURs of 1:1 and 52:48), representing the stratum that all participants have been assiduously avoiding thus far, except as a virtual force pulling on their 12-cycle expressions, followed by a slow triplet that scans to the 12-cycle. As indicated above, this is a key figure from the melody of “Monk’s Dream,” which we will return to in the next section. Alongside Monk’s new utterance, Rouse plays a cross-rhythmic “double tresillo” gesture, the eighth-note expression of which cleaves to the 12-cycle, creating a stunningly complex microtiming relationship with Monk’s isochronous figure.35
Rouse’s response to Monk’s new beat-span information, which can be heard in Audio Example 5, is to pull his own rhythmic articulations further still from the 12-cycle and more toward a liminal temporality (starting in m. 3, beat 2 of Ex. 9[c]). In other words, Rouse plays a series of eighth-note onsets that pull back toward (but not all the way to) 16-cycle isochrony, as if Rouse is feeling the gravitational pull of that stratum but still slightly resisting its call. Monk, meanwhile, answers his own call with a varied response that reverses polarities in a sense (Ex. 9[c], mm. 3–4). Where his first gesture expressed unequivocally the 16-cycle followed by the 12-cycle, in the varied repetition he first plays a compressed triplet variation of the first gesture, and then a stretched eighth–quarter–eighth interpretation of his initial slow triplet. In both cases, one cycle limit is swapped for the other, as shown in Example 10.

These kinds of relational processes are at play throughout Rouse’s solo. Importantly, though, all of this microtiming activity was presaged in the early moments of the performance. In the next section, we will turn to the head to illustrate some of the ways in which “Monk’s Dream’s” specific performative context was established.
Revisiting the Head: Establishing Beat Bins and Beat Span Cycles
To a much greater degree than many of his contemporaries in bebop jazz (broadly conceived), Monk focused on repetition and variation of rhythmic and melodic elements of the composition for improvisational extemporization.36 Rouse internalized his mentor’s approach, which might explain his longevity in Monk’s band. Much of the material that is repeated and revised in “Monk’s Dream” is introduced in the first statement of the head. In these opening moments, the width of beat bins, the coexistence of two beat-span cycles, and their related flexibility windows—rhythmic tolerance—are established. Starting with the beat bins, the relationship between Monk’s piano and Frankie Dunlop’s kick drum forms beats of around 50-ms extension on the downbeats, which is more than usual in this jazz tradition (usually the microtiming distance between comping instruments in a jazz combo is 20–30 ms37). Beat 1 of m. 3, which can be heard in Audio Example 6, is a typical example, and also the first wide beat in the performance. The waveform, shown in the top part of Example 11, shows the distance between the kick drum onset (upper part) and the piano note (lower part). Both hits articulate beat 1, but the piano is a little late (around 50–60 ms). This discrepancy is also visible in the Example 11 sonogram, where the kick drum (with transients forming vertical “spikes”) is followed by the main piano note (with its harmonics). The beat bin is the perceptual counterpart to these extended beats. When the sounds articulating such a beat are heard as merged, one can assume that the beat bin is at least as wide as the extended beat in the music; all individual events falling within the beat bin therefore perceptually belong to the same compound sound.

Waveform (top) and sonogram (0–12,000 Hz, bottom) of m. 3, beat 1 (0:04) of “Monk’s Dream” (Praat v. 6.1.15). Beat width ≈ 58 ms (box highlighted in light grey). Curve in sonogram refers to intensity
Many beats in the opening measures of the performance are articulated as compound sounds in this way. The two first measures are an exception, though, as if there is a need to synchronize in a tighter fashion before loosening up. However, from m. 3 onwards the first beat of each measure, a particularly salient metric position, tends to be wide. A shared tolerance for particularly wide beats seems, then, to be a feature of this emerging context, which sets up conditions for how events relate within beat bins through the rest of the performance.
At the first beat of m. 5 we hear a similar effect, but this time the beat is even more stretched out: the attacks of the two sounds are more than 100 ms apart. Consequently, this beat sounds less like a single composite sound and more like two separate notes that together articulate a compound event. The width of the beat also exceeds what has at this point been established as the “norm,” that is, it exceeds the virtual beat bin that has been articulated thus far.
Beat spans are also established during these first repetitions of the AABA head theme. Toward the end of each repetition of the A theme, a triplet-like feel is superimposed on a straight beat. The figure is presented the first time in m. 6 (0:08–0:10), but is prepared by several immediately preceding features. First, the piano figure in the middle of m. 5 puts pressure on the duple subdivision on beats 2 and 3: this is indicated by the grey vertical lines to the left in Example 12(a). Second, as noted, the width of beat 1 in both mm. 5 and 6 is considerable. These metrically salient wide beats work to increase the tolerance exactly when needed: it opens up a flexibility in the positioning of rhythmic events that facilitates hearing the cross-rhythmic gesture in m. 6 as a “stretched” slow triplet (see the grey vertical lines to the right in Ex. 12[a]).
(a) Waveform and sonogram (0–12,000 Hz) of mm. 5 and 6 (0:06–0:10). Curve in sonogram refers to intensity
This slow triplet figure, together with the stuttering figure that follows (see below), makes up what we call the rhythmic “signature” of the song. It starts just before m. 6 (0:08) (see Ex. 12[a]) with two proximate sounds––the first piano note and a snare stroke. The distance between the respective onsets of these two instruments, which express the same virtual position in the metric structure, is more than 130 ms. This asynchrony stretches the virtual beat even more than the previous beat 1 (m. 5), and as mentioned above, its resulting inconclusive position is crucial for the notes that follow to be perceived as the triplets we have provisionally notated. In fact, the events inducing this triplet feel are located quite far from where one would expect an evenly spaced triplet subdivision to happen, given the virtual structure indicated by the bars played so far. It is almost as if the triplets are being warped by what was up to then a dominating duple subdivision at the eighth-note level. And vice versa: the beat locations of the time-keeping ride cymbal and snare on beats two and four are also deformed or warped by the cross-rhythmic triplets; they depart slightly from their normal positions during the beats where the triplet subdivision (fast or slow) is virtually present (see the waveform/sonogram pair in Ex. 12[a]). This is precisely what we mean above when we say that actual and virtual events mutually affect one another.
Interestingly, the phrasing of the slow cross-rhythmic triplet becomes more even and aligned with the beat when the figure is repeated in m. 14 (see waveform/sonogram pair in Ex. 12[b]). The articulation of the ornament in m. 13 that precedes the cross-rhythmic figure is also different and now consists of a series of six more or less isochronous triplets at the sixteenth-note level, albeit triplets that lay back alongside the underlying meter (Ex. 12[b]).
The second repetition of the A theme leads to the B theme, or the bridge. This section has a more traditional, straightforward swing feel in the drums. When the last statement of the A theme returns, however, the cross-rhythmic tension recommences. The repeated hints of triplets in the last phrase are further enhanced by an overall staccato articulation. The articulation is probably inspired by the stuttering fast triplets (mm. 7, 15, and 31 of the head), which always succeed the slow cross-rhythmic triplet figure; see Example 13(a). However, even though the cross-rhythmic passages become more and more explicit as the head evolves, the rhythm never loses contact with the duple layer.

As mentioned above, the slow cross-rhythmic figure followed by stuttering faster triplets ending abruptly at beat four can be said to form the signature rhythmic motif of “Monk’s Dream.” This signature riff with its fast and slow triplets is shown in Example 13(a); the development of the fast triplet figure is shown in Example 13(b). All of these can be heard in Audio Example 7. Whereas the faster triplets are directly hinting at the 12-cycle, Monk’s slow cross-rhythmic triplet figure can be described as “borrowed” from the six-count traversal of the 12-cycle.38 The beat-span potential of all these triplets is manifest immediately. As we have seen, the wide beats facilitate these kinds of ambiguous cross-rhythmic passages and their characteristic fluctuations between subdivision beat cycles of 16 and 12. Together these features prepare us for what’s to come; they unfold the bins and spans that are to be explored by the soloists later in the performance.
Mutual Tolerance: The Relationship between Saxophone and Bass Timing
In the final part of this analysis, we return to the first improvised chorus of “Monk’s Dream” (cf. the beat span analysis above), this time focusing on the flexible and changing rhythmic-temporal relationship between Rouse’s saxophone and Ore’s walking bass. Employing the concept of rhythmic tolerance, we will examine the mechanisms behind and provide possible interpretations of the observed temporal variability, which includes considering how tolerance relates to beat-bin and beat-span phenomena, respectively.
In mm. 27–28 of his first improvised chorus, Rouse plays a line of swung eighth notes over Ore’s quarter notes, the latter representing a rhythmic baseline of recurring beats. This is shown in Example 14 and can be heard in Audio Example 8. An analysis of the timing between the two lines shows significant discrepancies.39 Rouse’s first played beat is 41 ms behind Ore’s; this distance increases to around 90 ms (89, 94, 94, 90) for the following four beats. For the last two beats the distance decreases, with the saxophone, respectively, 71 and 48 ms behind the bass. While this type of rhythmic behavior is associated with laid-back timing,40 the scale and variability of the timing values raise a number of questions about their experiential status. What follows are five potential interpretations, each plausible to a greater or lesser degree, and all inviting different kinds of questions about perception, interpretation, and the relationship between rhythmic tolerance, beat bins, and beat spans.

Rouse’s first improvised chorus, mm. 27–29 (1:28–1:30). Vertical lines on the spectrogram indicate note onsets for Rouse’s saxophone and Ore’s bass line, showing timing discrepancies between the two lines
(1) The two lines are experientially simultaneous and synchronized.
Under this explanation, the saxophone timing is heard as a form of rhythmic-accentual coloring within a coherent rhythmic structure. This interpretation is contingent on an experiential convergence between temporal and accentual properties in the sense that the rhythmic behavior is conceived as a manipulation of weight distribution between notes.41 In this scenario, the location of beats can fluctuate within a certain tolerance without implying that timing as such is actively or consciously manipulated. While the near-simultaneous occurrence of rhythmic events falling within the boundaries of a perceived beat implies a widened beat bin, this interpretation highlights the fact that such “width” might also be experienced as accentuation. This is reminiscent of how experienced springar performers assign weight and prominence to certain beats/notes by means of compound, asynchronous note onsets.42 Judging from the performers’ discourse, these rhythmic events are best described in terms of “large” beat onsets, which feel accentuated as a consequence of the increase in coinciding rhythmic information.
(2) The saxophone line is syncopated against the referential beat represented by the bass line.
For this alternative to be feasible, there has to be a rhythmic reference structure (played or virtual) that supports a categorical shift to off-beat articulation.43 This presumption seems rather questionable considering the overall rhythmic fabric of the segment. That is, while Rouse’s melodic line is out of sync with the referential beat, it is not aligned with subdivisions, either. Moreover, the fact that the saxophone timing is not consistent but fluctuates within and across beats further reinforces the impression of the timing residing in between metrical reference points.44 On the other hand, the “residing in between” interpretation raises the question of whether the saxophone indeed temporarily departs from the beat (as we will see in interpretations [3] and [4] below). Considering the premise that beat bins are perceptual categories that are established gradually through a process of listener engagement, as opposed to fluctuating back and forth from one beat to the next (a sudden late or early attack does not automatically imply a widened beat bin), the shift from around 40- to around 90-ms discrepancy between onsets in Example 14 might indicate that the saxophone moves outside the beat bin. However, it does not necessarily move outside the groove (as in the case of interpretation [3] below), which suggests a possible middle position where the note is simply heard as behind the beat. This implies a different type of rhythmic tolerance than in interpretation (1), in that the timing discrepancies between the two lines are heard as temporal asynchronies but are allowed to occur without compromising the rhythmic integrity and groove appeal of the segment.
(3) The saxophone timing is phase-shifted in the sense of neither being syncopated nor belonging to the beat.
This alternative suggests that the saxophone notes do not belong to the beat, which in turn implies a low tolerance for temporal discrepancies. That is, the beat bin is not wide enough to harbor the delayed attacks and there are no alternative metric categories to which they can be assigned in any musically meaningful way. It could, of course, be argued that a phase-shifted rhythmic layer makes musical sense in its own right, forming a separate auditory stream (Bregman 1990).45 However, this notion speaks against the fact that the playful stretching and bending of rhythmic values in “Monk’s Dream” contributes to, rather than contradicts, the groove. A phase-shifting interpretation denies this important point, since it suggests that two different temporal streams are unfolding concurrently, rather than conjoining to form cohesive—albeit “stretched” or “wide”—rhythmic events.
(4) The saxophone timing is best understood in terms of rubato,
meaning that it temporarily departs from the common rhythmic reference structure, only to return again by “landing” on the same beat location as the bass at the end of the segment.46 In terms of rhythmic tolerance, the implication is that such timing deviances are musically acceptable as long as they are confined to brief sections, they do not disturb or become confused with the rhythmic baseline, and that the rhythmic tension they create is clearly resolved by realigning with the other instruments. While this alternative may appear experientially sound in general terms, we might argue that the saxophone line lacks the characteristics of a free rhythmic interpretation. Rather, it seems stretched or pulled, whether against the other instruments or its own rhythmic pattern (cf. interpretation [5]). Another caveat with this interpretation is the implication that the durational variation of the played beats occurs due to tempo shifts.47 While this is a conventional and well-established understanding, we question whether longer/shorter beats necessarily imply slowing down/speeding up. To illustrate, Example 15 shows how the duration of the first note of each eighth-note pair fluctuates considerably (an 85-ms spread) while the length of the second note remains relatively stable (only a 27-ms spread). This could be interpreted as if the short second note were consistently timed to serve as a reference point for the flexible timing, and that the variation in total beat duration was attributable to the relative stretching of the first note. In other words, a form of rhythmic reshaping of the beat, rather than a tempo change, is taking place.

Rouse’s first improvised chorus, mm. 27–28 (1:28–1:30). Subdivision timing of Rouse’s eighth notes
(5) In terms of overall rhythmic articulation, the saxophone line is internally consistent and prominent enough to temporarily constitute its own reference structure.
In this interpretation, the saxophone line does not depart from as much as completely disregards the common rhythmic reference structure. This alternative bears some similarity with the phase-shifting explanation in the sense that the saxophone line seems to attain the status of an autonomous rhythmic pattern operating alongside the main groove. However, whereas the phase-shifted interpretation implies that the saxophone creates a disturbance in the overall rhythmic flow by introducing rhythmic events that do not align experientially with any reference points in the rhythmic structure, in interpretation (5) there is no tension (positive or negative) between the layers of rhythmic events. In this scenario, descriptions such as “ahead of” or “behind” the beat become misleading. Instead, beats are simply longer and shorter according to the internal dynamics established by the layer they belong to—much in the same way as with the springar tunes described above.
These provisional explanations represent possible interpretations of sounded events, and differently enculturated listeners may be primed to choose distinct interpretations of the relation between Rouse’s saxophone onsets and Ore’s bass stratum in this passage. While interpretations (1) and (5) might seem foreign to a contemporary jazz musician, both of these interpretations could be relevant to springar performers, who are primed to treat timing and accentuation as converging features, and who may conceive of the melody (rather than, for example, a rhythm section) as carrying and defining the beat. Interpretations (2) and (4) would, on the other hand, most likely represent how the performance would be accounted for using traditional music-analytical approaches, since each seeks to map the solo layer against a stable reference layer. Interpretation (3), which implies that we hear the solo as phase-shifted, escapes a traditional music-theoretical account since there is no firm, primary structure to pin down. The variety of possible interpretations of this passage shows that rhythmic tolerance is not simply an empirical feature of musical events in temporal relations with one another, but emerges in the meeting of listener and sound—that is, there is no such thing as a “correct” interpretation or threshold of tolerance. The extent to which a particular interpretation gains prominence is contingent on a variety of contextual factors, such as genre affiliation and associated aspects of musical training and musical preferences.
Analytical takeaways
The analytical discussions above illustrate how ambiguities in rhythmic interpretation and meaning cannot be resolved simply by measuring the distance between rhythmic events and applying a standardized metric framework to explain the observations. Instead, experiential features of timing, accentuation, and general prominence (foreground and background features) are contextually determined along several dimensions. In fact, each of the analytic perspectives represents facets of a coherent musical experience, and taken together, shed light on a number of microrhythmic processes at work in “Monk’s Dream.
First, by superimposing our analysis of the beat bins of “Monk’s Dream” onto a beat span analysis, we suggest how the wide beat bins established in the head engender certain flexibility in terms of where and how rhythmic events occur in the ensuing performance. Our suggestion here is that the wide beat bins at the beginning of the performance are needed to “warp” rhythmic events toward the beat cycles by which beat spans are established. The resulting inconclusive beat positions facilitate the characteristic fluctuations between the 12-cycle and 16-cycle subdivisions that are explored by the soloists later in the tune. We can also speculate that the superimposition of dual reference structures, enacted by co-occurring beat-span forces during the solo, raises the threshold for when temporal fluctuations are detectable, thus increasing the rhythmic tolerance in the rest of the performance. All of this, in turn, is in line with a central idea underlying the beat bin theory, namely that the shape and extension of the perceived beat may vary over time and may be manipulated for aesthetic purposes.
Second, how this stream of musical events will be heard is contingent on a variety of perceptual and contextual factors, such as genre affiliation, musical training, and even musical preference. The extended beats characterizing the performance might be heard as beat bins with one kind of shape and extension for one listener and a different shape and extension for another. The metrical cycles underlying beat spans might be activated in a listener accustomed to such microrhythmic dynamics, but not in a listener unfamiliar with it. Under certain conditions—including those engendered by both musical aspects and listener background—temporal fluctuations may be embedded into the musical flow such that they go virtually unnoticed, while other conditions may make such fluctuations stand out and invite active engagement. As we emphasized in the Introduction above, musical experience is co-constituted through a continuous process whereby virtual reference structures and actual sounds inflect and transform one another. Perceptual and contextual factors, such as the shape of beat bins, the presence or absence of beat-span generating metric cycles, and different forms of rhythmic tolerance at play when listening to music, should thus be included in one’s analysis, as all can be part of the phenomenon at hand.
Concluding remarks
Flexible, dynamic rhythmic features are defining traits of many groove-based traditions. Each of the three concepts presented in this article—beat bins, beat spans, and rhythmic tolerance—originated in attempts to bring analytical attention to fundamental aspects of a specific constellation of musical traditions: neo-soul and hip-hop, a range of African and Afrodiasporic traditions, and Scandinavian fiddle music, respectively. Each concept sheds light on a particular aspect of microrhythmic process, as we demonstrated in our analysis of “Monk’s Dream.” All three concepts focus on how the virtual reference structure is always in flux. Beat span reveals the multiple metric forces pulling on one another to create spans within which many different event locations are possible. This resembles the first dimension of rhythmic tolerance, which concerns how the durational flexibility of beats and measures is a product of the underlying generative mechanisms that produce what are often thought of as “irregular” durational patterns. Accordingly, both beat span and rhythmic tolerance are performative phenomena: they come into shape via the unfolding of musical events. The beat bin concept and the second dimension of rhythmic tolerance focus more on the listener’s perceptual tolerance for varying or “imprecise” locations of rhythmic events. Wide beat bins also increase the tolerance for how discrete sounds can be perceived as forming a singular rhythmic event, which parallels the third dimension of rhythmic tolerance.
Through our analysis, we hope to have shown that the three concepts can also be brought into dialogue as analytical tools, enriching each concept’s analytical valence by shedding light on different, albeit related, aspects of musical microrhythm. By clarifying the relationships between bins, spans, and tolerance, we have taken a step toward developing a more comprehensive multi-perspective framework for theorizing and analyzing musical microrhythm.
Lastly, we hope we have brought attention to the plasticity of rhythmic events at the microlevel of musical production and perception, as well as to the structure and the expressive potential of these foundational microrhythmic processes. In the case of “Monk’s Dream,” there is no virtual, unmarked original that precedes and grounds this particular performance; there is no prototype to refer back to. Micro-temporal details are generated as the music unfolds, and they are always already in constant flux. This fundamental fluidity calls for a reconsideration of the relationship between what are typically thought of as structural and expressive properties of musical rhythm. An important motivation for undertaking this comparative methodological project was what we feel is a need for a critical revision of prevailing theoretical models of rhythm and meter, which have tended to treat microrhythm as an expressive “addendum” to rhythmic structure. The naturalness of microrhythmic complexity, flexibility, and ambiguity is key to the pleasure of enculturated listeners and dancers. We hope, on the one hand, that we have demonstrated that such aspects are as fundamental to constituting the identity of the “work” as those which have been regarded as “structural” in more traditional music-theoretical approaches, while on the other hand showing that this identity is neither stable nor defined ahead of time. Describing the virtual temporal substrates that underlie meaningful rhythmic experiences is itself, in this way, an active, enacted, and creative process.
Works Cited
———.
———.
———.
———.
————.
————.
————.
————.
————.
————.
————.
————.
————.
————.
This work was partially supported by the Research Council of Norway through its Centers of Excellence scheme, project number 262762, and the TIME project, grant number 249817.
Footnotes
Lydon and Mandell (1974, 65); in Keil (1994, 106).
This is an unorthodox definition, of course, but is intended to be maximally inclusive without limiting meter to arbitrary constraints like “well-formedness” or hierarchically nested strata that may be true in some cultural contexts but are not generalizable across diverse global musicking practices.
London ([2004] 2012, 179), paraphrasing Repp (1998); see also Bengtsson (1987).
This process is especially true for orally transmitted music traditions (Kvifte 2007). See Deleuze and Guattari (1987, 257 and 336) for more on ethological taxonomy.
Over the past thirty years, studies of microtiming patterns in both classical and groove-based musics have grown in number and scope. See, for example, Alén (1995); Benadon (2006); Bengtsson and Gabrielsson (1983); Bilmes (1993); Butterfield (2010); Clarke (1985, 1989); Danielsen (2006, 2012); Desain and Honing (1989); Friberg and Sundström (2002); Iyer (2002); Johansson (2010, 2017a, 2017b); Keil (1995); Kvifte (2004, 2007); Ohriner (2019); Polak (2010); Polak and London (2014); Prögler (1995); Stover (2009).
See Danielsen (2006, 46–50).
A more precise way to say this is that two complementary processes are at work: the virtualization of the actual and the actualization of the virtual (Deleuze 1994, 209). The value of this double movement for understanding musical microtiming is that actual, materially sounded phenomena and virtual, structuring forces are continuously working to refigure one another: this is part of the process that gives any singular musical utterance its (emergent) identity.
For example, those found in Clarke (1989).
See Johansson (2022).
In this context, the term “beat onset” is shorthand for the perceived start of the beat, which is to be distinguished from other uses of the term onset in this article. More precisely, beat onset is an experiential category rather than a measurable property, meaning that its temporal location is rarely the same as the physical onset of the rhythmic event with which it is associated (see also the explanation of the beat bin concept below).
This hypothesis is key for Danielsen (2010).
A similar change in actual motion was observed in an experiment testing motion responses to the pulse shape of the different sections of the song (Danielsen, Haugen, and Jensenius 2015).
Brøvig-Hanssen and Danielsen (2016, 101–15).
Danielsen (2010, 29–32).
A similar dynamic is described in Anne Danielsen’s (2015) analysis of Destiny’s Child’s “Nasty Girl.” During the song’s choruses, the 12 and 16 cycles are present as layers of programmed percussion and synth-pad patterns, whereas Beyoncé occupies the liminal position, floating from side to side in the timing “corridor” indicated by these programmed layers.
One closely related n-cycle pair is the 9- and 12-cycles that underlie jazz waltzes ( as a 2:1 beat-upbeat ratio [BUR] expression of swing eighth notes; as “straight” sixteenth-note renderings). Other possibilities are easy to imagine, as are compositional applications where more complexly-related cycle pairs (say, coextensive 7- and 8-count cycles) are put to work to create beat spans.
See, in particular, Benadon (2006) and Butterfield (2011). Butterfield’s study includes a summary of earlier analyses of swing ratios; see especially pp. 5–9.
The complete track (“Monk’s Dream,” take 8) is available on many digital platforms, including Spotify and YouTube.
While the underlying cycles might be best expressed using two different time signatures, and , respectively, we have chosen here and below to represent the 12-cycle in with triplets, following conventional jazz notation practice, including that of all of the existing sheet music versions of “Monk’s Dream” we could find.
We have chosen to represent BURs so they sum to 100 for ease of comparison between event pairs.
Rouse “ghosts” this beat such that it is nearly impossible to tell even how many onsets he plays: three or four? This is a very common challenge when it comes to transcribing jazz saxophone solos—see Rusch, Salley, and Stover (2016).
This “double tresillo” is significant as it signals a potential diasporic connection to African timeline music practices. Christopher Washburne (1997) has suggested that at least some jazz is “in clave,” and uses examples from Thelonious Monk’s repertoire to illustrate his thesis.
See Butterfield (2010) and Prögler (1995).
See Stover (2009).
An even more extreme example of this type of temporal displacement is analyzed by Rusch, Salley, and Stover (2016). See Example 6, mm. 75–88. See also Benadon’s (2009) analysis of “time warps” in early jazz.
See, for example, Clarke (1989) and Tekman (2002).
This would be consistent with the beat span concept and its prediction that rhythmic events are being pulled out of alignment by complementary metric strata.
This is also reminiscent of Chris Stover’s (Rusch, Salley, and Stover 2016 [4.1–4.10]) analysis of a Sonny Rollins solo, in which he uses two non-aligned barlines to represent rhythmic strata that have moved out of phase with the common rhythmic reference structure.
Richard Ashley (2002) makes a similar argument in his analysis of jazz ballad performances. Ashley’s analysis focuses on the soloist’s flexible melodic rhythm over a steady underlying beat, a delay-accelerate strategy that is modulated in a number of ways but with a strong tendency for the melody to align with the accompaniment at cadential locations.
For example, see Benadon (2009).