Timing analysis of jazz solos

We begin by an in-depth analysis of onset timing in a large set of jazz recordings. As outlined above, our main goal is to prove that there is a positive effect of downbeat delays on swing, but we first want to clarify the question, whether or not and to which extent soloists tend to delay their downbeats with respect to the rhythm section. We evaluated data from the Weimar Jazz Database29, which contains accurately labeled transcriptions of 456 jazz solos of various artists, and gives access to several quantities like note positions or rhythmic value. We want to stress that our general analysis, which does not consider individual differences and different playing styles, can only have a limited accuracy with a large scatter of data. Nevertheless, it is able to reveal general trends, which is the goal of this section.

For each given piece in the database29, we isolated every downbeat-offbeat pair of the solo to compute the average downbeat delay and swing ratio (averaged over each solo) as a function of tempo using the downbeats of the drums as a reference. The results presented in Fig. 1 show the existence of non-zero downbeat delays in most cases (with the exception of a few negative and a few very small delays). The data show some variation, probably reflecting individual preferences, but there is a clear trend for decreasing delays with increasing tempo (Fig. 1a). The trend becomes nearly linear, if the downbeat delays are measured in ticks (Fig. 1b). Ticks represent fractions of quarter notes (which are subdivided into 960 ticks) and are not an absolute measurement of time (see Eq. (1)). The figure demonstrates that many soloists are using systematic MTD, i.e., positive downbeat delays, which typically are of the order of 30 ms or 85 ticks for intermediate tempi of about 150 bpm. (This value in ticks corresponds to delays of about 9% of a quarter note). While this is true for a majority of jazz soloists, it should be mentioned that a few soloists use only small or no downbeat delays at all.

Fig. 1: Average downbeat delays of soloists as a function of tempo. Each point in the scatter plots corresponds to a piece of the Weimar Jazz Database29. In order to ease readibility, the corresponding standard deviations are shown in Supplementary Fig. 1. a Average downbeat delays in milliseconds as a function of tempo in beats per minute. The red line delimits the tempo range of pieces used in our experiment (see “Methods” section) and corresponds to the fit in (b). The scattered data exhibit mostly positive delays with generally a nonlinear trend, which is nearly linear in this restricted tempo range. b Average downbeat delays expressed in ticks as a function of tempo. The red line shows a linear fit to the data. Full size image

This trend did not change, when we considered jazz sub-genres (“bebop”, “swing” or “hardbop”) separately (see Supplementary Fig. 6). Of course the magnitude of downbeat delays may vary within a solo or a whole piece and it makes sense to also look at individual delays in their musical context. Here, however, we want to detect general trends and are therefore studying average quantities. It is important to note that the standard deviation of the downbeat delays are noticeably smaller than their average value (except for high tempi above 200 bpm), which means that typical downbeat delays are almost always positive (see Supplementary Fig. 1).

The swing ratio is another important parameter. Although not in the focus of the present paper, we determine it here, as it is also relevant for swing. In particular, we realized in the course of our experimental study that it was crucial to choose a suitable swing ratio before applying systematic timing manipulations (for details see Supplementary Results 3: Serenade to a cuckoo, second experiment testing different swing ratios). The swing ratio is a measure of non-isochronous metrical subdivisions. Non-isochronous rhythmical patterns are prominent in jazz music, but are found also in some other cultures, e.g., in Malian jembe drumming and Uruguayan candombe drumming30,31,32. While the swing ratio has been extensively studied for drummers3,16,17,33, the swing ratio of soloists and in particular how its optimal value varies with tempo is still not unambiguously established34.

We determined the mean swing ratio of the soloists using the definition of Eq. (2) for each of the 456 pieces of the Weimar Jazz Database as described in the “Methods” section. The results are shown as a function of tempo in Fig. 2. Note that the mean swing ratios of the soloists are much smaller than generally believed and also smaller than reported in early observational studies14,34 that were using episodic excerpts. Assuming synchronized offbeats, such small swing ratios appear as a result of downbeat delays. In particular, the figure also demonstrates that the noted triplet feel (or ternary feel, i.e., a swing ratio of 2:1) is rather a myth as far as soloists are concerned. Most of them use swing ratios that are below 1.5. For fast tempi (more than 160 bpm), one finds a decreasing trend of the soloists’ swing ratio with increasing tempo. So far this is similar to the trend reported for the swing ratio of drummers16,35. On the other hand, the trend is reversed for medium to slow tempi (below 160 bpm), where the soloists’ swing ratio tends to decrease with decreasing tempo. This means that drummers and soloists follow two opposing trends regarding the swing ratio in this tempo range and that the swing ratio of soloists tends to be smaller than that of the rhythm section. We also analyzed other characteristics of the recordings such as the position of individual triplets as a function of tempo. These additional findings are included in the Supplementary Results 1. After submission of our manuscript, we became aware of recent work by Corcoran and Frieler, who also analyzed the swing ratios of the solos contained in the Weimar Jazz Database. They used a different method to determine the swing ratio and obtained qualitatively similar results apart from the increasing trend we found below 160 bpm36.

Fig. 2: Mean swing ratios of soloists as a function of tempo. Each point corresponds to a piece of the Weimar Jazz Database29 and represents the soloist's averaged swing ratio as a function of tempo. A quadratic fit to the data (gray line) as an indicator of preferential swing ratios reveals an increasing trend as a function of tempo up to 160 bpm and a decreasing trend above 160 bpm. The swing ratio of most soloists lies below 1.5, thus is much smaller than generally believed, and does not correspond to a triplet feel (i.e., swing ratio 2:1). Full size image

Experiment investigating swing

The above empirical observations indicate that a large fraction of jazz musicians play jazz solos with downbeats slightly delayed with respect to the rhythm section. Nevertheless, the question remains, whether these delays are an essential component of swing, as not all jazz musicians use them. To address this question, we adopted an operational definition of swing, that is, the performance of a piece swings if it is judged as swinging by expert listeners. Professional and semi-professional jazz musicians can be considered expert listeners, as they are trained and experienced in creating and evaluating the swing of a performance. For the study, we used an experimental approach, which we developed for a previous microtiming study on swing24. Manipulating the onset timing in MIDI recordings of piano jazz performances and letting expert jazz musicians rate the swing of different manipulations gives us the possibility to clarify whether different ways of microtiming have a positive effect on swing. In that previous study, we investigated the impact of random MTDs by amplifying them, deleting them, and inverting them. We showed that random MTDs, which are present in every human musical performance, did not enhance swing, which entails that these MTDs can be detrimental to swing. In the present work, we now focus on studying the effect of systematic MTDs.

Moreover, the analysis presented in the preceding section did not show whether soloists are also delaying their offbeats. The Weimar Jazz Database only reports downbeats of drums as a reference, but does not give access to their offbeats, which precludes determining the offbeat MTDs of soloists with respect to the drums. With our experimental approach, however, we are able to clarify the role of offbeat timing by studying how different versions with and without offbeat delays affect swing.

We prepared audio extracts presenting different kinds of systematic MTDs in jazz piano performances (“soloist”) with respect to a quantized rhythm section (“rhythm section”). The manipulations we carried out on real performances are explained in detail in the Methods section and sketched in Fig. 3. We based all manipulations on a quantized original version, which aligns the notes to a grid with an optimized swing ratio. We needed to take such a step for the sake of providing well-controlled distinguishable conditions. We think that this is justified as a minor intervention; we previously showed for instance that random microtiming fluctuations do not play a positive role for swing24. For the present experiment, we hypothesized that a positive effect on swing might result from (i) a both delayed manipulation, where all notes of the soloist are uniformly delayed with respect to the rhythm section, and/or (ii) a downbeat delayed manipulation, where the soloist notes are delayed apart from the offbeats (which are synchronized with the rhythm section).

Fig. 3: Timing manipulations. Schematic representation of the timing manipulations we used in the experiment to probe the effects of microtiming deviations on the swing feel. Importantly, all manipulations were done so as to keep the same swing ratio for the soloist (i.e., piano). Full lines represent exact quarter note positions (metronome beats). The dashed line shows the position of the offbeats corresponding to a chosen “optimal” swing-ratio, referred to as r opt in the upper-left frame. Black notes and gray notes denote timing positions of soloist and rhythm section, respectively, in the different manipulations. In the “quantized original” version (green background) underlying all further manipulations, the microtiming deviations of the soloist's original performance are suppressed and the notes are aligned with the grid. In the “both delayed” version (red background), all notes of the soloist are delayed by 85 ticks. Finally, in the “downbeat delayed” version (brown background), additionally, the offbeats of the rhythm section are synchronized with the offbeats of the soloist. This procedure creates downbeat delays of 85 ticks for the soloist without changing the soloist's swing ratio, but increases the swing ratio of the rhythm section. Full size image

We presented the manipulated audio extracts of four different pieces (“The smudge”, “Texas blues”, “Jordu”, “Serenade to a cuckoo”) to professional and semiprofessional jazz musicians in an online experiment. Participants were asked to compare all three manipulations with each other and to respond to the questions “Did it swing?” and “Did it groove?” for each piece separately. Answers were given on a scale from 1 (“not at all”) to 4 (“very much”). The responses to one of the pieces (“Serenade to a Cuckoo”), were not included in the analyses due to an ill-chosen swing ratio for the rhythm section (see “Methods” section). We, therefore, conducted a second experiment on this piece testing the influence of different swing ratios. The results of the second experiment were much in line with the results for the other three pieces presented in the following paragraphs (see Supplementary Results 3: Serenade to a cuckoo, second experiment testing different swing ratios).

The results show that professional and semiprofessional jazz musicians gave the highest swing ratings to versions with delayed downbeats and synchronized offbeats (i.e., the downbeat delayed version). This is apparent in the average distribution of swing ratings across three pieces shown in Fig. 4 as well as in Supplementary Fig. 9. In Fig. 4 one can see that the downbeat delayed version obtained a large proportion of high ratings (3 and 4, blue colors) while the quantized original or both delayed versions received considerably smaller fractions of high ratings. The results on the groove ratings show a similar pattern with considerably smaller effect sizes of the manipulations (see Supplementary Results 2: Groove ratings). It is worthwhile pointing out that professional musicians gave overall lower ratings than semiprofessionals, which is noticeable in particular for the highest rating in the downbeat delayed version (6.5% vs 31.4% for professionals and semiprofessionals, respectively; see Supplementary Fig. 9). We made a similar observation in our earlier study24. This finding probably reflects the higher standards and expectations of professional musicians. An ordinal logistic regression of the swing ratings upon manipulation, musician category, and their interaction statistically confirmed the results described above (cf. Table 1). The downbeat delayed versions received significantly higher swing ratings than quantized original versions not having any delays (p < 0.001). No significant difference was observed comparing the swing ratings of both delayed versions to those of the quantized original versions (p = 0.440). Professional jazz musicians gave significantly lower ratings than semiprofessional musicians (p = 0.019). In addition, the effect of the downbeat delayed versions (vs. quantized original) was larger for semiprofessionals than for professionals (p = 0.022).

Fig. 4: Distribution of swing ratings given by professional and semiprofessional jazz musicians to different manipulated versions. The three stacked histograms display the proportions of different possible ratings from 1 (“not at all”) to 4 (“very much”) averaged over three pieces. The downbeat delayed manipulation in the center elicits a much larger portion of high ratings (3 and 4 in blue colors) than the two other manipulations. Full size image

Table 1 Results of ordinal logistic regression for swing ratings. Full size table

The odds ratios as well as their associated confidence intervals for the different conditions are summarized in Table 1. The odds ratio of the downbeat delayed versions as compared to the quantized original versions was 7.48. In other words, delaying the soloist’s downbeats while synchronizing the offbeats makes it more than seven times more likely that jazz musicians judge the recording as more swinging than the quantized original. To further validate this effect, we performed three additional checks to analyze the statistical power and to test for potential effects of outliers and sample size (see Supplementary Results 2: Statistical power and robustness). They yield a very high statistical power together with a high robustness of the effects. Separately, we also analyzed participants’ ratings for only the very first piece they listened to, in order to ensure that the results were not affected by repeating the task or by being asked whether one perceived differences between versions (see Supplementary Results 2: Additional analyses on swing ratings).

To elucidate the discriminability between the different manipulations, we determined receiver operating characteristic curves (ROC) for each piece in Fig. 5. These ROC curves compare the cumulative proportions of the four ratings for two conditions mapped along the horizontal and the vertical axis, i.e., two of the stacked histograms of Fig. 4 are plotted against each other along an axis each. A deviation from the diagonal to either side indicates higher swing ratings for one of the conditions and shows that listeners discriminate between the versions and perceive one of them as more swinging. The area under the curve (AUC) quantifies the deviation from the diagonal (AUC = 0.5 means no discrimination) and is an effect size that can be tested for significance. The effect is statistically significant, if 0.5 is not within the AUC confidence interval (CI). Comparing the downbeat delayed to the quantized original manipulations (blue curves in Fig. 5) shows higher swing for the downbeat delayed versions with significant AUC values for all three pieces: AUC The smudge = 0.71 ± 0.13, AUC Texas blues = 0.70 ± 0.12 and AUC Jordu = 0.69 ± 0.13. Comparing the downbeat delayed and both delayed manipulations (black curves in Fig. 5) also shows higher swing for the downbeat delayed versions with significant AUC values. By contrast, the yellow curves and their AUC values display no significant difference between the both delayed and quantized original versions. Taken together, these findings imply that delaying the soloist’s downbeats while synchronizing offbeats has a significant positive impact on swing, whereas uniformly delaying all soloist notes does not.