The STTR window length would be the DELAY time plus the xfade length. The window Overlap would be the xfade length. That said, this will result in odd shaped windows. If you set delay to 0mS and then the total window length will be 2*xfade (this is because 2*xfade is actually the shortest delay possible) and will actually form a Bartlett window.
Modulation will effect the delay time which will, in effect, modulate the window length (when delay > xfade) and shape. I'm not entirely sure what this will do to the percieved effect.
Sidebands are always created when you amplitude modulate a signal (which you are doing here by the window function), but it's not coincidental that you start hearing the sidebands around an xfade length of 30 mS (which corresponds to a window length of 60 mS). This is because 1/60mS is 16.6 Hz, which is close to the lower limit of human hearing. As you shorten the window length you essentially raise the frequency of this triangle wave, moving the sidebands further apart and they sound more distinct.
This STTR effect is obviously not the intent of the reverse delay algorithm, but it's an interesting curiousity and it's fun to think through the ramifications of the effect. My gut says that most of the effect is simply from the multiplication of the fast triangle wave with the input signal, which is essentially a ring modulator, but as the paper shows, reversing the audio inside each window does create it's own effect.
Show More...Show Less...