CBT with Crosstalk Cancellation?

I've been looking for an excuse to build a CBT for nearly a decade now. The thing that always stops me from "pulling the trigger" is that the high frequency performance of arrays is not great. It's a really difficult problem to solve that, because it's not as simple as adding a tweeter to an array, if you're going to add a tweeter, it will likely require dozens or even hundreds of tweeters to get the beamwidth of the midrange array to match the beamwidth of the high frequency array.

Another technology that I've had good results from is Crosstalk Cancellation. I've tried a bunch of different recipes to do that, everything from Polk SDA style arrays from the 80s, Crosstalk Cancellation via DSP using MiniAmbio and various Windows plugins. Opsodis is another oddball Crosstalk Cancellation scheme that never really went anywhere.

I was pondering these two topics, and it occurred to me that the two may be compatible. IE, CBT arrays have some issues, and Crosstalk Cancellation ALSO has a ton of issues. But the combination of the two may be synergistic.
 
  • Like
Reactions: 2 users
Opsodis is a peculiar technology. I personally stumbled across after reading a VERY obscure graduate thesis, related to car audio, about fifteen years back.

At the bottom of this post, I'll include links to the Opsodis papers. But first, let me take a stab at describing why Opsodis works:

Our perception of where sounds are coming from is based on amplitude and phase. At high frequencies, amplitude is the most important. At low frequencies, phase is the most important. At midrange frequencies, both phase AND amplitude are important.

One way of assessing this, is when you fiddle with the "balance" control for two speakers. What you'll notice is that the "balance" control does a fairly good job of 'steering' sound to one side of the room but only at high frequencies. This is because the "balance" control changes the amplitude, but it doesn't impact phase at all.

Another way of assessing this phenomenon, is if you have a DSP. Anyone who's messed around with DSP in their car can attest to this; manipulating the delay in a set of stereo speakers in your car allows you to 'move' the center of the soundstage dramatically. Maybe I'm just sensitive to phase, but I find that a delay of even a fraction of a millisecond makes an audible difference in the soundstage.

In a nutshell, if you have a DSP in your car, you can move the center of the soundstage at will, manipulating the delay 'shifts' the entire stage to the right or to the left (depending on whether you're adding or subtracting delay.)

to be continued...
 
  • Like
Reactions: 1 user
OK, so at this point in the discussion, I've proposed that:

1) our perception of high frequency location is determined by amplitude
2) our perception of low frequency location is determined by phase
3) in the midrange, it's both

By the way, if anyone's curious, there's an actual anatomical reason for this. Basically, the median diameter of a human head is 14.4cm. 2327Hz is 14.4cm long. Because all frequencies above 2327Hz are shorter than the diameter of our heads, we can't perceive their phase easily. Because all frequencies below 2327Hz are longer than the diameter of our heads, we can perceive their phase easily.

This means that at high frequencies, amplitude differences are everything. At 5 khz, a 3dB difference in the left and right speaker will be incredibly apparent.

This means that at low frequencies, phase differences are everything. At 500 hz, a 90 degree difference (34cm / 6.7") in the left and right speaker will be somewhat apparent.

This means that at high frequencies, phase differences are (mostly) unimportant. At 5 khz, a 90 degree difference in the left and right speaker will be unnoticeable; that amounts to moving the speaker 1.7 cm. Nobody will notice that.

This means that at low frequencies, amplitude differences are unimportant. At 500 hz, a 3dB difference in the left and right speaker will be almost unapparent.
 
  • Like
Reactions: 2 users
As promised, here's one of the Opsodis papers. No offense to the Opsodis folks, but all of their papers seem to omit the psychoacoustics behind how the tech works. Their papers seems laser focused on technology designed to improve the dynamics of speakers using crosstalk technology.

But their papers don't actually get into the crosstalk cancellation technology itself, and why it works even with the drives spaced at ridiculous gaps (as in Opsodis.)

https://www.researchgate.net/profil...2-channel-and-3-channel-OPSODIS-soundbars.pdf
 
  • Like
Reactions: 1 user
OK, so let's segue into "how does Crosstalk Technology work?"

First off, the entire point of Crosstalk Cancellation is to emulate headphone listening. With headphones, all we hear in our left ear is sound from the left speaker. All we here in our right ear is sound from the right speaker. Crosstalk cancellation aims to mimic that, by canceling the sound that arrives at your LEFT ear from the RIGHT speaker and vice versa. If it worked perfect (and that's a big "if") the crosstalk eliminator would eliminate 100% of the sound from the right channel that arrives at your left ear without messing up the sound from the left channel.

ambiophonics-on-a-desk-jpg.1051530


Taken to the extreme, you could eliminate all crosstalk by erecting a barrier between your left and your right ear

Yes, it looks ridiculous. It also works really well.

I've done this a bunch of times, and what I've noticed is that it largely replicated the experience of listening to headphones. If the recording is in mono, the sound is right there in the center. If the recording is done well, you have a big soundstage. Perhaps most importantly, the "barrier method" of crosstalk cancellation isn't perturbed by overprocessed recordings. IE, if you listen to a lot of modern / pop / EDM music, a lot of it has soundstage effects "baked" into the mix. So if you use DSP processing to make the soundstage bigger (via crosstalk cancellation) it will often be 'tripped up' by newer recordings which already leverage DSP tricks to expand the soundstage.

image_preview


Polk generally has the "most famous" crosstalk cancellation scheme. Their "SDA" technology basically takes the output from the left speaker, and reproduces it out of phase from the right speaker. And vice versa.

Audioholics wrote this review:

"I could go on with many more examples of how the L800s separate themselves from typical speakers, so I will just sum up their overall character from my experience. Firstly, they have the best and most realistic imaging I have ever heard, with the right music. The effect was strongest with binaural recordings, but any recording with a sophisticated use of stereo soundstaging stands to benefit. The SDA effect added noticeable timbre coloration, but wasn’t obvious on all music and often wasn’t offensive. The sound was alluring, causing me to revisit my music library for hours on end. These are very enjoyable speakers to listen to, even with their imperfections, and I wanted to listen to them more than other speakers. While they played plenty loud, they sometimes felt dynamically constrained. They seemed to need a powerful amp to sound their best. I should say at this point that I like to listen to music and movies at louder levels than most people. The L800s have more than enough dynamic range for the vast majority of people - with enough amplification. They were capable of loud deep bass, which sounded very good. However, they need big amps to achieve their full potential. They sounded noticeably constrained in bass with small receivers."

If you read between the lines in the Audioholics review, he's illustrating a bunch of things I learned myself with Ambio and it's various crosstalk-canceling siblings:

1) With certain tracks, the effect is freaky. You can legitimately create a soundstage that's close to a full 180 degrees. Even stranger is that you can create that with speakers that are right in front of you. That's the craziest part of crosstalk cancellation, is hearing a soundstage that envelops you, even though the speakers are right there in front of you. Until you hear it for yourself, it's hard to appreciate how pinpoint a soundstage can be when you get rid of the crosstalk.

2) With crosstalk cancellation, some tracks actually sound quite close to mono. Which is because they're quite close to mono. Basically, we're so accustomed to "the stereo triangle" that we're often fooled into thinking mono tracks have "space" in them, when a lot of that "space" is simply defined by where our (conventional) loudspeakers are located. I noticed the same thing, 18 years ago, when I built one of my earliest Unity horns. Basically, I hadn't realized how many recordings are crummy. Unity horns make great recordings image great, but they're also an x-ray on bad recordings.

3) The crosstalk cancellation in Polk SDA kills the dynamics. The Opsodis paper gets into how this works; in a nutshell, the cancellation process can reduce output by 10-20dB and it's frequency dependent. To me, this is the Achilles Heel of the Polk SDA method, and a significant part of the reason that "the barrier method" works so effectively. YMMV, but what I've noticed with "The Barrier Method" is that well made recordings have a HUGE stage, and poorly recorded ones sound mono. The dynamics of "The Barrier Method" are great, because we're keeping the left and right channels separate via a barrier. We're not doing any "steering" and we're not processing the signal to do the crosstalk cancellation. I've tried methods like the Polk SDA, and I personally found that it reduced the dynamics. At the same time though, poor recordings didn't sound as "mono" as the barrier method. This is because in "The Barrier Method" the speakers are nearly touching, while in the "SDA" method they're a few feet apart.

Another option, is to do the processing via DSP. If you go this route, you can get crosstalk cancellation without resorting to the use of additional drivers, like in the Polk SDA. I've personally tried various software methods to do crosstalk cancellation, and none of them sounded good to me TBH. The output sounded noticeably "phasey" when using convolution filters.

Due to this, I'm going to largely focus on the Opsodis route...
 
  • Like
Reactions: 1 user
The paper that set me off on this journey, came to a fairly extreme conclusion: one may be able to have a stereo soundstage with a mono tweeter.

This sounds like a ludicrous claim on it's surface. Because something I've noticed that people do in car audio, is that they tend to put their tweeters as wide as possible.

IE, if we could get a stereo soundstage with just one tweeter, then why would people be naturally inclined to locating their tweeters as wide as possible?

So let's look into this:

10khz is 3.4cm long (1.35")

That's well out of the range of sounds where phase is important. If you move a tweeter by a distance of one centimeter, nobody is going to perceive a difference in location. But if you lower the level by 3dB, they will definitely hear that.

This next part is super important, and accentuates why a mono tweeter(!) may be acceptable or even superior:

While you will not be able to perceive a difference in location if you move your left speaker by 1.35", there WILL be a measurable difference in the amplitude of the stereo speakers. For instance, if get your left and right speaker perfectly in phase, if you then move the speaker (or your head!) by 1.35", there will be some left-to-right cancellation at very high frequencies. This is because the wavelengths are so short.

It's a catch 22:

1) In order to perceive location at high frequencies, we depend on amplitude differences

2) But a path difference of as little as 1.35" will wreak havoc on the overall high frequency response!

Anyone's who's measured a set of speakers playing in stereo knows exactly what I'm describing; the interference between the two creates a myriad of comb filters, and especially at high frequencies, the two speakers tend to cancel each other out because of this.

Looking at everything holistically, it becomes apparent that:

1) a narrow spacing between tweeters is not only acceptable, it's likely preferable. And taken to the extreme, one might play very high frequencies in mono with a single tweeter placed in the center.

2) As frequencies get lower and lower, phase becomes the most important. IMHO, phase is much harder to "fake" then amplitude. With amplitude differences, we can fairly easily compensate by adjusting that amplitude. Phase is way trickier; while we can manipulate phase to change the location of sound BETWEEN two stereo speakers, it's very very hard to make it sound like the sound extends BEYOND those two speaker locations.

IE, if I have two speakers placed two meters apart, I can manipulate the phase to make the sound appear to be radiating from almost any point between the two speakers. But I can't easily or reliably make the sound like it's coming from beyond the confines of the two speakers. This fact encourages me to place the speakers as widely as possible, to achieve a wide stage. But the wider you place the speakers, the more you get a "hole" in the middle of the soundstage.

Although it's (strangely) ignored in the patents and the Opsodis docs, the Opsodis locations largely solve this issue with stereo playback. At low frequencies, the drivers are placed extremely wide. That establishes the width of the stage (huge.) As we get higher in frequency, the playback devices span a much more limited width. It's not in any of the Opsodis patents, but an obvious possibility would be to make the very high frequencies play back in mono.
 

Attachments

  • opsodis.jpg
    opsodis.jpg
    52.3 KB · Views: 57
  • Like
Reactions: 1 user
Referencing my own posts is cringe, but there's some good info on this topic from 13 years ago: https://www.diymobileaudio.com/threads/anyone-tried-using-one-tweeter.72891/

I managed to find the (very) obscure link to the paper that got this all rolling, for me at least:

https://idsc.miami.edu/thesis/robert_hartman/rhartman_web/INDEX.HTM

Unfortunately, link is dead, and the document doesn't exist in the Wayback machine. The thesis is titled "Spatially relocated frequencies and their effect on the localization of a stereo image." Mr Hartman and Earl Geddes worked at the same company, so it's kinda neat seeing that car companies are investing the time and effort to do car stereo right. I have the Dynaudio system in my VW, and I gotta say it sounds about as good as the average $3000 stereo that you'd have installed by an aftermarket place, but it's 100% integrated. My only real gripe about the system is that it needs moar subwoofer.
 
  • Like
Reactions: 1 user
Here's my basic idea:

1) The fundamental challenge with CBTs is that even the best implementations have terrible response at high frequencies (see attached)

2) The obvious solution is to add an array of tweeters. But that gets messy and expensive and time consuming FAST because you may need as many as 50+ tweeters per side. You can't use a single tweeter because the vertical beamwidth is all wrong.

opsodis.jpg

3) But the OPSODIS concept basically takes a three way speaker and separates the elements horizontally. By doing so, it might be possible to get a convincing/coherent soundstage, even if the various frequency ranges are on different horizontal planes. Pathlength is the key here; though they're not located on the same vertical line, their wavelengths arrive in-phase because they form an arc around the listener.

Or at least that's the idea.
 
I spent way too much time doing sims in VituixCad, and came to realize that the entire Opsodis concept probably requires the listener to be quite close to the loudspeakers, much closer than in a conventional two way setup.

A great deal of my interest in Opsodis came from messing with similar concepts in car audio.

I tried doing Dolby Prologic II in my car, even to the point where I had a home theater receiver in the trunk of my Accord, powered off a 120V inverter. While Dolby PLII sounded "OK" in my car, I preferred Opsodis style setups because:

1) I found that the center channel in Dolby PLII tends to narrow the stage. This is probably less of a problem in a theater, where the speaker are meters apart, but in a car, Dolby PLII has a noticeable tendency to make things sound mono, because the center channel is louder than the left and the right channel, and the three front speakers are "only" a meter apart or so.

2) Dolby PLII setups sound like the stage is in front of you. Opsodis setups sound like the stage surrounds you. Because the speakers literally surround you.

Attached are some pics of Opsodis setups from the Opsodis folks themselves. In my next post I'll elaborate on why you probably want to be much closer to the speakers in an Opsodis setup, compared to a conventional setup.
 

Attachments

  • 288902_188917284507856_7243308_o.jpg
    288902_188917284507856_7243308_o.jpg
    357.8 KB · Views: 77
  • 428063_458898607509721_1422807118_n.jpg
    428063_458898607509721_1422807118_n.jpg
    27.2 KB · Views: 72
  • 537353_458906840842231_231501711_n.jpg
    537353_458906840842231_231501711_n.jpg
    59.7 KB · Views: 75
  • 544003_458892137510368_760617876_n.jpg
    544003_458892137510368_760617876_n.jpg
    39 KB · Views: 70
  • 544025_458889914177257_1067062709_n.jpg
    544025_458889914177257_1067062709_n.jpg
    34.9 KB · Views: 68
  • 13346883_1107943592605216_5663095776173642059_n.jpg
    13346883_1107943592605216_5663095776173642059_n.jpg
    57.9 KB · Views: 69
  • 55750495_2237388189660745_2054513441586020352_n.jpg
    55750495_2237388189660745_2054513441586020352_n.jpg
    51 KB · Views: 72
  • Like
Reactions: 1 user
Why could we just use two tweeters set up like this and cross it at 3000hz use two 8ohm tweeters bridge off the cross over to two 4 ohm midrange in the kicks and aim each tweeter to both seats won't this work with each seat receiving a mono signal with combined l+r to the driver and the passenger with mids in the kicks or is it wishful thinking
 

Attachments

  • Screenshot_20231024_145723_Chrome.jpg
    Screenshot_20231024_145723_Chrome.jpg
    40.8 KB · Views: 49
In post eleven, on this page, I noted that Opsodis seems to be best when listened at close range.

Here's some sims to illustrate what I mean.

Basically I set the 'ring' of speakers up, Opsodis style, in VituixCad. What I found:

1) Even with active filters, I had to tweak the distance from the listener to the speaker, on a driver by driver basis

2) When you move to the left or to the right, even an inch or two, the response goes to hell.

What's happening here, obviously, is that the largest distance from driver to driver is making the array extremely sensitive to small shifts in the listeners location.

When I'd messed with Opsodis in the past, I'd done so in a car, and I also did something a bit like it on a desktop setup I had about seven years ago. But in that setup, the left and the right speakers were less than a meter from me. In my current office, that distance is more than double.

I'm not ready to give up on this, because I've tried it before, and having the speakers in a semi-circle sounds great. What I'm thinking right now is that it might be better to have the left and right speaker in their 'normal' locations (stereo triangle) and then add some midbasses to widen the stage. This would be a bit like what Geddes does with subwoofers, but instead of distributing the subs in the room, I would be overlapping a pair of midbasses on each side. This accomplishes a few things:

1) It should widen the stage. The fundamental challenge with midbass is that you can't really 'fake' where it's coming from, so if you want to widen the stage, you need to literally widen the speakers. But you CAN fake where high frequencies come from, to an extent. (See first post in thread.)

2) It should smooth out the in-room response, since the midbass is distributed. 340Hz is a meter long.

3) If I want to get really fancy, I could put the mids in an end fire or a gradient array, to reduce radiation hitting the back wall.
 

Attachments

  • Screenshot 2023-11-13 234321.png
    Screenshot 2023-11-13 234321.png
    75.9 KB · Views: 38
  • Screenshot 2023-11-13 234429.png
    Screenshot 2023-11-13 234429.png
    45.5 KB · Views: 35
  • Screenshot 2023-11-13 234550.png
    Screenshot 2023-11-13 234550.png
    7.3 KB · Views: 38
  • Screenshot 2023-11-13 234628.png
    Screenshot 2023-11-13 234628.png
    14.7 KB · Views: 37
Just to elaborate on what I wrote in post fifteen:

1) in the first attached photo, you can see that if you're even ten degrees off axis, the response just goes nuts. It's nearly unlistenable if you move your head a few inches.

2) But the vertical response is still pretty good. This is because this is basically a conventional array that's been flipped 90 degrees. So instead of being insensitive to where you are horizontally, it's insensitive to where you are vertically.
 
  • Like
Reactions: 1 user
I created an end-fire array, using passive components.

In a conventional end-fire array, the delays are achieved using DSP delay: https://www.prosoundweb.com/the-end-fire-cardioid-subwoofer-array-made-visible/

But you can do an end fire array without DSP delay, by using increasingly higher order low pass filters. JBL does this with their CBT arrays: https://www.diyaudio.com/community/threads/passive-loudspeaker-delay.280664/post-4473934

Attached is the response of three woofers, with the bottom two low passed. The low pass filter achieves the delay required of an end fire array. The response is shows from the RIGHT of the speaker, like this:

End-Fire-Array-Diagram.jpg


The second set of attached images are the same sim, but measured from the FRONT of the same array. IE, how you would listen to it in the real world. The first is in front of the array (no need for a second image, because the response is symmetrical) and the second is the VituixCad model.
 

Attachments

  • Screenshot 2023-11-19 124312.png
    Screenshot 2023-11-19 124312.png
    34.6 KB · Views: 34
  • Screenshot 2023-11-19 124411.png
    Screenshot 2023-11-19 124411.png
    32.8 KB · Views: 38
  • Screenshot 2023-11-19 124319.png
    Screenshot 2023-11-19 124319.png
    19.4 KB · Views: 40
  • Screenshot 2023-11-19 221702.png
    Screenshot 2023-11-19 221702.png
    31.7 KB · Views: 35
  • Like
Reactions: 1 user
basically takes a three way speaker and separates the elements horizontally
Yes it is theoretically best for crosstalk cancelation. OSD optimal source distribution it is called in some papers. But widely spaced speakers are the second best option. Because head shadowing at high frequencies does quite high cancelation naturally. DSP crosstalk filters should include generic head shadowing filter to work well. Expecially for wider angle.I don’t know any plugin with this. Tried some of them and agree that they sound wrong.