What is the ideal directivity pattern for stereo speakers?

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
I was involved in the Snell RCS1000, which begat the Tact, which begat the Lyngdorf and they all had the same issue. I sat through several Audesey demos where they let the computer create the perfect EQ curve, had a listen, didn't like it, massaged the "house curve" and let the computer have another go at it. After 5 go-rounds of that you think "perfect automatic EQ?"
I think the fact that a "house curve" is needed at all is a tacit admission that the system is measuring the wrong thing, (more or less the power response of the speakers modified slightly by frequency dependent room absorption coefficients) and attempting to fudge it based on preconceived notions of the likely directivity index vs frequency of a "typical" speaker together with likely room characteristics.

If your speakers and room are "typical" (whatever that is) you might get a result that isn't horrible, but if you have speakers with significantly different than "normal" DI vs frequency and/or room treatment is a lot different than normal, such assumptions fall apart.

Any time making a measurement look better sounds worse, it's a fair bet that you're measuring the wrong thing, in this case steady state response vs direct field.

Absolutely, for automatic EQ to work the 1st step, measurement, must be done in a way that mimics the way hearing works. Without that the process can never be reliable. It may generally be an improvement over no EQ, but it won't always work because we are measuring the wrong thing.
A sliding window at least gives us a glimmer of hope - it's surprising how accurate a treble measurement from ~3Khz or so upwards can be as far back as near the listening position when measured with a suitably short gate time, and as the treble region is the most problematic to measure steady-state and where most of the "house curve" applies, that section of the spectrum could at least be measured gated.

Quite how you blend in longer gate times for lower frequencies without allowing the room influence to confuse the situation I'm not sure. Perhaps sliding window measurements taken on axis at 1, 2, and 3 metres (separately per speaker) and then analysed together would allow an algorithm to have enough smarts to figure out what is room effect and what is the actual speakers response, above the Schroeder frequency.

The closer measurements would give a longer usable reflection free time to give a reasonable idea of the direct signal in the upper half of the spectrum, (but be unusable at low frequencies due to floor bounce) while the further away measurements would give a better idea of the low frequency response, and allow corrections below the Schroeder frequency.

Still extremely challenging for an automated system though, you're essentially trying to measure the speakers true response (at least at mid/treble frequencies) in-situ, which is difficult (sometimes impossible) even done manually.

The cinema analogy is the inconsistency of sound in theaters all EQed to the same X curve.
I've heard quite a few cinemas in two different countries now and no two theatres sound alike, even within the same cinema complex. I'm not talking about small differences either, but gross differences in frequency response that are obvious to the ear.

I've heard some where it's very obvious to me that there is at least a 10dB up-tilt in response from ~300Hz to ~10Khz, which is very bright, harsh, and plain uncomfortable to listen to. (As well as too loud) So much so that I take my iPod earphones with me just in case so I have something to put in my ears to attenuate (but not block) the sound to bring it down below the pain threshold, especially in the presence region. :rolleyes:

Whether its shortcomings of the X curve measurement technique, or whether they just plain don't bother to calibrate the sound (or screw it up) I don't know.

The best cinema sound I've heard is the Glasgow IMAX, which is outstanding in every way. Great low frequency extension, very even and smooth frequency response, loud, effortless and dynamic, but not uncomfortably loud, and no trace of upper midrange harshness. Sound quality that is good enough for critical music listening, let alone a movie.

Comparatively speaking the sound of most of the non-IMAX cinemas I've heard range from ok to mediocre at best. Exaggerated but lacking in extension bass, tilted up high frequencies and a painful harshness in the presence region sounding to me like strong cone breakup is not uncommon - puzzling when I assume they're using horns of some kind which shouldn't have that problem. Vocals that have a strong cupped hands forwardness to them. Quite noticeable cabinet colouration in the lower midrange.

We criticize stereo recordings for not having any "standards" like movie soundtracks do, but whilst I suspect the movies themselves are made to a standard, a lot of cinemas just don't seem to be set up right. Do IMAX cinemas adhere to a different standard, or do they just take more trouble to set them up right and install good speakers ?

From what I've seen most of these products are designed by hotshot DSP guys with, unfortunately, very little knowledge of psychoacoustics.
I think the same could be extended to speaker design as well. Its become pretty clear to me that you can't design a really excellent speaker unless you have a pretty solid understanding of room acoustics, the human hearing perceptual system, and how the two interact.

Studying room acoustics and perceptual research of the sort referenced in threads like this should be required reading for any budding speaker designer. Without that understanding, how can you wisely choose the trade-offs in the design process ?

It's too easy to treat it purely as an engineering exercise to optimize certain performance parameters, without knowing which parameters really matter and which don't.

For example if you didn't understand that we don't perceive the power response in the room as measured by a microphone, you might follow a design path that attempts to achieve flat power response, or you might be lead to believe that as soon as a speaker with a non-perfect power response is placed in a room, it "must" be equalized before it will sound right. Both obviously wrong.

Or the common dilemma in midrange/tweeter crossovers - maintaining uniform polar response through the crossover region (small midrange driver and/or a low crossover frequency) versus achieving low distortion (especially tweeter IM) and better dynamic performance. (larger midrange driver and/or higher crossover frequency)

Having the pattern narrow somewhat below the crossover frequency (and a dip in power response) looks bad on paper, so it must be bad, right ? Yet you keep pointing out research that shows small dips in the power response like that are barely if at all noticeable in most circumstances.

So by using a small driver and/or low crossover frequency are we sacrificing distortion and dynamic performance for a parameter that doesn't matter that much ? Maybe...

Or how about transient perfect time aligned designs. It reproduces a square wave well, so it must sound better, right ? If you could achieve it without any other sacrifices, maybe very slightly, but there is very little perceptual research to back that up. If you have to greatly sacrifice distortion and power handling, as well as lobing by using 1st order roll-offs, is it really worth it ? Hmm...

The list goes on, and affects both the microscopic design choices like higher vs lower crossover frequency, and the macroscopic choices like 2 way vs 3 way, cone and dome vs constant directivity vs full range vs line arrays and so on.

You can never optimize everything at once so you better have a really good idea of what really matters and what is less important...
 
Last edited:
I sat through several Audesey demos where they let the computer create the perfect EQ curve, had a listen, didn't like it, massaged the "house curve" and let the computer have another go at it. After 5 go-rounds of that you think "perfect automatic EQ?"

I have Audesey in my HT Pre Processor. Tried to set it up a couple of times and ended up leaving it off. I didn't like what it did.

Rob:)
 
I think the fact that a "house curve" is needed at all is a tacit admission that the system is measuring the wrong thing......

Agreed, agreed, agreed...

Quite how you blend in longer gate times for lower frequencies without allowing the room influence to confuse the situation I'm not sure.

The idea is that the room should influence the perceived response measurement, at least to the degree that the room response occurs within the window length appropriate for that frequency. For low frequencies where the effective window is long then the steady state response is essentially used (with critical band smoothing). There was an interesting paper by Glyn Adams some years back where he showed the build up to the steady state response of a typical speaker in a typical room. The general trend of the final LF curve was arrived at essentially when the first few reflections had arrived (the 7 reflections of a tri-hedral corner) further reflections added "fuzz" to the curve but not a lot of change to its essential shape (some very heavy standing waves excepted, the essence of the standing waves is allso picked up with a longer window).

David S.
 
Administrator
Joined 2004
Paid Member
.....at least to the degree that the room response occurs within the window length appropriate for that frequency.

ARTA does a Burst Decay measurement that shows not time, but periods at each frequency. Kinda the ultimate sliding window. Not sure it matches with what we hear in a room, but it's an interesting way to look at the response.
Would it be of any use here?
 
The idea is that the room should influence the perceived response measurement, at least to the degree that the room response occurs within the window length appropriate for that frequency. For low frequencies where the effective window is long then the steady state response is essentially used (with critical band smoothing).
Yes that's pretty much what I was thinking, I'm just a bit worried about the effects of ripple in the derived frequency response as the window length is expanded to include the first and then following reflections. Some type of smoothing would definitely be needed.

I'm also wondering how high in frequency such a sliding window system would tend to try to equalize room effects in practice. Do we really want it using anything other than the direct reflection free response above say 500Hz ?

Critical band smoothing is important too, one thing I did notice with naive Auto EQ systems is that if you let it have its way on equalizing each narrow (typically 1/3 octave) band as much as it liked to achieve a "flat" response, the adjacent bands tended to get adjusted in a steep zig-zag pattern, which sounds really bad, presumably because you have a whole bunch of narrow band minimum phase peaking and notch filters trying to correct what's essentially a time domain problem. (constructive and destructive time delayed interference)

On the other hand if you set a maximum limit on the allowable difference between adjacent bands (say 1dB, which you can do on the DEQ2496) eg the maximum slope, it prevents it from trying to filter out every tiny comb filtering induced fluctuation, and instead settle on the broad overall trend. It still doesn't sound great as it's measuring steady-state response, but it sounds much better than giving it full freedom to do what it wants.

I think the lesson from this is Room EQ should not attempt to correct every small lump and bump, even if you can measure it more accurately, and that you should instead try to attempt some sort of curve fitting utilizing the least number of parametric EQ coefficients as possible, rather than a full freedom narrow band graphic EQ approach. "Less is more".

There was an interesting paper by Glyn Adams some years back where he showed the build up to the steady state response of a typical speaker in a typical room. The general trend of the final LF curve was arrived at essentially when the first few reflections had arrived (the 7 reflections of a tri-hedral corner) further reflections added "fuzz" to the curve but not a lot of change to its essential shape (some very heavy standing waves excepted, the essence of the standing waves is allso picked up with a longer window).
Interesting. I think the FRD consortium room response calculator spreadsheet allows you to set the maximum number of reflections for each ray, so it would be interesting to play around with that, and essentially see a simulation of the above research.
 
Last edited:
Or the common dilemma in midrange/tweeter crossovers - maintaining uniform polar response through the crossover region (small midrange driver and/or a low crossover frequency) versus achieving low distortion (especially tweeter IM) and better dynamic performance. (larger midrange driver and/or higher crossover frequency)

Having the pattern narrow somewhat below the crossover frequency (and a dip in power response) looks bad on paper, so it must be bad, right ? Yet you keep pointing out research that shows small dips in the power response like that are barely if at all noticeable in most circumstances.

So by using a small driver and/or low crossover frequency are we sacrificing distortion and dynamic performance for a parameter that doesn't matter that much ? Maybe...

Or how about transient perfect time aligned designs. It reproduces a square wave well, so it must sound better, right ? If you could achieve it without any other sacrifices, maybe very slightly, but there is very little perceptual research to back that up. If you have to greatly sacrifice distortion and power handling, as well as lobing by using 1st order roll-offs, is it really worth it ? Hmm...

The list goes on, and affects both the microscopic design choices like higher vs lower crossover frequency, and the macroscopic choices like 2 way vs 3 way, cone and dome vs constant directivity vs full range vs line arrays and so on.

You can never optimize everything at once so you better have a really good idea of what really matters and what is less important...


I'm interested in exploring the effects of driver lobing and polar response in the crossover region. I've created a separate thread and I would like to invite anyone interested to bring their opinion there.

Large vs Small midrange
 
I'm also wondering how high in frequency such a sliding window system would tend to try to equalize room effects in practice. Do we really want it using anything other than the direct reflection free response above say 500Hz ?

I was just going to mention that since the ear changes the way that it processes signals above 500 Hz (or so) that there should be a change in the processing of the signals as well. Basically I would lean towards an ever widening window as the frequency falls to steady state (very wide window) in the modal region. But above about 1 kHz I would think that the window should stay fairly much the same since everything above 1 kHz is processed in the ear in the same way with essentially the same time constants, etc. - that is, up until things begin to fall appart above about 8 kHz.

I found the sliding windows in Holm to be somewhat invalid because of the linear change in the window size with frequency. I certainly don't see that as correct. I think that it was just easy to impliment so he did it. I don't think that any advanced concepts in hearing were used in doing this however.
 

ra7

Member
Joined 2009
Paid Member
ARTA does a Burst Decay measurement that shows not time, but periods at each frequency. Kinda the ultimate sliding window. Not sure it matches with what we hear in a room, but it's an interesting way to look at the response.
Would it be of any use here?

Burst decay is just another way to look at the CSD. Instead of time, it has periods that allows resolution of the lower frequencies. Not sure if it will help here.
Page 5:
http://www.audioxpress.com/magsdirx/ax/addenda/media/dappolito2959.pdf
 
A few months ago, I began taking LP measurements starting at 2ms to look at 10k and above, then doubling the window as I looked lower and lower in frequency, ending around 500ms in subwoofer territory (9-10 measurements). Mentally integrating the curves from top to bottom. Certainly not scientific, but I used the process to produce an audibly smooth monotonic rolloff from bottom to top. The most pleasing result was fairly close to Dan's curve from a few pages back.
 
ARTA does a Burst Decay measurement that shows not time, but periods at each frequency. Kinda the ultimate sliding window. Not sure it matches with what we hear in a room, but it's an interesting way to look at the response.
Would it be of any use here?
That sounds promising to me. I haven't taken the time to get into ARTA (gotta fix my laptop first), but I've been meaning to for years... I'm interested in looking at the start up and decay time of bursts in a gaussian or "Blackman" envelop. I expect to find that decay causes any sound to seem louder, reltive to what a calibrated mic and pink noise would claim, but start up time seems important too. I believe that each will have it's own psycho-acoustic effects, which may well vary with frequency. It seems that we can calibrate frequency response for transients, or long term (steady state), but not necessarily both. With the right data, we could design for an educated compromise.
 
There are several psychoacoustic perception models available in the literature. Usually they employ a cochlear model, equal loudness contours, forwards and bacwards masking in time domain, upwards masking in frequency domain and temporal integration. I have inplemented one in Octave.

I wonder why nobody is using these kind of models in loudspeaker design?

Of course the best starting point is to measure binaural impulse responses at the listening position, which captures everything relevant in the loudspeaker and room interaction.


- Elias
 
Administrator
Joined 2004
Paid Member
I expect to find that decay causes any sound to seem louder, reltive to what a calibrated mic and pink noise would claim,
That's been my experience. E.G. a panel resonance at 225 or 450Hz doesn't stick out on the FR, but does on the burst or waterfall. And you'll hear it. It might sound heavy in that region, even tho the FR says it's not.
 
That's been my experience. E.G. a panel resonance at 225 or 450Hz doesn't stick out on the FR, but does on the burst or waterfall. And you'll hear it. It might sound heavy in that region, even tho the FR says it's not.

I've been saying it all along !

frequency response should not be used because it does not illustrate how we hear !


avoid using frequency response plots from now on because it does not take into account important aspects of our hearing mechanism

- Elias
 
Speakerdave said:
"There was an interesting paper by Glyn Adams some years back where he showed the build up to the steady state response of a typical speaker in a typical room. The general trend of the final LF curve was arrived at essentially when the first few reflections had arrived (the 7 reflections of a tri-hedral corner) further reflections added "fuzz" to the curve but not a lot of change to its essential shape (some very heavy standing waves excepted, the essence of the standing waves is allso picked up with a longer window). "

Interesting. I think the FRD consortium room response calculator spreadsheet allows you to set the maximum number of reflections for each ray, so it would be interesting to play around with that, and essentially see a simulation of the above research.


He was using an image model and adding up the images from the source out to a variety of radiuses, essentially what you are talking about.

It all makes sense if you think about. Every subsequent image is more delayed in time and will add some amount of comb filtered addition. The later the reflection the higher the "quefrency" of it. Broad trends have to come from early reflections. Later reflections can only add fine detail or "fuzz" to the curve.

David S.
 
There are several psychoacoustic perception models available in the literature. Usually they employ a cochlear model, equal loudness contours, forwards and bacwards masking in time domain, upwards masking in frequency domain and temporal integration. I have inplemented one in Octave.

I wonder why nobody is using these kind of models in loudspeaker design?

Of course the best starting point is to measure binaural impulse responses at the listening position, which captures everything relevant in the loudspeaker and room interaction.
- Elias
I'd bet it's because the various test methods aren't explained well enough for people who aren't very good at high math and maybe not familiar with acronyms and acoustics in general.

The trouble with using a binaural mic is that there's no brain to combine the two signals, and our ideas of how to do that may not be the same as how the brain does it.
 
I do think the boys at Harman know a thing or 2 about psychoacoustics. In fact they seem to have written the book. I would rather just have an auto EQ for below a couple hundred hertz, but some folks with jacked up speakers might be better off with the current software. With well designed speakers it may not be doing to much anyway above 500 Hz.

The first time I ran this without much room treatment(just HF foam on the fron wall behind the TV) it looked like this:
hkezeqoff.jpg

afterward EZEQ:
hkeqavg.jpg


Treble always varies a lot with slight changes in mic position and I wouldn't be surprised of those changes were largely that. Maybe I'm wrong, but I've often had this sort of variation in the treble with these sorts of averaged measurements though not always.

You've seen the graph with all the 14 inch foam added to the front wall, the foam coffee table and more optimal speaker positioning, but I'll show it again.

Ran the EZEQ again after adding foam:
hkezeqfinal.jpg

Then removed the foam coffee table w/o running the EQ again:
htwofoam.jpg


I wish I had several mics so I could measure this all at once. That would eliminate a lot of guessing.

Looks like acoustic treatment is still very effective in combo.

Dan
 
Basically I would lean towards an ever widening window as the frequency falls to steady state (very wide window) in the modal region. But above about 1 kHz I would think that the window should stay fairly much the same since everything above 1 kHz is processed in the ear in the same way with essentially the same time constants, etc. - that is, up until things begin to fall appart above about 8 kHz.
What falls apart above 8Khz that couldn't be adequately measured with a reflection free windowed measurement ? Even at the sitting position in most rooms there is enough delay to first reflection to get an accurate measurement of the high treble.
I found the sliding windows in Holm to be somewhat invalid because of the linear change in the window size with frequency. I certainly don't see that as correct. I think that it was just easy to impliment so he did it. I don't think that any advanced concepts in hearing were used in doing this however.
I agree completely, that's basically what I said earlier.

I think the approach of a sliding window where lower frequencies are derived from a longer window time and higher frequencies from a shorter time is the right approach, but that the ideal relationship between window time and frequency range is definitely not linear, and would be more in line with what you said at the top.

We know the curve will probably be monotonic, will include several hundred milliseconds at low bass frequencies, will be on the order of just a couple of milliseconds at treble frequencies, but exactly what happens in between I think is yet to be determined, and also whether the window time reaches a "flat" section above a certain frequency, where it no longer varies.

Deriving the curve that describes the ideal window time for each frequency band is something that is going to have to be painstakingly done by acoustic researchers, if it hasn't already been done yet. The references Dave has quoted before may have the data already, although whether it is of sufficient detail and accuracy to directly define a curve that could then be applied to a DSP algorithm I don't know.

Finding the right curve to most accurately mimic our perception is key to success or failure of this approach.
 
I do think the boys at Harman know a thing or 2 about psychoacoustics. In fact they seem to have written the book. I would rather just have an auto EQ for below a couple hundred hertz, but some folks with jacked up speakers might be better off with the current software. With well designed speakers it may not be doing to much anyway above 500 Hz.
Yet your before and after measurements below show large changes in the midrange and treble balance ?
The first time I ran this without much room treatment(just HF foam on the fron wall behind the TV) it looked like this:
hkezeqoff.jpg

afterward EZEQ:
hkeqavg.jpg


Treble always varies a lot with slight changes in mic position and I wouldn't be surprised of those changes were largely that. Maybe I'm wrong, but I've often had this sort of variation in the treble with these sorts of averaged measurements though not always.
Thanks for making my point for me :)

This is another reason why steady-state room measurements are so bogus at high frequencies, especially treble. Not only do they not directly map to the on-axis response of an arbitrary speaker, they change dramatically with just a few inches of microphone movement, one reason why Auto-EQ measurements tend to be so hit and miss - put the microphone in a slightly different place, get a different EQ result!

To get repeatable results you need to measure in a way that is repeatable, and apart from the whole on-axis vs power response issue, steady-state results never are repeatable within a room because the tiniest microphone or room object movement changes the results unpredictably.

Windowed measurements though are extremely reliable. I've taken 1 metre measurements of my speakers in-situ before, (only accurate down to about 1Khz due to side-wall reflections) then come back days later using only a measuring tape to get the distance from microphone to floor and front panel the same, only judged that the microphone is square onto the speaker by eye, and got measurement results that track from 1Khz to over 15Khz so closely that the lines actually perfectly overlay in overlay mode. (Tiny fraction of a dB)

The reality is I could be out a couple of inches or more and it would still be within a fraction of a dB, and any change would be an actual change in the response of the speaker in that point in space rather than room effects.

If I were to do that with a steady-state response (even at that close distance, let alone further back) I know that I'd never get the same measurement result twice once the microphone had been moved, even if I attempted to measure its position. (Been there, done that...)

It's hard to judge on a steady-state measurement, but the above change seems to have made the bass response better and everything else worse. How does it sound ?

What I'd really rather see is a 1 metre on axis reflection free windowed measurement taken before and after the Auto-EQ system has done its thing to see whether it's making the on-axis response better or worse in the midrange and treble. My guess looking at the above is that it is making it worse.
You've seen the graph with all the 14 inch foam added to the front wall, the foam coffee table and more optimal speaker positioning, but I'll show it again.

Ran the EZEQ again after adding foam:
hkezeqfinal.jpg

Then removed the foam coffee table w/o running the EQ again:
htwofoam.jpg
But is that change audible ? Or is it an example of the steady-state response being altered by room object changes at frequencies where the direct response dominates our perception ? How did it sound ?
I wish I had several mics so I could measure this all at once. That would eliminate a lot of guessing.
Multiple microphone stands ? ;) Or just don't put much stock in unreliable steady-state room responses above ~300Hz.
 
Last edited:
...
I found the sliding windows in Holm to be somewhat invalid because of the linear change in the window size with frequency. I certainly don't see that as correct. I think that it was just easy to impliment so he did it. I don't think that any advanced concepts in hearing were used in doing this however.


Hello all,

could anyone point me to some data concerning tone duration vs.
loudness impression ?

And how would this relate to "aural integration time" ?

All i have now is this

Physikalische und psychoakustische ... - Juan G. Roederer - Google Bücher

(see Figure 3.16) which is a bit sparse ...


Furthermore, when moving towards steady state at low frequencies, wouldn't we have
to talk about reverberation time vs. frequency also in our acoustical small rooms ?

I mean how it is commonly and how it should be ?

As mentioned earlier in this thread, the directivity pattern to be preferred can hardly be discussed
without taking reverberation characteristics of the room into account. But now i would like to
focus - for a moment - on preferred RT vs. frequency in a room suitable for "enjoyful listening" and
maybe for "critical listening" also ...
 
Last edited:
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.