A convolution based alternative to electrical loudspeaker correction networks

Unfortunately i don't have calibration for the mic. I have some cals around (it's an ecm8000) but i figured that due to production variance was better to not use any of them. I plan to buy a dayton which comes already calibrated sometime in the future, but i suppose even without calibration the thing isn't that bad because the before and after correction is like night and day and obviously better, and the fr dosn't sound skewed. But obviously a calibrated mic will make it perfect!

Getting a calibrated mic is quite important as anything affordable tend to differ in the 5k plus range by enough to make a calibrated mic worthwhile.

This is the one I have and an individual calibration can be downloaded from their website. Very good, cheap and available from thomann in europe

Sonarworks XREF 20 Mic – Thomann UK

Preferring a different slope at high frequencies could well be due to the mic calibration as well as personal taste. Calibration is easy enough to avoid that as a variable :)
 
Getting a calibrated mic is quite important as anything affordable tend to differ in the 5k plus range by enough to make a calibrated mic worthwhile.

This is the one I have and an individual calibration can be downloaded from their website. Very good, cheap and available from thomann in europe

Sonarworks XREF 20 Mic – Thomann UK

Preferring a different slope at high frequencies could well be due to the mic calibration as well as personal taste. Calibration is easy enough to avoid that as a variable :)


Thanks! Oh i know sonarworks, i used to use their software for headphones. The mic seems very similar to the ecm 8000 though, maybe it's that but rebranded? Aniway i think i will order one sooner than later, if only to eliminate one variable from the equation!
 
Mindscapes,

1) The Bark resolution settings will lead to less precision than ERB, but I like how it's nearly impossible to overly boost the low freq range with smaller speakers even without the peak limiting stage. My goal was a configuration that would work in a variety of situations without too much fuss. Also, because modal range dips are almost completely ignored, more amp/speaker headroom will be preserved.

2) 19000/19 samples at 44.1kHz defines a 4.3 cycle (1/6 oct) FDW. The neat thing about the custom configuration is that you can adjust the prefilter window size and EQ resolution independently.

3) The MP FDW stage of the 2nd (EQ) stage determines the effectiveness of the impulse response inversion stage. Since the point of this configuration is to *only* use the PT stage, MP FDW resolution is set low enough to practically remove the effect of the inversion stage. The PT stage is fed the "corrected" (post inversion stage) unwindowed response. By starting out with a pre filtered response, and "bypassing" the inversion stage, we allow the PT stage to do nearly all the work.

4) 1 cycle at 20Hz/20kHz is 50ms/.05ms
1ms at 44.1kHz sample rate is 44.1 samples
50 * 4.3 * 44.1 * 2 = 18,963

The 2X multiplication is because we're defining a symmetrical window.

5) The character of artifacts vary by cause. With my custom configuration there will be no pre-echo, so the thing to look out for would be low freq resonance or "booming" away from the measurement location when stronger EQ is used. Speakers positioned close to room boundaries could push this resonance up towards the midrange. With Bark resolution filtering and speakers positioned ~1 meter from walls, there should be very little chance of problematic artifacts.


fluid makes a good point about the importance of mic calibration and how the lack thereof might contribute to a wider range of preferred targets. I got a Dayton mic from Cross*Spectrum Labs with 0/45/90 deg files. Also, as wesayso suggested, fine tuning of the target response can make the "finishing touch". You may find that changing filter resolution can affect target choice...
 
Last edited:
Thanks for the explanation! Things start to get clearer now.


I have, in fact, my speaker positioned against the front wall. And my listening position is 50 cm from the back wall. I also have the speakers firing along the short wall and all the system is on the left of the room and not centered, with a window on the left wall lol. Oh, it isn't even at ear eight but higher.



Not much i can do about it though, it's a bedroom. I was almost tempted to move everything and tried the speaker firing along the long walls which gave also a more symmetric configuration with the lateral walls, but it sounded worse, not sure if it was because of the window behind with that placement.



I have an absorber behind the speaker, one behind me, another non permanent one on the window, a cloud on the ceiling between me and the speakers and another two big panels in the corners.



With the treatment and drc i fixed the skewed stereo image, which was the thing that bothered me the most. Now i get a perceived 5-6 meter large soundstage with deepness. If i close my eyes i don't feel the space of the room but the recording. So pretty happy with the results i obtained, it doesn't show in rew plots but i heard speakers 3 times the cost sounding worse.


This is due to crosstalk dsp reduction. I don't use the classical stereo triangle. Changing from stereo reproduction to ambiophonic was the cheapest and best upgrade to sound i have ever had. Give it a try, it's all explained here:



Understanding and Installing an Ambiophonic System .
Part 1: Problems with Stereo Reproduction and How to Fix Them


Aniway, all this to say that due to placement and my unusual setup i think the best settings for my system will be pretty different from the norm. Your starting point however is a lot better than the filters that come with the program. After some more testing of the standard psychofilter i think it's better than the 19000 version, on the sweetspot is a tie but off axis your standard one is much better!
 
Last edited:
At the moment i have made the following modifications to the psycho config file:
# MP = Minimum phase frequency dependent windowing (Prefilter)
MPPrefilterType = S
MPPrefilterFctn = B
MPWindowGap = 19
MPLowerWindow = 18963
MPUpperWindow = 19

MPStartFreq = 20
MPEndFreq = 20000
MPFilterLen = 65536
MPFSharpness = 1.0
MPBandSplit = 3
MPWindowExponent = 0.9
MPHDRecover = Y
MPEPPreserve = Y
MPHDMultExponent = 3
MPPFFinalWindow = 65536
MPPFNormFactor = 0.0
MPPFNormType = E
# MPPFOutFile = rmppf.pcm
MPPFOutFileType = F

# PT = Psychoacoustic target stage (second filter)
PTType = M
PTReferenceWindow = 37926
PTDLType = M
PTDLMinGain = 0.1 # -20.0 dB Min
PTDLStartFreq = 20
PTDLEndFreq = 20000
PTDLStart = 0.75
PTDLMultExponent = 3
PTBandWidth = 0.125
PTPeakDetectionStrength = 15

PTMultExponent = 0
PTFilterLen = 65536
PTFilterFile = rptf.pcm
PTFilterFileType = F
PTNormFactor = 0.0
PTNormType = E
PTOutFile = rpt.pcm
PTOutFileType = F
PTOutWindow = 0

This should give more or less 5 cycles at the end of the window and around 3 at mid frequencies. I've also played with the mpfsharpness parameter and found that the more i decrease it, the more i lose that holographic quality of the sound. Same happens if i increase the mp windows without lowering the exponent. So i guess i have to leave the mids alone as most as possible.

However, i don't know if really understand what the sharpness does. Intuitively it should make the windows transition in time more gradual with lower values, but isn't this taken care off by the exponent? what's the difference in the two parameters?

Also im pretty puzzled by measurement results. I've looked at the predictions and both the step response and the linearity of phase in REW(looking at a 1ms, 3 ms, and FDW windowing) are way worse looking with the psycho config. While any of the standard files do a good job aligning tweeter and mids, the psycho config doesn't do anything for that, even if i increase all the windowing parameters to very high values. I don't understand why. It sounds much better in any case, and this made me question the utility of a good step response graph.
 
Windowing Exponent allows you to increase or decrease the length of the window in the midrange without affecting the frequency extremes, while sharpness controls the shape of the window (think Blackman, Hann etc...).

RE your edits: PTBandwidth = 0.125 means 1/8 oct EQ resolution which is very high (low numeric value). Probably no benefit going beyond the prefilter resolution which in this case means no lower than 0.17 (1/6 oct)...

The Psycho config is only correcting the minimum phase portion of the response; it's not going to help with driver time alignment. For a more meaningful before/after picture, convolve the prefiltered response with the correction filter, and compare that to the uncorrected prefiltered response; you will see the step response improvement.
 
Windowing Exponent allows you to increase or decrease the length of the window in the midrange without affecting the frequency extremes, while sharpness controls the shape of the window (think Blackman, Hann etc...).

RE your edits: PTBandwidth = 0.125 means 1/8 oct EQ resolution which is very high (low numeric value). Probably no benefit going beyond the prefilter resolution which in this case means no lower than 0.17 (1/6 oct)...

The Psycho config is only correcting the minimum phase portion of the response; it's not going to help with driver time alignment. For a more meaningful before/after picture, convolve the prefiltered response with the correction filter, and compare that to the uncorrected prefiltered response; you will see the step response improvement.

Thanks! I'm talking about convolved results indeed! Take a look. Not the best step in any of the three cases but you can see the minimal filter does correct mids and highs alignment while the psycho remains pretty similar to the uncorrected impulse:

JfHxaI.jpg

JfHqCv.jpg

JfHBGR.jpg

JfHC4p.jpg

JfHo3N.jpg

JfHmts.jpg

JfHbNn.jpg


Here i show the filters impulses and their phase correction and after that the step and impulse of the test convolutions.
As you can see the minimal filter corrects 360 degrees of rotation more compared to the psycho filter and i suspect this align the step. If i do a 1 ms blackman window on the uncorrected impulse and the two corrected one, the minimal filter has a linear line while the psycho one shows 580 degrees of rotation from 0 at 20000 hz. So it seems it doesn't fully linearize the direct sound. Is there a way to increase the phase flattening in the config, specifically for higher frequencies? In the low end the phase correction seem pretty similar. I've tried longer windows and it didn't help, could this be because of some other parameter i have to play with?


Or worded differently: what does the standard config that the psycho doesn't to align the drivers? And more importantly, does it matter and should i even bother given that it sounds better aniway?
 
Last edited:
Mindscapes,

Here is your prefiltered step response before and after convolution with the (1/3 oct) correction filter.


Oh, i missed the part about the prefiltered impulse, i was convolving it with the original impulse, my bad!

So basically what i'm seeing in my graph is the room wrecking havock, and drc with the standard config making the graph prettier is in reality overcorrecting everything, that's why it sounded worse! All clear now :cool::cool:
 
Oh, i missed the part about the prefiltered impulse, i was convolving it with the original impulse, my bad!

So basically what i'm seeing in my graph is the room wrecking havock, and drc with the standard config making the graph prettier is in reality overcorrecting everything, that's why it sounded worse! All clear now :cool::cool:
I think that is a bit of an over simplification. gmad's pre-filtering does a good job of removing the room contribution. The correction will work on the measurement it is fed. If this is not right the correction won't be an improvement. Getting the right measurement to feed in does take some trial and error.

To make another comparison try the prefilter with the minimal to see what the differences are.

The mid frequency correction window length is very important and the window exponent parameter is very sensitive, there is a big difference between 0.9 and 0.95 even if the value does not seem to have changed very much. By putting a pause at the end of the batch script you can scroll back through the command output to see what the window lengths that were used based on the input parameters.
 
I think that is a bit of an over simplification. gmad's pre-filtering does a good job of removing the room contribution. The correction will work on the measurement it is fed. If this is not right the correction won't be an improvement. Getting the right measurement to feed in does take some trial and error.

To make another comparison try the prefilter with the minimal to see what the differences are.

The mid frequency correction window length is very important and the window exponent parameter is very sensitive, there is a big difference between 0.9 and 0.95 even if the value does not seem to have changed very much. By putting a pause at the end of the batch script you can scroll back through the command output to see what the window lengths that were used based on the input parameters.


You misunderstood my comment, the graph i posted were obtained convolving the filter with my basic response.
I understand that the prefilter use a frequency dependent windows to remove the room and correct basically only the speaker.
My comment was referring to the fact it didn't occur to me that to see the real effects all i had to do was convolving the filter with the prefiltered impulse.
From the difference you can notice between the step i posted (obtained by the operation in rew basic response*final filter) and the graph gmad posted (same but with the prefiltered response*final filter) i understood that the room has a big impact, otherwise the graphs obtained with the basic response will be more similar to the ones obtained with the prefiltered. The more your speakers are well positioned and the room well treated the more the graph should be similar, even if in practive they will never be exactly the same unless you are in an anechoic room.
It was pretty obvious in hindsight, given that i also already opened the prefiltered in rew various times to see how it changed when i modified something in the file, but i didn't think about it before gmad suggested it to me.
I also already tried what you say, to convolve the minimal with the prefiltered, and it confirm this, the minimal correct more for the room and applied on the prefiltered it overcorrects fr and gives worse results in the step. (even if it's not a completely fair comparison because the minimal doesn't start with a prefiltered response so it's guaranted it overcorrect the windowed one, but still.. i think to compare it more meaningfully you should compare the prefiltered*final filter with your basic impulse*minimal filter windowed with a similar resolution FDW in rew to the one you used in the prefilter)
 
Last edited:
You misunderstood my comment...
I also already tried what you say, to convolve the minimal with the prefiltered, and it confirm this, the minimal correct more for the room and applied on the prefiltered it overcorrects fr and gives worse results in the step. (even if it's not a completely fair comparison because the minimal doesn't start with a prefiltered response so it's guaranted it overcorrect the windowed one, but still.. i think to compare it more meaningfully you should compare the prefiltered*final filter with your basic impulse*minimal filter windowed with a similar resolution FDW in rew to the one you used in the prefilter)
I understood your comment, I think you have completely missed mine though.

I was responding to this "So basically what i'm seeing in my graph is the room wrecking havock, and drc with the standard config making the graph prettier is in reality overcorrecting everything, that's why it sounded worse! All clear now"

And the over simplification that because a stronger correction did not sound better to you meant it was overcorrected. The point I was trying to make is that you did not feed DRC the same input when processing and convolving the prefiltered with the minimal was not what I was suggesting at all.

I was suggesting that you use the prefiltered response as the input impulse to the minimal configuration so that the stronger correction was working on an input that was closer to speaker than room.

Comparing these will give you a better insight as to whether you actually like the psycho processing more than the minimal when both process the same measurement.
 
I understood your comment, I think you have completely missed mine though.

I was responding to this "So basically what i'm seeing in my graph is the room wrecking havock, and drc with the standard config making the graph prettier is in reality overcorrecting everything, that's why it sounded worse! All clear now"

And the over simplification that because a stronger correction did not sound better to you meant it was overcorrected. The point I was trying to make is that you did not feed DRC the same input when processing and convolving the prefiltered with the minimal was not what I was suggesting at all.

I was suggesting that you use the prefiltered response as the input impulse to the minimal configuration so that the stronger correction was working on an input that was closer to speaker than room.

Comparing these will give you a better insight as to whether you actually like the psycho processing more than the minimal when both process the same measurement.

Yeah it seems it was i that misunderstood what you suggested. I will try it for fun, but i think the original config files are meant to work on the raw measured impulsed. Their windows and configuration are overkill on an already filtered impulse.

It's not that drc overcorrect, in fact with the minimal i think it's doing what is supposed to do: correct the room as the name of the software implies, or at least this is what i understand was the original intention of it's creator. It overcorrects in the sense that as specific a correction has a very narrow sweetspot, but it works there for sure. It's not that it gave bad results compared to listening without any correction, but it seems to me it's a different philosophy entirely compared to what gmad is doing with this procedure so it doesn't make a lot of sense feeding the raw impulse on settings that are optimized for another thing, correcting the room instead of the speakers.

It makes more sense to me to use the minimal settings as they are meant to be used and then window the results in rew, saving the impulse with the fdw used, and reimporting it. Then you can run a wavelet and see what the two different philosophies are doing to the direct sound and compare them objectively.

But who knows, maybe it will sound good, i will give it a try!

EDIT: Here's the comparison between uncorrected, minimal, and psycho that i suggest. uncorrected and minimal are windowed to 1/8 octave instead to account for the lesser resolution of rew compared to drc.



 
Last edited:
This is my point that DRC works on the the measurement that is fed into it. If measurements taken from different parts of your room vary significantly then any correction will be quite position dependent.

The more you make the correction about the speaker and less about the room will make the sound improved over a greater area.

As to which approach is better depends on how you want to listen to your system. If you always sit in the same chair it could be argued that a single point correction might be preferred. If you want to cover a whole couch or more with good sound then an averaged measurement or a filtered measurement as gmad has done will be a good start.
 
Sorry if this has been covered before, I did not read the whole 60+ pages.

Is it possible to modify the scripts/DRC files to apply them to 48KHz measurements?

I already have empty room measurements done at 48KHz previously and no way to repeat them again for 44.1KHz as this is our living room with some heavy furniture in it now.
 
I already have empty room measurements done at 48KHz previously and no way to repeat them again for 44.1KHz as this is our living room with some heavy furniture in it now.

The fact that there is heavy furniture in their now means that you actually do need to retake the measurements for them to be valid to use with DRC. A bare room response is not what you want.

A basic way to make the process work is to change BCSampleRate to 48000 in the drc file associated with your script. As a long as the Impulse response centre is set to A (Auto) it will work but the window lengths will not be the same due to the sample rate change.

So Gmad is right re-sampling is a simpler way to get it exactly the same, but the above is a quick and dirty way too.
 
The fact that there is heavy furniture in their now means that you actually do need to retake the measurements for them to be valid to use with DRC. A bare room response is not what you want.

Thanks for the reply. If I remember correctly some other DRC tools recommend minimising the furniture reflections as much as possible. I think Dirac tells you so.

Resampling sounds reasonable.