@mdsimon2 : Lately I have been stress-testing weak RK3308 and its capabilities, using CDSP as a bridge + optionally async resampling between UAC2 gadget and the SoC I2S interface/s. That SoC has a driver which supports merging two I2S to produce 16ch out + 10ch in or any combination up to 26 channels.
Since the I2S hardware has fixed max buffers (512kB per I2S interface, IIRC), at 16ch out at 384kHz the maximum buffer frames size ends up quite small. I used CDSP 2 with 4 chunks per buffer and 8 periods per buffer. Target level was set at 75% of the buffer size, i.e. 3 chunksizes. I increased the base adjustment step ten fold to 1e-4
https://github.com/HEnquist/camilla...45df8bfba6e22cc1/src/alsadevice_utils.rs#L233 , adjustment period down to 2 seconds, to make sure the regulation kicks in faster and is more steep.
WIth this setup the system was able to cleanly playback and capture e.g. IIRC USB host -> gadget -> CDSP1 compiled with float32 format sync resamplling 16ch from 48kHz to 384kHz -> 16ch I2S -> 10ch I2S -> CDSP2 rate adjust to playback, using 5 channels only (isochronous packet limit) -> gadget -> host, no xruns, no issues on sox spectrogram for 20 minutes. Amazingly stable performance from CDSP (and that SoC too). All four cores were quite loaded, yet no timing issues. CDSP1 doing the resampling was reniced to -19. I did not play with any RT priorities or core assignments.
Of course the buffer level deviates, but never got any wild fluctuations. Also if your processing thread takes time and your CPU load varies (other processing running on the background), the processed chunks get delivered to the playback with varying delay which makes the buffer fluctuate more.
What was your buffer size and target level? Hundreds of samples seem safe if your buffer is several thousands and the target level is above half of it. Using the whole buffer is no problem, in fact you want the buffer as full as possible - larger safety margin. There can be no buffer overflow on playback, writing the samples will simply wait. Underflow is a problem on playback. Unlike on capture where overflow is a problem.
This is only anecdotal, but it usually seems like what happens is the buffer level exceeds my target and the rate adjust adjusts continues to adjust down (usually only going down to 0.9997) but the buffer level keeps increasing. Eventually there will be a sequence of large drops in buffer level until I am hundreds of samples below my target level. It is almost like the feedback is delayed.
Look at the algorithm, its quite simple
https://github.com/HEnquist/camilla...45df8bfba6e22cc1/src/alsadevice_utils.rs#L221 . At start it does not know the samplerate ratio (the capture_speed). If the delay difference (in samples) is outside of some "equality" range, the capture speed is increased (resp. decreased) by 3 speed_delta steps. If the delay difference is inside this equality range, the adjustment drops to one speed_delta step, in resp. direction.
The buffer level keeps increasing if the capture speed is still too fast. But it slows down every cycle and eventually will get too slow, causing adjustment to the other side.
It takes some cycles to reach and settle around the real capture speed, therefore it may be useful to decrease the cycle period (adjustment time). Exact target level will never be reached stably, too many timing factors in the chain - look at that linked chart how complicated the timing is. But if the deviations do not drop below some reasonable level (e.g. I would not be OK with some 20% of the buffer - hence my target at 75%), IMO no need to worry.
EDIT: corected async -> sync resampling in my setup description (no need for async).