Today's post is a bit of a continuation from last time's look at the different types of upsampling antialiasing playback filters using my Raspberry Pi 3 "Touch", piCorePlayer and SoX. As you can see from that discussion, across the audiophile equipment spectrum, manufacturers utilize all kinds of digital filter settings in their gear. Each company ends up choosing compromises between how much frequency roll-off, how much aliasing, how much temporal/phasic anomaly each would accept. And of course no matter what a company chooses, there are ways of advertising the decision as "good"; whether it be on the basis of frequency spectral accuracy, temporal accuracy, or just claims from pure subjectivity - "it just sounds better"!
The end goal of audiophilia is a bit like the modern interpretation of Goldilocks (and the Three Bears)... We're all trying to figure out for ourselves what is "just right" as we wade through the commercial and mainstream audiophile literature, unofficial blogs and forums, mix-and-match speakers with amplifiers, try out different accessories perhaps, and the like. So too it seems with digital filters and all the variants attached to the DACs we buy.
Remember that the only reason we're even talking about this is because of that 44.1kHz (and to a lesser degree 48kHz) samplerate such that the Nyquist frequency is at 22.05kHz; relatively close to the usual 20kHz upper limit of hearing acuity that the younger ones among us might be able to perceive. This is literally the only reason for all the hand wringing and millions of spilt keystrokes over the years around filtering by audiophiles (the few who still obsess over this...)
These days, we essentially have 2 major options for filter "types" among the DACs out there... Linear phase (the default for most mainstream DACs, Chord) or Minimum phase (Apple, MQA, Pono) - pick your "poison" :-). Of course within each phasic variety we have different levels of steepness and allowance for aliasing. We intuitively know that due to the biological phenomenon of auditory masking, maximum phase (where the group delay is pushed forward so "pre-ringing" is accentuated) is not desirable. But is there another choice?
Yes, there is of course... We can try to figure out a "just right" state with intermediate phase settings. Accepting that maybe there's some value to ensuring that pre-ringing isn't an issue even with some of the worst audio recordings out there, while maintaining awesome frequency and temporal accuracy - let me show you my choice for the filter that I listen to daily with the Pi 3 streamer...
As I mentioned previously, I sold the excellent Oppo Sonica DAC. As a result, I'm back to my "tried and true" TEAC UD-501 DAC in use since 2013. This is just fine because the TEAC can upsample to 352.8/384kHz which will defeat the built-in filter of the dual internal TI PCM1795 DAC chips.
Archimago's suggested "Goldilocks" filter settings...After some listening, tweaking, and analysis, here's the upsampling setting I've been using in the last couple months for "reference" listening:
Max sample rate:
As you can see, I have no fear of steep filtering. I want technically accurate, flat frequency response all the way to 20kHz at least. This is why in the piCorePlayer setting above, I've set "passband_end" to 95% (20.95kHz with 44.1kHz sampling) and "stopband_start" to 105% (23.15kHz with 44.1kHz sampling). A little bit of aliasing to about 23kHz isn't a problem; it's attenuated to a certain degree with the sharp filter slope by 22.05kHz as we'll see later (not to mention further attenuated by one's tweeters and ears).
As usual, I've put in a -4dB attenuation, and 28-bits precision is more than enough. Notice however that the "phase_response" is 45; an "intermediate" setting between 0 minimum and 50 linear.
Here is the Digital Filter Composite (DFC) graph with these settings recorded directly from the TEAC UD-501:
Nice, right? A relatively steep filter extending beyond 20kHz. Flat looking frequency response up to 21kHz. No significant imaging/aliasing (a few noise peaks and minor irregularities in the ultrasonic noise floor using "real life" hardware). Remember, for my DFC graphs I include wideband noise of 0dBFS that will trigger any evidence of intersample overload; much more demanding than the typical -4dBFS wideband measurement you see in reviews like in Stereophile.
Let's zoom into the transition band with the 0dBFS signal:
As you can see, with these settings, we have a very flat frequency response effectively to 21.5kHz. By Nyquist at 22.05kHz, there's ~6dB attenuation. The "aliasing region" is small and well contained - close to -30dB suppression by 22.5kHz and the reflected aliasing region is already close to -35dB by 21.5kHz. I doubt my dog/cat/bat is going to mind. My rationale for a little bit of aliasing is that this reduces the filter length (less impulse ring duration), less computational power needed, while effectively extending the frequency response beyond 20kHz. In other words, if you are human, "golden ears" or otherwise, this filter is guaranteed to be transparent within the frequencies of interest and extracts essentially everything a 44.1kHz signal has to offer.
Bottom line, humans will not hear the aliasing artifact. Heck, even my worst DR0-2 recording - Iggy Pop's 1997 remaster of Raw Power - doesn't contain higher than -60dB content from 20kHz up...
So, what does the impulse response look like?
This is what happens when we go for an intermediate phase setting; even a relatively small change in the "phase response" parameter from 50 to 45 with SoX. The impulse becomes asymmetrical with pre-ringing still present but significantly reduced compared to an equivalent linear phase setting. Concomitant to this change would be that the post-ringing is also slightly stronger and extended in duration than linear phase... But remember, post ringing if present isn't typically an issue due to auditory masking.
Remember that a strongly anti-imaging/aliasing minimum phase reconstruction filter introduces phase shifts in the audio spectrum; something we typically do not want to see because that is indicative of a temporal anomaly (even though humans generally are not sensitive to this as discussed last time). So, how does an intermediate phase setting biased towards linear phase as suggested here look in terms of phase shift?
As you can see, not bad at all! We're basically dancing around linear phase 0° and relatively little change all the way to 20kHz compared to minimum phase of the same filter steepness.
For completeness, I can demonstrate that there are no issues at all with typical DAC measurements of frequency response, noise, distortion...
And of course these days with asynchronous USB transmission, we need not worry about jitter even when upsampling using an inexpensive Raspberry Pi 3 connected to the TEAC UD-501 DAC transmitting 24-bit/352.8kHz audio data, standard generic 6' USB cable.
In case you're wondering, I'm even using "CRAAP" undervolt and underclock settings with the Pi 3 "Touch" for these RightMark 6.4.2 Pro and jitter measurements :-). So even if the speed/voltage tweak doesn't affect sound, at least you know you're saving energy and the Pi is producing less heat.
Speaking of underclocking, notice that it still doesn't take much processing power:
10.5% of the Pi 3's processing resources used which includes running the touchscreen GUI (jivelite). That's pretty well a peak number as most of the time it's streaming the playback with 7-10% processor utilization.
Summary...There you go, my take on selecting preferred settings taking into account some of the parameters that we can play with in piCorePlayer/SoX. What I'm listening with these days as I aim to be a rational audiophile wading through the literature out there and the claims in audiophile-land.
What I've shown here is my take on "Goldilocks" digital filtering. Settings which will:
1. Not compromise on frequency response - flat to 20kHz and a little beyond (with 44.1kHz sampling rate). IMO, it's better for the digital filter to be accurate and not have it roll-off early as some kind of "tone control" which we see in some very short impulse response filters (like PonoPlayer). I would rather be deliberate and either use an EQ for this purpose or DSP room correction with a target curve that rolls off the highs if desired.
2. Achieve excellent antialiasing properties. What's the point of using an antialiasing filter if it's leaky and spills all kinds of ultrasonic noise which can in turn result in audible intermodulation distortions? I refuse to compromise on this (no thanks, MQA - as per the example from Beyoncé).
3. Allow plenty of overhead to prevent intersample overloading during playback (simply -4dB attenuation in SoX does the job - this is not a significant loss of resolution for a high resolution DAC these days). If there is one factor I would love to see in future generations of DAC chips, it would be this being accounted for by default. (Over the years, Juergen Reis has commented about this as well.)Want to "see" this filter in "action" if one ran into some poorly engineered albums containing nasty 750Hz square waves?
4. Minimize pre-ringing potential whenever we run into poorly engineered albums containing "illegal", poorly bandwidth limited samples. For example, "modern", synthetic, highly dynamically compressed and clipped music containing square waves. Intermediate phase setting achieves this.
5. In achieving (4), the filter will not cause significant timing/phase shifts. This is achieved by biasing the intermediate phase setting towards linear phase.
"Goldilocks" to my eyes (and ears) :-). Notice the precise leading edges ("attack") with the intermediate phase and linear phase settings. Notice the more rounded, less acute slope of minimum phase upsampling. All the while, notice a clear reduction of pre-ringing by going intermediate phase and pushing a little more of the energy into the post-ring. But by not completely pushing all the energy into the post-ring timeframe as with minimum phase, we also see that the ringing settles down quickly through the square wave "plateau".
In a similar way, we can check out the filter's handling of something a little more challenging - some rogue 3kHz square waves:
This time, notice that I underlaid an image of the very simple essentially non-ringing "NOS-like" cubic interpolation filter from SoX (see last week for more information on this "ultralax" NOS-like filter). Remember that the cubic interpolator is non-bandwidth limited and in this example, can be used as a comparator to examine the time-domain accuracy of the waveforms. We can clearly see the temporal effect of minimum phase in this example. Notice how the leading edge of the square waves are temporally shifted forward in time! Again, my intermediate phase suggestion takes the middle ground - less pre-ringing and staying much more like the temporally accurate linear phase with good, clean rising/falling edges although there is a very small temporal shift forward.
Something I've noticed when playing with filters is that for minimum phase settings, because there's no pre-ring "release" of energy, the post-ringing amplitude tends to be more intense. In the 3kHz waveforms above, I only used -2dB attenuation with SoX and notice that there were a few clipped samples with the minimum phase filter setting, whereas this was not the case with linear and intermediate phase. I think this is a good reminder that DACs that use minimum phase filtering need to be a bit more careful to provide overhead for intersample overload (like MQA where the filters indeed overload!).
Finally, here's what 10kHz "square waves" look like with a 44.1kHz samplerate (-3dB overhead attenuation):
Well guys and gals, give this filter suggestion a try and tell me what you hear/think! Like I said, this is my attempt at a filter setting that is "just right" - suppressing the "detestable" pre-ringing with bad recordings, while maintaining excellent frequency and temporal domain accuracy with good recordings.
If you have a favourite setting, feel free to share your thoughts and settings.
Next time, let's look back at some audiophile history, look at some "real" music samples and think a little more about digital filters and contextualize the "detestable" ringing.
Hope you're all enjoying the music. Wishing you all a joyful, healthy, prosperous, euphonic and wisdom-filled 2018...
A friend's son is doing a grade 9 school project looking at perceptions around "healthy foods". If you have 5 minutes, he'd love to have your submission for a survey to be used in his science fair project. I'm sure he'll appreciate input from around the world! Always good to promote critical thinking, analysis, and scientific inquiry for the next generation: