Feedback and Q&A
As you can imagine, I get a lot of email. I always appreciate it when readers post their questions and comments on the RealHD-Audio.com site, but that hasn’t been convenient for the near 3000 recipients of the daily emails. So today, I’ve collected a few of the comments and questions.
Let’s start with an email that I received from my friend John Siau, the principal designer and all around digital guru at Benchmark Media. He was responding to my post on “Perfect Sound Forever” that much maligned marketing slogan that Sony and Phillips coined back with the introduction of the CD-101, the first compact disc machine (it’s not a coincidence that it was launched on October 1…10/1!).
John wrote to endorse the potential quality that a CD can deliver AND has always been able to deliver. It’s obviously not, as TAS editor Robert Harley wrote a “quaint notion that there’s no need for improvement over the CD”. Although I support high-resolution audio, CDs are not going to be obsolete anytime soon.
“I fully agree that the CD format is an excellent distribution format. Noise-shaped dither can extend the apparent dynamic range at least 12 dB, so we are not even limited to 96 dB. Anyone who doubts the effectiveness of noise-shaping should look at DSD. The 1-bit DSD format proves that noise shaping works!
Obviously the 16-bit format falls short in production applications. Longer word lengths (24-bit, 32-bits, or 64-bits) are now used in the studio to allow many cascaded and summed DSP operations without loss of SNR. Early DAW systems were limited to 16-bit storage, and were notorious for producing poor results. All newer DAWs are high-resolution systems and it is now possible to produce outstanding CDs if the conversion to 44.1/16 is the last step in the mastering process. Few recordings fully exploit the capabilities of the CD. Obviously there are some high-resolution recordings (such as those produced by AIX) that exceed the capabilities of the CD.
Nevertheless, it is still difficult to configure a playback system that is significantly better that CD-quality. Benchmark is addressing this issue with the new AHB2 high-resolution power amplifier. The AHB2 has a bandwidth that exceeds 200 kHz, and a SNR that approaches 130 dB.”
Another email from reader Paul S. tries to get at some related issues of PCM encoding:
“The dots in my scenario turn out to be XY coordinates of the digital samples of the music, with X being the moment in time and Y being amplitude. The dots must be connected to create the AC flow that we call music. I get that low pass filters and the mass of the drivers smooth off the digital edges caused by quantization. High sampling isn’t important in terms of frequencies that are too high to hear. It’s vital, if we want our highs we can hear to sound like the original in terms of amplitude and phase. Music doesn’t start and stop to give our sampling an easy time. Even 192k only starts to accommodate the accurate reconstruction of our audible highs no matter what phase they are in when sampled. With too few samples, like CD, there is not enough info to reconstruct the highs accurately. If you only have two samples of a 20khz sine wave, you better pray it is sampled at 90 degrees in. Otherwise, it will have lower amplitude and the peak will happen out of phase with the original wave. Your offer of 96k samples with 48khz frequencies suffers the same fate. Only the 90 degree sine wave will be acceptable.”
There are number of misunderstandings in his email. We’ve been going back and forth a little. First the discrete levels in the “battleship” grid of samples and amplitudes are not smoothed back into the continuously variable AC voltage that is sent to the amp and speakers by the “mass of the drivers”. It’s true that they instantaneous amplitude changes that occur when moving from one discrete amplitude level to another level (up or down) produce partials or overtones in the output. But according to the natural overtone series, the closest they can come to the fundamental frequency is a factor of two…because an octave has a numerical ratio of 1:2. Then a high quality LPF removes the objectionable HF partial and we’re back to the original waveform. If the highest frequency is 20 kHz then a sample rate of 44.1 kHz is more than enough to capture and reproduce it.
High frequency sampling is important for more than just “frequencies” that we can’t hear (although as I argued I think they do matter). The higher the sampling rate the easier it is to build a great filter AND the higher the frequency response. And they don’t have to be at 90 degrees to the sample times to ensure accurate capture and playback.
The Nyquist Theorem works whether the samples are “aligned” at the sample points or not. It assures us that an analog continuous audio waveform can be recreated accurately and in phase when the sampling rate is twice the highest frequency component. I used to understand this to mean “at least twice as high”…but the reality it that you only need two times that frequency for it work “perfectly.
This probably deserves another detailed post…but I thought I would pass my response to Paul back to the entire readership.
Paul McGown of PS Audio is excited by Pono. I shared your observation that uncompressed transfers of standard definition classic albums put into excessively large bit buckets. The sound quality will be hit or miss depending on the original production values…but there’s not going to any real high-resolution tracks available on his blog. I believe you are right: if the content is not HD, it does not matter what container it is delivered in.
This will be more of the HDtracks model of pushing the standard resolution audio files that come from the labels to a “new” and uninformed audience. Maybe I should get into the business of making hardware.
Pono
Yes I can understand why so many Artist’s jumped onto that bandwagon, they see it as a way of promoting and re-promotion of music already sold to us 5 or more times already. Digital spoiler systems aside, taking on the Apple Ipod is never going to be easy and most appear to be content with MP3 (Yuk). If the software was built into a hi-tech phone they could probably only charge another 20 to 50 dollars for it but as a separate player………..Large bit buckets indeed admin, no real HD is the probable result with large numbers like 24bit etc etc. How do you spell boll-icks?
I’m okay with the idea of levels of quality based on sensible definition. But without recognition that older recording at 192 kHz will sound different than new recording at 192 kHz is still a missing component of the discussion.
Your reader Paul S. seems to agree with what I’ve been saying for months – that two samples per waveform is too “hit or miss” to draw an accurate picture of a waveform approaching the Nyquist limit. If it doesn’t take a sample when the waveform is at or near its peak, it will be rendered as too quiet.
That’s why I like your 96khz sampling (and like 192khz sampling even better).
Whether or not there’s much above 20khz that is humanly perceptible (even if not “audible”), which I believe there is, even the frequencies in the band commonly-accepted as audible are drawn with more detail and with a greater chance of the right amplitude by using higher sample rates.
That, I believe, is why the 192khz transfer sounds audibly better than the 96khz transfer of the old analog tape field recording of Bill Evans’ Waltz for Debby at HDtracks. I paid for one track in both formats so I could A/B them and even my wife – who is *not* an audiophile – could hear the difference in the clarity of the individual notes from the piano.
No, an analog tape – particulary one made with portable equipment at a night club – is not going to have really high frequency information – but a well-done transfer can give us a clearer image of the octave from 10khz-20khz than a mass-produced, rushed to market version can’t.
Not all of their transfers are that good – Eric Clapton’s “Layla” sounds better on the original CD than on what they offer, which they explained was what they were given by the label. I tend to only buy things from them that they (meaning Chesky) recorded themselves (since Chesky records digitally), or which they describe how they were made, which isn’t very often.
Phil…it’s actually not true that you have a “hit or miss” for the two samples to correctly register the highest frequencies. The Nyquist Theorem is quite specific…and correct…in this regards. There are some valid reasons for higher sample rates…but very little justification for 192 kHz. I won’t disagree that they may sound different…but the 192 kHz sample rate is not one of them.