Audio digital theory gives me a headache but I am wondering if this is interesting or not.
Although I read the ND 555 white paper when it was published I did not take in the remarks on the ND DAC’s oversampling until it was mentioned to me today with respect to the Chord Hugo M-Scaler.
The ND 555 white paper states -
Although the PCM1704 is described on its datasheet as a 96kHz DAC, it utilises 8× oversampling so the DAC stage can actually operate at up to 768kHz
(8 × 96kHz). In the ND 555, oversampling and bespoke digital filtering is performed in an Analogue Devices SHARC ADSP21489 digital signal processing (DSP) chip located on the digital card. This DSP chip has the advantage of low power requirement .
Digital signals at multiples of 44.1kHz sampling rate (44.1kHz, 88.2kHz, etc) are oversampled to (16 × 44.1 =) 705.6kHz, and those at multiples of 48kHz sampling rate (48kHz, 96kHz, etc) to (16 × 48 =) 768kHz.
So are these two processes, ND 555 DAC oversampling and HMS upscaling, the same sort of thing? Advise me in my ignorance.
Yes, I understand that oversampling and upscaling are just different words for interpolation in the frequency domain. I might be wrong, of course. Interpolation, of course, can be done in very different ways. In the HMS, they use a stencil of up to 1000000 points/values. This is a rather exotic and Chord-specific approach. I think the SHARC applies more conventional interpolation schemes but I have no detailed understanding of either approaches.
Upscaling is not really a DSP engineering term I am familiar with audio, it is typically used to describe rescaling video frame size and resolutions… so in that sense doesn’t relate or compare to audio oversampling which is something quite different… but it my be a consumer electronics product marketing term used to describe various audio interpolations… don’t know… need to see it used in a context perhaps in an advert or user manual.
Oversampling however is a term used to describe the changing of the sampling frequency of a sampled stream of data. Typically oversampling is used to describe where the sampling frequency is changed by an even integer value like 44.1 to 88.2, and Upsampling used where the change in sampling frequency can be any non integer multiple… like changing 44.1 to 48.
This differences are far from subtle.
With oversampling no interpolation is required. The sample rate is increased by inserting zero value samples in the sample stream. Zero multiplied by any value is always zero so no interpolation arithmetic errors are produced.
With Upsampling, non zero sample value interpolation is required, and so arithmetic rounding errors will be most likely introduced into the sample stream
So why do we oversample. There are several reasons, the main ones are.
By increasing sample rate compared to the base band audio, that is increasing the sample rate above two times the maximum sampled frequency, allows a less resonant and steep filter to be used as the anti aliasing low pass output filter as well as the digital low pass reconstruction filter. Less steep and resonant filters cause a smoother response and less filter artefacts/perturbations in the audio pass band. Aliasing is what happens when one encodes or reconstructs frequencies that are greater than half the sample frequency rate. This will happen and cause noticeable unpleasant distortions unless measures are taken to prevent it.
By some interesting aspects of discrete series mathematics as used in digital sampling… by increasing the sample rate compared to the sample rate that originally encoded the audio, any timing errors (jitter) introduced in the encoding process are proportionally reduced… as the errors are now averaged out over the whole oversampled frequency domain, and as we typically will be interested in a smaller part of that frequency domain, the errors are proportionally reduced in the part we are interested in, the audio base band.
Finally regarding the PCM1704K. It is specified up to 96kHz sample rate, but Naim appear to significantly over clock it… which would appear not sensible… well it’s not quite like that. If using the internal digital filter in the PCM1704K then indeed 96kHz is your limit, however you can off load the filter function, and use the raw DAC bandwidth. This is what Naim do … they use an Analog Devices programmable filter (SHARC processor) as the oversampling function and digital low pass reconstruction filter and then feed the output of this filter to the PCM1704K natively at upto 768kHz which is within its native performance envelope.
Thank you nbpf and Simon,
I will take my time and read through this information, see if I can gain an understanding.
Several decades ago, in another lifetime, I dabbled in physics and engineering but then moved to the art world. My technical reasoning powers have decayed no end during recent times and understanding digital is a struggle for me. But my ears are having fun comparing the output of the ND 555 DAC against the Chord HMS/TT2 …
If you do not pretend to understand how interpolations are actually done in different devices and for different original sampling rates (this is, up to a certain extent, impossible because companies typically do not disclose the details of the interpolation schemes that they deploy), the basic idea is actually very simple:
A 16bits/44.1kHz file contains 44100 samples (words) of 16 bits length (each representing 2^16 possible values for the amplitude or volume) for every second of music. Oversampling, upsampling, upscaling just means inserting new words between the original samples, thus increasing the number of words over a fixed time interval. It is also possible to increse the length (number of bits) of the original samples, of course. But I do not think that this process is usually refered to as oversampling or upscaling.
As usual, things are a bit more complicated in reality becaue the computation of a new word might involve the (re-) construction of a local function and interpolations in different domains. But that’s the basic idea.
I thought it should be much more than 1.4 seconds: 44100 * 1.4 * 2 = 123480 and the M Scaler is supposed to use 1000000 taps. That would mean about 11 seconds of music. Perhaps taps are something different from interpolation points?
Ah, I knew I would be asked. Well, a few days ago I would have given one answer which favoured the Chord flow [using digital out of ND as source] but, this morning, the ND 555 DAC rendered the most beautiful musical experience I have heard in a while. I am only using one sample recording these few days, András Schiff’s recent Schubert pianoforte recording on ECM. With the HMS/TT2 as converter I was thinking the precision and clarity of notes and lines was winning out, the ND DAC seemed to give a softer, less incisive version of the music. But today, although I favoured the Chords at first, on a second listen form the ND 555 DAC I could hear new beauty; a delicacy and sonority rose from the multiple registers of the pianoforte that had not been in evidence to me before - the individual lines in the playing were singing out. Back with the Chord now and I am not picking up on that beauty. This is a bit of a surprise to me. I will continue. I still have to listen to a USB source direct to HMS, no ND 555 involved. Onward ears.
Yes one shouldn’t confuse ‘taps’ or filter kernel size with interpolation necessarily. Rob Watts uses an FIR filter function, and that functions by multiplying a filter kernel of size x samples (these kernel samples are also referred to as ‘taps’) over a function window of the sample bit stream. That window covers previous and future samples in the sample stream… and these samples are processed together for a point sample in time.
When it comes to oversampling, as I say non zero interpolation samples to adjust the sample frequency is a bad idea and is best avoided if you can… hence why most oversampling interpolation uses even integer multiples of the sample rate so zero sample values can be inserted so as to avoid the creation of error artefacts.
One can also increase the sample word size and there are different algorithms for achieving this, but this doesn’t neccessarily affect the sample rate frequency.
The issue with FIR filter kernels, is that the accuracy for a given kernel or tap size reduces with sample frequency… so ideally the kernel or tap to sample rate ratio needs to be fixed, such that a higher sample rate uses more taps than a lower sample rate… there is also a relationship between sample resolution and kernel size (number of taps), although a Blackman Window function will lessen this sensitivity so smaller kernels or number of taps can be used (Watts uses his own WTA algorithm instead, but to me they achieve essentially the same thing). I remember discussing this with Rob Watts the kernel size to sample rate ratio issue as well… he smiled wryly… he is quite aware of that issue.
One further thing I should add, last week, my trusty audio deal swapped the Chord Electronics provided dual BNC cables that came with the HMS for a pair of Chord Company Epics and different was immediate. The Epics opened up what we heard with crispness and clarity. We were listening to David Sylvian at the time and when we took the Epics out and replaced the Chord Elect cables DS’s voice became very obviously muddier. Epics were ordered on the spot. So a bit of a rider on what I am now hearing out of the HMS, maybe I am picking up on something from the cables and not the HMS/TT2 directly. Best to review again once Epics in place.