It seems we are re-creating (albeit in a more civilised manner) the
Great Debate in audio between Objectivists and Subjectivists.
I believe tests have been performed where users are allowed to take the
ABX box to their home system and switch as often or as little as they
want, and take as long or as short as they want (weeks, for example).
The results were actually less accurate than in standard "short" ABX
tests (perhaps because of the well-documented effect of one getting
"used" to the sound of certain equipment).
I believe my Marantz sounds better than my Yamaha amplifier, but lately
I have begun to suspect that the way the Marantz looks and feels, as
well as my own believes, etc etc affect my perception. The act of
listening is a combination of the ear and the brain, but the auditory
part of the brain is not isolated from other stimuli, so it seems
logical to think that they may affect how I hear. So in a way, it does
sound different --in a very real way. But take away the visual stimulus,
and it ceases to sound different.
As professionals, however, we have to be concerned with subjectivity and
accurate transfers. So my own ideal strategy when choosing a piece of
equipment would be:
1 - If there are metric differences (e.g. lower measured harmonic
distortion), then I will use the equipment with "best specs" I can
afford, even if I cannot tell the difference; else,
2 - If there are peer-reviewed, well-designed scientific studies (of any
sort, not just ABX) that show preferences (or differences) perceived
among a large or important enough group, I will use the perceived
"better" piece; else,
3 - I will try to determine any perceived differences in my own setup
via blind testing.
This is my ideal algorithm, but of course the three steps all affect
each other (e.g. Am I willing to spend $3000 extra on an amplifier that
shows a minimally better THD number, even if I know no one can ever
consistently tell the difference?). It seems pretty rational.
This has been an interesting discussion. Thank you.
Malcolm Davidson wrote:
> For many people the brain's ability to perceive subtle difference is
> severely limited when the samples are played sequentialy. It's difficult to
> remember the "reference" track and the one being listened to in the moment.
> For large difference it is much easier so that a sequential test does have
> some validity comparing say 96 KHz 24 bit, CD and MP3.
> When we did all the SDMI (Secure Digital Music Initiative) watermark
> evaluations, not only were we evaluating the watermarking technologies, but
> we were also (unintenionally) evaluating the abilities of the participants.
> This varied widely amongst all the so called experts. With a certain amount
> of training people can improve their listening skills. We observed
> individuals who were perceived as "golden ears" who could not pick out a
> watermark, whereas there were some participants who could pick out certain
> watermarks with ease, yet did not have a reputation as an expert listener.
> Any type of comparative listening test is highly subjective for the brain
> can pick out subtelties that we are not skilled at measuring. For example
> bit for bit identical streams have been consistantly thought of as different
> by expert listeners, (separate files on a hard disk) due in part to the
> buffering and speed matching of the data as it is read off the disk. It
> changes the small amount of jitter of the digital stream which subsequently
> alters the noise floor of the D/A. This has influences the spacial imaging
> slightly of stereo. However many people might describe it differently and
> it might not necessariy be a bad thing. With the complexity of the "supply
> chain", to the end user, how close does the final product replicate the
> actual recording experience and do people care and are they willing to pay
> for it?
> Malcolm Davidson
> ----- Original Message -----
> From: "Matthew Barton" <[log in to unmask]>
> To: <[log in to unmask]>
> Sent: Monday, January 28, 2008 11:30 AM
> Subject: Re: [ARSCLIST] A/B testing: another approach
>> It wasn't a scientific experiment--just an engineer having a bit of
>> fun--though he seems to have had a point to make, if not a detailedl
>> hypothesis to test. I just thought the structure was interesting, and
>> worth considering if our goal is to develop a listening test or tests.
>> Perhaps the thing that I find most interesting here is that it involved
>> a real-time, uninterrupted listening experience of A, B, C, and D.
>> Perhaps the brain does respond differently to such a listening
>> Matthew Barton
>> The Library of Congress
>> 101 Independence Ave., SE
>> Washington, DC 20540-4696
>> email: [log in to unmask]
>>>>> Marcos Sueiro Bal <[log in to unmask]> 1/28/2008 10:40:58 AM >>>
>> This is an interesting link, but as a scientific experiment it does not
>> seem very useful: What is the hypothesis? How are we quantifying it? A
>> statement such as "my fellow listeners appeared to be equally
>> uncomfortable" does not seem conducive to analysis.
>> If the highs were perceived not to be "as silky smooth" --in other
>> words, if the differentiating factor has been identified after just one
>> listen of a short passage--, should not the same listener be able to
>> correctly identify such a difference in a blind test? Logic seems to
>> indicate that he should, but perhaps the brain works in mysterious
>> Incidentally, it seems that not all ABX tests have concluded that
>> listeners are less sensitive than we thought. I was told in school that
>> most average Joes can hear at most a difference of 1 dB, but a group of
>> 5 listeners in an ABX test perceived differences of 0.4 dB 93% of the
>> time (note: this is not a peer-reviewed paper, and this is from the ABX
>> web page, so it is not conclusive evidence).
>> Matthew Barton wrote:
>>> Here's a link to an article from the October issue of Stereophile,
>>> which an interesting approach to blindfold testing is described:
>>> This is not an analog vs. digital article, and I'm not endorsing the
>>> test or or its results, or any conclusions in the article, but I
>>> the approach is interesting. Instead of an A:B comparision, in which
>>> listeners first heard A, and then B, and were asked for opinions,
>>> engineer created a composite patchwork of different formats using a
>>> repetitive passage from a recent recording of Handel's Messiah. He
>>> didn't tell his audience that this is what they would be hearing:
>>> "It turned out that we'd been unwittingly involved in a blind
>>> test. The DVD-A was a ringer. Philip had chosen a Handel chorus in
>>> the same music is heard four times. He had prepared four versions of
>>> chorus—the original 24-bit/88.2kHz data transcoded straight from
>>> DSD master; a version sample-rate–converted and decimated to
>>> CD data; an MP3 version at 320kbps; and, finally, an MP3 version at
>>> 192kbps—and spliced them together in that order. The last three
>>> versions had been subsequently upsampled back to 24/88.2 so that the
>>> DAC's performance would not be a variable. The peak and average
>>> were the same for all four versions; the only difference we would
>>> would be the reductions in bandwidth and resolution. "-- from
>>> the Detectives," by John Atkinson, Stereophile, October, 2007.
>>> We can all argue about the specs here, but the most interesting
>>> to me is that the changes in the audio unfolded over four iterations
>>> the same passage of music in the same recording. Listeners were not
>>> asked to use their memory of recording A to appraise recording B,or
>>> versa. They heard (or did not hear) the changes as part of
>>> listening experience.
>>> Matthew Barton
>>> The Library of Congress
>>> 101 Independence Ave., SE
>>> Washington, DC 20540-4696
>>> email: [log in to unmask]