DAC Evaluation Results
Comparing DACs
First, we admit up front that we were not exactly comparing DACs, per se. When we bring it up, there are always those who point out that our comparison covers much more than just DAC Number One vs. DAC Number Two, that there are additional amp stages involved, sometimes a headphone amp, other control circuitry, and the sum total of what makes up each of these entities we call a “DAC” is a whole conglomeration of circuitry that can be used as a two-channel system DAC. In most cases, the unit was a DAC/Headphone Amp. That is what we mean by DAC, a System DAC.
Day One: Sighted Pairings
For the final evaluation, we settled on one pair of DACs that had distinctive and audible differences. This was determined on Day One through sighted tests of a number of DAC pairings with open discussion about what we were hearing.
How does one listen? And what does one listen for?
These turned out to be very important questions. When we did our High-End Amplifier Evaluations, we did a style of ABX testing where, having no idea what amps we were even listening to, the test subject would listen to one of them, X (randomly chosen), and then after a gap of 20 seconds or so for some randomization switching, the subject would take the remote and start toggling back and forth between A & B and had to try to determine which of them was the X amp he had first heard. That 20-second gap was too much for my own auditory memory. It does not sound that hard, but when you break down the mental processes at work, it can get more than a little complex. The process of trying to determine which of the two was the same as the original X unit was just too much for me to sort through reliably, although others did much better.
For our DAC evaluation weekend, we had agreed that we would use a simpler approach, but it was not until we were in the middle of blind evaluation that we fully understood what that meant. With a test pairing chosen for the blind A-B test on Day Two, we went about A-B testing to see how well we could identify one or the other.
Leonard went first. I mixed up the setting of which DAC was which (A or B) and handed Leonard the A-B comparison remote, so all he could do was toggle back and forth until he determined which was which and stop and identify the DAC he was listening to as his favored DAC. But it turns out we were not quite in sync about exactly what we were doing. Leonard would start off his test track and toggling back and forth, and finally stop and say “That's the good one.” And I would tell him which DAC it was that he stopped on and he entered that into his data set. Then we would do it again. Leonard did a number of those tests using a number of tracks, and then it was Louie’s turn. I did the same with him for a few minutes, and at some point both Sonnie and I realized that we had a communication problem going on. The testing stopped for about 10 minutes during which there was quite an enthusiastic “discussion” about what we were doing. Sonnie and I had thought that they were trying to identify one or the other of the two DACs, but that was not what they were doing at all.
They had identified a listening quality for a given track, and would pick their favorite DAC for that listening quality, with the goal of picking the same one over and over again. Which DAC was chosen was only an afterthought. As long as they picked the same one every time, for that listening quality on that track, they were being successful. I had missed that altogether. At first I thought we would have to completely start over with our testing, but at the end of our discussion we realized that both Leonard and Louie had used the same method for their evaluation and that Leonard’s data reflected it properly, and we were able to continue our testing.
Both Leonard and Louie bad been doing quite well using the method they were following. When my turn came, I felt that my approach would be a little different, I thought I would be able to identify directly which DAC was which, and I started out my testing that way. My results were not good at all.
Trying to properly identify a DAC directly, I was doing a little worse than 50% correctly, so I decided to try the other method, the one that Leonard and Louie had used, and at that point my results really turned around. The method ended up being much simpler mentally.
With a track and listening quality selected, it was much simpler to switch back and forth and decide which one I like better for that listing quality. For instance, the vocals on a Mindy Smith track we recorded very clean, and I would listen for the cleaner sounding of the two decks on that track. Choosing the one that I liked because it sounded cleaner, I was able to select the same DAC numerous times in a row, and had a high success rate, as Leonard’s data will show.
There was still one little wrinkle for me. Sometimes I would decide after I had selected the favored DAC to do a few more toggles to be sure. That almost always messed me up, and I would almost always end up with the wrong answer that way. I did much better with going with my first selection every time.
Using several different tracks and several different listening criteria, I completed my evaluation as Leonard and Louie had. Sonnie, our host, was not convinced he could hear any difference consistently and passed on this part of the testing.
When Dennis came to my house to add his own evaluation data, we used the same method for his work. His results tracked extremely well with that of the rest of us for that pair of DACs that was used for the body of our blind testing.
I have to say that I was quite surprised to find that my own original listening and evaluation method was so flawed, and delighted to find that a very simple approach was highly accurate. I have found ways to use that same method in my own private testing and evaluation work numerous times since then, always with great success.
I understand that this is not ABX testing as it is commonly defined. Our method was not double-blind, and the listeners know exactly which DACs were involved through the blind testing. There are a lot of ways to set up any kind of such tests or evaluations, some are extremely difficult from the listener’s perspective and some are easier on the listener. We purposely chose one that was fairly easy on the listener while still giving data that could lead to statistically significant and meaningful results.
Day Two: Evaluation Results
Leonard will add the data here.
Comparing DACs
First, we admit up front that we were not exactly comparing DACs, per se. When we bring it up, there are always those who point out that our comparison covers much more than just DAC Number One vs. DAC Number Two, that there are additional amp stages involved, sometimes a headphone amp, other control circuitry, and the sum total of what makes up each of these entities we call a “DAC” is a whole conglomeration of circuitry that can be used as a two-channel system DAC. In most cases, the unit was a DAC/Headphone Amp. That is what we mean by DAC, a System DAC.
Day One: Sighted Pairings
For the final evaluation, we settled on one pair of DACs that had distinctive and audible differences. This was determined on Day One through sighted tests of a number of DAC pairings with open discussion about what we were hearing.
PS Audio vs. Fiio D3: It was embarrassingly difficult to tell these two DACs apart.There were differences, but they were subtle and could not be identified with consistency by any panel member. The PS Audio had trouble locking onto the TOSLINK source consistently, so it was only used briefly in this one sighted pairing.
Oppo HA-1 vs. Fiio D3: This pairing was also very difficult to tell apart consistently, although differences could be heard.
Oppo HA-1 vs. Fiio E7: With this pairing it was fairly easy to hear a difference, and all panel members felt they could identify these DACs consistently in an A-B test.
Oppo HA-1 vs. Audioengine D1: This pairing was, for me, very difficult to identify. I had compared them extensively in my own preparatory work, and when Dennis Young, Tesseract, came to my laboratory later to add his data to that from our weekend in Alabama, he also worked briefly with this pairing. His evaluation was brief, as it was at the end of other listening tests and fatigue was setting in. Dennis also had a difficult time telling the two units apart. My own assessment is that the D1 and HA-1 were virtually indistinguishable.
Oppo HA-1 vs. Headroom Desktop Headphone DAC/Amp: This pairing we did not have time to work with in Alabama. I did my own A-B comparison back home, and was able to consistently tell them apart, using the final A-B test method employed during our weekend.
Oppo HA-1 vs. Fiio D3: This pairing was also very difficult to tell apart consistently, although differences could be heard.
Oppo HA-1 vs. Fiio E7: With this pairing it was fairly easy to hear a difference, and all panel members felt they could identify these DACs consistently in an A-B test.
Oppo HA-1 vs. Audioengine D1: This pairing was, for me, very difficult to identify. I had compared them extensively in my own preparatory work, and when Dennis Young, Tesseract, came to my laboratory later to add his data to that from our weekend in Alabama, he also worked briefly with this pairing. His evaluation was brief, as it was at the end of other listening tests and fatigue was setting in. Dennis also had a difficult time telling the two units apart. My own assessment is that the D1 and HA-1 were virtually indistinguishable.
Oppo HA-1 vs. Headroom Desktop Headphone DAC/Amp: This pairing we did not have time to work with in Alabama. I did my own A-B comparison back home, and was able to consistently tell them apart, using the final A-B test method employed during our weekend.
How does one listen? And what does one listen for?
These turned out to be very important questions. When we did our High-End Amplifier Evaluations, we did a style of ABX testing where, having no idea what amps we were even listening to, the test subject would listen to one of them, X (randomly chosen), and then after a gap of 20 seconds or so for some randomization switching, the subject would take the remote and start toggling back and forth between A & B and had to try to determine which of them was the X amp he had first heard. That 20-second gap was too much for my own auditory memory. It does not sound that hard, but when you break down the mental processes at work, it can get more than a little complex. The process of trying to determine which of the two was the same as the original X unit was just too much for me to sort through reliably, although others did much better.
For our DAC evaluation weekend, we had agreed that we would use a simpler approach, but it was not until we were in the middle of blind evaluation that we fully understood what that meant. With a test pairing chosen for the blind A-B test on Day Two, we went about A-B testing to see how well we could identify one or the other.
Leonard went first. I mixed up the setting of which DAC was which (A or B) and handed Leonard the A-B comparison remote, so all he could do was toggle back and forth until he determined which was which and stop and identify the DAC he was listening to as his favored DAC. But it turns out we were not quite in sync about exactly what we were doing. Leonard would start off his test track and toggling back and forth, and finally stop and say “That's the good one.” And I would tell him which DAC it was that he stopped on and he entered that into his data set. Then we would do it again. Leonard did a number of those tests using a number of tracks, and then it was Louie’s turn. I did the same with him for a few minutes, and at some point both Sonnie and I realized that we had a communication problem going on. The testing stopped for about 10 minutes during which there was quite an enthusiastic “discussion” about what we were doing. Sonnie and I had thought that they were trying to identify one or the other of the two DACs, but that was not what they were doing at all.
They had identified a listening quality for a given track, and would pick their favorite DAC for that listening quality, with the goal of picking the same one over and over again. Which DAC was chosen was only an afterthought. As long as they picked the same one every time, for that listening quality on that track, they were being successful. I had missed that altogether. At first I thought we would have to completely start over with our testing, but at the end of our discussion we realized that both Leonard and Louie had used the same method for their evaluation and that Leonard’s data reflected it properly, and we were able to continue our testing.
Both Leonard and Louie bad been doing quite well using the method they were following. When my turn came, I felt that my approach would be a little different, I thought I would be able to identify directly which DAC was which, and I started out my testing that way. My results were not good at all.
Trying to properly identify a DAC directly, I was doing a little worse than 50% correctly, so I decided to try the other method, the one that Leonard and Louie had used, and at that point my results really turned around. The method ended up being much simpler mentally.
With a track and listening quality selected, it was much simpler to switch back and forth and decide which one I like better for that listing quality. For instance, the vocals on a Mindy Smith track we recorded very clean, and I would listen for the cleaner sounding of the two decks on that track. Choosing the one that I liked because it sounded cleaner, I was able to select the same DAC numerous times in a row, and had a high success rate, as Leonard’s data will show.
There was still one little wrinkle for me. Sometimes I would decide after I had selected the favored DAC to do a few more toggles to be sure. That almost always messed me up, and I would almost always end up with the wrong answer that way. I did much better with going with my first selection every time.
Using several different tracks and several different listening criteria, I completed my evaluation as Leonard and Louie had. Sonnie, our host, was not convinced he could hear any difference consistently and passed on this part of the testing.
When Dennis came to my house to add his own evaluation data, we used the same method for his work. His results tracked extremely well with that of the rest of us for that pair of DACs that was used for the body of our blind testing.
I have to say that I was quite surprised to find that my own original listening and evaluation method was so flawed, and delighted to find that a very simple approach was highly accurate. I have found ways to use that same method in my own private testing and evaluation work numerous times since then, always with great success.
I understand that this is not ABX testing as it is commonly defined. Our method was not double-blind, and the listeners know exactly which DACs were involved through the blind testing. There are a lot of ways to set up any kind of such tests or evaluations, some are extremely difficult from the listener’s perspective and some are easier on the listener. We purposely chose one that was fairly easy on the listener while still giving data that could lead to statistically significant and meaningful results.
Day Two: Evaluation Results
Leonard will add the data here.