Home Theater Forum and Systems banner

Home Theater Shack 2015 High-End Amplifier Evaluation Event Reporting and Discussion Thread

70126 Views 250 Replies 29 Participants Last post by  JoeGonzales
3
Home Theater Shack 2015 High-End Amplifier Evaluation Event Reporting and Discussion Thread



:fireworks2:
:fireworks1:




This thread is a continuation of the High-End Amplifier Evaluation Event Preparations Thread previously under way.



The event has begun. Coming to you from southern Alabama, the Home Theater Shack Evaluation Team has assembled at Sonnie Parker's Cedar Creek Cinema for the 2015 High-End Amplifier Evaluation Event. We have amps, we have speakers, we have tunes, we have great eats, what more could one ask for?

Be reminded of the first law of audio evaluation event execution. They never go exactly as planned. Not everything gets there, not everything works, but you endeavor to persevere and get things done.

We have deal with speakers not able to reach us in time, with cabling issues, with equipment not interfacing properly, a laptop crash, with hums and buzzes and clicks and pops, with procedural questions - - - yet we forge ahead, adapt, evolve, redirect, and forge ahead some more - - - and the task of evaluating amplifiers is underway.

Speakers: We were unable to get the Chane A5rx-c and the Acoustic Zen Crescendo Mk II speaker pairs. We are running the Spatial Hologram M1 Turbo v2 and the Martin Logan ESL. Both are very revealing speakers, baring a lot of inner detail in our recordings. They will serve us well. The A5rx-c will be reviewed for HTS when available.

At the moment, the Holograms are serving as our primary evaluation tool. I will post setup details and interesting discoveries a little later. They are giving us a monstrous soundstage, the kind that eats small animals for breakfast, with extremely sharp imaging and very good depth acuity. They are extremely clear, getting into the realm of rivaling electrostatic transparency. Their in-room response is very good, with some expected peaks and dips, but still very listenable. The high frequency response is extended and smooth. The bass gives you that "Are you sure the subs are not on?" feeling on deeper tracks.

We decided to start with sighted comparisons and open discussion today, and blind tests tomorrow. The Audyssey XT32 / Dirac Live comparison has not been completed yet.

Have we heard differences? Yes, some explainable and some not. One amp pairing yielded differences that several evaluators are convinced they could pick in a blind AB test.

One thing I have learned for sure: The perfect complement to good southern barbeque is a proper peach cobbler. Add great company and you have a perfect get-together.

The Event
  • Date: Thursday evening, March 12th through Saturday evening, March 14th.
  • Place: Cedar Creek Cinema, Alabama, hosted by Sonnie, Angie, and Gracie Parker.
  • Evaluation Panel: Joe Alexander (ALMFamily), Leonard Caillouet (lcaillo), Dennis Young (Tesseract), Sonnie Parker (Sonnie), Wayne Myers (AudiocRaver).

The Amplifiers
  • Behringer EP2500
  • Denon X5200 AVR
  • Emotiva XPA-2
  • Exposure 2010S
  • Krell Duo 175
  • Mark Levinson 532H
  • Parasound HALO A31
  • Pass Labs X250.5
  • Sunfire TGA-7401
  • Van Alstine Fet Valve 400R
  • Wyred 4 Sound ST-500 MK II
The Speakers
  • Spatial Hologram M1 Turbo v2, courtesy Clayton Shaw, Spatial Audio
  • Martin Logan ESL
Other key equipment special for the event:
  • Van Alstine ABX Switch Box, recently updated version (February 2015)
  • miniDSP nanoAVR DL, courtesy Tony Rouget, miniDSP
  • OPPO BDP-105

As mentioned, our deepest appreciation goes to Sonnie, Angie, and Gracie Parker, our hosts, for welcoming us into their home. Look up Southern Hospitality in your dictionary, and they are (or should be) listed as prime role models thereof.

This first posting will be updated with more info and results, so check back from time to time.




Amplifier Observations
These are the observations from our notes regarding what we heard that were supported by being consistent between sighted and blind testing and across reviewers. While we failed to identify the amps in ABX testing, the raw observations from the blind comparisons did correlate in some cases to the sighted observations and with the observations of other reviewers. Take these reports for what they are, very subjective assessments and impressions which may or may not be accurate.


Denon X5200 AVR

Compared to other amps, several observations were consistent. The Denon had somewhat higher sibilance, was a bit brighter, and while it had plenty of bass it was noted several times to lack definition found in other amps. At high levels, it did seem to strain a bit more than the other amps, which is expected for an AVR compared to some of the much larger amps. Several times it was noted by multiple reviewers that it had very good detail and presence, as well as revealing ambiance in the recordings.

We actually listened to the Denon more than any other amp, as it was in four of the blind comparisons. It was not reliably identified in general, so one could argue that it held its own quite well, compared to even the most expensive amps. The observations from the blind comparisons that had some common elements either between blind and sighted comparisons or between observers are below. The extra presence and slight lack of bass definition seem to be consistent observations of the Denon AVR, but everyone agreed that the differences were not a definitive advantage to any one amp that would lead us to not want to own or listen to another, so I think we can conclude that the Denon held its own and was a worthy amp to consider.

Compared to Behringer
- bass on Denon had more impact than Behr, vocals sounded muted on Behr
- vocals sounded muted on ML compared to Denon
- Denon: crisp highs preferred compared to Behringer which is silky.
- Denon is more present, forward in mids and highs than Behringer.

Compared to Mark Levinson
- Denon seemed to lack low end punch compared to ML.
- Denon is smooth, a certain PUSH in the bass notes, cellos & violins sounded distant, hi-hat stood out, distant vocal echo stood out, compared to ML.
- Denon bass seemed muddy compared to ML which is tighter.
- ML more distant strings than Denon.
- Denon is slightly mushy and fat in bass. String bass more defined on ML.
- ML seems recessed compared to Denon.

Compared to Pass
- vocals sounded muffled on Pass compared to Denon
- crisp bass on Denon compared to Pass
- Denon & Pass both even, accurate, transparent, natural, no difference, like both
- Pass seems soft on vocals but very close.
- Denon has a bit more punch on bottom, maybe not as much very deep bass, more mid bass.

Compared to Van Alstine
- bass on Chant track was crisp for VA while Denon was slightly sloppy
- sibilance not as pronounced on VA as it was on Denon
- VA super clarity & precision, detailed, space around strings, around everything compared to Denon which is not as clear, liked VA better.
- sibilanceon Denon, VA has less “air” but more listenable, both very good
- Very deep bass more defined on VA, overall more bass on Denon.


Wyred 4 Sound ST-500 MK II

In the sighted listening we compared the ST-500 MK II to the Van Alstine Fet Valve 400R. The assessments varied but were generally closer to no difference. The Van Alstine got comments of being fatter on the bottom. The Wyred 4 Sound was noted to have slightly better bass definition but apparently less impact there, and slightly less detail in the extreme highs. Most comments about the midrange were not much, if any difference. An interesting observation here was by Wayne, noting that he did not think he would be able to tell the difference in a blind comparison. Considering the ST-500 MK II is an ICE design and the Fet Valve 400R is a hybrid, we expected this to be one of the comparisons that would yield differences if any. As I am always concerned about expectation bias, this was one that I was particularly concerned with. Van Alstine is a personal favorite for a couple of us so I expected a clear preference for it to be present in the sighted comparison. I felt that the Wyred 4 Sound amp help its own with the much more expensive and likely to be favored VA.

In the blind comparisons, we compared the ST-500 MK II to the Emotiva XPA-2 and the Sunfire TGA-7401 in two separate sessions. Of course, in these sessions we had no idea what we were listening to until after all the listening was done. In the comparison to the Emotiva, some notes revealed not much difference and that these were two of the best sounding amps yet. The ST-500 MK II was noted to have the best midrange yet, along with the Emotiva. It was described as having less sibilance than both the Emotiva and Sunfire. Both the Emotiva and the ST-500 MK II were described as unstrained in terms of dynamics. In comparison to the Emotiva it was noted to have solid highs, lively dynamics, rich string tones, and punch in the bass. The overall preference in comparison to the Emo was either no difference to preferring the W4S.

In comparison to the Sunfire, comments ranged from preference for the W4S to not much difference to preference for the Sunfire. The Sunfire was described as having more presence in the midrange, while the Wyred was noted to be shrill, lifeless, and hollow by comparison.

These comments varied a lot, but the points of convergence were generally around the similarities to three amps that would be expected to be most likely to be different, if we found any differences at all. The objective results is that we failed to identify the amp in ABX comparisons to two other much more expensive amplifiers. I would have to conclude that based on the results, the ST-500 MK II represents one of the best values and certainly should satisfy most listeners.​





Audyssey XT32 vs. Dirac Live Listening Comparison

Last year HTS published a review of a the miniDSP DDRC-22D, a two-channel Dirac Live Digital Room Correction (DRC) product. The review included a comparison to Audyssey XT. A number of readers requested a comparison of Dirac Live with Audyssey XT32. That comparison was recently completed during the Home Theater Shack High-End Amplifier Evaluation Event at Sonnie Parker's Cedar Creek Cinema in rural Alabama. This report provides the results of that comparison.

Go to the Audyssey XT32 vs. Dirac Live Listening Comparison Report and Discussion Thread.


Spatial Hologram M1 Turbo Speakers

I was very pleased with the Spatial Hologram M1 speakers we used for the amplifier evaluation, and felt that they more than fulfilled our needs. They did not become "gotta have them" items for any of the evaluators, although I had thoughts in that direction once or twice. But they were speakers we could easily ignore through the weekend. I mean this as a high complement. Never did an evaluator complain that the M1 speakers were "in the way" or "holding us back," and we were able to focus on the task at hand unhindered. That alone means a lot, and may say more about them than the rest of the review just completed.

Here is what they did for us:
  • Because of their high efficiency, amplifiers were not straining to deliver the volumes we called for. We could be confident that the amps were operating in their linear ranges and that if we heard a difference it was not due to an amp being overdriven.
  • The stretched-out soundstage opened up a lot of useful detail for us to consider in our evaluations. In discussing the soundstage at one point, there was a consensus that it might be stretched a little too far and might be "coming apart at the seams," showing some gaps, although this did not hinder our progress. My final assessment is that this was not the case, all due respect to the fine ears of the other evaluators. I elaborate on this point in the M1 Review.
  • They served well as a full-range all-passive speaker, able to reach deep and deliver 40 Hz frequencies with lots of clean "oomph," all without the need for DSP boosting and without subwoofer support.
I thoroughly enjoyed spending time with them, and wish to again thank Clayton Shaw of Spatial Audio for loaning them to us. A complete review of the M1 speakers has been posted.

Go to the Spatial Hologram M1 Turbo Version 2 Speaker Review.


A Soundstage Enhancement Experience

Sonnie's MartinLogan ESL hybrid electrostatics were set up very nicely when we arrived, so we avoided moving them through the weekend. There were some improvements made to the soundstage and imaging by way of treatments, and some interesting twists and turns along the way which turned out to be very informative.

I have documented the exercise in a separate post.

Go to the Soundstage Enhancement Experience thread.
See less See more
241 - 251 of 251 Posts
Hi All --

New member here.

Recently at the Axiom boards we were having a discussion on amplifiers that ended up linking over to your blind high-end amplifier shootout. Reading through this thread inspired me to set up an account here.

It sounds like you guys did a great job, and the conclusions seem reasonable. My personal takeaway would be that if two solid state amps have sufficient power to avoid clipping (which is not always a given) then they will probably sound much more similar than dissimilar. Not everyone agrees, and definitely not everyone wants that to be true, but it is really good news for most of us. It implies that you can get into the regime of hair-raising (and hair-splitting) performance on a reasonable budget.

This is my favorite line of the whole deal:

A second wise man said ..... "There's something about those trademark McIntosh analog meters that turns many a mere mortal into Pavlov's dog ...drool... but I'll never be able to afford one."
That is pretty much the only amp I want to own and just for the record it would sound better than my Pioneer even if it sounds exactly the same....
I get it, and don't disagree that some of the enjoyment of owning a boutique product transcends the question of how it sounds.

I was caught by another comment in the thread about the statistics of incorrect blind associations and the question of how unlikely the same result would be in an unbiased "coin flip" type experiment.

The sample size was 28 (4 observers x 7 comparisons). Out of those 28 trials we got 11 correct (most of those were thanks to Dennis, BTW). The probablility of that if there was a .5 probability on each test would be about 11%. That is certainly not low enough to conclude with a high degree of certainty that the test was biased, but it is still pretty unlikely for a fair test at .5 probability per trial.
I'm likewise a bit of a math junkie, and did a quick independent calculation of the expectations for this experiment. However, my results are coming up somewhat differently than what was quoted and suggest the conclusion that the anomaly is maybe just at the level of one third, say 34:66. It may boil down to a difference in assumptions, but let me briefly describe how I would approach the problem from first principles:

The probability of getting N "heads" out of M coin flip trials, where each trial is a 50-50 shot should be P(N,M) = (.5^M) * (M)! / [ (N)! * (M-N)! ] . A digression for the curious: The exclamation point there means "factorial", or "multiply by all numbers smaller than yourself". For example, 4! = 4*3*2*1 = 24. This is the number of distinct ways to rearrange the order of a number of objects. In the prior example, there are four ways to choose who goes in spot one, then three ways (for each of the four prior) to choose who goes in spot two, and so on. What we need to count is how many distinct ways there are to get N heads out of M flips -- more possible ways to do it means that a given outcome is more likely. This will be the total number of ways to order the M flips divided by the ways to reorder just the N indistinguishable "heads" and just the (M-N) indistinguishable "tails". For example: If there are M=3 flips, and N=2 heads, this can happen three ways. Specifically, heads can occur on flips 1&2, or 1&3, or 2&3. Correspondingly, this is 3! / [ 2! * 1!] = 6 / [ 2 * 1] = 3. To turn this count of "ways to do it" into a probability, multiply by the probability of a single flip (.5) to the power of the total flips (M). If the coin were biased, with a probability "p" other than 0.5 of being heads on one flip (and 1-p of being tails), then the factor would be (p^N) * (1-p)^(M-N) instead. One can check that the sum of the probability formula for P(N,M) from N=0 up to N=M is indeed one.

The expected value is the average over possible outcomes, specifically N*P(N,M) summed over all N from 0 to M. This is 14 for M=28 (not too surprising). The relevant question then is how far a given experiment was from the expectation, and how likely it is to be at least that far from the expected result.

Let's stipulate in this particular case that to have a "more normal" outcome, the observation would have been in the range of, say, 12 to 16. It is important to notice that I am choosing to group together outcomes that are "more likely" or "more central" whether they have fewer correct answers or more correct answers than 14 out of 28. This "two-sided" probability computation is likely a key reason for the difference in my conclusion. The sum of the probabilities over this central region of the distribution is 65.5%. With these stipulations, for about 1 trial out of 3, the results (on a coin flip experiment) should be at least as far from the central value as was observed in the blind tests. I wouldn't classify them as a statistical anomaly at all. In other words, they would appear to me to be very consistent with the assumption of no reliably discernible differences, although it should be noted that the sample is still smallish.

Note: I wouldn't be too surprised to hear that the different conclusions are traceable to some different assumptions, perhaps a one-sided distribution and/or a pretabulated statistical table that perhaps does something more sophisticated to deal with small sample sizes, or etc. If you use the approach I described to ask "how likely is it to get 11 or fewer heads" (leaving out 17 or more) then the answers are already converging quite a bit.

Hope that is interesting / Cheers
See less See more
I am not trying to be argumentative either

I used to be dead set certain blind listening tests were irrefutable until I saw the discussion (could have been another forum and sorry but I cannot find it now) about the music tracks being altered so one had additional sounds and after the listeners had heard both tracks a few times they started also hearing the additional sounds in the track where they did not exist.
I am aware that the psychoacoustic abilities of the human system are very powerful and I am at least open to the possibility that blind A/B testing could be fundamentally flawed.

This is what I have seen as the most often cited amplifier testing protocol.
http://tom-morrow-land.com/tests/ampchall/
It seems straight forward and reasonable enough.

I am willing to bet a coke, that if you sat a $300 AVR along with some electronic project boxes that would presumably switch the source and speakers between the AVR and someone's awesome HiFi rig and the owner/listener had a clicker that turned on a green or red LED on one of the boxes to indicate they were listening to the AVR or the awesome HiFi rig they would hear a difference when the AVR LED was lit even though no change had actually occurred.
Repeat the test unsighted, and A/B would become indistinguishable too the listener.
Good Morning Charlie, this is Jack and I am inviting you to my home to listen to 3 high quality amplifiers on my present reference systems and with a bit of training on what to listen for, you will be able to tell the difference closer to 100% of the time than you think. And I will buy the Coke (soft Drink) for your pleasure once you grasp how easy it can be. No challenge here, just an offer should you wish to travel just a few hours. :smile:
Your offer is very kind and generous, it is greatly appreciated.
If I ever make it to your part of the country it would be an honor to visit with you.
Good Morning Charlie, this is Jack and I am inviting you to my home to listen to 3 high quality amplifiers on my present reference systems and with a bit of training on what to listen for, you will be able to tell the difference closer to 100% of the time than you think. And I will buy the Coke (soft Drink) for your pleasure once you grasp how easy it can be. No challenge here, just an offer should you wish to travel just a few hours. :smile:
What model do you own? :smile:

Kelvin
Your offer is very kind and generous, it is greatly appreciated.
If I ever make it to your part of the country it would be an honor to visit with you.
The honor would be mine good sir.
I still contend that the "blind" in these evaluations should be literal. The amps should be boxed/concealed such that no one knows what the amp is until the event is over. Switch as you would, listen, record your impressions & only after all is done, unbox & see what is what. Of course, the person doing the hook ups would know, but they, perhaps, should/would not participate.
Double-blind supposedly prevents the test administrator from giving subtle clues through body language, facial expressions, etc. as to which is which.


Sent from my iPhone using Tapatalk
Double-blind supposedly prevents the test administrator from giving subtle clues through body language, facial expressions, etc. as to which is which.


Sent from my iPhone using Tapatalk
Like wearing a bag over my head ??
I still contend that the "blind" in these evaluations should be literal. The amps should be boxed/concealed such that no one knows what the amp is until the event is over. Switch as you would, listen, record your impressions & only after all is done, unbox & see what is what. Of course, the person doing the hook ups would know, but they, perhaps, should/would not participate.
Of course that is the best way to do it. It potentially makes procedures far more complex and possibly unmanageable/prohibitive. Bind works for me, generally, if carefully executed and thoroughly reported.

Edit: For me personally, this is partly because I enjoy experimenting but not all of the tedious detail involved in thorough scientific double-blind methods.
Sonnie the prankster showed up tonight with his Oppo remote app on his iPhone, messing with the Oppo track selection while the listener trid to select and listen to tunes.
241 - 251 of 251 Posts
This is an older thread, you may not receive a response, and could be reviving an old thread. Please consider creating a new thread.
Top