No announcement yet.

2018 Yale Invitational - Judges' Choice

  • Filter
  • Time
  • Show
Clear All
new posts

  • 2018 Yale Invitational - Judges' Choice

    Yale Tab Director here. We ran a small experiment at this year's tournament, and I wanted to share the results and the raw data with everybody. We asked every scoring judge to answer this question on their blue ballot: Which team put on the better performance? (This is one of the tiebreakers used for a National Championship Trial.)

    We're interested to see what the relationship is like between scoring more points on a ballot and being picked as the "judge's choice." I've attached to this post the raw data we collected as well as a table summarizing a few different "cuts" of the data. We'll also be including these in the final tab summary sent to and published by AMTA.
    Attached Files

  • #2
    Super interesting findings! Thank you so much for experimenting with this and sharing the results. This seems to present a pretty troubling idea of defense bias. If you all have the blue ballots and the judge information card, it would be interesting to see if there were any judge characteristics that made a disjunction between opinion and score more likely (maybe judges without mock experience, or judges with a small range of scores).


    • #3
      What I find particularly interesting is that the judge choice reversed the case imbalance for this tournament. The P/D split was 52.97%P to 47.03%D for judges choice. But on ballots its was 45.91%P to 54.09%P (counting the ties as half a ballot for each side). This means that on the whole the judges watched the trial and tended to think P was better, but scored as though D was better. My guess is that this has to do with what AMTA has pointed out about judges tending to score higher as the trial goes on.

      Right now AMTA tends to write their cases to give p the factual advantage because cases tend to be D biased. This is, in part because judges score higher as they go on (unclear why) which gives D an advantage since D has most of their scores in the latter half of the trial. My guess is that the factual advantage that AMTA gives P is tending to bias the judges in favor of P for the judges choice, but the higher scores in the second half effect is giving the ballots to D on points.

      It does raise questions though about the proposals in recent years to use some sort of tie breaker question like this to break up tied ballots. My concern, based on the data would be that the tie breaker would tend to give the tied ballots to the side of the case with the better written in facts. Right now its the job of the case writers to balance out the facts to counteract all the weird effect like the upward score trend, but if in doing that they also have to make sure it doesn't skew the tie breaker question, its going to make their jobs a lot harder.
      Last edited by Gadfly; December 9th, 2018, 03:35 PM.


      • #4
        I'll say for the record what my team's experience was with this experiment. We went 1-3 on the plaintiff side; obviously not ideal, but 2 of the 3 ballots we lost actually said we had put on the better performance. When we reviewed those ballots, we saw that our attorneys on that side scored better than the other teams' (on cross examinations and statements) but our witnesses performed significantly worse (6's and 7's on direct and cross in rounds where the other team's witnesses were getting 8s and 9s), and most of the comments on the witnesses said that although we got out all the facts we needed, the witnesses just weren't entertaining/believable enough.

        Ultimately, this made a whole lot of sense to us. When judges are thinking at the end of the round about who "put on a better case" (which is the way a couple of the judges we had thought the question was worded), it seems to make sense that they'd consider which side was more convincing/which side would actually win the trial (especially for judges who are real-world lawyers and not former AMTA competitors), which would lead them to give more weight to the performance of the attorneys in the rounds, particularly given that (as has been mentioned) the plaintiff's attorneys have better facts to hammer in on crosses and in statements. But, as we all know, that's not what's most important in mock trial. A judge might very well see a plaintiff McClellan, for instance, who gets out great facts but gives a flat performance followed by a defense Thornhill who can't get out as many good facts but was really entertaining. After the round that same judge might then say that plaintiff put on the better performance because, well, they were more convincing, giving outsized weight to the ten points worth of closing they just saw rather than the 60-90 points (depending on how you count) worth of witnesses they saw and loved earlier in the trial as they do so, so that their ballots still have the defense winning the round. Obviously this would be a whole lot more likely to make a difference in a close round, which is exactly what we see happening.

        Of course I don't want to extrapolate our team's results to all the other teams who lost on the blues but won on the tiebreaker question, but I would bet that if we were to look at all of those ballots, we'd see a similar trend for a lot of other teams. Just my two cents.
        Last edited by Scooter Schwinn; December 15th, 2018, 11:23 AM.


        • #5
          Originally posted by Scooter Schwinn View Post
          When judges are thinking at the end of the round about who they thought "put on a better case" (which is the way a couple of the judges we had thought the question was worded), it seems to make sense that they'd consider which side was more convincing/which side would actually win the trial (especially for judges who are real-world lawyers and not former AMTA competitors)
          I'll add that a number of judges wrote, at the bottom of their blues, things like '[side] won the trial' or '[side] had better case.' So, it's certainly plausible that some judges interpreted the question in a way that puts more weight (perhaps unduly) on the merits of the cases presented. In the judges' presentations, I instructed judges to interpret the question as they saw fit. On the flip side, I did show them the question three times and asked them to write it down, so I would hope all the judges were starting from the same, correct question.

          And a little more food for thought: one judge, when I discussed this during the judges' presentation, remarked that he scores parts based on form (for example, how much did a direct examination mirror one he would see in real life). But, his interpretation of the judges' choice question would have him putting more weight on the content of both cases. I imagine every judge had their own interpretation of the question, which could explain some of the mismatch between who won on points and who was picked as the judges' choice.