This article is in response to the critique of our work by Jeff Hwang in the October 2019 issue of Global Gaming Business Magazine.
Given the number of Hwang’s inaccuracies and mischaracterizations of our findings (and his own), we will not be able to address all of them in a single rebuttal. Before we respond to the details of his take on our work, we need to clarify a few key points in order to help Hwang and others better understand the results from the research in this domain.
For starters, the context, the specific aims of inquiry and the research questions of the cited research must be clearly understood. For example, if you are looking at aggregated results such as game-level data, those values are comprised of many outcomes produced by many individual players. So, if you are looking at the mean (i.e., average) coin-in, then it will not provide a meaningful measure of the average time on device, especially for the population of players who rely on it.
Why? Because a disproportionate share of the coin-in is produced by winning players (i.e., outliers). These lucky players have the capacity to contribute more coin-in by recycling jackpot money. Therefore, they should not be included in any attempt to understand average time on device, because winning players are not likely to invoke this measure of gaming value. Losing players are the ones who reference value measures such as play time.
When you simply divide a buy-in amount by par (e.g., $100 divided by 5 percent), and then conclude that the result (i.e., $2,000) reflects a player’s expected coin-in on that game, you have just assumed there will be no winning players. You have assumed that all players will lose their entire buy-in, no matter how many jackpots they hit. That is the fatal flaw in that shortcut math, with respect to the time-on-device question.
Hwang committed a similar error in his critique of Lucas and Brandmeir (2005), when he divided the mean theoretical win by the par, concluding that the resulting coin-in estimate reflected the play time on the games. Again, this math assumes no winners walk with any credits. To accomplish his objective, he would need to isolate and compare the coin-in produced by losing players on both games. This type of overly simplistic calculation is common and likely responsible for the widespread misunderstanding of questions regarding the relationship between par and time on device.
Additionally, it’s important to note that the aim of Lucas and Brandmeir (2005) was on understanding how changes in reel pars affected game revenues, as opposed to coin-in or play time. Specifically, the casino operator wanted to know which par would produce the most revenue at the game level. In this case, you cannot take the results of a study designed to answer one question and shoehorn them into an argument for another.
For the record, 17 years ago, when we began conducting this research, we very well may have expected to find a negative correlation between a game’s par and its coin-in, even in reels. After all, that’s how it was initially explained to us. But as we completed research in the area of performance-potential modeling, we began to grow suspicious of this explanation.
That’s science. New information leads to an improved understanding of how the world works.
Performance-Potential Models
Our performance-potential models were used to predict the coin-in produced by each game, given its specific game characteristics and most importantly, its location on the floor. Of course, we used a variable representing the par of the games, as we then considered it to be an important game characteristic. But the par variables generally failed to produce the expected results. That is, early results were not providing definitive evidence that increases in par were producing decreases in game–level coin-in on reel slots.
Specifically, the par variable failed to produce a statistically significant effect on game-level coin-in in Lucas et al. (2004). That is, knowing the value of the game’s par told us nothing about its coin-in. In Lucas and Dunn (2005) the par variable failed to make the final model. Finally, Lucas and Roehl (2002) did produce a significant and negative effect on game-level coin-in, but that sample only included video poker games, which effectively identify price via the pay table.
Hwang neglected to mention that the only result from our work that supported his argument had nothing to do with the issue of par detection on reel slots. We felt it necessary to set the record straight, hence this explanation of our results. It’s always a good idea to read the original studies. This is why academic editors bristle at citing the interpretation of research from sources other than the original studies.
Simulation Research
As for Hwang’s take on Lucas Singh’s (2011) simulation of reel slot play, we could not disagree more with his assessment that the study was suspect. In fact, it demonstrated that there was no difference to detect, regarding the outcomes of play on games with different pars.
Specifically, if you plotted the outcomes from play on the games with different pars, then you would see that they fell almost entirely on top of each other (i.e., imagine two bell curves resting nearly on top of one another). If this “picture” doesn’t demonstrate that detecting differences in pars is highly unlikely, then we don’t know what does. If there is no session-level difference in results to detect, how can you expect players to detect differences over more sessions?
Moreover, Lucas and Singh (2011) included a scenario in which a gambler played each of two games on 10,000 visits (i.e., every night for more than 27 years), consistently producing a no-difference result.
Further, in the real world, players do not wager the same amount on every spin, and they do not play each of two games for the same number of spins. These realities only bolster the conclusions described in Lucas and Singh (2011). That is, varying the bet and failing to isolate play on the experimental games would only make the detection of a difference in the pars all the more difficult.
Hwang also objected to the use of virtual players in the simulations from Lucas and Singh (2011), but the point of that paper was to demonstrate that players would not have the means to detect a difference in the pars of the paired games based solely on the results of their play. Therefore, it would make no difference whether the subjects were virtual players, AI bots, or actual humans. It would seem as though his interpretation of that study failed to consider its primary aim.
Again, Lucas and Singh (2011) demonstrated that the results from play would not be sufficient to detect a difference in the pars at the session level, or over time (i.e., 27+ years of daily play). The long-term result is an aggregation of short-term experiences. Play must occur one session at a time.
If the argument becomes that reel slot players are somehow able to detect slight differences in the frequency of very infrequent jackpots, then we give up. Such a belief is anchored in a heroic assumption, and to the best of our knowledge, there are simply no data or research to support this ability/assumption.
Further, we know from the work of Nobel laureates Daniel Kahneman and Amos Tversky that the capacity for human beings to successfully perform such a feat is simply not supported by our behavior. Still, the frequent claims of these superhuman abilities to detect such differences in jackpot frequencies led us to our next stream of research.
Field Studies
This body of work was conducted in the field over long periods of time, to give frequently visiting players ample opportunity to detect differences in the pars of otherwise identical games. These studies analyzed the results of multiple two-game pairings featuring identical visible pay tables, but different pars.
Hwang’s interpretation of these findings was off target as well, as he mischaracterized our outcomes to support his position. For example, he cited that five of six pairings in Lucas and Spilde (2019a) produced a mean coin-in that was greater for the low-par game, citing this as evidence for greater play time. First, what did we say about using the mean coin-in at the game level, as a proxy for play time? Why would you include the exaggerated contributions/results of winners in an attempt to understand play time? It’s the losing players who care about that.
Second, his report of our results was importantly mischaracterized in that only five of 11 pairings produced statistically greater coin-in levels, across the two studies he cited (with outliers omitted). Again, it doesn’t really matter, as these results are not appropriate/valid measures of play time. But it is interesting.
Third, and most importantly, he failed to include our finding of no significant change in the coin-in levels of the paired games over time. That is, if players (in mass) could detect a difference in the pars, then why no increase in the coin-in on the low-par games, over such long sample periods, in heavy repeater markets? Alternatively, why no decrease in the coin-in levels on the high-par games? These outcomes simply do not support a sensitivity to pars.
If players are so perceptive to changes in pars, then we would certainly expect to see a clear migration of play (i.e., coin-in) to the low par games, especially given the egregious differences in the pars of the paired games (in all three studies). The pay tables of the paired games were identical, i.e., players couldn’t win any more money on the low-par games.
Hwang repeatedly insists that this relationship is simple economics, citing the law of demand as the governing process. The fatal flaw in that argument is that the law of demand assumes that price is known.
The purpose of Lucas and Spilde (2019a) and (2019b) was to help operators make better decisions about which pars to select for the games they place on their floors. Initially, we wanted to see which game would produce more revenue, i.e., theoretical win. We discovered that the high-par games generated significantly more theoretical win in every instance. This was an important finding, as nearly all operators are attempting to optimize revenues, not coin-in.
But the problem was a little more complicated. If a clientele of frequent players could eventually detect the increased pars, then an exodus of play and/or brand damage could occur. Of chief concern was that the high-par game might outperform the low-par game for a few months, only to be eventually discovered by the clientele. If discovered, then we would expect to see a change in the daily theoretical win and coin-in differences over the sample periods.
The test of detection was the change in the daily difference in coin-in between the two games, not the difference in the amount of coin-in on each game. The amount of coin-in at the single-game level told us nothing of importance. For instance, it did not matter that coin-in on Game A was greater than it was on Game B. What mattered was whether the daily difference in coin-in on Games A and B changed over time. Such change would indicate detection of a difference in the pars of the paired games. Why would anyone play a 10 percent game over a 5 percent game, if they both offered the same visible pay table? In spite of this, we saw no statistically significant change in the difference of the daily coin-in over the sample periods, in any of the two-game pairings, in any of the three different studies (i.e., Lucas & Spilde, 2019a, 2019b; Lucas, 2019).
Hwang attempted to claim that our results supported his argument by incorrectly interpreting the meaning of game-level coin-in. Specifically, he advanced an overly simplistic and incorrect conclusion that more coin-in resulted in more play time. You simply cannot conclude this from our data. To effectively make that argument, you would need the coin-in generated by losing players on each game (and those results might surprise you). Academic research is carefully designed to test specific hypotheses. The results from one experiment cannot necessarily be hijacked as support for a different research question, i.e., one which they were not designed to answer.
Omitted Studies
There was a third field study that escaped Hwang’s critique. It involved a sample of two-game pairings, with daily game-level results collected over a nine-month period. The findings indicated significantly elevated theoretical win levels for the high-par games and no evidence of significant play migration to the low-par games, over a 274-day period.
These results fully supported those from Lucas and Spilde (2019a; 2019b). Additionally, we suggest taking a look at Lucas and Singh’s (2007) work that examined the pulls-per-losing-player outcomes on five different games. The game with the highest par produced the greatest number of pulls while the game with lowest par produced the least number of pulls. This result highlighted the considerable limitations of the popular shortcut thinking associated with the notion that lower pars create more time on device.
All we had to do to break this heuristic was make minor adjustments to the standard deviations of the games. These results should give you pause, if you believe that lower pars necessarily produce greater play time
Additional Inaccuracies
We take issue with other interpretations and conclusions from our work. Contrary to Hwang’s review, we have made no claims about the effects of pars outside of the ranges we have examined. More specifically, we have made no claims about what would happen beyond a difference in pars of 9 percentage points. We have never advocated or suggested raising pars in perpetuity or that players will increase their gaming spend in perpetuity. We have suggested that increases in casino win would likely stem from paying out the greater jackpots a little less frequently. We hold that these subtle decreases in jackpot frequency would not be noticeable within the par ranges that we examined, and the body of our results most certainly supports that conclusion. We have also noted the macro-level advantages of increased pars, regarding optimizing revenues from time-constrained players.
Hwang suggests that the ability of players to perceive differences in pars takes a backseat to their observed behavior, insinuating that our work is somehow not relevant. First, it is not our argument that players can perceive a difference—that comes from the industry. Second, perception does matter. In that regard, we share the industry’s concern. If players could perceive a difference in pars it would almost certainly affect their behavior. Third, their behavior strongly suggests that they cannot perceive a difference. If you read the research, you will see that the findings speak for themselves.
Problems with the Counter-argument
Regarding Hwang’s presentation of data from Iowa’s casinos and the AGEM-sponsored report he references, we’ll start by noting that such cross-tabulations do not establish causal relationships. In most cases, you cannot conclude much of anything from surveys and cross-tabulations alone. Given the complexity of the par-performance relationship, these datasets would at best represent start positions for more meaningful inquiries. The AGEM report included limiting language to this effect, which is laudable.
To prove cause and effect, you need to establish three things: (1) the cause must precede the effect in time; (2) all other competing/potential sources of influence must be eliminated; and (3) you must establish perfect correlation between the cause and the effect, i.e., when the cause is present, the effect must always be present, and when the cause is absent, the effect must always be absent.
Hwang’s cross-tabulation of Iowa data does not establish any of these conditions. In his defense, it’s a difficult standard. Nearly all social science studies fall short of proving cause and effect, including ours. But you must come much closer than Hwang’s table data to make such bold causal claims. Carefully considered experimental design is critical in understanding causal relationships. Hwang insists that the effect of rising pars is “black and white” and that advanced statistical methods and careful experimental designs are not needed to understand the relationships between par, revenue, and coin-in. His matter-of-fact comments about his findings and his insistence that this question is one of “simple economics” simply do not hold up within the established requirements for proving cause and effect. The basis for his conclusions is not credible.
Further, Hwang asks us to assume that Iowa is a relatively quiet market unaffected by external forces. We know of no economist who would buy that argument. There were material events in the border states of South Dakota, Illinois, and Missouri which surely impacted gaming volumes in Iowa, during the period in question. Most notably, Illinois introduced VGTs in 2012 at truck stops, bars, restaurants, and other locations. Could this have impacted the decline in slot volume in Iowa beginning in 2012? Hwang felt this was not worth mentioning in his argument that rising pars in Iowa caused annual slot win to “hit a wall” in 2012. See the AGEM (2015) report that Hwang references, for a list of other events in Iowa’s border states with potentially negative impacts on its slot revenues.
Additionally, the free-play explosion in the post-recession period along with technological advances in the button deck and reel spin speed all likely impacted the slot player’s experience. Those two technological advances have often been cited as contributors to abbreviated play time, as have increases in forced minimum wagers. Increases in (1) the number of games requiring elevated forced minimum wagers; and (2) the dollar-value of the forced minimums would have both decreased play time during the period in question. And this is all notwithstanding the possibility of changes in the consumer sentiment regarding gambling expenditures, on the heels of the great recession.
To make causal claims about the impact of rising hold percentages in Iowa is simply unfounded. What is even more bewildering is that the AGEM report cited by Hwang paints Iowa as a strong performer (if not a counterfactual to his own argument). Specifically, its indexed annual slot win has only once dipped very slightly below its indexed hold percentage, between the years 2008 and 2014. In most years within this period, its indexed slot win has exceeded its indexed hold percentage, in spite of rising hold percentages.
Concluding Remarks
At the end of the day, each person must weigh the evidence presented by both sides of this argument and decide for himself/herself. On one side, there is consistent evidence produced by nine academic studies that have been carefully designed, peer-reviewed, and accepted for publication in top-tier journals. On the other side, there are overly simplistic and anecdotal arguments, based in casual observation and table data, with a heavy dose of “Trust me.” And all of that is backed-up with flawed math, a misinterpretation of research results, and a misunderstanding of how the games actually produce outcomes.
One argument is empirically supported by results from performance-potential models, computer simulations, and experimental field studies, while the other side presents no results from rigorous, objective, and scientific studies. We are not trying to change anybody’s mind, we have no political agenda, and we respect everyone’s right to their opinion. But in any attempt to understand a process, we must all consider how we come to know things. This path to knowledge often comes down to choosing between a political or wishful version of the truth and an objectively-derived one (i.e., a scientific truth).
Lastly, we appreciate the interest in this research as well as the challenges to its findings. Such challenges are an important part of the scientific process. Interested readers are welcome to contact us for a complete list of the studies referenced in this article.