Slots are known as games of pure luck, for which optimal strategies (minimizing loss and maximizing profit) are not available. Moreover, the inner characteristics of the slot games and their statistical indicators are kept secret by their producers and operators and this prevents one to play informed or use this information in their strategies, including strategies of choosing the machines to play at.
The traditional view in casino gambling research is that gamblers are not able to identify slot machines with better odds in order to optimize their play. Two researchers challenged this view with their study in 2024, which revealed that seasoned players tend to play slot machines with better odds, while more experienced players exhibit higher consistency in selecting slot machines with better odds over time. These findings are consistent with the theory of tradeoff between exploitation and exploration in reinforcement learning.
The theory of tradeoff between exploitation and exploration in reinforcement learning as shown in Hugging Face Deep RL Course
Slot machines from non-transparency to empirical approach
The statistical indicators of slot machines (as the return to player percentage and volatility index) and probability of winning each prize in their payout schedule are kept secret by their producers and operators. They can only be derived from the parametric configuration of a slots game (number of unique symbols, number of stops on each reel, and the weighting of the reels), which is not made public, or by hard statistical work based on tracking their results in the long run. Occasionally, the return to player percentage (RTP) is displayed on the menu of the machines, where the jurisdiction requires that.
With this lack of information, the player faces a “how to gamble if you must” situation, requiring choices under ambiguity. The player’s goal is to determine which machines to play, how much to wager and for how long, in order to maximize the expected gain and minimize the expected loss. This decision making problem is a tradeoff dilemma for the player and involves exploration and exploitation, namely trying out new machines to learn about their odds and playing those machines with better-known odds.
Although there are mathematical models optimizing the choices and playing behavior in a given multi-slot casino context and setup, deriving solutions relies on data retrieved from long-term statistical observation. Besides, the complexity of the mathematical equations of such models makes them actually inapplicable for the average players. These facts also raise doubt about whether human intuitions, even based on unlimited trial opportunities, are sufficient for solving large-scale multi-slot problems.
The alternative to the inapplicable theoretical approach is obviously the empirical approach, concerning both players’ behavior and the gambling research.
In the study of Ye Hu (University of Houston) and Stowe Shoemaker (University of Nevada), these researchers analyzed field data on actual bets placed by slot players, in order to investigate their decisions when confronted with a highly complex multi-slot problem in a casino with thousands of slot machines.
The slots problem and wrong choices
The outcomes of a slot machine cannot be predicted, nor measured in probability due to the lack of information regarding the inner design of a slot game. From a decision-making perspective, players tackle the optimization slots problem by using their intuition and making probabilistic choices under ambiguity.
A simplistic formulation of the slots problem is the abstract Bernoulli slot game, where each spin’s outcome is binary – either a win (1) or loss (0). Theoretical solutions to the multi-slots problem typically focus on the efficiency of the sampling process.
A study conducted in 1995 assessed how well people imitate such theoretical solutions and found that even in the most basic form of the slots problem, players tend to make choices inconsistent with optimal theory. In particular, players tend to under-sample or over-sample the unfamiliar option, switching between options too often, focus too much on immediate winning or losing feedback, and exhibit excessive aversion to ambiguity by persistently choosing the known machine even if its win rate is low. Although their choices were wrong, the subjects in this experiment did demonstrate an intuitive understanding of the direction of choices as prescribed by the theory.
This study (and others subsequent) suggested that it is possible that people can make relatively optimal choices for the slots problem based on intuition and experience.
Illustrative Slot Machine Zones in the Casino. Notes: Each zone contains multiple slot machines. The size and locations of the zones are illustrative (not to scale). Source: Research Gate
The virtues and aims of Hu and Shoemaker’s study
Past empirical research on sequential choices under ambiguity and reinforcement learning has mainly relied on evidence obtained from laboratory settings. The study of Hu and Shoemaker uses a large-scale setting based on actual bets made by casino players with years of experience. Its main goal is to determine whether, with great experience, players make choices that are directionally optimal in minimizing expected losses. A second aim is to identify indirect evidence of learning that may have contributed to these choices. Overall, it aims to support the hypothesis that gamblers, despite their traditional irrationality in making their choices in gambling, demonstrate the ability to rationalize and learn from their experience.
The slot games provide a large number of opportunities to learn, due to their brevity. Unlike table games (like blackjack or roulette), the outcome of a spin in slots shows up in seconds or less, allowing players to accumulate a decent number of empirical observations in a relatively short time.
Beyond the academic context of decision-making, the results of the study are supposed to contribute to the practice of customer relationship management in the casino business. Indeed, from the casino’s perspective, it is better for the players to know that they have the ability or skill to choose better-odds machines, which would enrich their casino experience.
The study aimed to answer three specific questions:
- Given the complexity of the slots problem in a casino with thousands of slot machines, do more experienced players choose slot machines with better odds?
- If yes, is there evidence that more experienced players deliberately choose machines to play?
- Is there any evidence supporting the occurrence of reinforcement learning among more experienced players?
The Number of Slot Players and Total Coin-In by Hour. Source: Research Gate
Setup and primary statistics
The study was conducted using the data provided by a big US casino, which operates a club membership program. Each member receives a card containing information that uniquely identifies the player. When playing a slot machine, players insert their card into the machine and earn points based on their wagers. The casino uses a customer relationship management (CRM) system to store and manage members’ information, including their betting history. There are five membership tiers established within this CRM system, based on the total amount each player has wagered within one year period, namely Tier 1 to Tier 5, in ascending order of tier points (Tier 1 members having wagered the least money). The authors of the study used these tiers as a proxy measure for the level of experience a player has with slot machines.
The collected data covers activities during a 24-hour period in a day of 2008, and includes: Player Id, Slot number (identifying each machine), Game type, Zone, Denomination, Session start time, Duration (of the session), Games played (rounds played on a machine), Coin-in (The total amount wagered during the recorded session), House advantage (fixed for each machine).
The average house advantage of the 4,381 machines observed is 0.080. The demographic data includes a total of 4,677 slot players with an average age of 64.109 years and average membership duration of 14.192 years at that casino. Female players represent 59.1% and male players 40.9%. A player played on average on 11.247 slot machines.
The following patterns and associations have been recognized in the statistical analysis:
Club tier and slot house advantage
The higher the club tier, the lower the average house advantage (0.085 for Tier 1 to 0.076 for Tier 5). This model-free evidence suggests that players with higher gambling experience (measured ultimately in time played), on average, play the slots with lower house advantages (higher RTP).
Comparing the means of house advantages between each pair of club tiers, the authors found them statistically relevant, suggesting strongly that player experience may be systematically related to the house advantages of the slot machines each tier tends to play. It is fair to infer that the evidence indicates statistically relevant associations between the optimality of slot choices and player’s experience.
Distribution of House Advantage across the Slot Machines at the Casino. Source: Research Gate
Club tier and number of slot machines
Learning is not directly observed with the given setup of this study, but analyzing the statistics of the mean number of machines played per minute, the results indirectly informs us about how machine choices differ across club tiers.
This means decreases monotonically as the club tier increases (0.172 for Tier 1 to 0.096 for Tier 5), suggesting that players in higher club tiers switched machines less frequently. Along with the house advantage statistic, these results indicate that more experienced slot players played on machines with better odds and were more likely to stick with the same machine. On the other hand, less experienced slot players played on machines with worse odds and switched the machines more frequently. Interpreting that in behavioral terms, less experienced players, having more ambiguous information about slot odds, “explored” more by trying out machines, while experienced players, having better information, “exploited” the machines with better odds.
Empirical analysis by mathematical modeling provided the following results:
Choice of slot machines
To investigate whether more experienced players are more likely to choose slots with lower house advantages, the authors aggregated the session-level data and modeled the probability that a certain player chooses a certain slot machine, using a binary logit regression.
The results indicated that each higher club tier is less likely to choose a slot machine with a higher house advantage and confirmed that the moderating effect of player club tiers on house advantage becomes more negative as the club tier increases.
The study found a positive association between a player’s experience and the optimality of their choices, suggesting that players with more casino experience may possess genuine information about the odds of the slot machines (i.e., house advantage or RTP).
Interestingly, unlike club tiers, the interaction between house advantage and membership duration is not significant, suggesting that the most relevant experience for slot machine choice is not the time being a member of a casino, but the actual gambling activity at that casino.
Coin-in at chosen slot machines
To analyze how much money players bet on a chosen slot machine, the authors used a similar modeling method. Their results suggested that although house advantage on average has no significant effect on machine choices, it does have a negative effect on the amount players bet on the machine – the higher the house advantage, the less coin-in bet on the machine. Higher tier players tend to place a larger amount of their total wager on a slot machine and every higher club tier tends to bet less money on a slot machine with a higher house advantage. The least experienced players tend to bet more money on the machines with higher house advantages.
Overall, the results suggest that compared to “how much to bet”, “which slot machine to choose” is the more prominent factor reflecting the optimality resulting from accumulated gambling experience at the casino.
Box Plot of Machine House Advantage by Club Tier. Source: Research Gate
Session level analyses and house advantages
Turning to the question of whether there is evidence supporting learning among more experienced players, the authors of the study modeled the house advantage of a slot machine that a certain player played on during a certain session as a recursive function in terms of the previous session, along with other control variables.
The results of this regression analysis suggested that players with higher club tiers tend to choose slot machines with higher interdependence in house advantages compared to players with lower club tiers. We can fairly hypothesize that there may exist a threshold of experience for learning to be sufficient to produce effects in the outcomes of their play. These results and suggestions align with the findings of the research on reinforcement learning in the slots context.
More experienced players, acquiring better information about slot machines odds, exhibit higher exploitation by focusing on slot machines with known better odds. Less experienced players exhibit higher exploration by randomly playing on slot machines with higher variation in house advantages.
Hourly Median House Advantage of Slots Played by Each Club Tier. Source: Research Gate
Temporal optimality
In a crowded casino, players can also consider factors such as traffic and machine availability when making their choices. The authors investigated whether optimality of the machines played by players from different club tiers varies over time, with the ceiling of optimality depending on casino traffic levels, approximated on an hourly basis.
The differences in house advantages across club tiers are associated with earlier findings, indicating that higher club tiers are associated with more favorable odds. In regard to temporal changes, club Tier 5 was found to move closer to the theoretical median house advantage between 3 a.m. and 8 a.m., the period of lowest casino traffic. This shift provides evidence supporting the idea that the majority of the most experienced players possess intimate knowledge about the most favorable machines to play, including by taking advantage of the low traffic in the early hours of the day and playing on machines with lower house advantages.
Conclusion
The study of Hu and Shoemaker found evidence consistent with experience-related learning, where more experienced slot players tend to choose slot machines with better odds. In front of thousands of slot machines with different house advantages, the choices of more experienced players seem to be less random than those of less experienced players. It is suggested that players acquire information about slot machines through three main channels: reinforced learning, observational learning, and word-of-mouth.
Slots are recognized by problem gambling experts as the highest addictive games of chance and a representative environment of manifestation for the classical gambling cognitive distortions (such as illusion of control, overestimation of chances, and various fallacies and myths).
Evidence for associations between gambling experience and rational or optimal play open new streams of challenging research on the gambling cognitive distortions, including on theoretical (systematic) learning versus empirical learning in the context of playing information.
References:
Camerer, C., Weber, M. (1992). Recent developments in modeling preferences: Uncertainty and ambiguity. Journal of risk and uncertainty, Vol. 5, 325-370.
Savage, L. D. L. (1965). How to gamble if you must. Inequalities for Stochastic Processes. McGraw-Hill, New York, NY
Gittins, J., Glazebrook, K., Weber, R. (2011). Multi-armed bandit allocation indices. John Wiley & Sons.
Haw, J. (2008). The relationship between reinforcement and gaming machine choice. Journal of Gambling Studies, 24(1), 55-61.
Hu, Y., & Shoemaker, S. (2024). Do More Experienced Gamblers Choose Slot Machines with Better Odds? A Large-Scale Multi-Armed Bandit Problem at a Casino. Customer Needs and Solutions, 11(1), 9.
Lucas, A. F., Singh, A. K. (2011). Estimating the ability of gamblers to detect differences in the payback percentages of reel slot machines: A closer look at the slot player experience. UNLV Gaming Research & Review Journal, 15(1), 2.
Meyer, R. J., & Shi, Y. (1995). Sequential choice under ambiguity: Intuitive solutions to the armed-bandit problem. Management science, 41(5), 817-834.
Sutton, R. S., & Barto, A. G. (1998). The reinforcement learning problem. Reinforcement learning: An introduction, 51-85. MIT Press, Cambridge, MA.
Written by:
Prime Casino Editorial Team
At Prime Casino, our editorial team is passionate about bringing you a wide range of high-quality content focused on all things casino and more. From in-depth game guides to tips and strategies for maximising your gameplay, we cover everything you need to provide a casino experience transcending the ordinary. We are dedicated to promoting responsible gaming while delivering engaging and informative content to both new and experienced players alike.