I have an excel file with x rows with random numbers (from 1 till 25). The total number of columns (numbers) can vary (15, 16, 17, 18, 19 or 20):
I need python to tell me my probability of winning a lottery with the current set of rows (may be 30k rows with 15,16,17,18,19 or 20 numbers). The lottery is a 15 number draw from 25 numbers pool (1 to 25).
This is what I have so far:
import pandas as pd
from math import comb
# Load the Excel file
file_path = 'combinations.xlsx'
df = pd.read_excel(file_path)
# Define the total number of possible combinations for a 15-number draw from a pool of 25 numbers
total_combinations = comb(25, 15)
# Function to calculate the number of matches
def count_matches(row, drawn_numbers):
return sum(1 for number in row if number in drawn_numbers)
# Since we don't have the drawn numbers, we'll assume a hypothetical draw
# For example, let's assume the drawn numbers are the first 15 numbers in the pool
drawn_numbers = list(range(1, 16))
# Calculate the total probability for the entire file
total_probability = 0
for index, row in df.iterrows():
matches = count_matches(row.dropna().tolist(), drawn_numbers)
probability = matches / total_combinations
total_probability += probability
# Calculate the inverse probability
inverse_probability = 1 / total_probability if total_probability != 0 else float('inf')
# Print the results
print(f"Total probability for the entire file: {total_probability}")
print(f"The probability of winning is 1 in {int(inverse_probability):,}")
It runs without errors, however, the result is not correct. looking at the lottery official website and calculating probability manually, it is known that:
- For a 15 numbers ticket (in my file, it would be 1 row with 15
numbers) there is 1 in 3.268.760 chance of winning - For a 16 numbers ticket (in my file, it would be 1 row with 16
numbers) there is 1 in 204.298 chance of winning - For a 17 numbers ticket (in my file, it would be 1 row with 17
numbers) there is 1 in 24.035 chance of winning - For a 18 numbers ticket (in my file, it would be 1 row with 18
numbers) there is 1 in 4.006 chance of winning - For a 19 numbers ticket (in my file, it would be 1 row with 19
numbers) there is 1 in 843 chance of winning - For a 20 numbers ticket (in my file, it would be 1 row with 20
numbers) there is 1 in 211 chance of winning
This is the result my code gives for a file with a simple row of 15 numbers:
My file may contain different rows of 15, 16, 17,18,19 or 20 numbers, hence the need to calculate the probability of different combinations.
Any python magician available to help?