I’m working on a project where I need to map stars onto a virtual sphere. The stars must be evenly spaced from each other and from the center of the sphere (0,0,0), with a specific distribution of brightness categories across the sphere’s surface. Each segment of the sphere should have a proportional mix of stars based on their brightness, ensuring that the entire sphere is covered uniformly.
Approach and Problem:
Normalization: I started by normalizing the position of each star to form a sphere by scaling their position vectors to a fixed radius.
Even Point Distribution: I generated 1000 evenly spaced points on the sphere using Fibonacci sphere sampling.
Initial Matching: Matching each point to the nearest star worked well without considering star brightness.
Category Distribution: The challenge arises when I try to include a desired distribution of star brightness. The categories need specific ratios across the sphere’s surface, but I struggle to maintain an even distribution when categorizing by brightness.
Current Method: My latest attempt involved randomly shuffling the generated points and then assigning them to stars based on their brightness category. For example, every 1.9th point goes to a star of brightness category ‘8’. However, this method failed to achieve the even distribution required, especially since category ‘8’ stars are rare yet numerous enough to meet the distribution requirements.
import pandas as pd
import numpy as np
from sklearn.neighbors import NearestNeighbors
def classify_stars(vmag):
if vmag >= 10:
return '1'
elif 7 <= vmag < 10:
return '3'
elif 6 <= vmag < 7:
return '5'
elif 3 <= vmag < 6:
return '8'
elif 1 <= vmag < 3:
return '9'
else:
return '10'
def generate_sphere_points(samples=1000, radius=50):
points = []
dphi = np.pi * (3. - np.sqrt(5.)) # Approximation of the golden angle in radians.
for i in range(samples):
y = 1 - (i / float(samples - 1)) * 2 # y goes from 1 to -1
radius = np.sqrt(1 - y * y) # radius at y
theta = dphi * i # golden angle increment
x = np.cos(theta) * radius
z = np.sin(theta) * radius
points.append((x * 50, y * 50, z * 50))
return np.array(points)
df = pd.read_csv('hygdata_v3.csv', usecols=['hip', 'x', 'y', 'z', 'mag'])
df.dropna(subset=['hip', 'x', 'y', 'z', 'mag'], inplace=True)
df['hip'] = df['hip'].astype(int)
df['norm'] = np.sqrt(df['x']**2 + df['y']**2 + df['z']**2)
df['x'] = 50 * df['x'] / df['norm']
df['y'] = 50 * df['y'] / df['norm']
df['z'] = 50 * df['z'] / df['norm']
df.drop(columns='norm', inplace=True)
df['class'] = df['mag'].apply(classify_stars)
points = generate_sphere_points(samples=1000)
desired_distribution = {'1': 0.27, '3': 0.27, '5': 0.27, '8': 0.19, '9': 0, '10': 0}
total_points = len(points)
# Calculate points per category based on desired distribution
category_points = {k: int(v * total_points) for k, v in desired_distribution.items()}
# Randomly shuffle points to avoid spatial clustering in assignment
np.random.shuffle(points)
sampled_df = pd.DataFrame()
offset = 0
for category, count in category_points.items():
if count > 0:
category_stars = df[df['class'] == category]
nbrs = NearestNeighbors(n_neighbors=1).fit(category_stars[['x', 'y', 'z']])
if offset + count > len(points):
count = len(points) - offset # Adjust count if it exceeds the number of points
_, indices = nbrs.kneighbors(points[offset:offset + count])
unique_indices = np.unique(indices.flatten())
assigned_stars = category_stars.iloc[unique_indices[:count]]
sampled_df = pd.concat([sampled_df, assigned_stars], ignore_index=True)
offset += count
# Output the hip values of stars with category 8 to see if they are evenly distributed
sampled_df = sampled_df.drop_duplicates(subset='hip')
category_8_stars = sampled_df[sampled_df['class'] == '8']
hip_values_category_8 = category_8_stars['hip'].astype(str).tolist()
print(hip_values_category_8)
The dataset can be downloaded here: https://raw.githubusercontent.com/EnguerranVidal/HYG-STAR-MAP/main/hygdatav3.csv
Like I said, it is not really wokring the way I imagine.
After two complete days of trying to solve this riddle, I came here for expert advice.
Any idea how I can approach this problem?