Thiết kế website giá rẻ

Question

I have sleep data from Apple Health Kit which looks similar to this

    type    startDate   endDate creationDate    value   unit    sourceName  sourceVersion
32195   HKCategoryTypeIdentifierSleepAnalysis   2024-06-03 22:40:16+02:00   2024-06-03 22:48:16+02:00   2024-06-04 05:00:42+02:00   HKCategoryValueSleepAnalysisAsleepCore  None    Christian’s Apple Watch 10.5
32197   HKCategoryTypeIdentifierSleepAnalysis   2024-06-03 22:48:16+02:00   2024-06-03 22:49:46+02:00   2024-06-04 05:00:42+02:00   HKCategoryValueSleepAnalysisAwake   None    Christian’s Apple Watch 10.5
32199   HKCategoryTypeIdentifierSleepAnalysis   2024-06-03 22:49:46+02:00   2024-06-03 22:56:16+02:00   2024-06-04 05:00:42+02:00   HKCategoryValueSleepAnalysisInBed   None    Christian’s Apple Watch 10.5
32198   HKCategoryTypeIdentifierSleepAnalysis   2024-06-03 22:49:46+02:00   2024-06-03 22:56:16+02:00   2024-06-04 05:00:42+02:00   HKCategoryValueSleepAnalysisAsleepCore  None    Christian’s Apple Watch 10.5
32200   HKCategoryTypeIdentifierSleepAnalysis   2024-06-03 22:56:16+02:00   2024-06-03 22:57:16+02:00   2024-06-04 05:00:42+02:00   HKCategoryValueSleepAnalysisAwake   None    Christian’s Apple Watch 10.5

I would like to get a list of sleep sessions (start, end) with all the stages in between. The example above is a subset of 1 sleep session. Starts at 22:40 and ends at 22:57

Problem I have is how to find the start and the end of each session without specifying a time range. i.e bedtime between 9pm and 12pm. I’d rather find clusters of sleep stages and group them. Usually between sleeps there is a gap of … 12+ hours?

The second issue I have is that there are gaps between sleep stages. An example would be to wakeup at 3am and fall back to sleep at 5am. This 2 hour gap should be considered as HKCategoryValueSleepAnalysisAwake but within the same sleep session.

I did try it before but with erroneous data. It has too many sleep sessions at times with very short sleep times. I am sure there are more efficient ways to find each sleep session. Thinking about island and gap problem here.

def group_sleep_sessions(sleep_df, max_gap=timedelta(hours=20), min_session_duration=timedelta(hours=2)):
    sleep_df = sleep_df.sort_values('startDate')
    sessions = []
    current_session = None

    def merge_sessions(s1, s2):
        return {
            'startDate': min(s1['startDate'], s2['startDate']),
            'endDate': max(s1['endDate'], s2['endDate']),
            'sourceName': s1['sourceName'],
            'stages': sorted(s1['stages'] + s2['stages'], key=lambda x: x['start'])
        }

    for _, row in sleep_df.iterrows():
        new_segment = {
            'startDate': row['startDate'],
            'endDate': row['endDate'],
            'sourceName': row['sourceName'],
            'stages': [{'stage': row['value'], 'start': row['startDate'], 'end': row['endDate']}]
        }

        if current_session is None:
            current_session = new_segment
        else:
            time_diff = row['startDate'] - current_session['endDate']
            same_day_or_next = (row['startDate'].date() - current_session['endDate'].date()) <= timedelta(days=1)
            
            if time_diff <= max_gap and same_day_or_next:
                # Merge if within max gap and on the same or next day
                if row['startDate'] > current_session['endDate']:
                    current_session['stages'].append({
                        'stage': 'HKCategoryValueSleepAnalysisAwake',
                        'start': current_session['endDate'],
                        'end': row['startDate']
                    })
                current_session = merge_sessions(current_session, new_segment)
            else:
                # Finalize current session and start a new one
                if (current_session['endDate'] - current_session['startDate']) >= min_session_duration:
                    sessions.append(current_session)
                current_session = new_segment

    # Add the last session if it meets the minimum duration
    if current_session and (current_session['endDate'] - current_session['startDate']) >= min_session_duration:
        sessions.append(current_session)

    return sessions

to simplify debugging here a dict version of the test data

data = [{'type': 'HKCategoryTypeIdentifierSleepAnalysis',
  'startDate': Timestamp('2024-06-03 22:40:16+0200', tz='pytz.FixedOffset(120)'),
  'endDate': Timestamp('2024-06-03 22:48:16+0200', tz='pytz.FixedOffset(120)'),
  'creationDate': Timestamp('2024-06-04 05:00:42+0200', tz='pytz.FixedOffset(120)'),
  'value': 'HKCategoryValueSleepAnalysisAsleepCore',
  'unit': None,
  'sourceName': 'Christian’s Applexa0Watch',
  'sourceVersion': '10.5'},
 {'type': 'HKCategoryTypeIdentifierSleepAnalysis',
  'startDate': Timestamp('2024-06-03 22:48:16+0200', tz='pytz.FixedOffset(120)'),
  'endDate': Timestamp('2024-06-03 22:49:46+0200', tz='pytz.FixedOffset(120)'),
  'creationDate': Timestamp('2024-06-04 05:00:42+0200', tz='pytz.FixedOffset(120)'),
  'value': 'HKCategoryValueSleepAnalysisAwake',
  'unit': None,
  'sourceName': 'Christian’s Applexa0Watch',
  'sourceVersion': '10.5'},
 {'type': 'HKCategoryTypeIdentifierSleepAnalysis',
  'startDate': Timestamp('2024-06-03 22:49:46+0200', tz='pytz.FixedOffset(120)'),
  'endDate': Timestamp('2024-06-03 22:56:16+0200', tz='pytz.FixedOffset(120)'),
  'creationDate': Timestamp('2024-06-04 05:00:42+0200', tz='pytz.FixedOffset(120)'),
  'value': 'HKCategoryValueSleepAnalysisInBed',
  'unit': None,
  'sourceName': 'Christian’s Applexa0Watch',
  'sourceVersion': '10.5'},
 {'type': 'HKCategoryTypeIdentifierSleepAnalysis',
...
  'creationDate': Timestamp('2024-06-04 05:00:42+0200', tz='pytz.FixedOffset(120)'),
  'value': 'HKCategoryValueSleepAnalysisAwake',
  'unit': None,
  'sourceName': 'Christian’s Applexa0Watch',
  'sourceVersion': '10.5'}]

Thiết kế website giá rẻ

Danh mục

Find sleep sessions out of sleep stage data with pandas