The question is:
What is the average completion rate per content_type?
The columns with dates/times are these titles and I’ve given the first two rows of data for an example:
event_id | users_id | Date(timestamp) | episode_id | content_type | play_time(seconds) | Episode_length(seconds) |
---|---|---|---|---|---|---|
iqeq4x84r6x2yxl | xw8EF | 2016-08-25T08:15:30-05:00 | woaqa01 | vod | 180 | 200 |
iqeujzl76pff8 | IfTI1 | 2016-08-25T08:16:31-05:00 | eodjdm | vod | 260 | 300 |
I use SQL Server. My logic here is, the completion rate would be more relevant to the date(timestamp)
column as it indicates 15:30-05:00. The rest of the data in that column is all the same date ‘2016-08-25′ but have differing start times such as 16:31 / 15:35 / 15:39 however they all have the same end date ’05:00’. How on earth do you write the query to find the average between those times in that column. I’ve not queried timestamp before, or am I completely on the wrong side of things here? Is the calculation just between the play_time and Episode_length.
And is the answer something less complex like:
SELECT
content_type,
AVG(CAST(play_time AS FLOAT) / episode_length) AS avg_completion_rate
FROM
Audience
GROUP BY
content_type;
Really appreciate the feedback. Thanks
YorksherMike is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
3
As @C3roe says, the timestamp and timezone are not relevant to the rate.
For the completion rate:
- Calculate the play time as a percentage of the runtime (episode length), and average that across the
content_type
. - (optional): Multiply that by 100 to get the percentage if required.
- (optional):
ROUND
to N decimals places if required.
SELECT
content_type,
AVG(SUM([play_time(seconds)]) / SUM([Episode_length(seconds)]) * 100 AS avg_completion_rate
FROM
Audience
GROUP BY
content_type;
4