I have a json list of timestamps:
[
"2024-03-27 00:30:30.321000",
"2024-03-27 00:34:58.695000",
"2024-03-27 00:37:38.352000",
"2024-03-27 00:37:40.419000",
"2024-03-27 00:43:54.536000",
"2024-03-27 00:49:39.231000",
"2024-03-27 01:03:39.637000",
"2024-03-27 01:05:24.370000",
"2024-03-27 01:17:43.586000",
"2024-03-27 01:17:47.447000",
"2024-03-27 01:17:59.913000",
"2024-03-27 01:18:34.872000",
"2024-03-27 01:18:36.922000",
"2024-03-27 01:18:44.626000",
"2024-03-27 01:19:11.057000",
"2024-03-27 01:19:12.307000",
"2024-03-27 01:21:11.322000",
"2024-03-27 01:26:54.640000",
"2024-03-27 01:26:55.055000",
...
I am looking to plot the frequency of them, for example per hour. I could get this to work with pandas, but that required me to add a dummy column:
[
{
"foo": 1,
"ts": "2024-03-27 00:24:13.132000"
},
{
"foo": 1,
"ts": "2024-03-27 00:30:30.321000"
},
{
"foo": 1,
"ts": "2024-03-27 00:34:58.695000"
},
{
"foo": 1,
"ts": "2024-03-27 00:36:04.166000"
},
{
"foo": 1,
"ts": "2024-03-27 00:37:38.352000"
},
{
"foo": 1,
"ts": "2024-03-27 00:37:40.419000"
},
{
"foo": 1,
"ts": "2024-03-27 00:43:54.536000"
},
....
]
So I could use sum()
:
import sys
import pandas as pd
freq = '1d'
df = pd.read_json(sys.stdin)
df['ts'] = pd.to_datetime(df['ts'])
overview = df.resample(freq, on='ts').foo.sum()
print(overview)
This gave what I am looking for:
2024-03-27 674
2024-03-28 405
2024-03-29 366
2024-03-30 352
2024-03-31 541
2024-04-01 657
2024-04-02 398
2024-04-03 523
2024-04-04 466
2024-04-05 498
2024-04-06 468
2024-04-07 312
2024-04-08 453
2024-04-09 625
2024-04-10 654
2024-04-11 696
2024-04-12 624
2024-04-13 377
2024-04-14 304
2024-04-15 493
2024-04-16 544
2024-04-17 526
Can I do this without the dummy column? So just with the simple list of timestamps as input?