I designing a machine learning model to predict the resource loads that HTTP requests provoke in the server they execute in. Im struggling to find a dataset that has those two.
Anyone knows a dataset that has info about the HTTP request like the context of the request, type (POST,GET…) body size, headers size, time of the day and day of the week, and any other parameters, and also either the impact that the request has on the server resources, or a server CPU, RAM, Network IO dataset during the same timestamp of the HTTP dataset?
I thought of generating synthetic data, but i cant think of how to generate data that is realistic, because if I generate a lot of HTTP calls data and then generate an estimate of the resources some of the caracteristics, the labels will be too biased. Im thinking of generating data by configuring a web server and attacking it while monitoring, but its going to be long, there are not going to be many diferent services…
If someone knows of an existing dataset or has a way to generate realistic and varied synthetic data it will be of a lot of help.
Thanks.