I have been trying to integrate OTel into my Lambdas for a week now. Integrating it in code was no problem, but actually sending the traces and metrics to a Backend of our choice is quite hard. I have initially tried ADOT Collector from AWS, but it was quite slow and didn’t work as expected for some reason, I then configured the original OTel Collector and it was faster, but I just couldn’t get it to send data to Datadog, and then I found out that the Collector Layer used for lambdas is a smaller version of the standard non-layer Collector, and it does not have the Datadog exporter in it. Then, I have tried using the vendor specific Datadog Extension for Lambda (risky move, because it’s Datadog specific, I know) and it claimed to be fully OTel compliant, but it just refuses to send traces made by OTel in the Lambdas. It only sends traces made by the, again, vendor specific Datadog Tracing library, and it’s a higher risk of vendor locking. So far my experience with OTel in lambdas was not so great. This Datadog Exporter is taking even about 1s to start, so it impacts a lot the cold starts and high-load, because of the amount of things it does after it collects the traces.
My question is – is there a good option to run the OTel collector that it’s possible for lambdas to send telemetry data to it without any issue or slowdown? I know about hosting it in ECS or EC2, but then I have to do some tinkering with making the endpoint accessible to lambdas somehow, because the ECS or EC2 instance run in VPC, and I don’t want to have the Lambdas in VPC. Additionally, it’s quite costly to host it like that. Maybe there is a trick I didn’t find or don’t know. Hope to find a solution soon!
ADOT Collector Layer – slow, can’t export data to the desired backend, Datadog
OTel Collector Layer – faster, can’t export data to the desired backend, Datadog
Datadog Extension for Lambdas Layer – the slowest of them all, can’t collect OTel traces and metrics, only Datadog specific data made by their library