I made a pipeline where is use this code snippet. This works fine, till I want to use it on Google Cloud Dataflow. Still I use the requirements file to build the image as I made a Template. But Google Cloud Dataflow Keeps saying ‘beam not found’. And this only happens in the step ‘Add timestaps’. I copied the example from the official site of Apache Beam.
windowed_data = (
data
| 'Add timestamps' >> beam.Map(lambda list: beam.window.TimestampedValue((list[0],list[1], list[3]), datetime.strptime(list[0], '%Y-%m-%d %H:%M:%S %Z').timestamp()))
| 'Window into fixed 15min windows' >> beam.WindowInto(beam.window.FixedWindows(window_size * 60))
)
I tried to change the requirements file, and I tried to put this line of code in other function.
Matthias Van Den Berge is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.