I’m facing a problem in Chapter 3 of the Data Engineering with AWS. This is regarding the lambda function to convert CSV to parquet.
It wrote the parquet but:
- Table is not created in the Glue Catalogue. Also, at which part of code are we adding a “table” in the catalogue? We are doing it for database in the line:
if db_name not in current_databases.values:
print(f'- Database {db_name} does not exist ... creating')
wr.catalog.create_database(db_name)
else:
print(f'- Database {db_name} already exists')
- For 1/6 CSV file uploads, lambda did not trigger.
- Even after 2 minutes of timeout, the “results” are not getting printed. I think it gets stuck at writing the parquet:
result = wr.s3.to_parquet(
df=input_df,
path=output_path,
dataset=True,
database=db_name,
table=table_name,
mode="append")
print("RESULT: ") # THIS IS NOT GETTING PRINTED.
print(f'{result}') # NOR THIS.
2024-05-01T16:41:02.026Z 7bf8244b-ff47-49e6-b296-09c65da7b43c Task timed out after 122.07 seconds
- The logs say that “task timed out”. Where can I see the logs for this particular task?
I’m using AWS Wrangler 3.73 with Python 3.9 in the lambda layer.
Thanks in advance.
New contributor
omt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.