I have connected a dynamoDB table to Athena using the DynamoDB Connector. However, when I check the table schema in Athena Query Editor, I find it is inferred incorrectly. In the table, I have a large nested map attribute as the value with a UUID as the key for that map attribute. As the UUIDs are unique to each row in the table, since Athena has inferred the table schema from a subset of the table, the schema now has the UUID in it, which means I cannot query for that attribute at ALL, as only the row with that UUID will be returned correctly even in a simple SELECT * FROM TABLE_a statement. I understand using a Glue Crawler set up would solve this issue, but I want to keep costs down and wanted to try and work with the DynamoDB connector if possible.
Rough eg of the table and schema:
{
"ID" : {...},
"Email" : {...},
"Products" : {"407ab488-24sd80-a2eds-fwfd2a5" : {...}}
}
Email -varchar
ID -varchar
Products -
struct<407ab488-24sd80-a2eds-fwfd2a5:struct<Products:struct<Product Status:boolean>,<Profile Picture:varchar>,<Location Info:struct<District:varchar,City:varchar,Address2:varchar,Address1:varchar,Country:varchar,Pincode:varchar,Location:varchar>
As you can see, the attribute has a UUID taken from one of the rows during schema inference.
On trying to edit the schema, I found I was not able to do that. It seems only a glue crawler set up allows for schema editing? Reading documentation and comments by AWS Athena Devs seems to imply that the “limited inference” of the DynamoDB Connector is a known issue, but I could not find any solution for this (my mistake if the solution exists, but was not found by me in searching around).
Simply put, is there any known method to edit the dynamodb table schema to allow for unique key values in nested map attributes whilst using the DynamoDB connector? I was trying to create an “ideal sample” row for the schema to be inferred from, but how can I inclue all the possible UUIDs in that sample?