My code as of now just moves the entire folder from my source bucket to my target bucket. It’s a lambda that has a trigger which copies the folder once a new object is created:
import boto3
import urllib
TARGET_BUCKET = 'final-destination-bucket-name'
def lambda_handler(event, context):
# Get incoming bucket and key
source_bucket = event['Records'][0]['s3']['bucket']['name']
source_key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'])
# Copy object to different bucket
s3_resource = boto3.resource('s3')
copy_source = {
'Bucket': source_bucket,
'Key': source_key
}
s3_resource.Bucket(TARGET_BUCKET).Object(source_key).copy(copy_source)
# Delete the source object
#s3_resource.Bucket(source_bucket).Object(source_key).delete()
The folder called ‘2024052111’ has another folder inside it called ‘1722’ that contains 20 parquet files. I want to move each parquet file into an individual folder so when the lambda is run, it moves 1 parquet file per 1 folder in the target bucket. So if the source has “table_name_1.parquet”, “table_name_2.parquet”, “table_name_3.parquet” all in the folder ‘1722’, I want my lambda to move all 3 parquet files into 3 separate folders in the target bucket. I created 20 different folders in the target s3 bucket which is the name of the parquet file but how do I move each parquet file into it’s respective folder?