and thank you for taking the time to read me.
My end game is to run some Python script onto Google Cloud so I can automate some data pipeline; it is my first time setting up something like this.
I seem to be having an issue with the requirement.txt file, which prevents me from running my script successfully.
Could you please help?
On the same subject, since I am running the script onto Google Cloud, do I need to reference the credential.json file that is supposed to identify me on the platform?
Much appreciated.
I started by scheduling a simple ‘hello world’ test script without issue. So, I followed up with a more complex script that involved importing various modules. This is when the flow of error messages started to pile up.
Here is the content of my requirement.txt file:
google-cloud-bigquery
pandas
time
os
And here is the code for my main.py file:
import pandas as pd
import os
from google.cloud import bigquery
import time
# Get a dataframe going
dic = {'Name':['Peter', 'Paul', 'Jack'], 'Sales':[1000,2000,3000]}
df = pd.DataFrame.from_dict(dic)
# Setup the BigQuery connection
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '.key.json'
client = bigquery.Client()
dataset = client.dataset('Test_Dataset')
table = dataset.table('Test_Table')
job_config = bigquery.LoadJobConfig( autodetect = True )
# Load the dataframe to BigQuery
job = client.load_table_from_dataframe(df, table, job_config=job_config)
# Wait for the job to finish
while job.state != 'DONE':
time.sleep(1)
job.reload()
print(job.state.title())
print(f'Job result: {job.result()}')
1