I am in the process of setting up a framework using trino in python. I have set up the connection as
<code>import trino
connection = trino.dbapi.connect(host = '[email protected]', port = 'fgert', catalog = 'hive',
http_scheme = 'https', username = 'abcd', password = 'tigergfrt')
cursor = connection.cursor
</code>
<code>import trino
connection = trino.dbapi.connect(host = '[email protected]', port = 'fgert', catalog = 'hive',
http_scheme = 'https', username = 'abcd', password = 'tigergfrt')
cursor = connection.cursor
</code>
import trino
connection = trino.dbapi.connect(host = '[email protected]', port = 'fgert', catalog = 'hive',
http_scheme = 'https', username = 'abcd', password = 'tigergfrt')
cursor = connection.cursor
Here I want to set the trino properties that will help me is:
- Running large queries with heavy joins
- Running 1K simple select queries succesively without getting too loaded
- Running long running queries for min 2 hours before giving up on the resources.
- Can look up to 1500 paritions on a single select
The existing framework is in hive with the following properties :
<code>hive.strict.checks.large.query : false
hive.mapred.mode = nonstrict
hive.execution.engine = tez
hive.vectorized.execution.enabled = false
hive.limit.query.max.table.partition = -1
</code>
<code>hive.strict.checks.large.query : false
hive.mapred.mode = nonstrict
hive.execution.engine = tez
hive.vectorized.execution.enabled = false
hive.limit.query.max.table.partition = -1
</code>
hive.strict.checks.large.query : false
hive.mapred.mode = nonstrict
hive.execution.engine = tez
hive.vectorized.execution.enabled = false
hive.limit.query.max.table.partition = -1
I would like the community to help me in setting standard properties and properties specific to my requirements using the trino-presto connection. Though trino also supports hive engines, the aim of the framework is to replace using hive.
Also please let me know how to implement the properties in a session. The way I have found out is :
<code>property_dict = {'optimize_has_generation':'true', 'query_max_execution_time':'1h'}
list(map(lambda x : (cursor.execute(x), cursor.fetchall()),
[f"set session {key} = {value}" for key, value in property_dict.items()]))
</code>
<code>property_dict = {'optimize_has_generation':'true', 'query_max_execution_time':'1h'}
list(map(lambda x : (cursor.execute(x), cursor.fetchall()),
[f"set session {key} = {value}" for key, value in property_dict.items()]))
</code>
property_dict = {'optimize_has_generation':'true', 'query_max_execution_time':'1h'}
list(map(lambda x : (cursor.execute(x), cursor.fetchall()),
[f"set session {key} = {value}" for key, value in property_dict.items()]))