I am using Trino docker image(trinodb/trino:400) and did following changes in dockerfile(
USER root
RUN chmod -R 777 /data
RUN chmod -R 777 /etc/trino
USER trino
)
and hive docker image (apache hive: 4.0.1) and added the following jars in docker file as it was complaining(
COPY postgresql-42.7.4.jar /opt/hive/lib/postgresql.jar
ADD hadoop-aws-3.2.0.jar /opt/hive/lib/hadoop-aws-3.2.0.jar
ADD aws-java-sdk-bundle-1.11.271.jar /opt/hive/lib/aws-java-sdk-bundle-1.11.271.jar
). I have connected postgresql to hive as metastore and Minio image( latest : RELEASE.2024-11-07T00-52-20) and I created a empty bucket in MinIO called tiny.
I have created a hive Catalog in Trino to read files from MinIO
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ .Values.applicationName }}-minio-catalog
labels:
app: {{ .Values.applicationName }}
data:
hive.properties: |
connector.name=hive
hive.metastore.uri=thrift://{{ .Values.hive.metastore.host}}:{{ .Values.hive.metastore.port }}
hive.s3.endpoint=http://{{ .Values.minio.host}}:{{ .Values.minio.port }}
hive.s3.aws-access-key={{ .Values.minio.accessKey }}
hive.s3.aws-secret-key={{ .Values.minio.secretKey }}
hive.s3.path-style-access=true
hive.s3.ssl.enabled=false
hive.non-managed-table-writes-enabled=true
hive.storage-format=ORC
Also I have a configMap in hive
apiVersion: v1
kind: ConfigMap
metadata:
name: hive-config
data:
hive-site.xml: |
<configuration>
<property>
<name>metastore.thrift.uris</name>
<value>thrift://hive-metastore.hive.svc.cluster.local:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.postgresql.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:postgresql://{{ .Values.postgresql.host }}:{{ .Values.postgresql.port }}/{{ .Values.postgresql.database }}</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>{{ .Values.postgresql.user }}</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>{{ .Values.postgresql.password }}</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.db.type</name>
<value>postgres</value>
</property>
<property>
<name>fs.s3a.access.key</name>
<value>minio</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>minio123</value>
</property>
<property>
<name>fs.s3a.endpoint</name>
<value>http://minio-service.minio-1.svc.cluster.local:9000</value>
</property>
<property>
<name>fs.s3a.path.style.access</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.pre.event.listeners</name>
<value>org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener</value>
</property>
<property>
<name>hive.security.metastore.authorization.manager</name>
<value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
</property>
<property>
<name>hive.metastore.authorization.storage.checks</name>
<value>false</value>
</property>
</configuration>
I tried this commands:
On Trino CLI: I am able to see Hive catalog.
trino> show schemas in hive;
Schema
--------------------
default
information_schema
(2 rows)
Query 20241202_122641_00010_zzij5, FINISHED, 1 node
BUT hive schemas creation fails with following error:
trino> CREATE SCHEMA hive.tiny WITH (location = 's3a://tiny/');
Query 20241202_122950_00011_zzij5 failed: java.io.IOException: From option fs.s3a.aws.credentials.provider java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider not found
I am expecting to create schemas in hive for reading files in Minio using that schema.
Divya Gupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.