Issue Summary:
Experiencing problems when transferring data from Oracle Cloud Infrastructure (OCI) to Databricks Delta Lake tables using Qlik Replicate.
Source: Oracle Cloud Infrastructure (OCI)
Destination: Databricks Delta Lake tables
Problem Statement:
Data duplicacy
Missing data
Details:
Logs from Qlik Replicate point to issues in Databricks.
Logs from Databricks, checked via “Query History,” also show issues.
Unable to diagnose the exact errors causing data duplication and missing data.
Some network-related errors have been identified.
Uncertain about the next steps to mitigate the problem.
Sample error at Databricks side:
Error 1:
Query could not be scheduled: HTTP Response code: 500
Error 2:
[DELTA_CONCURRENT_APPEND] ConcurrentAppendException: Files were added to the root of the table by a concurrent update. Please try the operation again.
Conflicting commit: {“timestamp”:1722250928325,”userId”:”5954939804701401″,”userName”:”[email protected]”,”operation”:”MERGE”,”operationParameters”:{“predicate”:[“((ITEM#22092922 = seg1_repcol#22092878) AND (LOC#22092925 = seg2_repcol#22092879))”],”matchedPredicates”:[{“actionType”:”update”}],”statsOnLoad”:false,”notMatchedBySourcePredicates”:[],”notMatchedPredicates”:[]},”readVersion”:2917,”isolationLevel”:”WriteSerializable”,”isBlindAppend”:false,”operationMetrics”:{“numTargetRowsCopied”:”0″,”numTargetRowsDeleted”:”0″,”numTargetFilesAdded”:”1″,”numTargetBytesAdded”:”1282377″,”numTargetBytesRemoved”:”1005294″,”numTargetDeletionVectorsAdded”:”158″,”numTargetRowsMatchedUpdated”:”28916″,”executionTimeMs”:”209330″,”numTargetRowsInserted”:”0″,”numTargetRowsMatchedDeleted”:”0″,”numTargetDeletionVectorsUpdated”:”158″,”scanTimeMs”:”83197″,”numTargetRowsUpdated”:”28916″,”numOutputRows”:”28916″,”numTargetDeletionVectorsRemoved”:”158″,”numTargetRowsNotMatchedBySourceUpdated”:”0″,”numTargetChangeFilesAdded”:”0″,”numSourceRows”:”29161″,”numTargetFilesRemoved”:”1″,”numTargetRowsNotMatchedBySourceDeleted”:”0″,”rewriteTimeMs”:”84773″},”tags”:{“noRowsCopied”:”true”,”delta.rowTracking.preserved”:”false”,”restoresDeletedRows”:”false”},”engineInfo”:”Databricks-Runtime/15.2.x-photon-scala2.12″,”txnId”:”eb912069-75fd-4f33-9829-a9cb191b4b7e”}
Sample error at Qlik side pointing to Databricks:
Error1:
02728323: 2024-07-11T00:50:57 [TARGET_LOAD ]E: RetCode: SQL_ERROR SqlState: 08S01 NativeError: 124 Message: [Simba][Hardy] (124) A 503 response was returned but no Retry-After header was provided. Original error: Unknown [1022502] (ar_odbc_stmt.c:5090)
Error 2:
02602439: 2024-07-28T18:12:49 [TARGET_APPLY ]T: RetCode: SQL_ERROR SqlState: 08S01 NativeError: 115 Message: [Simba][Hardy] (115) Connection failed with error: SSL_read: Connection reset by peer [1022502] (ar_odbc_stmt.c:4737)
02602439: 2024-07-28T18:12:49 [TARGET_APPLY ]T: Network error encountered (ar_odbc_util.c:1242)
Error 3:
04171826: 2024-07-29T14:43:45 [TARGET_APPLY ]T: Failed (retcode -1) to execute statement