why flink checkpint always in-progress state

I am using flink , read kafka and write kafka, it’s simple job. but i face some quetions.

1、flink checkpoint always in-progress state.

2、the job logs don’t have any useful message.

  • job running state

  • jm log

    2024-06-03 19:35:01[flink-akka.actor.default-dispatcher-6][org.apache.flink.runtime.executiongraph.ExecutionGraph][INFO] - Job [默认机构]-[mengh-test]-[xh_kafka2kafka]-[2] (50a99ba35e78fc2dada1595a64c1095d) switched from state CREATED to RUNNING.
    2024-06-03 19:35:01[flink-akka.actor.default-dispatcher-6][org.apache.flink.runtime.executiongraph.ExecutionGraph][INFO] - Source: Kafka采集_1 (1/1) (67303c1e5204a0e0ca67159cd535d445_32327f23cd3a7e9293b6f6e514156c67_0_0) switched from CREATED to SCHEDULED.
    2024-06-03 19:35:01[flink-akka.actor.default-dispatcher-6][org.apache.flink.runtime.executiongraph.ExecutionGraph][INFO] - Sink: Kafka输出_2 (1/1) (67303c1e5204a0e0ca67159cd535d445_2534dea4e3192b38956d113863687316_0_0) switched from CREATED to SCHEDULED.
    2024-06-03 19:35:01[flink-akka.actor.default-dispatcher-6][org.apache.flink.runtime.jobmaster.JobMaster][INFO] - Connecting to ResourceManager akka.tcp://flink@xh2:36541/user/rpc/resourcemanager_*(00000000000000000000000000000000)
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.hadoop.hdfs.DFSClient][INFO] - Created HDFS_DELEGATION_TOKEN token 71 for paUser on ha-hdfs:nameservice1
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.hadoop.hdfs.DFSClient][INFO] - Created HDFS_DELEGATION_TOKEN token 72 for paUser on ha-hdfs:nameservice1
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.flink.runtime.security.token.KerberosDelegationTokenManager][INFO] - Tokens update task started with 64799860 ms delay
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.hadoop.ipc.Client][WARN] - Failed to connect to server: xh3/10.100.1.167:8030: retries get failed due to exceeded maximum allowed retries number: 0
    java.net.ConnectException: Connection refused
      at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_25]
      at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) ~[?:1.8.0_25]
      at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) ~[flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) ~[flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:685) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:788) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.ipc.Client.getConnection(Client.java:1550) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.ipc.Client.call(Client.java:1381) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.ipc.Client.call(Client.java:1345) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at com.sun.proxy.$Proxy40.registerApplicationMaster(Unknown Source) [?:?]
      at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_25]
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_25]
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_25]
      at java.lang.reflect.Method.invoke(Method.java:483) ~[?:1.8.0_25]
      at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at com.sun.proxy.$Proxy41.registerApplicationMaster(Unknown Source) [?:?]
      at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:236) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:228) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.registerApplicationMaster(AMRMClientAsyncImpl.java:168) [flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
      at org.apache.flink.yarn.YarnResourceManagerDriver.registerApplicationMaster(YarnResourceManagerDriver.java:525) [flink-dist-1.16.1.jar:1.16.1]
      at org.apache.flink.yarn.YarnResourceManagerDriver.initializeInternal(YarnResourceManagerDriver.java:186) [flink-dist-1.16.1.jar:1.16.1]
      at org.apache.flink.runtime.resourcemanager.active.AbstractResourceManagerDriver.initialize(AbstractResourceManagerDriver.java:92) [flink-dist-1.16.1.jar:1.16.1]
      at org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager.initialize(ActiveResourceManager.java:171) [flink-dist-1.16.1.jar:1.16.1]
      at org.apache.flink.runtime.resourcemanager.ResourceManager.startResourceManagerServices(ResourceManager.java:269) [flink-dist-1.16.1.jar:1.16.1]
      at org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:241) [flink-dist-1.16.1.jar:1.16.1]
      at org.apache.flink.runtime.rpc.RpcEndpoint.internalCallOnStart(RpcEndpoint.java:198) [flink-dist-1.16.1.jar:1.16.1]
      at org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StoppedState.lambda$start$0(AkkaRpcActor.java:622) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StoppedState$$Lambda$448/1738199324.run(Unknown Source) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StoppedState.start(AkkaRpcActor.java:621) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleControlMessage(AkkaRpcActor.java:190) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at org.apache.flink.runtime.rpc.akka.AkkaRpcActor$$Lambda$445/1346710180.apply(Unknown Source) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at akka.actor.Actor.aroundReceive(Actor.scala:537) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at akka.actor.Actor.aroundReceive$(Actor.scala:535) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at akka.actor.ActorCell.invoke(ActorCell.scala:548) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at akka.dispatch.Mailbox.run(Mailbox.scala:231) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at akka.dispatch.Mailbox.exec(Mailbox.scala:243) [flink-rpc-akka_b38f95be-66ba-4627-bef8-8e33e48bf6cf.jar:1.16.1]
      at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) [?:1.8.0_25]
      at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:902) [?:1.8.0_25]
      at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1689) [?:1.8.0_25]
      at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1644) [?:1.8.0_25]
      at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) [?:1.8.0_25]
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider][INFO] - Failing over to rm2
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.flink.yarn.YarnResourceManagerDriver][INFO] - Recovered 0 containers from previous attempts ([]).
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager][INFO] - Recovered 0 workers from previous attempt.
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.flink.runtime.externalresource.ExternalResourceUtils][INFO] - Enabled external resources: []
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl][INFO] - Upper bound of the thread pool size is 500
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy][INFO] - yarn.client.max-cached-nodemanagers-proxies : 0
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-6][org.apache.flink.runtime.jobmaster.JobMaster][INFO] - Resolved ResourceManager address, beginning registration
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager][INFO] - Registering job manager [email protected]://flink@xh2:36541/user/rpc/jobmanager_1 for job 50a99ba35e78fc2dada1595a64c1095d.
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager][INFO] - Registered job manager [email protected]://flink@xh2:36541/user/rpc/jobmanager_1 for job 50a99ba35e78fc2dada1595a64c1095d.
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-6][org.apache.flink.runtime.jobmaster.JobMaster][INFO] - JobManager successfully registered at ResourceManager, leader id: 00000000000000000000000000000000.
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-6][org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager][INFO] - Received resource requirements from job 50a99ba35e78fc2dada1595a64c1095d: [ResourceRequirement{resourceProfile=ResourceProfile{UNKNOWN}, numberOfRequiredSlots=1}]
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager][INFO] - Requesting new worker with resource spec WorkerResourceSpec {cpuCores=1.0, taskHeapSize=198.400mb (208037478 bytes), taskOffHeapSize=0 bytes, networkMemSize=64.000mb (67108864 bytes), managedMemSize=57.600mb (60397978 bytes), numSlots=1}, current pending count: 1.
    2024-06-03 19:35:02[flink-akka.actor.default-dispatcher-19][org.apache.flink.yarn.YarnResourceManagerDriver][INFO] - Requesting new TaskExecutor container with resource TaskExecutorProcessSpec {cpuCores=1.0, frameworkHeapSize=128.000mb (134217728 bytes), frameworkOffHeapSize=128.000mb (134217728 bytes), taskHeapSize=198.400mb (208037478 bytes), taskOffHeapSize=0 bytes, networkMemSize=64.000mb (67108864 bytes), managedMemorySize=57.600mb (60397978 bytes), jvmMetaspaceSize=256.000mb (268435456 bytes), jvmOverheadSize=192.000mb (201326592 bytes), numSlots=1}, priority 1.
    2024-06-03 19:35:07[Checkpoint Timer][org.apache.flink.runtime.checkpoint.CheckpointFailureManager][INFO] - Failed to trigger checkpoint for job 50a99ba35e78fc2dada1595a64c1095d since Checkpoint triggering task Source: Kafka采集_1 (1/1) of job 50a99ba35e78fc2dada1595a64c1095d is not being executed at the moment. Aborting checkpoint. Failure reason: Not all required tasks are currently running..
    2024-06-03 19:35:07[AMRM Heartbeater thread][org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl][INFO] - Received new token for : xh3:36051
    2024-06-03 19:35:07[flink-akka.actor.default-dispatcher-19][org.apache.flink.yarn.YarnResourceManagerDriver][INFO] - Received 1 containers.
    2024-06-03 19:35:07[flink-akka.actor.default-dispatcher-19][org.apache.flink.yarn.YarnResourceManagerDriver][INFO] - Received 1 containers with priority 1, 1 pending container requests.
    2024-06-03 19:35:07[flink-akka.actor.default-dispatcher-19][org.apache.flink.yarn.YarnResourceManagerDriver][INFO] - Removing container request Capability[<memory:1024, vCores:1>]Priority[1].
    2024-06-03 19:35:07[flink-akka.actor.default-dispatcher-19][org.apache.flink.yarn.YarnResourceManagerDriver][INFO] - Accepted 1 requested containers, returned 0 excess containers, 0 pending container requests of resource <memory:1024, vCores:1>.
    2024-06-03 19:35:07[cluster-io-thread-4][org.apache.flink.yarn.YarnResourceManagerDriver][INFO] - TaskExecutor container_e02_1716186778189_0029_01_000002(xh3:36051) will be started on xh3 with TaskExecutorProcessSpec {cpuCores=1.0, frameworkHeapSize=128.000mb (134217728 bytes), frameworkOffHeapSize=128.000mb (134217728 bytes), taskHeapSize=198.400mb (208037478 bytes), taskOffHeapSize=0 bytes, networkMemSize=64.000mb (67108864 bytes), managedMemorySize=57.600mb (60397978 bytes), jvmMetaspaceSize=256.000mb (268435456 bytes), jvmOverheadSize=192.000mb (201326592 bytes), numSlots=1}.
    2024-06-03 19:35:07[cluster-io-thread-4][org.apache.flink.yarn.YarnResourceManagerDriver][INFO] - TM:Adding keytab hdfs://nameservice1/user/paUser/.flink/application_1716186778189_0029/paUser.keytab to the container local resource bucket
    2024-06-03 19:35:08[cluster-io-thread-4][org.apache.flink.yarn.YarnResourceManagerDriver][INFO] - Creating container launch context for TaskManagers
    2024-06-03 19:35:08[cluster-io-thread-4][org.apache.flink.yarn.YarnResourceManagerDriver][INFO] - Starting TaskManagers
    2024-06-03 19:35:08[flink-akka.actor.default-dispatcher-19][org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager][INFO] - Requested worker container_e02_1716186778189_0029_01_000002(xh3:36051) with resource spec WorkerResourceSpec {cpuCores=1.0, taskHeapSize=198.400mb (208037478 bytes), taskOffHeapSize=0 bytes, networkMemSize=64.000mb (67108864 bytes), managedMemSize=57.600mb (60397978 bytes), numSlots=1}.
    2024-06-03 19:35:08[org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl #0][org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl][INFO] - Processing Event EventType: START_CONTAINER for Container container_e02_1716186778189_0029_01_000002
    2024-06-03 19:35:21[flink-akka.actor.default-dispatcher-24][org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager][INFO] - Registering TaskManager with ResourceID container_e02_1716186778189_0029_01_000002(xh3:36051) (akka.tcp://flink@xh3:45749/user/rpc/taskmanager_0) at ResourceManager
    2024-06-03 19:35:21[flink-akka.actor.default-dispatcher-6][org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager][INFO] - Worker container_e02_1716186778189_0029_01_000002(xh3:36051) is registered.
    2024-06-03 19:35:21[flink-akka.actor.default-dispatcher-6][org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager][INFO] - Worker container_e02_1716186778189_0029_01_000002(xh3:36051) with resource spec WorkerResourceSpec {cpuCores=1.0, taskHeapSize=198.400mb (208037478 bytes), taskOffHeapSize=0 bytes, networkMemSize=64.000mb (67108864 bytes), managedMemSize=57.600mb (60397978 bytes), numSlots=1} was requested in current attempt. Current pending count after registering: 0.
    2024-06-03 19:35:24[flink-akka.actor.default-dispatcher-22][org.apache.flink.runtime.executiongraph.ExecutionGraph][INFO] - Source: Kafka采集_1 (1/1) (67303c1e5204a0e0ca67159cd535d445_32327f23cd3a7e9293b6f6e514156c67_0_0) switched from SCHEDULED to DEPLOYING.
    2024-06-03 19:35:24[flink-akka.actor.default-dispatcher-22][org.apache.flink.runtime.executiongraph.ExecutionGraph][INFO] - Deploying Source: Kafka采集_1 (1/1) (attempt #0) with attempt id 67303c1e5204a0e0ca67159cd535d445_32327f23cd3a7e9293b6f6e514156c67_0_0 and vertex id 32327f23cd3a7e9293b6f6e514156c67_0 to container_e02_1716186778189_0029_01_000002 @ xh3 (dataPort=45797) with allocation id 5de59394618459594cea507b0b706534
    2024-06-03 19:35:24[flink-akka.actor.default-dispatcher-22][org.apache.flink.runtime.executiongraph.ExecutionGraph][INFO] - Sink: Kafka输出_2 (1/1) (67303c1e5204a0e0ca67159cd535d445_2534dea4e3192b38956d113863687316_0_0) switched from SCHEDULED to DEPLOYING.
    2024-06-03 19:35:24[flink-akka.actor.default-dispatcher-22][org.apache.flink.runtime.executiongraph.ExecutionGraph][INFO] - Deploying Sink: Kafka输出_2 (1/1) (attempt #0) with attempt id 67303c1e5204a0e0ca67159cd535d445_2534dea4e3192b38956d113863687316_0_0 and vertex id 2534dea4e3192b38956d113863687316_0 to container_e02_1716186778189_0029_01_000002 @ xh3 (dataPort=45797) with allocation id 5de59394618459594cea507b0b706534
    2024-06-03 19:35:25[flink-akka.actor.default-dispatcher-22][org.apache.flink.runtime.executiongraph.ExecutionGraph][INFO] - Sink: Kafka输出_2 (1/1) (67303c1e5204a0e0ca67159cd535d445_2534dea4e3192b38956d113863687316_0_0) switched from DEPLOYING to INITIALIZING.
    2024-06-03 19:35:25[flink-akka.actor.default-dispatcher-22][org.apache.flink.runtime.executiongraph.ExecutionGraph][INFO] - Source: Kafka采集_1 (1/1) (67303c1e5204a0e0ca67159cd535d445_32327f23cd3a7e9293b6f6e514156c67_0_0) switched from DEPLOYING to INITIALIZING.
    2024-06-03 19:35:46[flink-akka.actor.default-dispatcher-21][org.apache.flink.runtime.executiongraph.ExecutionGraph][INFO] - Sink: Kafka输出_2 (1/1) (67303c1e5204a0e0ca67159cd535d445_2534dea4e3192b38956d113863687316_0_0) switched from INITIALIZING to RUNNING.
    2024-06-03 19:35:47[flink-akka.actor.default-dispatcher-21][org.apache.flink.runtime.executiongraph.ExecutionGraph][INFO] - Source: Kafka采集_1 (1/1) (67303c1e5204a0e0ca67159cd535d445_32327f23cd3a7e9293b6f6e514156c67_0_0) switched from INITIALIZING to RUNNING.
    2024-06-03 19:36:07[Checkpoint Timer][org.apache.flink.runtime.checkpoint.CheckpointCoordinator][INFO] - Triggering checkpoint 1 (type=CheckpointType{name='Checkpoint', sharingFilesStrategy=FORWARD_BACKWARD}) @ 1717414567372 for job 50a99ba35e78fc2dada1595a64c1095d.
    
  • tm log

    
    2024-06-03 19:35:24[Sink: Kafka输出_2 (1/1)#0][org.apache.flink.runtime.taskmanager.Task][INFO] - Loading JAR files for task Sink: Kafka输出_2 (1/1)#0 (67303c1e5204a0e0ca67159cd535d445_2534dea4e3192b38956d113863687316_0_0) [DEPLOYING].
    2024-06-03 19:35:24[Source: Kafka采集_1 (1/1)#0][org.apache.flink.streaming.runtime.tasks.StreamTask][INFO] - Using job/cluster config to configure application-defined state backend: org.apache.flink.runtime.state.hashmap.HashMapStateBackend@164a5a6a
    2024-06-03 19:35:24[Sink: Kafka输出_2 (1/1)#0][org.apache.flink.streaming.runtime.tasks.StreamTask][INFO] - Using job/cluster config to configure application-defined state backend: org.apache.flink.runtime.state.hashmap.HashMapStateBackend@4baf67bb
    2024-06-03 19:35:24[Source: Kafka采集_1 (1/1)#0][org.apache.flink.streaming.runtime.tasks.StreamTask][INFO] - Using application-defined state backend: org.apache.flink.runtime.state.hashmap.HashMapStateBackend@32c0868a
    2024-06-03 19:35:24[Source: Kafka采集_1 (1/1)#0][org.apache.flink.runtime.state.StateBackendLoader][INFO] - State backend loader loads the state backend as HashMapStateBackend
    2024-06-03 19:35:24[Sink: Kafka输出_2 (1/1)#0][org.apache.flink.streaming.runtime.tasks.StreamTask][INFO] - Using application-defined state backend: org.apache.flink.runtime.state.hashmap.HashMapStateBackend@798815c7
    2024-06-03 19:35:24[Sink: Kafka输出_2 (1/1)#0][org.apache.flink.runtime.state.StateBackendLoader][INFO] - State backend loader loads the state backend as HashMapStateBackend
    2024-06-03 19:35:24[Source: Kafka采集_1 (1/1)#0][org.apache.flink.streaming.runtime.tasks.StreamTask][INFO] - Using job/cluster config to configure application-defined checkpoint storage: JobManagerCheckpointStorage (checkpoints to JobManager) ( maxStateSize: 5242880)
    2024-06-03 19:35:24[Sink: Kafka输出_2 (1/1)#0][org.apache.flink.streaming.runtime.tasks.StreamTask][INFO] - Using job/cluster config to configure application-defined checkpoint storage: JobManagerCheckpointStorage (checkpoints to JobManager) ( maxStateSize: 5242880)
    2024-06-03 19:35:25[Source: Kafka采集_1 (1/1)#0][org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory][WARN] - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
    2024-06-03 19:35:25[Sink: Kafka输出_2 (1/1)#0][org.apache.flink.runtime.taskmanager.Task][INFO] - Sink: Kafka输出_2 (1/1)#0 (67303c1e5204a0e0ca67159cd535d445_2534dea4e3192b38956d113863687316_0_0) switched from DEPLOYING to INITIALIZING.
    2024-06-03 19:35:25[Source: Kafka采集_1 (1/1)#0][org.apache.flink.runtime.taskmanager.Task][INFO] - Source: Kafka采集_1 (1/1)#0 (67303c1e5204a0e0ca67159cd535d445_32327f23cd3a7e9293b6f6e514156c67_0_0) switched from DEPLOYING to INITIALIZING.
    2024-06-03 19:35:25[Source: Kafka采集_1 (1/1)#0][org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase][INFO] - Consumer subtask 0 has no restore state.
    2024-06-03 19:35:25[Source: Kafka采集_1 (1/1)#0][cn.com.bsfit.pipeace.component.flink.kafka.KafkaSourceComponent][INFO] - consumer krb5Path is /vdir/opt/hadoop/yarn/local/usercache/paUser/appcache/application_1716186778189_0029/container_e02_1716186778189_0029_01_000002/tempFile/8bef0b9c2306b9d7d40fb95e940d0afckrb5.conf
    2024-06-03 19:35:46[Sink: Kafka输出_2 (1/1)#0][cn.com.bsfit.org.apache.flink.streaming.connectors.kafka.KafkaSinkComponent][INFO] - producer krb5Path is /vdir/opt/hadoop/yarn/local/usercache/paUser/appcache/application_1716186778189_0029/container_e02_1716186778189_0029_01_000002/tempFile/8bef0b9c2306b9d7d40fb95e940d0afckrb5.conf
    2024-06-03 19:35:46[Sink: Kafka输出_2 (1/1)#0][cn.com.bsfit.org.apache.flink.streaming.connectors.kafka.KafkaSinkComponent][INFO] - producer jaasStr is com.sun.security.auth.module.Krb5LoginModule required
    useKeyTab=true
    keyTab="/vdir/opt/hadoop/yarn/local/usercache/paUser/appcache/application_1716186778189_0029/container_e02_1716186778189_0029_01_000002/tempFile/8bef0b9c2306b9d7d40fb95e940d0afckadm5.keytab"
    principal="paUser@TDH"
    useTicketCache=false
    storeKey=true
    refreshKrb5Config=true
    debug=true;
    
    2024-06-03 19:35:46[Sink: Kafka输出_2 (1/1)#0][org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction][INFO] - KafkaSinkComponent 1/1 - no state to restore
    2024-06-03 19:35:46[Sink: Kafka输出_2 (1/1)#0][cn.com.bsfit.org.apache.flink.streaming.connectors.kafka.PaFlinkKafkaProducer][INFO] - Starting FlinkKafkaInternalProducer (1/1) to produce into default topic pa-first-topic
    2024-06-03 19:35:46[Sink: Kafka输出_2 (1/1)#0][org.apache.flink.runtime.taskmanager.Task][INFO] - Sink: Kafka输出_2 (1/1)#0 (67303c1e5204a0e0ca67159cd535d445_2534dea4e3192b38956d113863687316_0_0) switched from INITIALIZING to RUNNING.
    2024-06-03 19:35:47[Source: Kafka采集_1 (1/1)#0][org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase][INFO] - Consumer subtask 0 will start reading the following 3 partitions from the committed group offsets in Kafka: [KafkaTopicPartition{topic='mengh-test', partition=2}, KafkaTopicPartition{topic='mengh-test', partition=0}, KafkaTopicPartition{topic='mengh-test', partition=1}]
    2024-06-03 19:35:47[Source: Kafka采集_1 (1/1)#0][org.apache.flink.runtime.taskmanager.Task][INFO] - Source: Kafka采集_1 (1/1)#0 (67303c1e5204a0e0ca67159cd535d445_32327f23cd3a7e9293b6f6e514156c67_0_0) switched from INITIALIZING to RUNNING.
    2024-06-03 19:35:47[Legacy Source Thread - Source: Kafka采集_1 (1/1)#0][org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase][INFO] - Consumer subtask 0 creating fetcher with offsets {KafkaTopicPartition{topic='mengh-test', partition=2}=-915623761773, KafkaTopicPartition{topic='mengh-test', partition=0}=-915623761773, KafkaTopicPartition{topic='mengh-test', partition=1}=-915623761773}.
    
    

i carefully read the job log, but don’t find any useful message, logs show everything is ok. except a refused connectio warn log(you can get detail in my jm log file)

please tell me the reason for why my job checkpoint always in-progress state, and how to make the job normal checkpoint.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật