I packaged my lookout for vision model into a greengrass component and I am deploying it to my Jetson Xavier AGX. I have deployed this model to this same machine successfully in the past but IT recently wiped it and am running into the following issue. Note that IT did set up a bunch of security stuff so maybe something is messing with the gcc_user/gcc_group?
From the AWS documentation, I borrowed this function.
<code>def start_model_if_needed(stub, model_name):
# Starting model if needed.
model_description_response = stub.DescribeModel(pb2.DescribeModelRequest(model_component=model_name))
print(f"DescribeModel() returned {model_description_response}")
if model_description_response.model_description.status == pb2.RUNNING:
print("Model is already running.")
elif model_description_response.model_description.status == pb2.STOPPED:
print("Starting the model.")
stub.StartModel(pb2.StartModelRequest(model_component=model_name))
elif model_description_response.model_description.status == pb2.FAILED:
raise Exception(f"model {model_name} failed to start")
print(f"Waiting for model to start.")
if model_description_response.model_description.status != pb2.STARTING:
<code>def start_model_if_needed(stub, model_name):
# Starting model if needed.
while True:
model_description_response = stub.DescribeModel(pb2.DescribeModelRequest(model_component=model_name))
print(f"DescribeModel() returned {model_description_response}")
if model_description_response.model_description.status == pb2.RUNNING:
print("Model is already running.")
break
elif model_description_response.model_description.status == pb2.STOPPED:
print("Starting the model.")
stub.StartModel(pb2.StartModelRequest(model_component=model_name))
continue
elif model_description_response.model_description.status == pb2.FAILED:
raise Exception(f"model {model_name} failed to start")
print(f"Waiting for model to start.")
if model_description_response.model_description.status != pb2.STARTING:
break
time.sleep(1.0)
</code>
def start_model_if_needed(stub, model_name):
# Starting model if needed.
while True:
model_description_response = stub.DescribeModel(pb2.DescribeModelRequest(model_component=model_name))
print(f"DescribeModel() returned {model_description_response}")
if model_description_response.model_description.status == pb2.RUNNING:
print("Model is already running.")
break
elif model_description_response.model_description.status == pb2.STOPPED:
print("Starting the model.")
stub.StartModel(pb2.StartModelRequest(model_component=model_name))
continue
elif model_description_response.model_description.status == pb2.FAILED:
raise Exception(f"model {model_name} failed to start")
print(f"Waiting for model to start.")
if model_description_response.model_description.status != pb2.STARTING:
break
time.sleep(1.0)
However, I am seeing this output when I run it:
<code>DescribeModel() returned model_description {
model_component: "[ModelName]"
lookout_vision_model_arn: "..."
status_message: "Model failed with unknown reason."
<code>DescribeModel() returned model_description {
model_component: "[ModelName]"
lookout_vision_model_arn: "..."
status: FAILED
status_message: "Model failed with unknown reason."
}
</code>
DescribeModel() returned model_description {
model_component: "[ModelName]"
lookout_vision_model_arn: "..."
status: FAILED
status_message: "Model failed with unknown reason."
}
Here is my /greengrass/v2/logs/aws.iot.lookoutvision.EdgeAgent.log
<code>2024-09-21T08:31:35.939Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. File "/greengrass/v2/work/aws.iot.lookoutvision.EdgeAgent/env/lib/python3.8/site-packages/dlr/api.py", line 88, in __init__. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.939Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. raise ex. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.940Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. File "/greengrass/v2/work/aws.iot.lookoutvision.EdgeAgent/env/lib/python3.8/site-packages/dlr/api.py", line 85, in __init__. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.940Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. self._impl = DLRModelImpl(model_path, dev_type, dev_id, error_log_file, use_default_dlr). {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.940Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. File "/greengrass/v2/work/aws.iot.lookoutvision.EdgeAgent/env/lib/python3.8/site-packages/dlr/dlr_model.py", line 79, in __init__. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.940Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. self._check_call(self._lib.CreateDLRModel(byref(self.handle),. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.941Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. File "/greengrass/v2/work/aws.iot.lookoutvision.EdgeAgent/env/lib/python3.8/site-packages/dlr/dlr_model.py", line 160, in _check_call. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.941Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. raise DLRError(self._lib.DLRGetLastError().decode('ascii')). {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.941Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. dlr.dlr_model.DLRError. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.941Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
<code>2024-09-21T08:31:35.939Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. File "/greengrass/v2/work/aws.iot.lookoutvision.EdgeAgent/env/lib/python3.8/site-packages/dlr/api.py", line 88, in __init__. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.939Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. raise ex. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.940Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. File "/greengrass/v2/work/aws.iot.lookoutvision.EdgeAgent/env/lib/python3.8/site-packages/dlr/api.py", line 85, in __init__. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.940Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. self._impl = DLRModelImpl(model_path, dev_type, dev_id, error_log_file, use_default_dlr). {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.940Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. File "/greengrass/v2/work/aws.iot.lookoutvision.EdgeAgent/env/lib/python3.8/site-packages/dlr/dlr_model.py", line 79, in __init__. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.940Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. self._check_call(self._lib.CreateDLRModel(byref(self.handle),. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.941Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. File "/greengrass/v2/work/aws.iot.lookoutvision.EdgeAgent/env/lib/python3.8/site-packages/dlr/dlr_model.py", line 160, in _check_call. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.941Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. raise DLRError(self._lib.DLRGetLastError().decode('ascii')). {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.941Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. dlr.dlr_model.DLRError. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.941Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
</code>
2024-09-21T08:31:35.939Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. File "/greengrass/v2/work/aws.iot.lookoutvision.EdgeAgent/env/lib/python3.8/site-packages/dlr/api.py", line 88, in __init__. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.939Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. raise ex. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.940Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. File "/greengrass/v2/work/aws.iot.lookoutvision.EdgeAgent/env/lib/python3.8/site-packages/dlr/api.py", line 85, in __init__. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.940Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. self._impl = DLRModelImpl(model_path, dev_type, dev_id, error_log_file, use_default_dlr). {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.940Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. File "/greengrass/v2/work/aws.iot.lookoutvision.EdgeAgent/env/lib/python3.8/site-packages/dlr/dlr_model.py", line 79, in __init__. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.940Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. self._check_call(self._lib.CreateDLRModel(byref(self.handle),. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.941Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. File "/greengrass/v2/work/aws.iot.lookoutvision.EdgeAgent/env/lib/python3.8/site-packages/dlr/dlr_model.py", line 160, in _check_call. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.941Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. raise DLRError(self._lib.DLRGetLastError().decode('ascii')). {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.941Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. dlr.dlr_model.DLRError. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
2024-09-21T08:31:35.941Z [WARN] (Copier) aws.iot.lookoutvision.EdgeAgent: stderr. {scriptName=services.aws.iot.lookoutvision.EdgeAgent.lifecycle.run.script, serviceName=aws.iot.lookoutvision.EdgeAgent, currentState=RUNNING}
And here is my /greengrass/v2/logs/[ModelName].log
<code>2024-09-21T08:14:14.518Z [WARN] (Copier) [ModelName]: stderr. raise _InactiveRpcError(state). {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.518Z [WARN] (Copier) [ModelName]: stderr. grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.519Z [WARN] (Copier) [ModelName]: stderr. status = StatusCode.UNAVAILABLE. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.519Z [WARN] (Copier) [ModelName]: stderr. details = "failed to connect to all addresses". {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.519Z [WARN] (Copier) [ModelName]: stderr. debug_error_string = "{"created":"@1726906454.511841809","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3134,"referenced_errors":[{"created":"@1726906454.511839313","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}". {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.519Z [WARN] (Copier) [ModelName]: stderr. >. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:17.143Z [WARN] (Copier) [ModelName]: stderr. pid-21123 thread-548325888016 - [INFO]: Creating model info.. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:17.145Z [WARN] (Copier) [ModelName]: stderr. pid-21123 thread-548325888016 - [INFO]: Registering model LyraModel(model_component_name='[ModelName]', lyra_model_arn='arn...', model_asset_path='/greengrass/v2/packages/artifacts/[ModelName]/2.0.1/greengrass_model_component.zip').. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:17.149Z [WARN] (Copier) [ModelName]: stderr. pid-21123 thread-548325888016 - [INFO]: Model registered.. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:17.215Z [INFO] (Copier) [ModelName]: Startup script exited. {exitCode=0, serviceName=[ModelName], currentState=STARTING}
<code>2024-09-21T08:14:14.518Z [WARN] (Copier) [ModelName]: stderr. raise _InactiveRpcError(state). {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.518Z [WARN] (Copier) [ModelName]: stderr. grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.519Z [WARN] (Copier) [ModelName]: stderr. status = StatusCode.UNAVAILABLE. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.519Z [WARN] (Copier) [ModelName]: stderr. details = "failed to connect to all addresses". {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.519Z [WARN] (Copier) [ModelName]: stderr. debug_error_string = "{"created":"@1726906454.511841809","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3134,"referenced_errors":[{"created":"@1726906454.511839313","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}". {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.519Z [WARN] (Copier) [ModelName]: stderr. >. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:17.143Z [WARN] (Copier) [ModelName]: stderr. pid-21123 thread-548325888016 - [INFO]: Creating model info.. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:17.145Z [WARN] (Copier) [ModelName]: stderr. pid-21123 thread-548325888016 - [INFO]: Registering model LyraModel(model_component_name='[ModelName]', lyra_model_arn='arn...', model_asset_path='/greengrass/v2/packages/artifacts/[ModelName]/2.0.1/greengrass_model_component.zip').. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:17.149Z [WARN] (Copier) [ModelName]: stderr. pid-21123 thread-548325888016 - [INFO]: Model registered.. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:17.215Z [INFO] (Copier) [ModelName]: Startup script exited. {exitCode=0, serviceName=[ModelName], currentState=STARTING}
</code>
2024-09-21T08:14:14.518Z [WARN] (Copier) [ModelName]: stderr. raise _InactiveRpcError(state). {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.518Z [WARN] (Copier) [ModelName]: stderr. grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.519Z [WARN] (Copier) [ModelName]: stderr. status = StatusCode.UNAVAILABLE. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.519Z [WARN] (Copier) [ModelName]: stderr. details = "failed to connect to all addresses". {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.519Z [WARN] (Copier) [ModelName]: stderr. debug_error_string = "{"created":"@1726906454.511841809","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3134,"referenced_errors":[{"created":"@1726906454.511839313","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}". {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:14.519Z [WARN] (Copier) [ModelName]: stderr. >. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:17.143Z [WARN] (Copier) [ModelName]: stderr. pid-21123 thread-548325888016 - [INFO]: Creating model info.. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:17.145Z [WARN] (Copier) [ModelName]: stderr. pid-21123 thread-548325888016 - [INFO]: Registering model LyraModel(model_component_name='[ModelName]', lyra_model_arn='arn...', model_asset_path='/greengrass/v2/packages/artifacts/[ModelName]/2.0.1/greengrass_model_component.zip').. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:17.149Z [WARN] (Copier) [ModelName]: stderr. pid-21123 thread-548325888016 - [INFO]: Model registered.. {scriptName=services.[ModelName].lifecycle.Startup.Script, serviceName=[ModelName], currentState=STARTING}
2024-09-21T08:14:17.215Z [INFO] (Copier) [ModelName]: Startup script exited. {exitCode=0, serviceName=[ModelName], currentState=STARTING}
I’ve tried removing and recreating the greengrass core component and reinstalling the greengrass nucleus software. The only thing that I can think of is that the lookout for vision model is on us-east-2 while the greengrass component/core device are on us-east-1. Though it seems like I would have to redo all my annotations in order to get the model also on east-1…
My manager and I thank you for the help!!