I am using OpenCV to determine the landing location of an object using a camera from a top-down position. I have already gotten the code for the object tracking to work so that it not a problem. The issue I am having is with returning the location in which the object lands.
Because this is a work-related question I have made the background of the captured image white.
The target is a large (diameter) ring that is not very thick. What I am looking to do is figure out whether the object landed on the inner portion, center portion or outside portion of that ring.
The total landing area is a large ring with 3 zones (A,B,C) every five degrees for a total of 216 possible landing locations (image 1). The only thing the cameras sees is the ring. The section/zones and their sizes are decided in the code. By tracking the center point of the object contour I know the [x,y] coordinates (in the frame) for where the object lands. For my example the object lands in the 35 degree section, zone C. The problem I am having is that I don’t know the best way to calculate and store all possible coordinates for each zone/section combination in such a way that when the object lands the program will output “The object landed in Section __ , zone __”.
Because each of the 216 locations is made up by some number of pixels each with their own [x,y] coordinates. My thought was that I would want to iterate over all of the 216 locations and store the possible coordinates that make up each location so that when I get the final position of the object I can compare it to my stored coordinates and return the location it landed in. For this I was considering a dictionary of dictionaries that stores all 72 sections, the three zones within each section, and all possible coordinate pairs within each zone. The reason I thought to use a dictionary was that I can easily return the keys (section and zone name) at the end.
To test whether I could accurately store all the pixels I have written code to color in the section using the equations for all points along the arc. It has not worked well as when I color those pixels I end up both repeating many pixels and also missing some (image 2). I tried revising the code to use Bresenham’s algorithm to try and fix this, but that did not work either (image 3).
A couple of notes that may be worth while:
-
This program is not a consumer facing project, it will be used for testing in-house so it needn’t be overly performant or efficient, but it is still better if it is. What is important is how accurate it is in returning the landing location.
-
The target and camera will always be fixed.
-
Once the object lands all tracking will stop and then the landing location will be determined.
My questions are as follows:
-
Is what I described the correct approach to being able to return where the object lands (i.e. storing the pixel coordinates of defined zones in a dictionary of dictionaries to be compared against later)?
-
Is there a better way to do this? Is there a way to easily define 216 regions of interest that are irregular in shape (they are really just annulus sections). Are there any libraries I can import that would make this easier?
-
How can I accurately, and without repeats plot, or draw these sections pixel by pixel so that I can ensure I am not missing any points.
Here is sample of the code for plotting the pixels (using the non-Bresenham method):
step_size = 0.1
angle_start = 32.5
angle_end = angle_start + 398.2 + step_size
r_in_end = radius_inner + ring_thickness/2
r_out_in = r_out - ring_thickness
print(f'Inner radius (float): {r_out_in} | Inner radius (int): {int(r_out_in)}')
r_out_end = r_out
print(f'Outer radius (float): {r_out_end} | Inner radius (int): {int(r_out_end)}')
angle_value = np.arange(angle_start, angle_end, 0.2)
# print(angle_value)
# print(f'angle_value: {angle_value}')
# print(f'angle_value size: {len(angle_value)}')
section_length = r_out_end - r_out_in + 2
# Calculate the sizes of the outer for loop (height)
# calculate the size of the inner for loop (width)
# Initialize an empty array of the size needed (height X width) to fill
width = int((r_out_end + 1.5 - r_out_in - .5))
height = len(angle_value)
pixel_locations = np.zeros((width * height, 2), dtype=np.int32)
# Intitialize variables for storing points in the array and checking repeats in the array
index = 0
px_match = 0
prev_pt = (0, 0)
# Nest the loops to color pixels at every point along the arc
# for every step in the direction of the inner section radius to the outer section radius.
for r in range(int(r_out_in+.5), int(r_out_end+1.5)):
for angle in angle_value:
# Convert angle to radians and rename as theta for understandibility
# Calculate the x and y positions of the pixel along the arc length
theta = math.radians(angle)
x = int(center_loc[0] + math.sin(theta) * r)
y = int(center_loc[1] + math.cos(theta) * r)
# Store the x and y coords in the empty array
# Increment the index
pixel_locations[index] = [x, y]
index += 1
# Error checking. We don't want a location to repeat
if (x, y) == prev_pt:
# print(f'The current position is:{x, y} | The previous position was: {prev_pt}')
px_match += 1
# Draw the new point on the image to give visual check that the function is working
# cv2.circle(overlay, (x, y), radius=1, color=(255, 0, 0), thickness=-1)
overlay[y, x] = (255, 0, 0)
# store the current location as the new previous for the next loop.
prev_pt = (x, y)
# Print out number of repeats.
print(f'The number of overlapping pixels is: {px_match}')
Target_transparent = cv2.addWeighted(overlay, alpha, TargetImg_Resized, 1 - alpha, 0)
Learning4Fun is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.