I have a rasterized image that contains two colors: white and black. The black portion is entirely connected and looks like a big blob. Is there a good way to estimate the length of the boundary between the regions?
One method I have considered is to simply count how many “edges” of the rasterized grid have a different color on each side. However, for e.g. a purely diagonal line, this will overestimate the distance of the boundary by a factor of sqrt(2). Is there a better method for estimating the boundary length which does not have this problem?
3
The blog post “Measuring boundary length” by Cris Luengo discusses precisely this problem and describes several solutions of increasing sophistication. In case the blog goes down, here is a permanent reference to the best one discussed in the post:
Vossepoel and Smeulders, “Vector code probability and metrication error in the representation of straight lines of finite length” (Computer Graphics and Image Processing 20(4):347-364, 1982).
2
Issue with length overestimation for diagonal lines stems entirely from the approach you are taking, namely to count “edges of the pixel grid which different color on each side” and declare the sum a contour length. Those two are not equivalent in general case.
Essentially you are restricting yourself to calculating distances “on the grid” instead of Euclidian (“direct”) or “smoothed out for multiple points” distances between actual edges of the contour (not “edges” of the grid). The reason you find your approach inaccurate for diagonal lines is because your criteria of accuracy has to do with sub-pixel precision, while in your algorithm you are operating on a pixel precision level (“edges” of rasterized grid).
Why not first apply edge detection and get a list of edges with sub-pixel coordinates (you can apply any general purpose sub-pixel edge detection algorithm). Your output from this step is going to be a list of pairs of doubles. This step will transform your problem to a sub-pixel domain, where you can calculate distances and meet your accuracy criteria.
After drilling down to edges on sub-pixel level we have two main options:
- to calculate the sum of Euclidian distances between successive edges. It is going to give us acceptably accurate countour length given that we’ve picked a decent edge-detection algorithm;
- to connect your edges in a smooth way (E.g. B-splines) and calculate a sum of arc lengths between successive edges according to the interpolating function you’ve picked.
UPDATE
Simplified approach for step one would be to pick a sub-pixel middle point of your your eligible “edges” of rasterized grid and declare it a center of your edge. From there you continue with calculating distances – even with Euclidian distances between center points you’ll get a much accurate result then with counting elidgible “edges” of rasterized grid. At least your “overestimation by factor of sqrt(2)” issue for diagonal lines will be resolved. Hope it helps.
As rwong wrote, you can use findContours
to get the countours, if you don’t already have them. Then you can use arcLength
to compute the length of the contour.
2