I’m currently writing a Sprite Sheet Unpacker such as Alferds Spritesheet Unpacker. Now, before this is sent to gamedev, this isn’t necessarily about games. I would like to know how to detect a sprite within a spriitesheet, or more abstactly, a shape inside of an image.
Given this sprite sheet:
I want to detect and extract all individual sprites. I’ve followed the algorithm detailed in Alferd’s Blog Post which goes like:
- Determine predominant color and dub it the BackgroundColor
- Iterate over each pixel and check ColorAtXY == BackgroundColor
- If false, we’ve found a sprite. Keep going right until we find a BackgroundColor again, backtrack one, go down and repeat until a BackgroundColor is reached. Create a box from location to ending location.
- Repeat this until all sprites are boxed up.
- Combined overlapping boxes (or within a very short distance)
- The resulting non-overlapping boxes should contain the sprite.
This implementation is fine, especially for small sprite sheets. However, I find the performance too poor for larger sprite sheets and I would like to know what algorithms or techniques can be leveraged to increase the finding of sprites.
A second implementation I considered, but have not tested yet, is to find the first pixel, then use a backtracking algorithm to find every connected pixel. This should find a contiguous sprite (breaks down if the sprite is something like an explosion where particles are no longer part of the main sprite). The cool thing is that I can immediately remove a detected sprite from the sprite sheet.
Any other suggestions?
5
After a little browsing, I created my own solution a few months back but failed to post it here in the interest of anybody that would like to do the same. Therefore, I will quickly describe the method before showing the code. It’s in kotlin, but if you’re familiar with Java and/or AS3 you should be able to understand the syntax.
The Method
- Scan the entire image and store each color in a table.
- Determine which color shows up most often- this is the background color
- Start at the top left and scan each line.
- If a pixel does not match the background color, then it is a sprite.
- Use a backtracking algorithm to “fill out” the entire sprite.
- Create a rectangle out of this plot.
- Store it and cut the sprite out of the image.
- Repeat 4-7 until finished.
The Code
The source is available on github, and here is the code w/o documentation or imports.
fun unpack(spriteSheet: Path): List<Image> {
Preconditions.checkArgument(Files.exists(spriteSheet),
"The file ${spriteSheet.getFileName()} does not exist")
logger.debug("Loading sprite sheet.")
// TODO: Convert to png so we have an alpha layer to work with
val spriteSheetImage = readImage(spriteSheet).toBufferedImage()
logger.debug("Determining most probable background color.")
val backgroundColor = determineProbableBackgroundColor(spriteSheetImage)
logger.debug("The most probable background color is $backgroundColor")
return findSprites(spriteSheetImage, backgroundColor) map(::copySubImage.bindFirst(spriteSheetImage))
}
private fun findSprites(image: BufferedImage,
backgroundColor: Color): List<Rectangle> {
val workingImage = copy(image)
val spriteRectangles = ArrayList<Rectangle>()
for (pixel in workingImage) {
val (point, color) = pixel
if (color != backgroundColor) {
logger.debug("Found a sprite starting at (${point.x}, ${point.y})")
val spritePlot = findContiguous(workingImage, point) { it != backgroundColor }
val spriteRectangle = Rectangle(spritePlot, image)
logger.debug("The identified sprite has an area of ${spriteRectangle.width}x${spriteRectangle.height}")
spriteRectangles.add(spriteRectangle)
eraseSprite(workingImage, backgroundColor, spritePlot)
}
}
logger.info("Found ${spriteRectangles.size()} sprites.")
return spriteRectangles
}
private fun findContiguous(image: BufferedImage, point: Point, predicate: (Color) -> Boolean): List<Point> {
val unvisited = LinkedList<Point>()
val visited = HashSet<Point>()
unvisited.addAll(neighbors(point, image) filter { predicate(Color(image.getRGB(it.x, it.y))) })
while (unvisited.isNotEmpty()) {
val currentPoint = unvisited.pop()
val currentColor = Color(image.getRGB(currentPoint.x, currentPoint.y))
if (predicate(currentColor)) {
unvisited.addAll(neighbors(currentPoint, image) filter {
!visited.contains(it) && !unvisited.contains(it) &&
predicate(Color(image.getRGB(it.x, it.y)))
})
visited.add(currentPoint)
}
}
return visited.toList()
}
private fun neighbors(point: Point, image: Image): List<Point> {
val points = ArrayList<Point>()
if (point.x > 0)
points.add(Point(point.x - 1, point.y))
if (point.x < image.getWidth(null) - 1)
points.add(Point(point.x + 1, point.y))
if (point.y > 0)
points.add(Point(point.x, point.y - 1))
if (point.y < image.getHeight(null) - 1)
points.add(Point(point.x, point.y + 1))
return points
}
However, the code is not yet complete- packing and creating the metafile is missing- and has been languishing in my Github since March, but I’ll hopefully complete it one day.
I implemented a very decent and fast sprite detection algorithm but different to above and… superior in speed and accuracy (10 seconds on I7 vs ~ instant on a bitmap 2048 * 2048).
because of the way the standard VB rectangle structure worked I had to roll my own. reason being for theirs Right = Left + Width. I needed it so Left could be equal to Right. In my case for a single pixel at (0, 0) { Left = 0, Right = 0, Top = 0, Bottom = 0, Width = 1, Height = 1 }. However for the standard one this would have deemed the rectangle as “empty”.
Firstly if extracting off a JPG then the background has a bit of noise so need to use a color tolerance algorithm when determining if a pixel is background or other.
This algorithm assume sprites are randomly over the sheet and have background around them such that a rectangle can enclose single sprites without overlapping other sprites. It however WON’T work for example if the sheet has sprites that are 32*32 and are packed 32*32 with no gaps.
1) Let user click on a pixel that is deemed the background (In testing I just assumed pixel at (0, 0) was the background). Tightly packed sprite sheets can have less background pixels than other colors and again JPG noise will skew if looking for most frequent color as advised above.
2) Once a background color is decided and a tolerance chosen (say 32) take a clone of the sprite sheet and iterate over every pixel ONCE. Set the pixel values to 0 if they meet the background color tolerance otherwise … 1. Now you end up with a clone with 0’s and 1’s which will be a LOT easier (and faster) to deal with below.
3) Now iterate over each pixel in the modified clone once. Top to bottom, left to right, order not really important now.
4) When you hit a non-background pixel (ie 1) use a flood-fill algorithm to erase the pixels to 0. I used a custom one that was very fast and offered me exact control. The custom flood-fill also (importantly) allowed me to return the bounding rectangle of the fill operation.
5) The bounding rectangle of every flood-fill is added to a list of rectangles.
6) At the end of scanning every pixel you should have a list of rectangles. On a test .PNG file with 63 sprites the original algorithm had > 10,000 rectangles that then needed to be merged. The flood-fill one had… “drum roll” 88.
7) The list of rectangles will be close but may have “orphans”. Iterate through all like a sort algorithm so you compare each element to all the others. If they intersect merge them (union) so 2 rectangles becomes 1. (to merge I take the bottom rectangle on the list and put it in place of the 2nd one I compared. The first one is modified and the list count is decremented by 1. Now… if they are touching they are again merged ONLY if they meet the maximum area condition.
1) The color tolerance
2) The rectangle merge maximum size. When 2 rectangles are intersecting they’re always merged (union) however when 2 rectangles are next to each other with no pixel gap then they are only merged if their internal area is < some cutoff. This way small orphans get joined to a larger rectangles but larger rectangles don’t get merged.
3) Flood fill, has an option. When it iterates looking for the next pixel it looks up, right, down, left. However if desired it can also look on the diagonals as well.
NOTE … To check if 2 rectangles are next to each other use the code below with expand=1.
Public Function IntersectsWith(rectb As strRect, expand As Integer) As Boolean
If isEmpty Or rectb.isEmpty Then Return False
If Not (m_top - expand <= rectb.m_bottom + expand) Then Return False
If Not (m_bottom + expand >= rectb.m_top - expand) Then Return False
If Not (m_left - expand <= rectb.m_right + expand) Then Return False
If Not (m_right + expand >= rectb.m_left - expand) Then Return False
Return True
End Function
Bit of code that sets the clone pixels to either 1 or 0 :-
m_bmp = DirectCast(p_bmp.Clone(), Bitmap)
Dim lock As BitmapData = BMPLock(m_bmp)
Dim dword_bgcolor As Integer = bgcolor.ToArgb
' first smash the copied bitmap into either BG or non bg so can search exact
For i As Integer = 0 To lock.Width * lock.Height - 1
If BMPColorsAreSimilar(BMPGetPixelFastDWORD(lock, i), dword_bgcolor, tol) > 0 Then ' bg
BMPSetPixelFast(lock, i, 0)
Else
BMPSetPixelFast(lock, i, 1)
End If
Next
Main loop that finds pixels to flood fill :-
Dim rect As strRect
Dim temp_stack As New Stack ' fill algo uses this... passing it means not allocated every call
For y As Integer = 0 To lock.Height - 1
For x As Integer = 0 To lock.Width - 1
If BMPGetPixelFastDWORD(lock, x, y) = 1 Then ' hit a sprite so erase it with flood fill to get the bounds
BMPFloodFill(lock, temp_stack, filldiag, x, y, 0)
rect.SetBounds(BMPFloodFillLastMinX(), BMPFloodFillLastMaxX(), BMPFloodFillLastMinY, BMPFloodFillLastMaxY)
m_bounds.Add(rect) ' add to our list of rectangles
End If
Next
Next
Hope this helps anyone. Very fun little algorithm.
My code for the flood fill is below… It’s in managed C++ and called via VB.NET. It uses a stack of coordinates to backtrack when looking for new paths of pixels rather than being recursive. No idea if other people make flood fills like this… but it’s fast and accurate and unlike the boxed in ones works on a 32 bit locked surface and offers full control.
int fill_dx[] = { 0, -1, 0, 1, -1, -1, 1, 1 };
int fill_dy[] = { -1, 0, 1, 0, -1, 1, -1, 1 };
int fill_minx;
int fill_miny;
int fill_maxx;
int fill_maxy;
BYTE *fill_track = 0;
DWORD fill_track_size = 0;
Int32 MCUMakerSupport::MCUBitmap::BMPFloodFill(BitmapData ^%bmp_lock, Stack ^%mCods, Int32 diag, Int32 x, Int32 y, Int32 col)
{
diag = diag ? 8 : 4;
if (!bmp_lock)
{
return 0;
}
int w = bmp_lock->Width;
int h = bmp_lock->Height;
if (x < 0 || x >= w || y < 0 || y >= h) return 0;
refill:
if (!fill_track)
{
fill_track = (BYTE *)malloc(w * h);
fill_track_size = w * h;
}
else
{
if (fill_track_size < w * h)
{
free(fill_track);
fill_track = 0;
fill_track_size = 0;
goto refill;
}
}
if (!fill_track)
{
fill_track_size = 0;
return 0;
}
memset(fill_track, 0, w * h);
//BitmapData ^bmp_lock = bmp->LockBits(Rectangle(0, 0, w, h), ImageLockMode::ReadWrite, bmp->PixelFormat);
DWORD *fill_addr = (DWORD *)bmp_lock->Scan0.ToPointer();
DWORD fill_color = col;
DWORD target_color = fill_addr[y * w + x];
//Stack ^mCods = gcnew Stack();
Point pt;
pt.X = x;
pt.Y = y;
mCods->Clear();
int filled = 1;
fill_minx = fill_maxx = x;
fill_miny = fill_maxy = y;
fill_addr[pt.Y * w + pt.X] = fill_color;
fill_track[pt.Y * w + pt.X] = 255;
mCods->Push(pt);
while (mCods->Count > 0)
{
recheck:
for (int i = 0; i < diag; i++)
{
int cx = pt.X + fill_dx[i];
if (cx < 0 || cx >= w) continue;
int cy = pt.Y + fill_dy[i];
if (cy < 0 || cy >= h) continue;
bool res = false;
if (!fill_track[cy * w + cx])
{
fill_track[cy * w + cx]++;
DWORD c = fill_addr[cy * w + cx];
if (c == target_color)
{
res = true;
fill_addr[cx + cy * w] = fill_color;
if (cx < fill_minx) { fill_minx = cx; }
if (cy < fill_miny) { fill_miny = cy; }
if (cx > fill_maxx) { fill_maxx = cx; }
if (cy > fill_maxy) { fill_maxy = cy; }
}
}
if (res) // fill?
{
mCods->Push(pt);
filled++;
pt.Y = cy;
pt.X = cx;
goto recheck;
}
}
pt.X = ((Point ^)mCods->Peek())->X;
pt.Y = ((Point ^)mCods->Peek())->Y;
mCods->Pop();
}
mCods->Clear();
return filled; // number of pixels filled
}