Holes in ggplot polygons

by on July 13, 2016

As part of my efforts to construct an R package connecting ggplot and Spatial objects, I came across an issue with ggplot involving holes in polygons. According to the collective knowledge of StackOverflow, it’s possible to make this happen by putting the coordinates in the correct clockwise/counterclockwise order, extending geom_polygon() somehow, or other really complicated things. But for my simple example, things didn’t work, and appears not to work for many examples where spatial data is involved.

devtools::install_github("paleolimbot/ggspatial")
library(ggspatial)
data("longlake_waterdf")

spdf <- longlake_waterdf[is.na(longlake_waterdf$label),]
ggplot(spdf) + geom_polygon()
it's all wrong!

it’s all wrong!

Of course, ggplot has to turn the SpatialPolygonsDataFrame into a data.frame somehow, and it turns out that this happens using the fortify() function.

df <- fortify(spdf)
head(df)
      long     lat order  hole piece id group
1 412583.6 5086360     1 FALSE     1  2   2.1
2 412585.6 5086360     2 FALSE     1  2   2.1
3 412588.5 5086356     3 FALSE     1  2   2.1
4 412588.4 5086350     4 FALSE     1  2   2.1
5 412585.2 5086343     5 FALSE     1  2   2.1
6 412583.1 5086335     6 FALSE     1  2   2.1

So more specifically, our call to ggplot is really more like this:

ggplot(df, aes(x=long, y=lat)) + 
    geom_polygon(aes(group=id), fill="lightblue") + 
    geom_path(aes(group=group))

Rplot01

Here we can see that the ‘fill’ is working improperly, but if we use group=group as an aesthetic, the geom_path of the outline is correct.

It turns out that the secret is actually a bit of a workaround, suggested by this answer on StackOverFlow. He didn’t get credit for an answer, but got plenty of upvotes. Basically, if you come back to the same point after every hole, you fix the fill problem. So the solution is, insert a the first point before each piece of the polygon. Finding the first line of every hole is easy if it’s just one feature (which our example is), so we’ll start there.

ringstarts <- which(!duplicated(df$group))
df[ringstarts, ]
        long     lat order  hole piece id group
1   412583.6 5086360     1 FALSE     1  2   2.1
320 412375.3 5086133   320  TRUE     2  2   2.2
431 412076.0 5085796   431  TRUE     3  2   2.3
502 412477.7 5086260   502  TRUE     4  2   2.4
581 412317.0 5085906   581  TRUE     5  2   2.5
676 412310.1 5086123   676  TRUE     6  2   2.6
714 412234.5 5085986   714  TRUE     7  2   2.7

Now if we manually insert the first row in front of each of the rings, we can see that our fill plots properly.

df2 <- df[c(1:319, 
          1, 320:430, 1, 431:501, 1, 502:580, 
          1, 581:675, 1, 676:713, 1, 714:nrow(df)),]
ggplot(df, aes(x=long, y=lat)) + 
    geom_polygon(aes(group=id), fill="lightblue") + 
    geom_path(aes(group=group))

Rplot02

Programmatically coming up with this vector of row numbers took quite a bit of experimentation, but with a combination of c(), lapply(), and do.call() it looks like this does the trick:

fixfeature <- function(df) {
  ringstarts <- which(!duplicated(df$group))
  if(length(ringstarts) < 2) {
    return(df)
  } else {
    ringstarts <- c(ringstarts, nrow(df))
    indicies <- c(1:(ringstarts[2]-1), do.call(c, lapply(2:(length(ringstarts)-1), function(x) {
        c(1, ringstarts[x]:(ringstarts[x+1]-1))
    })), nrow(df))
    return(df[indicies,])
  }
}

Because this only works with a single feature, we need to invoke dplyr to split our original data.frame up and apply the fix_feature function.

library(dplyr)
custom_fortify <- function(x, ...) {
  df <- fortify(x, ...)
  df %>% group_by(id) %>% do(fixfeature(.))
}

This appears to work on any multi-part geometry, whether it involves a hole or not. The classic wrld_simpl dataset (from the maptools package) looks particularly bad when plotted without any type of conversion.

library(maptools)
data(wrld_simpl)
wrld_df <- fortify(wrld_simpl)
ggplot(wrld_df, aes(x=long, y=lat)) + geom_polygon()

Rplot03

But if we use our custom_fortify() function, things look much prettier.

wrld_df <- custom_fortify(wrld_simpl)
ggplot(wrld_df, aes(x=long, y=lat, group=id)) + geom_polygon()

Rplot04

Or if we want to add outlines, we have to use a slightly different aesthetic but it still works:

ggplot(wrld_df, aes(x=long, y=lat, group=id)) + 
    geom_polygon() + geom_path(aes(group=group))

Rplot05

And of course, the whole point of this was to roll it into the ggspatial package, so the easy way to go about this would be using geom_spatial() (but that would be cheating…).

library(ggspatial) # devtools::install_github("paleolimbot/ggspatial") if you don't have it
data(longlake_waterdf)
ggplot() + geom_spatial(longlake_waterdf, aes(fill=label, col=area)) + 
    coord_fixed()

unnamed-chunk-3-2

There you go! A generic solution (hopefully!) to all your holes-in-polygons needs.

Leave a Reply

WP Facebook Like Send & Open Graph Meta powered by TutsKid.com.