Sunday, September 8, 2013

Exploring RPCs

TL;DR: Generating stable RPCs is hard, and I'm not sure how to do it!

RPCs in the context of the geospatial world are equations for describing how a geometrically raw sensor image can be related to a location in the real world.  RPC can variously expand to Rational Polynomial Coefficient or Rapid Positioning Coefficients.  They consist of 80 coefficents for a ratio of two third degree polynomials (for x and y) plus 10 more values for normalizing image and geographic coordinates.

A couple background papers include one by Vincent Tao and others on Understanding the Rational Function Model (pdf), and one by Gene Dial and J. Grodecki titled RPC Replacement Camera Models (pdf).  These date from the mid-2000s, but they reference some earlier work by Gene Dial and J. Grodecki who seem to have played the seminal role in promoting the use of RPCs as generic replacements for rigerous camera models.  I believe the major satellite image vendors have now been delivering RPC models (example) for raw scenes for on the order of a decade. 

I, and others, have added support to GDAL for reading and representing RPCs as metadata over the a number of years, as well as support for evaluating RPCs in gdalwarp as a way of georeferencing images.  At it's very simplest a command like the following can be used to roughly rectify a raw image that has RPCs.

  gdalwarp -rpc raw.tif rectified.tif

In fact, at FOSS4G NA this year, a command like this was the big reveal in Paul Morin's discussion of how the Polar Geospatial Center was able to efficiently process huge numbers of satellite scenes of the polar regions.  I was rather surprised since I wasn't aware of this capability being in significant use.  However the above use example will usually do a relatively poor job.  RPCs encapsulate the perspective view of the satellite and as such you need to know the elevation of an early location in order to figure out exactly where it falls on the raw satellite image.  If you had a relatively flat image at a known elevation you can specify a fixed elevation like this:

  gdalwarp -rpc -to RPC_HEIGHT=257 raw.tif rectified.tif

However, in an area that is not so flat, you really need a reasonably could DEM (digital elevation model) of the region so that distortion effects from terrain can be taken into account.  This is particularly important if the view point is relatively low (ie. for an airphoto) or if the image is oblique rather than nadir (straight down).  You should be able to do this using a command referencing a DEM file for the area like this, though to be honest I haven't tried it recently (ever?) and I'm not sure how well it works:

  gdalwarp -rpc -to RPC_DEM=srtm.tif raw.tif rectified.tif

The above is essentially review from my point of view.  But with my recent move from Google to Planet Labs I have also become interested in how to generate RPCs given knowledge of a camera model, and the perspective view from a satellite.   This would give us the opportunity to deliver raw satellite scenes with a generic description of the geometry that can already be exploited in a variety of software packages, including GDAL, OSSIM, Orfeo Toolbox and proprietary packages like ArcGIS, Imagine and PCI Geomatica.  However, I had no idea how this would be accomplished and while I earned an Honours BMath, I have let the modest amount of math I mastered in the late 80's rot ever since.

So, using what I am good at, I started digging around the web for open source implementations for generating RPCs.  Unfortunately, RPC is a very generic term much more frequently used to mean "Remote Procedure Call" in the software development world.  Nevertheless, after digging I learned that the Orfeo Toolbox (OTB) has a program for generating RPCs from a set of GCPs (ground control points).  Ground control points are just a list of locations on an image where the real world location is known.  That is a list of information like (pixel x, pixel y, longitude, latitude, height).  That was exciting, and a bit of digging uncovered the fact that in fact Orfeo Toolbox's RPC capability is really just a wrapper for services used from OSSIM.  In particular core RPC "solver" is found in the source files ossimRpcSolver.cc, and ossimRpcSolver.h.  They appear to have been authored by the very productive Garrett Potts.

Based on the example of the Orfeo Toolbox code, I wrote a gcps2rpc C++ program that called out to libossim to transform GCPs read from a text file into RPCs written to a GDAL VRT file.

For my first attempt I created 49 GCPs in a 7x7 pattern.  The elevations were all zero (ie. they were on the WGS84 ellipsoids) and they were generated using custom code mapping from satellite image pixels to ground locations on the ellipsoid.  I included simple code in gcps2rpc to do a forward and reverse evaluation of the ground control points to see how well they fit the RPC.  In the case of this initial experiment the forward transformation, (long, lat, height) to (pixel, line) worked well with errors in the order of 1/1000th of a pixel.  Errors in the inverse transformation, (pixel, line, height) to (long, lat) was less good with worst case errors on the order of 1/10th of a pixel.

At first I wasn't sure what to make of the errors in the inverse direction.  1/10th of a pixel wasn't terrible from my perspective, but it was still a fairly dramatic difference from the tiny errors in the forward direction.  I tried comparing the results from evaluating the RPCs with OSSIM and with GDAL.  To test with GDAL I just ran gdaltransform something like this.

  gdaltransform -rpc -i simulated_rgb.vrt
  -85.0495014028 39.7243133317          (entered in terminal)
  2016.00011619623 1792.00003975369 0   (output in terminal)


and:
  gdaltransform -rpc simulated_rgb.vrt
  2016 1792                             (entered in terminal)
  -85.0495014329239 39.7243132735733 0  (output in terminal)


I found that GDAL and OSSIM agreed well in the forward transform, but while they both had roughly similar sized errors in the inverse they were still fairly distinct results.  Reading through the code the reasoning became fairly obvious.  They both use iterative solutions for the inverse transform and start with very different initial guesses, and they have fairly weak requirements for covergence.  In the case of GDAL the convergence threshold was 1/10th of a pixel.  OK, so that leaves me thinking there may be room to improve the inverse functions but accepting that the errors I'm seeing are really all about the iterative convergence. 

Next I tried using the RPCs with gdalwarp to actually "rectify" the image.  Initially I just left a default elevation of zero for gdalwarp, so I used a command like:

  gdalwarp -rpc simulated_rgb.vrt ortho_0.tif

The results seemed plausible but I didn't have a particularly good way to validate it.  My test scene is in Indiana, so I used a lat/long to height web service to determine that the region of my scene had a rough elevation of 254m above sea level (we will treat that as the same as the ellipsoid for now).  So I tried:

  gdalwarp -rpc -to RPC_HEIGHT=254 simulated_rgb.vrt ortho_254.tif

On review of the resulting image, it exactly overlayed the original.  I had been hoping the result would have been slightly different based on the elevation.  After a moments reflection and a review of the RPC which had many zero values in the terms related to elevation I realized I had't provided sufficient inputs to actually model the perspective transform.  I had't given any GCPs at different heights.  Also, the example code made it clear that an elevation sensitive RPC could not be generated with less than 80 GCPs (since the equations have 80 unknowns).

My next step was to produce more than 80 GCPs and to ensure that there was more than one elevation represented.  It seemed like two layers of 49 GCPs - one at elevation zero, and one at an elevation of 300m should be enough to do the trick, so I updated my GCP generating script to produce an extra set with an ellipsoid that was 300m "bigger".  I then ran that through gcps2rpc.

I was pleased to discover that I still got average and maximum errors at the GCPs very similar to the all-zero case.  I also confirmed that the GCPs at 300m were actually at a non-trivially different location.  So it seemed the RPCs were modelling the perspective view well, at least at the defining positions.  I thought I was home free!  So I ran a gcpwarp at a height of 0m, 300m and 254m.  The results at 0m and 300m looked fine, but somehow the results at 254 meters was radically wrong - greatly inflated!  A quick test with gdaltransform gave a sense of what was going wrong:

gdaltransform -rpc  simulated_rgb.vrt
0 0 0
-85.036383862713 39.6431112403042 0
0 0 300
-85.0363998100256 39.6431575414329 300
0 0 254
-84.9815303476089 39.2011322215184 254


Essentially this showed that pixel/line location 0,0 was transforming reasonably at the elevations 0 and 300, but at 254 the results were far from the region expected.  Clearly the RPCs were not generally modelling the perspective view well.  This is the point I am at now.  Is it not clear to me what I will need to do to produce stable RPCs for my perspective view from orbit.  Perhaps there is a more suitable pattern or number of GCPs to produce.  Perhaps the OSSIM RPC generation code needs to be revisited.  Perhaps I need to manually construct RPC coefficients that model the perspective view instead of just depending on a least squares best fit to produce reasonable RPCs from a set of GCPs. 

Anyways, I hope that this post will provide a bit of background on RPCs to others walking this road, and I hope to have an actually solution to post about in the future.