Elijah Robison | GIS Blog

A scrapbook of GIS tricks, with emphasis on FOSS4G.

Archive for the ‘PostGIS’ tag

Create UTFGrid Tiles from PostGIS Tables

without comments

I assume you’re where I was about a week ago. That is, you’ve heard of UTFGrid, and now you want to render your own UTFGrid tiles. Perhaps you found yourself looking at this thread over at GIS.SE, but you don’t want to jump into TileStache just now. Anyway if you’ve got a working installation of GDAL/OGR and Mapnik 2+, complete with Python bindings, I’ll show you what worked for me..

Because this is merely an adaptation of Matthew Perry’s original solution, I highly recommend considering his original blog entry on the topic, complete with discussion points and caveats, before proceeding further!

Once you’re ready, head over to GitHub and download some code, specifically globalmaptiles.py, create_tile_shp.py, and create_utfgrids.py. [1] You don’t actually need create_tile_shp.py, but I snagged all three. To keep things simple put these files in the same folder.

Next, in that same directory, create a new Python file, I called mine createUtfgridsFromPG.py.

Where create_utfgrids.py works entirely on shapefiles, createUtfgridsFromPG.py accepts an OGR PostgreSQL connection string in place of a shapefile path. Fortunately the original OGR code didn’t change, but to use the Mapnik PostGIS driver I had to iterate over the PostgreSQL connection string and store the connection parameters so I could supply them a differently to Mapnik.

Finally, copy the following code and paste it into your new file. It’s basically the same as create_utfgrids.py, but I changed shppath to pgconn and added some code to use a mapnik.PostGIS() datasource in place of a mapnik.Shapefile() datasource.

If you’re curious, the comment # ELR 2014.9.26: flags the few places I changed the original code.

createUtfgridsFromPG.py

#!/usr/bin/env python
# -*- coding: utf-8  -*-
"""
create_utfgrids.py
Author: Matthew Perry
License: BSD

Creates utfgrid .json tiles for the given polygon shapefile

Thx to Dane Springmeyer for the utfgrid spec and mapnik rendering code
and to  Klokan Petr Přidal for his MapTiler code
(http://www.maptiler.org/google-maps-coordinates-tile-bounds-projection/)

"""
import globalmaptiles
import mapnik
import ogr
import os
from optparse import OptionParser, OptionError
try:
    import simplejson as json
except ImportError:
    import json

def create_utfgrids(pgconn, minzoom, maxzoom, outdir, fields=None, layernum=0):

    # ELR 2014.9.26:
    # Original implementation pushed in a shapefile path.
    #ds = ogr.Open(shppath)
    ds = ogr.Open(pgconn)

    # ELR 2014.9.26:
    # Iterate over the PostgreSQL connection string and pull out values we need
    # to use Mapnik's PostGIS datasource constructor.
    pgConnARR = pgconn[3:].split(' ')
    for kvPair in pgConnARR:
        if kvPair.split('=')[0] == "host":
            nikHost = kvPair.split('=')[1]
        if kvPair.split('=')[0] == "port":
            nikPort = kvPair.split('=')[1]
        if kvPair.split('=')[0] == "user":
            nikUser = kvPair.split('=')[1]
        if kvPair.split('=')[0] == "password":
            nikPass = kvPair.split('=')[1]
        if kvPair.split('=')[0] == "dbname":
            nikDB = kvPair.split('=')[1]
        if kvPair.split('=')[0] == "tables":
            nikTable = kvPair.split('=')[1]

    print
    print "WARNING:"
    print " This script assumes a polygon shapefile in spherical mercator projection."
    print " If any of these assumptions are not true, don't count on the results!"
    # TODO confirm polygons
    # TODO confirm mercator
    # TODO get layernum from command line
    layer = ds.GetLayer(layernum)
    bbox = layer.GetExtent()
    print ""
    print str(bbox)

    mercator = globalmaptiles.GlobalMercator()

    m = mapnik.Map(256,256)

    # Since grids are `rendered` they need a style
    s = mapnik.Style()
    r = mapnik.Rule()
    polygon_symbolizer = mapnik.PolygonSymbolizer(mapnik.Color('#f2eff9'))
    r.symbols.append(polygon_symbolizer)
    line_symbolizer = mapnik.LineSymbolizer(mapnik.Color('rgb(50%,50%,50%)'),0.1)
    r.symbols.append(line_symbolizer)
    s.rules.append(r)
    m.append_style('My Style',s)

    print ""
    # ELR 2014.9.26:
    # Original implementation using shapefile..
    #ds = mapnik.Shapefile(file=shppath)

    # ELR 2014.9.26:
    # Parameterized PostGIS implementation..
    ds = mapnik.PostGIS(host=nikHost,port=nikPort,user=nikUser,password=nikPass,dbname=nikDB,table=nikTable)

    mlayer = mapnik.Layer('poly')
    mlayer.datasource = ds
    mlayer.styles.append('My Style')
    m.layers.append(mlayer)

    print ""
    if fields is None:
        fields = mlayer.datasource.fields()
        print "Fields were NONE. Using.."
        print fields
    else:
        print "Fields are USER PROVIDED. Using.."
        print fields
    print ""

    for tz in range(minzoom, maxzoom+1):
        print " * Processing Zoom Level %s" % tz
        tminx, tminy = mercator.MetersToTile( bbox[0], bbox[2], tz)
        tmaxx, tmaxy = mercator.MetersToTile( bbox[1], bbox[3], tz)
        for ty in range(tminy, tmaxy+1):
            for tx in range(tminx, tmaxx+1):
                output = os.path.join(outdir, str(tz), str(tx))
                if not os.path.exists(output):
                    os.makedirs(output)

                # Use top origin tile scheme (like OSM or GMaps)
                # TODO support option for TMS bottom origin scheme (ie opt to not invert)
                ymax = 1 << tz;
                invert_ty = ymax - ty - 1;

                tilefilename = os.path.join(output, "%s.json" % invert_ty) # ty for TMS bottom origin
                tilebounds = mercator.TileBounds( tx, ty, tz)
                #print tilefilename, tilebounds

                box = mapnik.Box2d(*tilebounds)
                m.zoom_to_box(box)
                grid = mapnik.Grid(m.width,m.height)
                mapnik.render_layer(m,grid,layer=0,fields=fields)
                utfgrid = grid.encode('utf',resolution=4)
                with open(tilefilename, 'w') as file:
                    file.write(json.dumps(utfgrid))

if __name__ == "__main__":
    usage = "usage: %prog [options] shapefile minzoom maxzoom output_directory"
    parser = OptionParser(usage)
    parser.add_option("-f", '--fields', dest="fields", help="Comma-seperated list of fields; default is all")
    (options, args) = parser.parse_args()

    if len(args) != 4:
        parser.error("Incorrect number of arguments")

    pgconn = args[0]
    minzoom, maxzoom = int(args[1]), int(args[2])
    outdir = args[3]

    if os.path.exists(outdir):
        parser.error("output directory exists already")

    if options.fields:
        fields = options.fields.split(",")
    else:
        fields = None

    create_utfgrids(pgconn, minzoom, maxzoom, outdir, fields)

Usage..

Once you’ve prepared createUtfgridsFromPG.py, you can call it from the command line like this..

C:\xDev\utfgrids\createUtfgridsFromPG.py "PG:host=127.0.0.1 port=5432 user=postgres dbname=gis password=passw0rd tables=parcels_pmerc" 12 16 "C:/xGIS/tiles/utf" -f tms,owner_name

  • Hopefully the PostgreSQL connection string ("PG:host=..") makes sense.
  • 12 and 16 represent the minimum and maximum zoom levels to be rendered, respectively.
  • The directory “C:/xGIS/tiles/utf” is where your UTFGrid tiles will be saved.
  • And -f tms,owner_name,the_wkt represents a comma-separated list of data fields you want in your UTFGrid.

Caveats..

  • Both create_utfgrids.py and createUtfgridsFromPG.py require your geodata table to be in a Web Mercator projection (EPSG:3857)!
  • The script assumes a top-origin tile scheme, like OSM and others.
  • The script will only work with polygons.
  • While the OGR PostgreSQL connection string has a tables parameter, this implementation will only accept one table.
  • The script will create your target directory, in the example case, utf, and it will throw an error if you create this directory in advance.

[1] Many thanks to Matthew Perry, Klokan Petr Přidal, and Dane Springmeyer for their collective efforts and for sharing their work.

Written by elrobis

September 26th, 2014 at 1:26 pm

OGR VRT: Connect to PostGIS DataSource

without comments

I needed an OGR VRT for something and didn’t find a clear example on the web all in one place, so here goes.

Somewhere on your system, create a new file with a .ovf extension. Inside that file, add some XML like the following to define your PostgreSQL connection:

That name=”WKTGrid” is semantically unrelated here. I have been experimenting with including WKT geometry data in UtfGrid tiles, and that name is relative to my experiments. You can provide most any value for name. However, do note that the layer name is referenced in the ogrinfo command.

<OGRVRTDataSource>
  <OGRVRTLayer name="WKTGrid">
    <SrcDataSource>PG:host=127.0.0.1 user=postgres dbname=gis password=l00per</SrcDataSource>
    <SrcLayer>parcels_cama_20140829_pmerc</SrcLayer>
    <SrcSQL>SELECT tms, owner_name, the_wkt FROM parcels_cama_20140829_pmerc</SrcSQL>
  </OGRVRTLayer>
</OGRVRTDataSource>

  • OGRVRTLayer The layer name attribute you assign can be anything you want.
  • SrcDataSource The data source value defines your PostgreSQL connection parameters.
  • SrcLayer The source layer identifies the target table in your PostgreSQL instance.
  • SrcSQL [Optional] OGR SQL can be used to target specific fields, define field aliases, and even refine the data set using WHERE clauses, etc.

After you make a VRT, it’s smart to test it in ogrinfo before you use it for anything serious. It’s easy to test a VRT in ogrinfo, and if ogrinfo makes sense of it, then you know you’ve got a good VRT.

A command like this uses ogrinfo and OGR_SQL to open the VRT and isolate one feature, showing you its attributes.

ogrinfo C:\xGIS\Vector\parcels\parcels_cama_20140829_pmerc.ovf -sql " SELECT tms, owner_name, the_wkt FROM WKTGrid WHERE tms = 'R39200-02-31' "

In some cases, OGR may have trouble identifying your geometry field, or you may have multiple geometry fields and want to specify one field in particular. If so, note the following changes, specifically the modification to the SrcSQL node and the added GeometryField node.

<OGRVRTDataSource>
  <OGRVRTLayer name="WKTGrid">
    <SrcDataSource>PG:host=127.0.0.1 user=postgres dbname=gis password=l00per</SrcDataSource>
    <SrcLayer>parcels_cama_20140829_pmerc</SrcLayer>
    <SrcSQL>SELECT ST_AsBinary(wkb_geometry) as geomm, tms, owner_name FROM parcels_cama_20140829_pmerc</SrcSQL>
    <GeometryField encoding="WKB" field="geomm"></GeometryField>
  </OGRVRTLayer>
</OGRVRTDataSource>

And this is just scratching the surface. Make sure to check out the OGR VRT driver page for a complete list of options available to you.

Written by elrobis

September 24th, 2014 at 4:45 pm

PostGREsql/PostGIS Implementation of Google’s Encoded Polyline Algorithm

with 8 comments

[Edit 30 Jan, 2014]

I added an additional PostGREsql method to perform Polygon encoding by concatenating polygon geometries (delimiter: †) and their inner rings (delimiter: ‡) together into one massive encoded block of ring features. I also provided an example JavaScript method demonstrating how to bring the amalgamated polygon feature encodings into your Google Map.

By some uncanny twist of the fates, I’ve elected to use, had to use, and/or been asked to develop applications that use Google Maps ASCII Encoded Polyline expressions. In previous encounters, I’ve used a PHP class to handle the encoding task, and most recently I wrote a Python method to decode these expressions so that I could return a 3rd-party’s encoded geometries to WKT and import them into a spatially aware database.

So far so good.

However one thing has always bugged me about using the PHP solution–I don’t like using a piece of middleware to handle what I consider to be a responsibility of the data layer. Mark McClure’s page, which is basically the seminal authority on this topic, provides external links to implementations for Perl, Ruby, PHP (note: I prefer the PHP class linked, above), Java, and Mathematica. Also, by searching Stack Overflow, you can find implementations of the algorithm in both C# and VB.Net. But for all my efforts searching, I could never dredge up an implementation for either MySQL or PostGREsql/PostGIS. Bummer.

Looking up, it seems version 2.2 of PostGIS might include some built-in Google encoding conversion methods. While this is cool enough for a hat tip, unfortunately, it’s too inconvenient to wait that long, and even then, there’s no guarantee the implementation will work the way I expect with complex Polygon geometries; for instance, maybe it will encode only the exterior ring of Polygons, ignoring MultiPolygons completely, etc. For that matter, it’s equally possible there could be some bugs. So with this said, and even though the previously-mentioned PHP implementation does the job, my boss was cool-enough to let me take a crack at implementing the algorithm as a PostGREsql/PostGIS function, and then share the results with the world. Since some initial testing confirms my PostGIS implementation works, I’ll just post the code parts and hope others find it useful.

For what it’s worth, if anyone finds a bug or has recommendations for improvements, please don’t hesitate to drop me a line.

 

Sample query calling the first encoding function on the EXTERIOR RING of Polygon geometries:
(Also works on single-part LINESTRING features.)

/************************************************************************
 * Note that the encoding method can accept a LINESTRING only, which
 * is the geometry type used to represent the ring parts of a Polygon.
 * To help understand this, and why, please see the trailing discussion
 * section, which elaborates on this situation further.
 ************************************************************************/
SELECT
  GoogleEncodeLine(ST_ExteriorRing(wkb_geometry)) as Google
  FROM polygons_wgs84
  WHERE ST_GeometryType(wkb_geometry) = 'ST_Polygon'
  LIMIT 10 ;

 

[Added 30 Jan, 2014]

Sample query calling the second encoding function on Polygon and MultiPolygon geometries:
(Preserves multi-part polygons and their inner-ring parts, a.k.a. “holes”.)

/************************************************************************
 * This encoding method will accept Polygon and MultiPolygon geom types.
 * The output returned is an amalgamation of Polyline encodings, where
 * individual geometries and their interior rings are concatenated
 * together using string delimiters, †, and ‡, respectively.
 ************************************************************************/
SELECT
  GoogleEncodePolygon(wkb_geometry) as GooglePolygon
  FROM polygons_wgs84
  LIMIT 10 ;

 

Implementation functions to execute/save in your PostGREsql instance:

[Added 30 Jan, 2014]

/*************************************************************
 * Pass in either a Polygon or MultiPolygon geometry. Returns
 * an array of ASCII-encoded Polygon feature parts, including
 * multi-part geometries and their interior rings.
 ************************************************************/
CREATE OR REPLACE FUNCTION GoogleEncodePolygon
(
  g1 GEOMETRY
)
RETURNS TEXT AS $$
DECLARE
 ng INT;        -- Store number of Geometries in the Polygon.
 g INT;         -- Counter for the current geometry number during outer loop.
 g2 GEOMETRY;   -- Current geometry feature isolated by the outer loop.
 nr INT;        -- Store number of internal ring parts in the Polygon.
 r INT;         -- Counter for the current inner-ring part.
 r1 GEOMETRY;   -- Exterior ring part isolated BEFORE the inner loop.
 r2 GEOMETRY;   -- Inner-ring part isolated within the inner loop.
 gEncoded TEXT; -- Completed Google Encoding.
BEGIN
 gEncoded = '';
 ng = ST_NumGeometries(g1);
 g = 1;
 FOR g IN 1..ng BY 1 LOOP
     g2 = ST_GeometryN(g1, g);
     if g > 1 then gEncoded = gEncoded || chr(8224); END IF;
     -- Get ExteriorRing now; if there are any holes, get them later in the loop..
     r1 = ST_ExteriorRing(g2);
     gEncoded = gEncoded || GoogleEncodeLine(r1);
     nr = ST_NRings(g2);
     if nr > 1 then
       -- One (1) is because interior rings is one-based.
       -- And nr-1 is because ring count includes the boundary.
       FOR r IN 1..(nr-1) BY 1 LOOP
         r2 = ST_InteriorRingN(g2, r);
         gEncoded = gEncoded || chr(8225) || GoogleEncodeLine(r2);
       END LOOP;
     END IF;
 END LOOP;
 RETURN gEncoded;
End
$$ LANGUAGE plpgsql;

 

/*************************************************************
 * First of two methods. Pass in a geometry (LINESTRING only).
 * Returns ASCII-encoded point array for use in Google Maps.
 ************************************************************/
CREATE OR REPLACE FUNCTION GoogleEncodeLine
(
  g GEOMETRY
)
RETURNS TEXT AS $$
DECLARE
  pt1 GEOMETRY;
  pt2 GEOMETRY;
  p INT; np INT;
  deltaX INT;
  deltaY INT;
  enX VARCHAR(255);
  enY VARCHAR(255);
  gEncoded TEXT;
BEGIN
  gEncoded = '';
  np = ST_NPoints(g);

  IF np > 3 THEN
    g = ST_SimplifyPreserveTopology(g, 0.00001);
    np = ST_NPoints(g);
  END IF;

  pt1 = ST_SetSRID(ST_MakePoint(0, 0),4326);

  FOR p IN 1..np BY 1 LOOP
    pt2 = ST_PointN(g, p);
    deltaX = (floor(ST_X(pt2)*1e5)-floor(ST_X(pt1)*1e5))::INT;
    deltaY = (floor(ST_Y(pt2)*1e5)-floor(ST_Y(pt1)*1e5))::INT;
    enX = GoogleEncodeSignedInteger(deltaX);
    enY = GoogleEncodeSignedInteger(deltaY);
    gEncoded = gEncoded || enY || enX;

    pt1 = ST_SetSRID(ST_MakePoint(ST_X(pt2), ST_Y(pt2)),4326);
  END LOOP;
RETURN gEncoded;
End
$$ LANGUAGE plpgsql;

 

/**************************************************************
 * Second of two methods. Accepts a signed integer (LON or LAT
 * by 1e5) and returns an ASCII-encoded coordinate expression.
 *************************************************************/
CREATE OR REPLACE FUNCTION GoogleEncodeSignedInteger(c INT)
RETURNS VARCHAR(255) AS $$
DECLARE
  e VARCHAR(255);
  s BIT(32);
  b BIT(6);
  n INT;
BEGIN
 e = '';
 s = (c::BIT(32))<<1;

 IF s::INT < 0 THEN
   s = ~s;
   END IF;

 WHILE s::INT >= B'100000'::INT LOOP
   b = B'100000' | (('0'||substring(s, 28, 5))::BIT(6));
   n = b::INT + 63;
   e = e || chr(n);
   s = s >> 5;
 END LOOP;
 e = e || chr(s::INT+63);

RETURN e;
End
$$ LANGUAGE plpgsql;

 

[Added 30 Jan, 2014]

JavaScript method demonstrating how to add Polygon encodings to a Google Map object:
(This client implementation works for either the single or the multi-part polygons.)

/*************************************************************
 * JavaScript! Pass-in an encoded text block created by either
 * PostGIS method, GoogleEncodePolygon() or GoogleEncodeLine(),
 * and render it in your Google Map object. If you don't want
 * the map to zoom to each rendering, just remove the "bounds"
 * variable and any references to it.
 ************************************************************/
function renderEncoded(encoded_path)
{
   var bounds = new google.maps.LatLngBounds();
   var $encodedGeoms = encoded_path.split("†");
   for (var i=0; i<$encodedGeoms.length; i++)
   {
       var encodedGeom = $encodedGeoms[i];
       var $encodedRings = encodedGeom.split("‡");
       var polyPaths = [];
       for (var j=0; j<$encodedRings.length; j++)
       {
           var ptarray = google.maps.geometry.encoding.decodePath($encodedRings[j]);
           polyPaths.push(ptarray);
       }
       var polygonObject = new google.maps.Polygon(
       {
         paths: polyPaths,
         strokeColor: '#890000',
         strokeOpacity: 1.0,
         strokeWeight: 2
       });
       polygonObject.setMap(map);
       polygonObject.getPath().forEach(function(e)
       {
           bounds.extend(e);
       });
   }
   map.fitBounds(bounds);
}

 

And some additional discussion..

There are two “gotchas” when it comes to implementing the encoding algorithm with respect to Polygons:

1) Polygons, as geometries, can be composed of many rings. The outer ring is considered to be the boundary, and various inner rings are often called “holes”. So this is a specified, understood, and accepted built-in many-to-one relationship between polygons and their internal ring geometries.

And 2) It’s not rare to find polygon tables containing both Polygon and MultiPolygon data types. I think this happens because ESRI allows it, and so in an effort to play well with others, other GIS systems have accommodated it. At least, I know this is true for MySQL and PostGIS.

Here’s why this makes trouble–Google’s encoding algorithm is only intended to represent individual point arrays as a singular geometry. Basically, as long as your first point equals your last point, it’s considered to be a closed geometry, and you can add it and render it in a Google Map as a polygon. The algorithm itself isn’t designed to represent nested arrays, which would be necessary to render either a Polygon with “holes” or a MultiPolygon, which could potentially define many polygons with holes of their own! As such, I suspect there could be considerable disagreement as to how a Polygon-to-Google-Encoded method should actually handle Polygons..

The only solutions I can imagine for this problem would require “faking” a one-to-many relationship by perhaps delimiting together several encodings to account for MultiPolygons and/or single feature Polygons with interior rings. But this starts to get weird. So to keep things somewhat simple for the sake of the post, I chose to stay true to the algorithm’s intent and return a single encoded geometry expression from my method. And the sample query demonstrates this by calling the method against the outermost ring (i.e. the boundary) of a Polygon geometry type, which PostGREsql regards as a LineString, anyway.

[Added 30 Jan, 2014]

Since I wanted to handle the more complex geometries, I wrote the wrapper method GoogleEncodePolygon(), to first iterate over ST_NumGeometries() and gain access to any multi-part features, then second, iterate over ST_NRings() using ST_InteriorRingN()–you could also do this using ST_DumpRings()–and gain access to any interior rings of the Polygon geometries, themselves. Then, for each ring part, I call GoogleEncodeLine(), and concatenate together all those expressions into one massive block of “compound” expressions. I chose to delimit each geometry encoding using an extra-special character that would never be used by Google’s algorithm; for example chr(8224), which corresponds to “†”. I then further delimit the internal ring parts using another special character, chr(8225), which corresponds to “‡”, and return all these concatenated together as a compound encoding expression. Then, on the client-side (a JavaScript example is provided above), I merely split the compound expression against my delimiters, loop over the various expressions, and add them to the map individually. Note if you are attaching attributes to your features, you’ll need to remember to include them explicitly to each unique Polygon added to your map.

Written by elrobis

January 27th, 2014 at 12:20 pm

PostGIS: query all multipolygon parcels with at least one hole

without comments

I was writing some code to iterate over Well Known Text expressions for polygon features, and I decided I needed to test the most complex edge-case I could think of–multipolygon geometries where at least one of the bound polygons has a hole (i.e. an interior ring).

I ended up with the following query. This seems like the kind of thing I’ll want to reuse later, so I’m noting it here. For good measure, I also use a rudimentary technique to sort the output with the most complicated geometries in the table at the top of the list. Basically, the more “text” it takes to describe the geometry using Well Known Text, the larger and more complex I figure it must be!

SELECT
  SomePrimaryId,   /* your primary key, i.e. ogc_fid, etc. */
  SomeUniqueId,    /* your descriptive id, i.e. a parcel number */
  ST_NumGeometries(wkb_geometry) AS num_geoms,
  ST_NRings(wkb_geometry) AS num_rings,
  ST_AsText(ST_Centroid(wkb_geometry)) AS center,
  Char_Length(ST_AsText(wkb_geometry)) AS len,
  ST_AsText(wkb_geometry) AS wkt
FROM SomePolygonTable
WHERE
  ST_NumGeometries(wkb_geometry) > 1
  AND
  ST_NRings(wkb_geometry) > ST_NumGeometries(wkb_geometry)
ORDER BY Char_Length(ST_AsText(wkb_geometry)) ASC ;

 

Just for the sake of promoting caution, I’m not certain this is a definitive approach for identifying the largest geometry in a table, as the length of the binary representation and the length of the readable text representation do not correspond one-to-one. Moreover, a feature could have more vertices that required less precision to express (fewer decimal position), than a geometry with fewer vertices that needed more precision, and then you have to ask, which is bigger, fewer vertices and more text, or more vertices that coincidentally did not require as much text? My conclusion is, the “most complicated geometry” is probably relative to the one asking the question. However for my purposes, this was close enough to put the most complicated stuff at the top of the list.

Written by elrobis

November 8th, 2013 at 10:26 am

PostGIS: count all features of each GeometryType in a spatial table

with one comment

Sometimes, just when you think you’ve got something figured out –you get reminded that you really don’t.  :/

As you may know, ESRI allows for single and muli-part geometries to live in the same FeatureClass. So, if I have a shapefile of roads, there might be both LINESTRING and MULTILINESTRING features in that dataset. I live just loose-enough not to care about that. But I do need to be aware of it when I’m cobbling data in PostGIS.

In thise case, I was getting a PostGIS error tying to do a Dissolve-By-SQL, so I thought why not get a quick count of each GeometryType in the dataset? Maybe I was running into issues single and multi-part geometries were blurred together. It took me an embarassing chunk of time to get this right, so I figured I’d post the recipe  in case I needed a reminder later.

SELECT  
  GeometryType( wkb_geometry ) as geomType,  
  COUNT( wkb_geometry ) as featureCount
FROM anyGeoDataTable
WHERE wkb_geometry IS NOT NULL
GROUP BY GeometryType( wkb_geometry );

 

[Update 4.16.2012]
I wanted to do the same thing in MySQL the other day. It’s essentially the same query, but note the need to use CONVERT() function to make the output properly render in MySQL Workbench:

SELECT
  CONVERT( GeometryType( shape ) USING utf8 ) as geomType,
  COUNT( shape ) as featureCount
FROM anyGeoDataTable
WHERE shape IS NOT NULL
GROUP BY GeometryType( shape );

 

Without applying the CONVERT() function, MySQL Workbench just shows “blob” in the return set.
 

Anyway.. back to cracking some nut..  :]

Written by elrobis

January 21st, 2012 at 9:17 pm

Posted in PostGIS

Tagged with ,

Prepare a Shapefile for OpenScales using ogr2ogr and PostGREsql

with one comment

This post explains how to import GIS data (a shapefile, in this case) into a database (PostGREsql) so it can be consumed by most any mapping API. I have OpenScales in mind, but this approach will support any mapping app with functions for rendering feature overlays using geodetic coordinates (i.e Longitude and Latitude). In many cases, you’ll need to translate your feature data out of a projected/cartesian system and into a geodetic/spherical system; so I’ll include a demonstration of that.

Quick and Dirty Summary

This approach has two parts. First, we’ll use GDAL’s ogr2ogr utility to import a shapefile into our database. Second, we’ll use a few SQL commands to translate our data from a projected to a geodetic system, as well as optimize the table for fast query speeds.

Prerequisites

The following prerequisites will need to be met in order to follow along:

1) GDAL is installed. If you need to install GDAL, check out my earlier post titled Install GDAL on Windows. Alternatively, you could install FWTools, which is admittedly easier, but that package is no longer maintained and it’s becoming out-of-date as GDAL/OGR continues to evolve.

2) PostGREsql is installed, and the PostGIS extension is enabled. If you need to install PostGREsql and PostGIS, check out the tutorial at Boston GIS demonstrating how to acquire and install PostGREsql with PostGIS. It won’t hurt to reveiw their entire tutorial, but I deviate from their approach once installation is complete (look for their sub-heading, Loading GIS Data Into the Database).

3) You have some geodata. I think the typical reader will have their own shapefile, pesonal geodatabase, or otherwise, but if you need something to follow along, here’s a US States shapefile projected to NAD83 Albers Equal Area Conic:

http://www.cartometric.com/blog/wp-content/uploads/data/usstates_nad83_aeac.zip

Loading Geodata into PostGREsql / PostGIS

To push shapefile data into your geodatabase, you can run an ogr2ogr script like this:

ogr2ogr -f “PostGreSQL” PG:”host=127.0.0.1 user=youruser dbname=yourdb password=yourpass” “E:\4_GIS\01_tutorials\usstates_nad83_aeac\usstates_albers.shp” -nln usstates -nlt geometry

For deeper reading on ogr2ogr utility flags (like -nln and nlt), check out the usage notes for ogr2ogr. Also, it may be worth you while to peruse the OGR PostgreSQL driver page, as well as the Advanced Driver Information page. In the meantime, here are a few quick notes regarding my script:

-f “PostGreSQL” PG:”host=127.0.0.1 user=youruser dbname=yourdb password=yourpass”  This tells OGR you’re exporting to PostGreSQL with the following connection string. Notice that my connection string is wrapped in double-quotes (“).

“E:\4_GIS\01_tutorials\usstates_nad83_aeac\usstates_albers.shp”   This is the path to my shapefile  input data. Once again, I wrapped this value in double-quotes (“). I do this to prevent the console from introducing linebreaks into the argument value and confusing the parser.

-nln usstates  The -nln flag means “rename the table on export”. In other words, my PostGREsql db will get a new table named usstates, and not one named usstates_albers.

-nlt geometry  This one’s particularly important for polygon data. It tells OGR “accept any geometry you encounter and store it in the feature’s geometry column”. Oftentimes, a polygon dataset will have polygons and multipolygons in the same table. For example here’s a narrow column of Well Known Text (WKT) geometries from the albers shapefile so you can see what I mean:

WKT;STATE_NAME;STATE_FIPS;STATE_ABBR…..
MULTIPOLYGON (((-1827806.2165497246 1227…..
POLYGON ((-1148108.0134327484 649421.311…..
MULTIPOLYGON (((1949555.0881544715 75264…..
POLYGON ((-199237.01920416657 704421.540…..
POLYGON ((-519870.38897088548 372373.616…..

If you run the ogr2ogr script noted above without -nlt geometry, you’ll get an error like this:

ERROR 1: Terminating translation prematurely after failed
translation of layer usstates_albers (use -skipfailures to skip errors)

By default, OGR refuses to mix geometry types in a table, so -nlt geometry allows you to duck that requirement and store both Polygon and Multipolygon features in the same table. You could optionally instruct OGR to “explode” Multipolygons into individual Polygons using the -explodecollections flag, as depicted in the following screenshot, but I don’t recommending that solution for the intended use case. For example, if a map user clicks on Michigan’s Upper Penenssula, I want the whole state to be selected, not just the UP. I’m not saying you can’t make that happen after exploding multifeatures; rather, it’s just not the approach I favor.

Without -nlt geometry, ogr2ogr will throw an error if it attempts to export polygons and multipolygons into the same table. Alternatively, you can use the flag -explodecollections (not recommended in this case) to translate your multipolygons into several polygons.

Assuming you used the script like the one I initially provided, you should be able to open pgAdmin III (the PostGREsql admin GUI that insalls with the database) and see your new usstates table:

Post-Processing your Geodata with SQL Instructions

With pgAdmin III open, expand the Tools menu and launch the Query tool. You’ll use the Query tool and the following SQL instructions to prep your data for production. I’ll start by listing all the queries together, then I’ll provide some deeper explaination in the text that follows.

SELECT SRID(wkb_geometry) FROM usstates;
SELECT * FROM spatial_ref_sys WHERE srid = 900925;
SELECT st_asText(wkb_geometry) FROM usstates;
ALTER TABLE usstates ADD COLUMN wgs84geom GEOMETRY;
UPDATE usstates SET wgs84geom = st_Transform(wkb_geometry, 4326); 
SELECT SRID(wgs84geom) FROM usstates;
SELECT st_asText(wgs84geom) FROM usstates;
VACUUM usstates;
CREATE INDEX usstates_wgs84_idx ON usstates USING GIST(wgs84geom);

Basically, the steps emphasized in blue do the actual work, while the steps in black are more for sanity checks. Steps 4 and 5 perform the geometry transformation, and the last two steps do some house-cleaning and table optimization. Now I’ll provide a one-by-one discussion of each step.

1) SELECT SRID(wkb_geometry) FROM usstates;

Here we’re getting the SRID for the features in this layer (which is 900925 on my system). By default, OGR will store feature geometries in a field called wkb_geometry. Also, your PostGIS installation includes a table named spatial_ref_sys that stores coordinate system definitions necessary for the database to remain “spatially aware” of your new table as well as the other spatial datasets the system is managing. Consider this, if you want to select points from one layer that fall inside polygons from another layer, PostGIS needs to understand the coordinate systems for both datasets so that it can align their features for analysis. So when we run the SRID() function on the table’s geometry field, wkb_geometry, PostGREsql will return the unique identifier for the coordinate system used to define the features in our table.

2) SELECT * FROM spatial_ref_sys WHERE srid = 900925;

In this step we answer the question, “Does the SRS established at import makes sense for the data?” This statement queries the PostGIS spatial_ref_sys table for the coordinate system whose ID is referenced in the previous step. Check the srText field for a readable version of the coordinate system. Mine begins with “PROJCS[“North_America_Albers_Equal_Area_Coni..”  That’s what I expected, and that’s a good thing.

3) SELECT st_asText(wkb_geometry) FROM usstates;

Now I like to do a quick query to see the WKT for some of my features. The geometry field wkb_geometry was created by ogr2ogr when it imported the shapefile into PostGREsql. If you don’t like this name, you can use the creation option -lco GEOMETRY_NAME=geom in your ogr2ogr import script to set the name of the geometry field at import time. As shown in the image, the WKT for my features looks like I would expect.

Querying feature geometries using the ST_asText() function on the geometry column, wkb_geometry.

4) ALTER TABLE usstates ADD COLUMN wgs84geom GEOMETRY;

This instruction adds a new column to the table, which I’ll use to store the feature geometries for my US States in geodetic coordinates. The column will be named wgs84geom and will expect data of type GEOMETRY. In other words, this field will store a permanent “cast” of our feature geometries in the WGS84 coordinate system, which is very popular due to its use by the famous Global Positioning System (GPS).

Note: GIS coordinate systems are complex beasts, and it’s easy to get lost in their particulars. Nevertheless, one distinction is very important, and that’s the difference between projected systems and geodetic systems. Projected systems are two-dimensional. —these are the X/Y grids you used for trigonometry exercises in High School. On the other hand, geodetic systems define coordinate geometries within a three-dimensional, spherical space.

Both systems are roses by many names. For instance, projected systems may be called “Cartesian” or “Geometric”. And Geodetic systems may be called “Sexigesimal” or “Geographic”. The PostGIS community may more often refer to features as being geometry, or geography data and mean projected vs. geodetic coordinates, respectively. So if you run across language like this, realize people intend for geometry to imply X and Y coordinates in a cartesian space, and for geography to mean familiar longitude and latitude coordinates.

In this image, state boundaries are drawn in Albers Equal Area Conic, which is a projected coordinate system.

 

Here the state boundaries are "represented" in NAD83 geographic coordinates. NAD83 geographic (EPSG 4269), is similar, if not nearly identical, to WGS84. As you can see, geodetic systems are difficult to represent in 2D space; as such, unprojected maps of large areas tend to look "suspect".

5) UPDATE usstates SET wgs84geom = st_Transform(wkb_geometry, 4326); 

With the new column ready to go, you can now wield an UPDATE statement and the st_Transform() function to translate feature geometries from their projected coordinates to their geographic WGS84 coordinates. The st_Transform() function expects two arguments, the source geometry field to transform, and the EPSG code for the output coordinate system. WGS84 is a fundamentally-popular coordinate system, and it’s EPSG code of 4326 is easy to find. If you do not know the EPSG code for your preferred coordinate system, head over to http://spatialreference.org/ and do some quick research. 

Note: We could optionally perform any coordinate system transformations in our queries by calling st_Transform on the geometry field right in the query. However, by casting our feature geometries in advance, we remove calculation overhead and get a subtle efficiency gain. This can particularly improve response times for spatial queries.

6) SELECT SRID(wgs84geom) FROM usstates;

Like the second step, this query is only intended to confirm whether the SRID established in the previous step makes sense for the data. It should return 4326.

7) SELECT st_asText(wgs84geom) FROM usstates;

Also like the third step, here I’m querying the feature geometrires as WKT to make sure they’re defined by longitude/latitude coordinate pairs.

Well Known Text (WKT) for features transformed into the WGS84 coordinate system.

8) VACUUM usstates;

Once you’re finished with the geometry transformation, call a VACUUM instruction for the usstates table. PostGREsql likes to have a deep knowledge of its feature tables so that it can optimize queries. To this end, the VACUUM command instructs PostGREsql to “gather fresh intel” on your table so that it can make better decisions. This step is particularly necessary for tables with a large number of features as well as tables experiencing a lot of maintenance (i.e. frequent feature INSERT and UPDATE activity).

9) CREATE INDEX usstates_wgs84_idx ON usstates USING GIST(wgs84geom);

Finally —if you intend to perform queries on this table, particularly spatial intersection queries on the new geometry column, you’ll want to create a spatial index referencing that column. Here, usstates_wgs84_idx is just a naming convention that implies TableName_FieldName_ThisIsAnIndex. To create an index, call the GIST() function on a table and pass in the table column you intend to search on —for instance, wgs84geom.

After following along with this post, you should have learned how to 1) use ogr2ogr to populate a PostGREsql database with shapefile data, 2) leverage PostGIS functions to perform a coordinate system transformation in the database, and 3) apply PostGREsql optimization functions to optimize the table for production use.

I hope you found this beneficial. Thanks for reading.

/Elijah

Written by elrobis

November 19th, 2011 at 5:55 pm

Posted in GDAL/OGR

Tagged with , , ,