Elijah Robison | GIS Blog

A scrapbook of GIS tricks, with emphasis on FOSS4G.

Archive for the ‘GDAL/OGR’ Category

Terminal one-liner to create shapefile from WKT using ogr2ogr

without comments

Fair warning, this is a Linux-themed solution. But I expect it could be ported to Windows without too much trouble.

Today I needed to create a shapefile with a single point geometry in it to use as an input for a random GIS utility executable, which required its input to be a an actual shapefile rather than an array of coordinate pairs, or even a single coordinate pair, or even separate x=/y= input parameters for a single coordinate pair. I already had the point coordinate I wanted to use, and I didn’t want to go to the trouble to make this shapefile! So I got to wondering if I could use ogr2ogr to render a simple shapefile from the terminal.

Fortunately, with the help of some utilities built in to my Ubuntu shell, I was able to come up with the following solution. This compound command 1) creates an empty file (dataset.csv), 2) pushes two features into it, and 3) uses ogr2ogr to translate the CSV into a shapefile projected accordingly–in this case, to EPSG:4326.

Here’s the full command. Note that a pair of ampersands (&&) are used to daisy-chain the individual steps together into a single instruction:

touch dataset.csv && printf "gid,WKT,some_value\n1,POINT(-82.048051 33.567181),Alpha\n2,POINT(-92.7774 35.9829),Bravo\n" > dataset.csv && ogr2ogr -f "ESRI Shapefile" dataset.shp -dialect sqlite -sql "SELECT gid, GeomFromText(WKT), some_value FROM dataset" dataset.csv -a_srs EPSG:4326

Here’s a breakdown of what happens..

touch dataset.csv
  • Create a file (in the current directory) named “dataset.csv”

printf "gid,WKT,some_value\n1,POINT(-82.048051 33.567181),Alpha\n2,POINT(-92.7774 35.9829),Bravo\n" > dataset.csv
  • Push a string through standard output (STDOUT) and INTO (>) the new file “dataset.csv”
  • The string imposes line breaks using the “\n” sequence and establishes a CSV type containing three lines
  • The three lines of content include: a column header, feature 1, and feature 2

ogr2ogr -f "ESRI Shapefile" dataset.shp -dialect sqlite -sql "SELECT gid, GeomFromText(WKT), some_value FROM dataset" dataset.csv -a_srs EPSG:4326
  • The ogr2ogr command exports a shapefile, “dataset.shp”
  • The input file, “dataset.csv” is included toward the end, along with the declaration of the source data’s CRS (EPSG:4326/WGS84)
  • ogr2ogr is instructed to use the “SQLITE” dialect of sql to pull data from the target dataset
  • The SQL command makes sure ogr2ogr knows to identify the WKT column as the source data’s geometry field

It runs pretty quickly, and does exactly what I wanted: Quickly crank out a shapefile if all I have is a point or a polygon or something, already in WKT format. I’m sure I’ll be using this again, so I wanted to document it here for easy reference later. Hopefully someone else finds it useful!

Written by elrobis

July 16th, 2020 at 5:04 pm

Create UTFGrid Tiles from PostGIS Tables

without comments

I assume you’re where I was about a week ago. That is, you’ve heard of UTFGrid, and now you want to render your own UTFGrid tiles. Perhaps you found yourself looking at this thread over at GIS.SE, but you don’t want to jump into TileStache just now. Anyway if you’ve got a working installation of GDAL/OGR and Mapnik 2+, complete with Python bindings, I’ll show you what worked for me..

Because this is merely an adaptation of Matthew Perry’s original solution, I highly recommend considering his original blog entry on the topic, complete with discussion points and caveats, before proceeding further!

Once you’re ready, head over to GitHub and download some code, specifically globalmaptiles.py, create_tile_shp.py, and create_utfgrids.py. [1] You don’t actually need create_tile_shp.py, but I snagged all three. To keep things simple put these files in the same folder.

Next, in that same directory, create a new Python file, I called mine createUtfgridsFromPG.py.

Where create_utfgrids.py works entirely on shapefiles, createUtfgridsFromPG.py accepts an OGR PostgreSQL connection string in place of a shapefile path. Fortunately the original OGR code didn’t change, but to use the Mapnik PostGIS driver I had to iterate over the PostgreSQL connection string and store the connection parameters so I could supply them a differently to Mapnik.

Finally, copy the following code and paste it into your new file. It’s basically the same as create_utfgrids.py, but I changed shppath to pgconn and added some code to use a mapnik.PostGIS() datasource in place of a mapnik.Shapefile() datasource.

If you’re curious, the comment # ELR 2014.9.26: flags the few places I changed the original code.

createUtfgridsFromPG.py

#!/usr/bin/env python
# -*- coding: utf-8  -*-
"""
create_utfgrids.py
Author: Matthew Perry
License: BSD

Creates utfgrid .json tiles for the given polygon shapefile

Thx to Dane Springmeyer for the utfgrid spec and mapnik rendering code
and to  Klokan Petr Přidal for his MapTiler code
(http://www.maptiler.org/google-maps-coordinates-tile-bounds-projection/)

"""
import globalmaptiles
import mapnik
import ogr
import os
from optparse import OptionParser, OptionError
try:
    import simplejson as json
except ImportError:
    import json

def create_utfgrids(pgconn, minzoom, maxzoom, outdir, fields=None, layernum=0):

    # ELR 2014.9.26:
    # Original implementation pushed in a shapefile path.
    #ds = ogr.Open(shppath)
    ds = ogr.Open(pgconn)

    # ELR 2014.9.26:
    # Iterate over the PostgreSQL connection string and pull out values we need
    # to use Mapnik's PostGIS datasource constructor.
    pgConnARR = pgconn[3:].split(' ')
    for kvPair in pgConnARR:
        if kvPair.split('=')[0] == "host":
            nikHost = kvPair.split('=')[1]
        if kvPair.split('=')[0] == "port":
            nikPort = kvPair.split('=')[1]
        if kvPair.split('=')[0] == "user":
            nikUser = kvPair.split('=')[1]
        if kvPair.split('=')[0] == "password":
            nikPass = kvPair.split('=')[1]
        if kvPair.split('=')[0] == "dbname":
            nikDB = kvPair.split('=')[1]
        if kvPair.split('=')[0] == "tables":
            nikTable = kvPair.split('=')[1]

    print
    print "WARNING:"
    print " This script assumes a polygon shapefile in spherical mercator projection."
    print " If any of these assumptions are not true, don't count on the results!"
    # TODO confirm polygons
    # TODO confirm mercator
    # TODO get layernum from command line
    layer = ds.GetLayer(layernum)
    bbox = layer.GetExtent()
    print ""
    print str(bbox)

    mercator = globalmaptiles.GlobalMercator()

    m = mapnik.Map(256,256)

    # Since grids are `rendered` they need a style
    s = mapnik.Style()
    r = mapnik.Rule()
    polygon_symbolizer = mapnik.PolygonSymbolizer(mapnik.Color('#f2eff9'))
    r.symbols.append(polygon_symbolizer)
    line_symbolizer = mapnik.LineSymbolizer(mapnik.Color('rgb(50%,50%,50%)'),0.1)
    r.symbols.append(line_symbolizer)
    s.rules.append(r)
    m.append_style('My Style',s)

    print ""
    # ELR 2014.9.26:
    # Original implementation using shapefile..
    #ds = mapnik.Shapefile(file=shppath)

    # ELR 2014.9.26:
    # Parameterized PostGIS implementation..
    ds = mapnik.PostGIS(host=nikHost,port=nikPort,user=nikUser,password=nikPass,dbname=nikDB,table=nikTable)

    mlayer = mapnik.Layer('poly')
    mlayer.datasource = ds
    mlayer.styles.append('My Style')
    m.layers.append(mlayer)

    print ""
    if fields is None:
        fields = mlayer.datasource.fields()
        print "Fields were NONE. Using.."
        print fields
    else:
        print "Fields are USER PROVIDED. Using.."
        print fields
    print ""

    for tz in range(minzoom, maxzoom+1):
        print " * Processing Zoom Level %s" % tz
        tminx, tminy = mercator.MetersToTile( bbox[0], bbox[2], tz)
        tmaxx, tmaxy = mercator.MetersToTile( bbox[1], bbox[3], tz)
        for ty in range(tminy, tmaxy+1):
            for tx in range(tminx, tmaxx+1):
                output = os.path.join(outdir, str(tz), str(tx))
                if not os.path.exists(output):
                    os.makedirs(output)

                # Use top origin tile scheme (like OSM or GMaps)
                # TODO support option for TMS bottom origin scheme (ie opt to not invert)
                ymax = 1 << tz;
                invert_ty = ymax - ty - 1;

                tilefilename = os.path.join(output, "%s.json" % invert_ty) # ty for TMS bottom origin
                tilebounds = mercator.TileBounds( tx, ty, tz)
                #print tilefilename, tilebounds

                box = mapnik.Box2d(*tilebounds)
                m.zoom_to_box(box)
                grid = mapnik.Grid(m.width,m.height)
                mapnik.render_layer(m,grid,layer=0,fields=fields)
                utfgrid = grid.encode('utf',resolution=4)
                with open(tilefilename, 'w') as file:
                    file.write(json.dumps(utfgrid))

if __name__ == "__main__":
    usage = "usage: %prog [options] shapefile minzoom maxzoom output_directory"
    parser = OptionParser(usage)
    parser.add_option("-f", '--fields', dest="fields", help="Comma-seperated list of fields; default is all")
    (options, args) = parser.parse_args()

    if len(args) != 4:
        parser.error("Incorrect number of arguments")

    pgconn = args[0]
    minzoom, maxzoom = int(args[1]), int(args[2])
    outdir = args[3]

    if os.path.exists(outdir):
        parser.error("output directory exists already")

    if options.fields:
        fields = options.fields.split(",")
    else:
        fields = None

    create_utfgrids(pgconn, minzoom, maxzoom, outdir, fields)

Usage..

Once you’ve prepared createUtfgridsFromPG.py, you can call it from the command line like this..

C:\xDev\utfgrids\createUtfgridsFromPG.py "PG:host=127.0.0.1 port=5432 user=postgres dbname=gis password=passw0rd tables=parcels_pmerc" 12 16 "C:/xGIS/tiles/utf" -f tms,owner_name

  • Hopefully the PostgreSQL connection string ("PG:host=..") makes sense.
  • 12 and 16 represent the minimum and maximum zoom levels to be rendered, respectively.
  • The directory “C:/xGIS/tiles/utf” is where your UTFGrid tiles will be saved.
  • And -f tms,owner_name,the_wkt represents a comma-separated list of data fields you want in your UTFGrid.

Caveats..

  • Both create_utfgrids.py and createUtfgridsFromPG.py require your geodata table to be in a Web Mercator projection (EPSG:3857)!
  • The script assumes a top-origin tile scheme, like OSM and others.
  • The script will only work with polygons.
  • While the OGR PostgreSQL connection string has a tables parameter, this implementation will only accept one table.
  • The script will create your target directory, in the example case, utf, and it will throw an error if you create this directory in advance.

[1] Many thanks to Matthew Perry, Klokan Petr Přidal, and Dane Springmeyer for their collective efforts and for sharing their work.

Written by elrobis

September 26th, 2014 at 1:26 pm

OGR VRT: Connect to PostGIS DataSource

without comments

I needed an OGR VRT for something and didn’t find a clear example on the web all in one place, so here goes.

Somewhere on your system, create a new file with a .ovf extension. Inside that file, add some XML like the following to define your PostgreSQL connection:

That name=”WKTGrid” is semantically unrelated here. I have been experimenting with including WKT geometry data in UtfGrid tiles, and that name is relative to my experiments. You can provide most any value for name. However, do note that the layer name is referenced in the ogrinfo command.

<OGRVRTDataSource>
  <OGRVRTLayer name="WKTGrid">
    <SrcDataSource>PG:host=127.0.0.1 user=postgres dbname=gis password=l00per</SrcDataSource>
    <SrcLayer>parcels_cama_20140829_pmerc</SrcLayer>
    <SrcSQL>SELECT tms, owner_name, the_wkt FROM parcels_cama_20140829_pmerc</SrcSQL>
  </OGRVRTLayer>
</OGRVRTDataSource>

  • OGRVRTLayer The layer name attribute you assign can be anything you want.
  • SrcDataSource The data source value defines your PostgreSQL connection parameters.
  • SrcLayer The source layer identifies the target table in your PostgreSQL instance.
  • SrcSQL [Optional] OGR SQL can be used to target specific fields, define field aliases, and even refine the data set using WHERE clauses, etc.

After you make a VRT, it’s smart to test it in ogrinfo before you use it for anything serious. It’s easy to test a VRT in ogrinfo, and if ogrinfo makes sense of it, then you know you’ve got a good VRT.

A command like this uses ogrinfo and OGR_SQL to open the VRT and isolate one feature, showing you its attributes.

ogrinfo C:\xGIS\Vector\parcels\parcels_cama_20140829_pmerc.ovf -sql " SELECT tms, owner_name, the_wkt FROM WKTGrid WHERE tms = 'R39200-02-31' "

In some cases, OGR may have trouble identifying your geometry field, or you may have multiple geometry fields and want to specify one field in particular. If so, note the following changes, specifically the modification to the SrcSQL node and the added GeometryField node.

<OGRVRTDataSource>
  <OGRVRTLayer name="WKTGrid">
    <SrcDataSource>PG:host=127.0.0.1 user=postgres dbname=gis password=l00per</SrcDataSource>
    <SrcLayer>parcels_cama_20140829_pmerc</SrcLayer>
    <SrcSQL>SELECT ST_AsBinary(wkb_geometry) as geomm, tms, owner_name FROM parcels_cama_20140829_pmerc</SrcSQL>
    <GeometryField encoding="WKB" field="geomm"></GeometryField>
  </OGRVRTLayer>
</OGRVRTDataSource>

And this is just scratching the surface. Make sure to check out the OGR VRT driver page for a complete list of options available to you.

Written by elrobis

September 24th, 2014 at 4:45 pm

Prepare a Shapefile for OpenScales using ogr2ogr and PostGREsql

with one comment

This post explains how to import GIS data (a shapefile, in this case) into a database (PostGREsql) so it can be consumed by most any mapping API. I have OpenScales in mind, but this approach will support any mapping app with functions for rendering feature overlays using geodetic coordinates (i.e Longitude and Latitude). In many cases, you’ll need to translate your feature data out of a projected/cartesian system and into a geodetic/spherical system; so I’ll include a demonstration of that.

Quick and Dirty Summary

This approach has two parts. First, we’ll use GDAL’s ogr2ogr utility to import a shapefile into our database. Second, we’ll use a few SQL commands to translate our data from a projected to a geodetic system, as well as optimize the table for fast query speeds.

Prerequisites

The following prerequisites will need to be met in order to follow along:

1) GDAL is installed. If you need to install GDAL, check out my earlier post titled Install GDAL on Windows. Alternatively, you could install FWTools, which is admittedly easier, but that package is no longer maintained and it’s becoming out-of-date as GDAL/OGR continues to evolve.

2) PostGREsql is installed, and the PostGIS extension is enabled. If you need to install PostGREsql and PostGIS, check out the tutorial at Boston GIS demonstrating how to acquire and install PostGREsql with PostGIS. It won’t hurt to reveiw their entire tutorial, but I deviate from their approach once installation is complete (look for their sub-heading, Loading GIS Data Into the Database).

3) You have some geodata. I think the typical reader will have their own shapefile, pesonal geodatabase, or otherwise, but if you need something to follow along, here’s a US States shapefile projected to NAD83 Albers Equal Area Conic:

http://www.cartometric.com/blog/wp-content/uploads/data/usstates_nad83_aeac.zip

Loading Geodata into PostGREsql / PostGIS

To push shapefile data into your geodatabase, you can run an ogr2ogr script like this:

ogr2ogr -f “PostGreSQL” PG:”host=127.0.0.1 user=youruser dbname=yourdb password=yourpass” “E:\4_GIS\01_tutorials\usstates_nad83_aeac\usstates_albers.shp” -nln usstates -nlt geometry

For deeper reading on ogr2ogr utility flags (like -nln and nlt), check out the usage notes for ogr2ogr. Also, it may be worth you while to peruse the OGR PostgreSQL driver page, as well as the Advanced Driver Information page. In the meantime, here are a few quick notes regarding my script:

-f “PostGreSQL” PG:”host=127.0.0.1 user=youruser dbname=yourdb password=yourpass”  This tells OGR you’re exporting to PostGreSQL with the following connection string. Notice that my connection string is wrapped in double-quotes (“).

“E:\4_GIS\01_tutorials\usstates_nad83_aeac\usstates_albers.shp”   This is the path to my shapefile  input data. Once again, I wrapped this value in double-quotes (“). I do this to prevent the console from introducing linebreaks into the argument value and confusing the parser.

-nln usstates  The -nln flag means “rename the table on export”. In other words, my PostGREsql db will get a new table named usstates, and not one named usstates_albers.

-nlt geometry  This one’s particularly important for polygon data. It tells OGR “accept any geometry you encounter and store it in the feature’s geometry column”. Oftentimes, a polygon dataset will have polygons and multipolygons in the same table. For example here’s a narrow column of Well Known Text (WKT) geometries from the albers shapefile so you can see what I mean:

WKT;STATE_NAME;STATE_FIPS;STATE_ABBR…..
MULTIPOLYGON (((-1827806.2165497246 1227…..
POLYGON ((-1148108.0134327484 649421.311…..
MULTIPOLYGON (((1949555.0881544715 75264…..
POLYGON ((-199237.01920416657 704421.540…..
POLYGON ((-519870.38897088548 372373.616…..

If you run the ogr2ogr script noted above without -nlt geometry, you’ll get an error like this:

ERROR 1: Terminating translation prematurely after failed
translation of layer usstates_albers (use -skipfailures to skip errors)

By default, OGR refuses to mix geometry types in a table, so -nlt geometry allows you to duck that requirement and store both Polygon and Multipolygon features in the same table. You could optionally instruct OGR to “explode” Multipolygons into individual Polygons using the -explodecollections flag, as depicted in the following screenshot, but I don’t recommending that solution for the intended use case. For example, if a map user clicks on Michigan’s Upper Penenssula, I want the whole state to be selected, not just the UP. I’m not saying you can’t make that happen after exploding multifeatures; rather, it’s just not the approach I favor.

Without -nlt geometry, ogr2ogr will throw an error if it attempts to export polygons and multipolygons into the same table. Alternatively, you can use the flag -explodecollections (not recommended in this case) to translate your multipolygons into several polygons.

Assuming you used the script like the one I initially provided, you should be able to open pgAdmin III (the PostGREsql admin GUI that insalls with the database) and see your new usstates table:

Post-Processing your Geodata with SQL Instructions

With pgAdmin III open, expand the Tools menu and launch the Query tool. You’ll use the Query tool and the following SQL instructions to prep your data for production. I’ll start by listing all the queries together, then I’ll provide some deeper explaination in the text that follows.

SELECT SRID(wkb_geometry) FROM usstates;
SELECT * FROM spatial_ref_sys WHERE srid = 900925;
SELECT st_asText(wkb_geometry) FROM usstates;
ALTER TABLE usstates ADD COLUMN wgs84geom GEOMETRY;
UPDATE usstates SET wgs84geom = st_Transform(wkb_geometry, 4326); 
SELECT SRID(wgs84geom) FROM usstates;
SELECT st_asText(wgs84geom) FROM usstates;
VACUUM usstates;
CREATE INDEX usstates_wgs84_idx ON usstates USING GIST(wgs84geom);

Basically, the steps emphasized in blue do the actual work, while the steps in black are more for sanity checks. Steps 4 and 5 perform the geometry transformation, and the last two steps do some house-cleaning and table optimization. Now I’ll provide a one-by-one discussion of each step.

1) SELECT SRID(wkb_geometry) FROM usstates;

Here we’re getting the SRID for the features in this layer (which is 900925 on my system). By default, OGR will store feature geometries in a field called wkb_geometry. Also, your PostGIS installation includes a table named spatial_ref_sys that stores coordinate system definitions necessary for the database to remain “spatially aware” of your new table as well as the other spatial datasets the system is managing. Consider this, if you want to select points from one layer that fall inside polygons from another layer, PostGIS needs to understand the coordinate systems for both datasets so that it can align their features for analysis. So when we run the SRID() function on the table’s geometry field, wkb_geometry, PostGREsql will return the unique identifier for the coordinate system used to define the features in our table.

2) SELECT * FROM spatial_ref_sys WHERE srid = 900925;

In this step we answer the question, “Does the SRS established at import makes sense for the data?” This statement queries the PostGIS spatial_ref_sys table for the coordinate system whose ID is referenced in the previous step. Check the srText field for a readable version of the coordinate system. Mine begins with “PROJCS[“North_America_Albers_Equal_Area_Coni..”  That’s what I expected, and that’s a good thing.

3) SELECT st_asText(wkb_geometry) FROM usstates;

Now I like to do a quick query to see the WKT for some of my features. The geometry field wkb_geometry was created by ogr2ogr when it imported the shapefile into PostGREsql. If you don’t like this name, you can use the creation option -lco GEOMETRY_NAME=geom in your ogr2ogr import script to set the name of the geometry field at import time. As shown in the image, the WKT for my features looks like I would expect.

Querying feature geometries using the ST_asText() function on the geometry column, wkb_geometry.

4) ALTER TABLE usstates ADD COLUMN wgs84geom GEOMETRY;

This instruction adds a new column to the table, which I’ll use to store the feature geometries for my US States in geodetic coordinates. The column will be named wgs84geom and will expect data of type GEOMETRY. In other words, this field will store a permanent “cast” of our feature geometries in the WGS84 coordinate system, which is very popular due to its use by the famous Global Positioning System (GPS).

Note: GIS coordinate systems are complex beasts, and it’s easy to get lost in their particulars. Nevertheless, one distinction is very important, and that’s the difference between projected systems and geodetic systems. Projected systems are two-dimensional. —these are the X/Y grids you used for trigonometry exercises in High School. On the other hand, geodetic systems define coordinate geometries within a three-dimensional, spherical space.

Both systems are roses by many names. For instance, projected systems may be called “Cartesian” or “Geometric”. And Geodetic systems may be called “Sexigesimal” or “Geographic”. The PostGIS community may more often refer to features as being geometry, or geography data and mean projected vs. geodetic coordinates, respectively. So if you run across language like this, realize people intend for geometry to imply X and Y coordinates in a cartesian space, and for geography to mean familiar longitude and latitude coordinates.

In this image, state boundaries are drawn in Albers Equal Area Conic, which is a projected coordinate system.

 

Here the state boundaries are "represented" in NAD83 geographic coordinates. NAD83 geographic (EPSG 4269), is similar, if not nearly identical, to WGS84. As you can see, geodetic systems are difficult to represent in 2D space; as such, unprojected maps of large areas tend to look "suspect".

5) UPDATE usstates SET wgs84geom = st_Transform(wkb_geometry, 4326); 

With the new column ready to go, you can now wield an UPDATE statement and the st_Transform() function to translate feature geometries from their projected coordinates to their geographic WGS84 coordinates. The st_Transform() function expects two arguments, the source geometry field to transform, and the EPSG code for the output coordinate system. WGS84 is a fundamentally-popular coordinate system, and it’s EPSG code of 4326 is easy to find. If you do not know the EPSG code for your preferred coordinate system, head over to http://spatialreference.org/ and do some quick research. 

Note: We could optionally perform any coordinate system transformations in our queries by calling st_Transform on the geometry field right in the query. However, by casting our feature geometries in advance, we remove calculation overhead and get a subtle efficiency gain. This can particularly improve response times for spatial queries.

6) SELECT SRID(wgs84geom) FROM usstates;

Like the second step, this query is only intended to confirm whether the SRID established in the previous step makes sense for the data. It should return 4326.

7) SELECT st_asText(wgs84geom) FROM usstates;

Also like the third step, here I’m querying the feature geometrires as WKT to make sure they’re defined by longitude/latitude coordinate pairs.

Well Known Text (WKT) for features transformed into the WGS84 coordinate system.

8) VACUUM usstates;

Once you’re finished with the geometry transformation, call a VACUUM instruction for the usstates table. PostGREsql likes to have a deep knowledge of its feature tables so that it can optimize queries. To this end, the VACUUM command instructs PostGREsql to “gather fresh intel” on your table so that it can make better decisions. This step is particularly necessary for tables with a large number of features as well as tables experiencing a lot of maintenance (i.e. frequent feature INSERT and UPDATE activity).

9) CREATE INDEX usstates_wgs84_idx ON usstates USING GIST(wgs84geom);

Finally —if you intend to perform queries on this table, particularly spatial intersection queries on the new geometry column, you’ll want to create a spatial index referencing that column. Here, usstates_wgs84_idx is just a naming convention that implies TableName_FieldName_ThisIsAnIndex. To create an index, call the GIST() function on a table and pass in the table column you intend to search on —for instance, wgs84geom.

After following along with this post, you should have learned how to 1) use ogr2ogr to populate a PostGREsql database with shapefile data, 2) leverage PostGIS functions to perform a coordinate system transformation in the database, and 3) apply PostGREsql optimization functions to optimize the table for production use.

I hope you found this beneficial. Thanks for reading.

/Elijah

Written by elrobis

November 19th, 2011 at 5:55 pm

Posted in GDAL/OGR

Tagged with , , ,

ogr2ogr: Export Well Known Text (WKT) for one feature to a CSV file

without comments

Perhaps you’re looking for this?

ogr2ogr -f “CSV” “E:\4_GIS\NorthArkCartoData\UnitedStates\MO_wkt” “E:\4_GIS\NorthArkCartoData\UnitedStates\USStates.shp” -sql ” SELECT * FROM usstates WHERE STATE_NAME = ‘Missouri’ ” -lco “GEOMETRY=AS_WKT ” -lco “LINEFORMAT=CRLF” -lco “SEPARATOR=SEMICOLON”

My buddy at work needed a way to get the WKT geometry definition for a single feature in a shapefile. I thought, “surely this is something we can do with OGR?” Lo’ and behold, yes it was. 🙂

The script above uses OGR SQL to interrogate a shapefile for one lone feature and, when it’s found, exports the record to a comma separated values (CSV) file (or in this case, a semicolon delimited file). Here’s a quick break down:

-f “CSV” “E:\4_GIS\NorthArkCartoData\UnitedStates\MO_wkt”  This means CREATE the directory MO_wkt and export the derived output to this directory. I wrapped the file path in double-quotes (“) so linebreaks could not be introduced by the terminal and confuse the argument parser.

GOTCHA! DO NOT create the MO_wkt directory before running the script. Let OGR create that directory. However, relative to my example, E:\4_GIS\NorthArkCartoData\UnitedStates\ could be on your file system already.

“E:\4_GIS\NorthArkCartoData\UnitedStates\USStates.shp”  This is the shapefile you want to interrogate. Yet again, the file path is wrapped in double-quotes (“).

-sql ”  SELECT * FROM usstates WHERE STATE_NAME = ‘Missouri’  “   This is the OGR SQL expression I used to grab the entire record (*) for features where the state_name field was ‘Missouri’. Notice how the search term (‘Missouri’) is wrapped in single-quotes (‘). If the field you’re searching on is a VARCHAR or a TEXT field (obviously a name requires non-numeric characters), you’ll need to wrap your search term in single-quotes. If you were searching on a numeric field, you would not wrap your search term in quotes. Also, just like the file paths shown above, I recommend always wrapping long values and expressions in double-quotes to avoid confusing the argument parser. Next, and this is a matter of user-preference, but notice how I added extra spaces between the double-quotes and my SQL expression (i.e. ”  SELECT..  “). This is totally legit and it may help to distinguish the SQL query from the rest of the OGR script. 

I could have exported a subset of the fields using an expression like:

”  SELECT STATE_ABBR as abbr, STATE_NAME as name FROM usstates WHERE STATE_NAME = ‘Missouri’  ”

This expression selects only the STATE_NAME and the STATE_ABBR fields and renames them name and abbr, respectively, by applying the AS operator. The fields for the subset are comma-separated, but the last field is not followed by a comma (this is basic SQL syntax, but I’m explaining it in case the readership includes some first-timers). If you don’t want to rename the fields in your selection, just omit the “AS rename” from your expression. —when exporting to CSV, the geometry will be exported automatically according to the geometry flag used, which we set next. Unfortunately, though, this means there’s nothing you can do to omit the geometry from your selection.

These last few parameters use -lco flags to signalize various “creation options” used by the CSV driver. The creation options are unique for each dataset OGR can export, and in this case I’m demonstrating a particular recipie. If you want to know more about the CSV creation options, check out the OGR CSV driver page. Alternatively, if you want to export to a different file format, visit the OGR formats page, select the appropriate driver, and read-up on the various creation options.

-lco “GEOMETRY=AS_WKT “  This creation option means “Export the Geometry value as Well Known Text”. Other options that might interest you include GEOMETRY=AS_XYZ, GEOMETRY=AS_XY or GEOMETRY=AS_YX.

-lco “LINEFORMAT=CRLF”  This creation option means “use a Windows-formatted linebreak”. If you’re running Linux you should use “LINEFORMAT=LF”. Honestly, I can’t remember if we really needed this flag so you may want to try skipping this option.

-lco “SEPARATOR=SEMICOLON”  This flag specifies the delimiter used to separates fields and values from one another in the CSV. We chose SEMICOLON because Polygon WKT already has a bunch of commas in it, so using the SEMICOLON delimiter helped us to visually identify the WKT we were really looking for. Your other options include SEPARATOR=COMMA and SEPARATOR=TAB.

Alternatively, do this with ogrinfo..

You could also use ogrinfo to dump a feature’s WKT right into the console. This approach required using one of the so-called special fields recognized by OGR SQL. To learn more about OGR SQL, its particulars, and these special fields, visit the OGR SQL help page. Without further adieu, here’s my ogrinfo script:

ogrinfo “E:\4_GIS\NorthArkCartoData\UnitedStates\USStates.shp” -geom=yes -sql ” SELECT OGR_GEOM_WKT FROM usstates WHERE STATE_NAME = ‘Missouri’ “

In this case, the OGR SQL expression requests only the “OGR_GEOM_WKT” special field. Because we’re using ogrinfo, our output will stay in the console. However, this was ultimately undesirable for our task, as “real world” Polygon WKT typically spills beyond the console buffer. Moreover, copying the WKT geometry out of the console introduced hundreds of unwanted linebreaks into the WKT. So, for these reasons, we prefered to export our WKT results to a CSV because it allowed us to easily see, select, and copy the desired WKT data without any fuss using good ol’ Notepad with word wrap.

I hope you like. 🙂

/Elijah

P.S. If you don’t have OGR and want (correction, you NEED) to install it, check out my earlier post describing how to install GDAL/OGR on a Windows system.

Written by elrobis

November 18th, 2011 at 5:59 pm

Posted in GDAL/OGR

Tagged with , ,

Install GDAL on Windows

with 57 comments

Later posts on this blog will assume a working install of GDAL/OGR, so it’s prudent that I first demonstrate how to get a fresh cut of GDAL/OGR up-and-running on a Windows system.

Windows users have a few options for installing GDAL (see this question at gis.stackexchange,which will point you to Christoph Gohlke’s binaries –scroll down to GDAL, and the OSGeo4W Installer, among other approaches). However, I’ll guide you through what works for me, which can be summarized as follows:

1) Install Python
2) Install the GDAL binaries published by Tamas Szekeres
3) Append your environment Path variable
4) Add the GDAL_DATA environment variable
5) Finally, perform a quick test to make sure everything worked.

First, Install Python

(For reference purposes, I used version 2.7 for Win32):

· Start at the Python homepage
· At left, find “Quick Links (2.7.2)” and select “Windows Installer”
· Complete the download and start the installation
· Accept the default options suggested by the installer

Once Python is installed, launch IDLE (the built-in Python IDE) to get specific details about your Python installation:

Follow Windows (Start)→Programs→Python 2.7→IDLE (Python GUI)

This figure emphasizes where to look:

Python Version Details Output in IDLE

In this case, I’m noting the following values:
· “MSC v.1500
· “on win32

Your values may be different than mine. For instance, you may have v.1400? Or maybe you installed 64-bit Python on a 64-bit system? Whatever you did, you’ll need to be consistent from here on out. Since these values will be important in the next step, you may want to keep the IDLE window open for a few minutes.

Next Install the GDAL Binaries

From the GDAL homepage, you’d navigate to Downloads→Windows, and open Tamas Szekeres’ Windows binaries. When Tamas’ page loads, you’ll need to identify the proper download package for your version of Python, system bit-depth, and VC dependencies.

Note: VC2005, VC2008, VC2010 refer to various dependencies upon C++ redistributable libraries. To get your very own, just Google “c++ redistributable” and pick up the most current release. Alternatively, if you want to play it uber-safe, download and install Visual C++ Express 2010, and it will automagically qualify you for MSVC2010 builds (which equates to the compiler used to build GDAL).

At Tamas’ site, do the following:
· Scroll down to “GDAL and MapServer latest release versions”

· Find a release corresponding to both your installed C++ redistributable library AND your Python install. For instance, I used “release-1500-gdal-1-8-0-mapserver-6-0-0”, which corresponds to MSVC2008 (Win32) in the left column (note the figure).

Finding the right GDAL binaries at gisinternals.com

Selecting a release will take you to a new screen promoting a handful
of EXE and MSI files—to use ogr2ogr, you’ll need two of these.

· Download and install gdal-18-xxxx-core.msi (where xxxx is 1310, 1400,
1500, or 1600, which matches your Python setup). I recommend keeping the defaults while installing.

· Next download and install the GDAL binaries for your version of Python (i.e. 2.6, 2.7, 3.1). Your choice should match the bit-depth and version of Python you installed in the first place. In keeping with my running example, I used GDAL-1.8.0.win32-py2.7.msi. Once again, run the installer and accept the defaults.

Note: The GDAL installer should detect the Python instance installed in the previous section (see Figure 1-9). If not, first review the steps in the previous section to see if you omitted something (this may indicate the GDAL binaries do not match your installation of Python), otherwise select “Python from another location” and proceed cautiously.

The GDAL Python bindings installer should detect your Python installation.

 Set the Environment Variables

We’re nearly finished installing GDAL. In this section, we’ll configure the Windows environment variables (Path and GDAL_DATA) so the GDAL/OGR utilities will be easily asccessible from the command line.

Append the Path Variable

First, append the GDAL installation directory onto the existing Path variable. Do the following to edit the Path variable:

· Open the Control Panel (Start→Control Panel), open System and select “Advanced System Settings”.

Note: Alternatively, you can open the Windows (Start) menu, right-click “Computer”, select Properties, then select “Advanced System Settings”, which I find easier.

· Select “Environment Variables” (near the bottom).

· Find “Path” among the list options and select it.

· With “Path” selected, click the “Edit” button.

· When the Edit System Variable dialog appears, scroll to the end of the series, add a semi-colon (;) to separate the prior path from the new one you’re about to enter, then add your GDAL installation path. I added exactly this:  ;C:\Program Files (x86)\GDAL

· Keep the Environment Variables dialog open for the next task.

Warning: Be sure to confirm your GDAL installation directory. I installed GDAL on Windows XP, but if you run a newer version of Windows, your installation path may be different from mine.

Appending the GDAL installation directory to the Path variable. (THIS IMAGE IS FROM A WIN XP INSTALL! A WIN 7 install will likely require "Program Files (x86)".

Add the GDAL_DATA Variable

Next we’ll add a new System Variable, GDAL_DATA, which helps the various GDAL utilities find the gdal-data folder. Without access to the files in the gdal-data folder, GDAL may have trouble recognizing some projections and coordinate systems, among other things.

In Windows, create the GDAL_DATA variable as follows:

· In the Environment Variables dialog, click “New” (near the bottom).

· In the New Variable dialog, enter GDAL_DATA for the name value.

· For value, enter the path to your gdal-data folder; it should be one level below your GDAL directory, which you can confirm by scouting the GDAL folder. I used:  C:\Program Files (x86)\GDAL\gdal-data

Once again, confirm the gdal-data directory makes sense relative to your GDAL installation folder.

· At this point, commit your changes by selecting “OK” on each dialog.

Adding the GDAL_DATA variable. (Note! On Win 7 you'll probably need to use "Program Files (x86)". I stole this image from my older WinXP tutorial.)

[Update 1 Oct, 2013]

Add the GDAL_DRIVER_PATH Variable

In the same way you added the GDAL_DATA variable, add a GDAL_DRIVER_PATH variable. This variable is necessary for the Python environment to locate various GDAL driver plugins. Like the gdal-data directory, the gdalplugins directory should be one level below your GDAL installation. Of course, confirm the path on your system. (I’m on Win 7 at this point–note this is inconsistent with the screenshot, above).

On my system, the gdalplugins directory is: C:\Program Files (x86)\GDAL\gdalplugins

For those who already have GDAL installed along with the Python bindings–If you’re trying to write custom scripts and you’re getting errors like ERROR 4: ..does not exist ..is not recognized as a supported dataset name, you may simply need to add this additional environment variable. This is especially true if you followed my tutorial for setting up GDAL on your system in the first place! In case of the latter, my apologies.

 

Confirming a Successful GDAL Installation

To confirm the availability of ogr2ogr, just open a new command terminal and attempt to call ogr2ogr. If your installation is successful, ogr2ogr will start running, fail to find any valid runtime arguments, and close, leaving some helpful usage hints in its wake.

To test ogr2ogr in a Windows Command Prompt, do the following:

· Open the Windows (Start) menu, and at the bottom (“Search programs and files”), type cmd and hit Enter; this will open the Command Prompt.

· With the terminal open, type ogr2ogr and hit Enter (see the image).

Because we added the GDAL installation directory to the Path variable, Windows will attempt to find a valid ogr2ogr executable in the GDAL folder. If all goes well your command prompt should resemble the following figure (the ogr2ogr command is highlighted).

Confirming a successful install by calling ogr2ogr.

————————

[Update 27 Jan, 2014]

Change in testing/confirming a successful install

My initial recommendation to just type “ogr2ogr”, which resulted in an error (which I warned you’d get), was confusing for several people. In hindsight, I agree it was a poor choice on my part. So I like this better:

At the command prompt, type this:

gdalinfo --version

And the OGR module should respond with something like this:

C:\Users\elrobis>gdalinfo --version
GDAL 1.9.2, released 2012/10/08

Assuming you get a version number and a release date, and not an error, you should have a clean install of GDAL/OGR to work with. Be warned, though, getting the version output doesn’t confirm that your environment is setup properly, if that stuff isn’t right, you’ll likely get errors when you actually apply GDAL or OGR to some task.

————————

And voila! Assuming you see the Usage/Help output, you should be good to go with a fresh GDAL/OGR installation.

Cheers, Elijah

[Edit 06.22.2012]

If you plan to create your own Python utility scripts with the OGR Python bindings, I further recommend the following Python libraries:

shapely, numpy, Python Image Library (pil), PostgreSQL Adapter (psycopg), MySQL Adaptor (mysql-python), Python Shapefile Library (pyshp), Dbase/Xbase Adaptor Library (pydbf), matplotlib

Written by elrobis

October 17th, 2011 at 9:33 pm

Posted in GDAL/OGR,Python

Tagged with , ,