postgis/doc/using_postgis_app.xml

<?xml version="1.0" encoding="UTF-8"?>
<chapter>
  <title id="using_postgis_applications">Using PostGIS: Building Applications</title>
  <sect1 id="Using_Mapserver">
    <title>Using Mapserver</title>

    <para>The Minnesota Mapserver is an internet web-mapping server which
    conforms to the OpenGIS Web Mapping Server specification.</para>

    <itemizedlist>
      <listitem>
        <para>The Mapserver homepage is at <ulink
        url="http://mapserver.gis.umn.edu">http://mapserver.gis.umn.edu</ulink>.</para>
      </listitem>

      <listitem>
        <para>The OpenGIS Web Map Specification is at <ulink
        url="http://www.opengeospatial.org/standards">http://www.opengeospatial.org/standards</ulink>.</para>
      </listitem>
    </itemizedlist>

    <sect2>
      <title>Basic Usage</title>

      <para>To use PostGIS with Mapserver, you will need to know about how to
      configure Mapserver, which is beyond the scope of this documentation.
      This section will cover specific PostGIS issues and configuration
      details.</para>

      <para>To use PostGIS with Mapserver, you will need:</para>

      <itemizedlist>
        <listitem>
          <para>Version 0.6 or newer of PostGIS.</para>
        </listitem>

        <listitem>
          <para>Version 3.5 or newer of Mapserver.</para>
        </listitem>
      </itemizedlist>

      <para>Mapserver accesses PostGIS/PostgreSQL data like any other
      PostgreSQL client -- using <filename>libpq</filename>. This means that
      Mapserver can be installed on any machine with network access to the
      PostGIS server, as long as the system has the <filename>libpq</filename>
      PostgreSQL client libraries.</para>

      <orderedlist>
        <listitem>
          <para>Compile and install Mapserver, with whatever options you
          desire, including the "--with-postgis" configuration option.</para>
        </listitem>

        <listitem>
          <para>In your Mapserver map file, add a PostGIS layer. For
          example:</para>

          <programlisting>LAYER
  CONNECTIONTYPE postgis
  NAME "widehighways"
  # Connect to a remote spatial database
  CONNECTION "user=dbuser dbname=gisdatabase host=bigserver"
  PROCESSING "CLOSE_CONNECTION=DEFER"
  # Get the lines from the 'geom' column of the 'roads' table
  DATA "geom from roads"
  STATUS ON
  TYPE LINE
  # Of the lines in the extents, only render the wide highways
  FILTER "type = 'highway' and numlanes &gt;= 4"
  CLASS
    # Make the superhighways brighter and 2 pixels wide
    EXPRESSION ([numlanes] &gt;= 6)
    STYLE
      COLOR 255 22 22
      WIDTH 2
    END
  END
  CLASS
    # All the rest are darker and only 1 pixel wide
    EXPRESSION ([numlanes] &lt; 6)
    STYLE
      COLOR 205 92 82
    END
  END
END</programlisting>

          <para>In the example above, the PostGIS-specific directives are as
          follows:</para>

          <variablelist>
            <varlistentry>
              <term>CONNECTIONTYPE</term>

              <listitem>
                <para>For PostGIS layers, this is always "postgis".</para>
              </listitem>
            </varlistentry>

            <varlistentry>
              <term>CONNECTION</term>

              <listitem>
                <para>The database connection is governed by the a 'connection
                string' which is a standard set of keys and values like this
                (with the default values in &lt;&gt;):</para>

                <para>user=&lt;username&gt; password=&lt;password&gt;
                dbname=&lt;username&gt; hostname=&lt;server&gt;
                port=&lt;5432&gt;</para>

                <para>An empty connection string is still valid, and any of
                the key/value pairs can be omitted. At a minimum you will
                generally supply the database name and username to connect
                with.</para>
              </listitem>
            </varlistentry>

            <varlistentry>
              <term>DATA</term>

              <listitem>
                <para>The form of this parameter is "&lt;column&gt; from
                &lt;tablename&gt;" where the column is the spatial column to
                be rendered to the map.</para>
              </listitem>
            </varlistentry>

			<varlistentry>
              <term>PROCESSING</term>

              <listitem>
                <para>Putting in a CLOSE_CONNECTION=DEFER if you have multiple layers reuses existing connections instead of closing them.  This improves
					speed.  Refer to for <ulink url="http://blog.cleverelephant.ca/2008/10/mapserverpostgis-performance-tips.html">Paul's Mapserver PostGIS Performance Tips</ulink> for more detailed explanation.  </para>
              </listitem>
            </varlistentry>

            <varlistentry>
              <term>FILTER</term>

              <listitem>
                <para>The filter must be a valid SQL string corresponding to
                the logic normally following the "WHERE" keyword in a SQL
                query. So, for example, to render only roads with 6 or more
                lanes, use a filter of "num_lanes &gt;= 6".</para>
              </listitem>
            </varlistentry>
          </variablelist>
        </listitem>

        <listitem>
          <para>In your spatial database, ensure you have spatial (GiST)
          indexes built for any the layers you will be drawing.</para>

          <programlisting>CREATE INDEX [indexname] ON [tablename] USING GIST ( [geometrycolumn] );</programlisting>
        </listitem>

        <listitem>
          <para>If you will be querying your layers using Mapserver you will
          also need an "oid index".</para>

          <para>Mapserver requires unique identifiers for each spatial record
          when doing queries, and the PostGIS module of Mapserver uses the
          PostgreSQL <varname>oid</varname> value to provide these unique
          identifiers. A side-effect of this is that in order to do fast
          random access of records during queries, an index on the
          <varname>oid</varname> is needed.</para>

          <para>To build an "oid index", use the following SQL:</para>

          <programlisting>CREATE INDEX [indexname] ON [tablename] ( oid );</programlisting>
        </listitem>
      </orderedlist>
    </sect2>

    <sect2>
      <title>Frequently Asked Questions</title>

      <qandaset>
        <qandaentry>
          <question>
            <para>When I use an <varname>EXPRESSION</varname> in my map file,
            the condition never returns as true, even though I know the values
            exist in my table.</para>
          </question>

          <answer>
            <para>Unlike shape files, PostGIS field names have to be
            referenced in EXPRESSIONS using <emphasis>lower
            case</emphasis>.</para>

            <programlisting>EXPRESSION ([numlanes] &gt;= 6)</programlisting>
          </answer>
        </qandaentry>

        <qandaentry>
          <question>
            <para>The FILTER I use for my Shape files is not working for my
            PostGIS table of the same data.</para>
          </question>

          <answer>
            <para>Unlike shape files, filters for PostGIS layers use SQL
            syntax (they are appended to the SQL statement the PostGIS
            connector generates for drawing layers in Mapserver).</para>

            <programlisting>FILTER "type = 'highway' and numlanes &gt;= 4"</programlisting>
          </answer>
        </qandaentry>

        <qandaentry>
          <question>
            <para>My PostGIS layer draws much slower than my Shape file layer,
            is this normal?</para>
          </question>

          <answer>
            <para>In general, expect PostGIS layers to be 10% slower than
            equivalent Shape files layers, due to the extra overhead involved
            in database connections, data transformations and data transit
            between the database and Mapserver.</para>

            <para>If you are finding substantial draw performance problems, it
            is likely that you have not build a spatial index on your
            table.</para>

            <programlisting>postgis# CREATE INDEX geotable_gix ON geotable USING GIST ( geocolumn );
postgis# SELECT update_geometry_stats(); -- For PGSQL &lt; 8.0
postgis# VACUUM ANALYZE; -- For PGSQL &gt;= 8.0</programlisting>
          </answer>
        </qandaentry>

        <qandaentry>
          <question>
            <para>My PostGIS layer draws fine, but queries are really slow.
            What is wrong?</para>
          </question>

          <answer>
            <para>For queries to be fast, you must have a unique key for your
            spatial table and you must have an index on that unique
            key.</para>

            <para>You can specify what unique key for mapserver to use with
            the <varname>USING UNIQUE</varname> clause in your
            <varname>DATA</varname> line:</para>

            <programlisting>DATA "the_geom FROM geotable USING UNIQUE gid"</programlisting>

            <para>If your table does not have an explicit unique column, you
            can "fake" a unique column by using the PostgreSQL row "oid" for
            your unique column. "oid" is the default unique column if you do
            not declare one, so enhancing your query speed is a matter of
            building an index on your spatial table oid value.</para>

            <programlisting>postgis# CREATE INDEX geotable_oid_idx ON geotable (oid);</programlisting>
          </answer>
        </qandaentry>
      </qandaset>
    </sect2>

    <sect2>
      <title>Advanced Usage</title>

      <para>The <varname>USING</varname> pseudo-SQL clause is used to add some
      information to help mapserver understand the results of more complex
      queries. More specifically, when either a view or a subselect is used as
      the source table (the thing to the right of "FROM" in a
      <varname>DATA</varname> definition) it is more difficult for mapserver
      to automatically determine a unique identifier for each row and also the
      SRID for the table. The <varname>USING</varname> clause can provide
      mapserver with these two pieces of information as follows:</para>

      <programlisting>DATA "the_geom FROM (
  SELECT
    table1.the_geom AS the_geom,
    table1.oid AS oid,
    table2.data AS data
  FROM table1
  LEFT JOIN table2
  ON table1.id = table2.id
) AS new_table USING UNIQUE oid USING SRID=-1"</programlisting>

      <variablelist>
        <varlistentry>
          <term>USING UNIQUE &lt;uniqueid&gt;</term>

          <listitem>
            <para>Mapserver requires a unique id for each row in order to
            identify the row when doing map queries. Normally, it would use
            the oid as the unique identifier, but views and subselects don't
            automatically have an oid column. If you want to use Mapserver's
            query functionality, you need to add a unique column to your view
            or subselect, and declare it with <varname>USING UNIQUE</varname>.
            For example, you could explicitly select one of the table's oid
            values for this purpose, or any other column which is guaranteed
            to be unique for the result set.</para>

            <para>The <varname>USING</varname> statement can also be useful
            even for simple <varname>DATA</varname> statements, if you are
            doing map queries. It was previously recommended to add an index
            on the oid column of tables used in query-able layers, in order to
            speed up the performance of map queries. However, with the
            <varname>USING</varname> clause, it is possible to tell mapserver
            to use your table's primary key as the identifier for map queries,
            and then it is no longer necessary to have an additional
            index.</para>

            <note>
              <para>"Querying a Map" is the action of clicking on a map to ask
              for information about the map features in that location. Don't
              confuse "map queries" with the SQL query in a
              <varname>DATA</varname> definition.</para>
            </note>
          </listitem>
        </varlistentry>

        <varlistentry>
          <term>USING SRID=&lt;srid&gt;</term>

          <listitem>
            <para>PostGIS needs to know which spatial referencing system is
            being used by the geometries in order to return the correct data
            back to mapserver. Normally it is possible to find this
            information in the "geometry_columns" table in the PostGIS
            database, however, this is not possible for tables which are
            created on the fly such as subselects and views. So the
            <varname>USING SRID=</varname> option allows the correct SRID to
            be specified in the <varname>DATA</varname> definition.</para>
          </listitem>
        </varlistentry>
      </variablelist>

      <warning>
        <para>The parser for Mapserver PostGIS layers is fairly primitive, and
        is case sensitive in a few areas. Be careful to ensure that all SQL
        keywords and all your <varname>USING</varname> clauses are in upper
        case, and that your <varname>USING UNIQUE</varname> clause precedes
        your <varname>USING SRID</varname> clause.</para>
      </warning>
    </sect2>

    <sect2>
      <title>Examples</title>

      <para>Lets start with a simple example and work our way up. Consider the
      following Mapserver layer definition:</para>

      <programlisting>LAYER
  CONNECTIONTYPE postgis
  NAME "roads"
  CONNECTION "user=theuser password=thepass dbname=thedb host=theserver"
  DATA "the_geom FROM roads"
  STATUS ON
  TYPE LINE
  CLASS
    STYLE
      COLOR 0 0 0
    END
  END
END</programlisting>

      <para>This layer will display all the road geometries in the roads table
      as black lines.</para>

      <para>Now lets say we want to show only the highways until we get zoomed
      in to at least a 1:100000 scale - the next two layers will achieve this
      effect:</para>

      <programlisting>LAYER
  CONNECTIONTYPE postgis
  CONNECTION "user=theuser password=thepass dbname=thedb host=theserver"
  PROCESSING "CLOSE_CONNECTION=DEFER"
  DATA "the_geom FROM roads"
  MINSCALE 100000
  STATUS ON
  TYPE LINE
  FILTER "road_type = 'highway'"
  CLASS
    COLOR 0 0 0
  END
END
LAYER
  CONNECTIONTYPE postgis
  CONNECTION "user=theuser password=thepass dbname=thedb host=theserver"
  PROCESSING "CLOSE_CONNECTION=DEFER"
  DATA "the_geom FROM roads"
  MAXSCALE 100000
  STATUS ON
  TYPE LINE
  CLASSITEM road_type
  CLASS
    EXPRESSION "highway"
    STYLE
      WIDTH 2
      COLOR 255 0 0
    END
  END
  CLASS
    STYLE
      COLOR 0 0 0
    END
  END
END</programlisting>

      <para>The first layer is used when the scale is greater than 1:100000,
      and displays only the roads of type "highway" as black lines. The
      <varname>FILTER</varname> option causes only roads of type "highway" to
      be displayed.</para>

      <para>The second layer is used when the scale is less than 1:100000, and
      will display highways as double-thick red lines, and other roads as
      regular black lines.</para>

      <para>So, we have done a couple of interesting things using only
      mapserver functionality, but our <varname>DATA</varname> SQL statement
      has remained simple. Suppose that the name of the road is stored in
      another table (for whatever reason) and we need to do a join to get it
      and label our roads.</para>

      <programlisting>LAYER
  CONNECTIONTYPE postgis
  CONNECTION "user=theuser password=thepass dbname=thedb host=theserver"
  DATA "the_geom FROM (SELECT roads.oid AS oid, roads.the_geom AS the_geom,
        road_names.name as name FROM roads LEFT JOIN road_names ON
        roads.road_name_id = road_names.road_name_id)
        AS named_roads USING UNIQUE oid USING SRID=-1"
  MAXSCALE 20000
  STATUS ON
  TYPE ANNOTATION
  LABELITEM name
  CLASS
    LABEL
      ANGLE auto
      SIZE 8
      COLOR 0 192 0
      TYPE truetype
      FONT arial
    END
  END
END</programlisting>

      <para>This annotation layer adds green labels to all the roads when the
      scale gets down to 1:20000 or less. It also demonstrates how to use an
      SQL join in a <varname>DATA</varname> definition.</para>
    </sect2>
  </sect1>

  <sect1>
    <title>Java Clients (JDBC)</title>

    <para>Java clients can access PostGIS "geometry" objects in the PostgreSQL
    database either directly as text representations or using the JDBC
    extension objects bundled with PostGIS. In order to use the extension
    objects, the "postgis.jar" file must be in your CLASSPATH along with the
    "postgresql.jar" JDBC driver package.</para>

    <programlisting>import java.sql.*;
import java.util.*;
import java.lang.*;
import org.postgis.*;

public class JavaGIS {

public static void main(String[] args) {

  java.sql.Connection conn;

  try {
    /*
    * Load the JDBC driver and establish a connection.
    */
    Class.forName("org.postgresql.Driver");
    String url = "jdbc:postgresql://localhost:5432/database";
    conn = DriverManager.getConnection(url, "postgres", "");
    /*
    * Add the geometry types to the connection. Note that you
    * must cast the connection to the pgsql-specific connection
    * implementation before calling the addDataType() method.
    */
    ((org.postgresql.Connection)conn).addDataType("geometry","org.postgis.PGgeometry")
;
    ((org.postgresql.Connection)conn).addDataType("box3d","org.postgis.PGbox3d");
    /*
    * Create a statement and execute a select query.
    */
    Statement s = conn.createStatement();
    ResultSet r = s.executeQuery("select ST_AsText(geom) as geom,id from geomtable");
    while( r.next() ) {
      /*
      * Retrieve the geometry as an object then cast it to the geometry type.
      * Print things out.
      */
      PGgeometry geom = (PGgeometry)r.getObject(1);
      int id = r.getInt(2);
      System.out.println("Row " + id + ":");
      System.out.println(geom.toString());
    }
    s.close();
    conn.close();
  }
catch( Exception e ) {
  e.printStackTrace();
  }
}
}</programlisting>

    <para>The "PGgeometry" object is a wrapper object which contains a
    specific topological geometry object (subclasses of the abstract class
    "Geometry") depending on the type: Point, LineString, Polygon, MultiPoint,
    MultiLineString, MultiPolygon.</para>

    <programlisting>PGgeometry geom = (PGgeometry)r.getObject(1);
if( geom.getType() = Geometry.POLYGON ) {
  Polygon pl = (Polygon)geom.getGeometry();
  for( int r = 0; r &lt; pl.numRings(); r++) {
    LinearRing rng = pl.getRing(r);
    System.out.println("Ring: " + r);
    for( int p = 0; p &lt; rng.numPoints(); p++ ) {
      Point pt = rng.getPoint(p);
      System.out.println("Point: " + p);
      System.out.println(pt.toString());
    }
  }
}</programlisting>

    <para>The JavaDoc for the extension objects provides a reference for the
    various data accessor functions in the geometric objects.</para>
  </sect1>

  <sect1>
    <title>C Clients (libpq)</title>

    <para>...</para>

    <sect2>
      <title>Text Cursors</title>

      <para>...</para>
    </sect2>

    <sect2>
      <title>Binary Cursors</title>

      <para>...</para>
    </sect2>
  </sect1>
</chapter>