[ TOC ] [ Previous ] [ Next ]

15. ADMINISTERING POSTGRES


In this section, we will discuss aspects of POSTGRES that are of interest to those who make extensive use of POSTGRES, or who are the site administrator for a group of POSTGRES users.

15.1. Frequent Tasks

Here we will briefly discuss some procedures that you should be familiar with in managing any POSTGRES installation.

15.1.1. Starting the Postmaster

If you did not install POSTGRES exactly as described in the installation instructions, you may have to perform some additional steps before starting the postmaster process.

15.1.2. Shutting Down the Postmaster

If you need to halt the postmaster process, you can use the UNIX kill(1) command. Some people habitually use the -9 or -KILL option; this should never be necessary and we do not recommend that you do this, as the postmaster will be unable to free its various shared resources, its child processes will be unable to exit gracefully, etc.

15.1.3. Adding and Removing Users

The createuser and destroyuser commands enable and disable access to POSTGRES by specific users on the host system.

15.1.4. Periodic Upkeep

The vacuum command should be run on each database periodically. This command processes deleted instances[9] and, more importantly, updates the system statistics concerning the size of each class. If these statistics are permitted to become out-of-date and inaccurate, the POSTGRES query optimizer may make extremely poor decisions with respect to query evaluation strategies. Therefore, we recommend running vacuum every night or so (perhaps in a script that is executed by the UNIX cron(1) or at(1) commands). Do frequent backups. That is, you should either back up your database directories using the POSTGRES copy command and/or the UNIX dump(1) or tar(1) commands. You may think, "Why am I backing up my database? What about crash recovery?" One side effect of the POSTGRES "no overwrite" storage manager is that it is also a "no log" storage manager. That is, the database log stores only abort/commit data, and this is not enough information to recover the database if the storage medium (disk) or the database files are corrupted! In other words, if a disk block goes bad or POSTGRES happens to corrupt a database file, you cannot recover that file. This can be disastrous if the file is one of the shared catalogs, such as pg_database.

15.1.5. Tuning

Once your users start to load a significant amount of data, you will typically run into performance problems. POSTGRES is not the fastest DBMS in the world, but many of the worst problems encountered by users are due to their lack of experience with any DBMS. Some general tips include:
  1. Define indices over attributes that are commonly used for qualifications. For example, if you often execute queries of the form
                    SELECT * from EMP where salary < 5000
    
    then a B-tree index on the salary attribute will probably be useful. If scans involving equality are more common, as in
                    SELECT * from EMP where salary = 5000
    
    then you should consider defining a hash index on salary. You can define both, though it will use more disk space and may slow down updates a bit. Scans using indices are much faster than sequential scans of the entire class.

  2. Run the vacuum command a lot. This command updates the statistics that the query optimizer uses to make intelligent decisions; if the statistics are inaccurate, the system will make inordinately stupid decisions with respect to the way it joins and scans classes.

  3. When specifying query qualfications (i.e., the where part of the query), try to ensure that a clause involving a constant can be turned into one of the form range_variable operator constant, e.g.,
                    EMP.salary = 5000
    
    The POSTGRES query optimizer will only use an index with a constant qualification of this form. It doesn't hurt to write the clause as
                    5000 = EMP.salary
    
    if the operator (in this case, =) has a commutator operator defined so that POSTGRES can rewrite the query into the desired form. However, if such an operator does not exist, POSTGRES will never consider the use of an index.

  4. When joining several classes together in one query, try to write the join clauses in a "chained" form, e.g.,
                    where A.a = B.b and B.b = C.c and ...
    
    Notice that relatively few clauses refer to a given class and attribute; the clauses form a linear sequence connecting the attributes, like links in a chain. This is preferable to a query written in a "star" form, such as
                    where A.a = B.b and A.a = C.c and ...
    
    Here, many clauses refer to the same class and attribute (in this case, A.a). When presented with a query of this form, the POSTGRES query optimizer will tend to consider far more choices than it should and may run out of memory.

  5. If you are really desperate to see what query plans look like, you can run the postmaster with the -d option and then run monitor with the -t option. The format in which query plans will be printed is hard to read but you should be able to tell whether any index scans are being performed.

15.2. Infrequent Tasks

At some time or another, every POSTGRES site administrator has to perform all of the following actions. 15.2.1. Cleaning Up After Crashes The postgres server and the postmaster run as two different processes. They may crash separately or together. The housekeeping procedures required to fix one kind of crash are different from those required to fix the other. The message you will usually see when the backend server crashes is:
         FATAL: no response from backend: detected in ...
This generally means one of two things: there is a bug in the POSTGRES server, or there is a bug in some user code that has been dynamically loaded into POSTGRES. You should be able to restart your application and resume processing, but there are some considerations:
  1. POSTGRES usually dumps a core file (a snapshot of process memory used for debugging) in the database directory
                    /usr/local/postgres95/data/base/<database>/core
    
    on the server machine. If you don't want to try to debug the problem or produce a stack trace to report the bug to someone else, you can delete this file (which is probably around 10MB).

  2. When one backend crashes in an uncontrolled way (i.e., without calling its built-in cleanup routines), the postmaster will detect this situation, kill all running servers and reinitialize the state shared among all backends (e.g., the shared buffer pool and locks). If your server crashed, you will get the "no response" message shown above. If your server was killed because someone else's server crashed, you will see the following message:
                    I have been signalled by the postmaster.
                    Some backend process has died unexpectedly and possibly
                    corrupted shared memory.  The current transaction was
                    aborted, and I am going to exit.  Please resend the
                    last query. -- The postgres backend
    

  3. Sometimes shared state is not completely cleaned up. Frontend applications may see errors of the form:
                    WARN: cannot write block 34 of myclass [mydb] blind
    
    In this case, you should kill the postmaster and restart it.

  4. When the system crashes while updating the system catalogs (e.g., when you are creating a class, defining an index, retrieving into a class, etc.) the B-tree indices defined on the catalogs are sometimes corrupted. The general (and non-unique) symptom is that all queries stop working. If you have tried all of the above steps and nothing else seems to work, try using the reindexdb command. If reindexdb succeeds but things still don't work, you have another problem; if it fails, the system catalogs themselves were almost certainly corrupted and you will have to go back to your backups.

The postmaster does not usually crash (it doesn't do very much except start servers) but it does happen on occasion. In addition, there are a few cases where it encounters problems during the reinitialization of shared resources. Specifically, there are race conditions where the operating system lets the postmaster free shared resources but then will not permit it to reallocate the same amount of shared resources (even when there is no contention). You will typically have to run the ipcclean command if system errors cause the postmaster to crash. If this happens, you may find (using the UNIX ipcs(1) command) that the "postgres" user has shared memory and/or semaphores allocated even though no postmaster process is running. In this case, you should run ipcclean as the "postgres" user in order to deallocate these resources. Be warned that all such resources owned by the "postgres" user will be deallocated. If you have multiple postmaster processes running on the same machine, you should kill all of them before running ipcclean (otherwise, they will crash on their own when their shared resources are suddenly deallocated).

15.2.2. Moving Database Directories

By default, all POSTGRES databases are stored in separate subdirectories under /usr/local/postgres95/data/base.[10] At some point, you may find that you wish to move one or more databases to another location (e.g., to a filesystem with more free space). If you wish to move all of your databases to the new location, you can simply: To install a single database in an alternate directory while leaving all other databases in place, do the following:

15.2.3. Updating Databases

POSTGRES is a research system. In general, POSTGRES may not retain the same binary format for the storage of databases from release to release. Therefore, when you update your POSTGRES software, you will probably have to modify your databases as well. This is a common occurrence with commercial database systems as well; unfortunately, unlike commercial systems, POSTGRES does not come with user-friendly utilities to make your life easier when these updates occur. In general, you must do the following to update your databases to a new software release: You should give any new release a "trial period"; in particular, do not delete the old database until you are satisfied that there are no compatibility problems with the new software. For example, you do not want to discover that a bug in a type's "input" (conversion from ASCII) and "output" (conversion to ASCII) routines prevents you from reloading your data after you have destroyed your old databases! (This should be standard procedure when updating any software package, but some people try to economize on disk space without applying enough foresight.)

15.3. Database Security

Most sites that use POSTGRES are educational or research institutions and do not pay much attention to security in their POSTGRES installations. If desired, one can install POSTGRES with additional security features. Naturally, such features come with additional administrative overhead that must be dealt with.

15.3.1. Kerberos

POSTGRES can be configured to use the MIT Kerberos network authentication system. This prevents outside users from connecting to your databases over the network without the correct authentication information.

15.4. Querying the System Catalogs

As an administrator (or sometimes as a plain user), you want to find out what extensions have been added to a given database. The queries listed below are "canned" queries that you can run on any database to get simple answers. Before executing any of the queries below, be sure to execute the POSTGRES vacuum command. (The queries will run much more quickly that way.) Also, note that these queries are also listed in
         /usr/local/postgres95/tutorial/syscat.sql
so use cut-and-paste (or the \i command) instead of doing a lot of typing. This query prints the names of all database adminstrators and the name of their database(s).
         SELECT usename, datname
             FROM pg_user, pg_database
             WHERE usesysid = int2in(int4out(datdba))
             ORDER BY usename, datname;
This query lists all user-defined classes in the database.
         SELECT relname
             FROM pg_class
             WHERE relkind = 'r'           -- not indices
               and relname !~ '^pg_'       -- not catalogs
               and relname !~ '^Inv'       -- not large objects
             ORDER BY relname;
This query lists all simple indices (i.e., those that are not defined over a function of several attributes).
         SELECT bc.relname AS class_name,
                  ic.relname AS index_name,
                  a.attname
             FROM pg_class bc,             -- base class
                  pg_class ic,             -- index class
                  pg_index i,
                  pg_attribute a           -- att in base
             WHERE i.indrelid = bc.oid
                and i.indexrelid = ic.oid
                and i.indkey[0] = a.attnum
                and a.attrelid = bc.oid
                and i.indproc = '0'::oid   -- no functional indices
             ORDER BY class_name, index_name, attname;
This query prints a report of the user-defined attributes and their types for all user-defined classes in the database.
         SELECT c.relname, a.attname, t.typname
             FROM pg_class c, pg_attribute a, pg_type t
             WHERE c.relkind = 'r'     -- no indices
               and c.relname !~ '^pg_' -- no catalogs
               and c.relname !~ '^Inv' -- no large objects
               and a.attnum > 0       -- no system att's
               and a.attrelid = c.oid
               and a.atttypid = t.oid
             ORDER BY relname, attname;
This query lists all user-defined base types (not including array types).
         SELECT u.usename, t.typname
             FROM pg_type t, pg_user u
             WHERE u.usesysid = int2in(int4out(t.typowner))
               and t.typrelid = '0'::oid   -- no complex types
               and t.typelem = '0'::oid    -- no arrays
               and u.usename <> 'postgres'
             ORDER BY usename, typname;
This query lists all left-unary (post-fix) operators.
         SELECT o.oprname AS left_unary,
                  right.typname AS operand,
                  result.typname AS return_type
             FROM pg_operator o, pg_type right, pg_type result
             WHERE o.oprkind = 'l'           -- left unary
               and o.oprright = right.oid
               and o.oprresult = result.oid
             ORDER BY operand;
This query lists all right-unary (pre-fix) operators.
         SELECT o.oprname AS right_unary,
                  left.typname AS operand,
                  result.typname AS return_type
             FROM pg_operator o, pg_type left, pg_type result
             WHERE o.oprkind = 'r'          -- right unary
               and o.oprleft = left.oid
               and o.oprresult = result.oid
             ORDER BY operand;
This query lists all binary operators.
         SELECT o.oprname AS binary_op,
                  left.typname AS left_opr,
                  right.typname AS right_opr,
                  result.typname AS return_type
             FROM pg_operator o, pg_type left, pg_type right, pg_type result
             WHERE o.oprkind = 'b'         -- binary
               and o.oprleft = left.oid
               and o.oprright = right.oid
               and o.oprresult = result.oid
             ORDER BY left_opr, right_opr;
This query returns the name, number of arguments (parameters) and return type of all user-defined C functions. The same query can be used to find all built-in C functions if you change the "C" to "internal", or all SQL functions if you change the "C" to "sql".
         SELECT p.proname, p.pronargs, t.typname
             FROM pg_proc p, pg_language l, pg_type t
             WHERE p.prolang = l.oid
               and p.prorettype = t.oid
               and l.lanname = 'c'
             ORDER BY proname;
This query lists all of the aggregate functions that have been installed and the types to which they can be applied. count is not included because it can take any type as its argument.
         SELECT a.aggname, t.typname
             FROM pg_aggregate a, pg_type t
             WHERE a.aggbasetype = t.oid
             ORDER BY aggname, typname;
This query lists all of the operator classes that can be used with each access method as well as the operators that can be used with the respective operator classes.
         SELECT am.amname, opc.opcname, opr.oprname
             FROM pg_am am, pg_amop amop, pg_opclass opc, pg_operator opr
             WHERE amop.amopid = am.oid
               and amop.amopclaid = opc.oid
               and amop.amopopr = opr.oid
             ORDER BY amname, opcname, oprname;


9. This may mean different things depending on the archive mode with which each class has been created. However, the current implementation of the vacuum command does not perform any compaction or clustering of data. Therefore, the UNIX files which store each POSTGRES class never shrink and the space "reclaimed" by vacuum is never actually reused.
10. Data for certain classes may stored elsewhere if a non-standard storage manager was specified when they were created. Use of non-standard storage managers is an experimental feature that is not supported outside of Berkeley.
[ TOC ] [ Previous ] [ Next ]