[ TOC ]
[ Previous ]
[ Next ]
15. ADMINISTERING POSTGRES
In this section, we will discuss aspects of POSTGRES
that are of interest to those who make extensive use of
POSTGRES, or who are the site administrator for a group
of POSTGRES users.
15.1. Frequent Tasks
Here we will briefly discuss some procedures that you
should be familiar with in managing any POSTGRES
installation.
15.1.1. Starting the Postmaster
If you did not install POSTGRES exactly as described in
the installation instructions, you may have to perform
some additional steps before starting the postmaster
process.
- Even if you were not the person who installed POSTGRES,
you should understand the installation
instructions. The installation instructions explain
some important issues with respect to where POSTGRES
places some important files, proper settings for
environment variables, etc. that may vary from one
version of POSTGRES to another.
- You must start the postmaster process with the userid
that owns the installed database files. In most
cases, if you have followed the installation
instructions, this will be the user "postgres". If
you do not start the postmaster with the right userid,
the backend servers that are started by the
postmaster will not be able to read the data.
- Make sure that
/usr/local/postgres95/bin
is in your
shell command path, because the postmaster will use
your PATH to locate POSTGRES commands.
- Remember to set the environment variable PGDATA to
the directory where the POSTGRES databases are
installed. (This variable is more fully explained
in the POSTGRES installation instructions.)
- If you do start the postmaster using non-standard
options, such as a different TCP port number, remember
to tell all users so that they can set their
PGPORT environment variable correctly.
15.1.2. Shutting Down the Postmaster
If you need to halt the postmaster process, you can use
the UNIX kill(1) command. Some people habitually use
the -9 or -KILL option; this should never be necessary
and we do not recommend that you do this, as the postmaster
will be unable to free its various shared
resources, its child processes will be unable to exit
gracefully, etc.
15.1.3. Adding and Removing Users
The createuser and destroyuser commands enable and disable
access to POSTGRES by specific users on the host
system.
15.1.4. Periodic Upkeep
The vacuum command should be run on each database periodically.
This command processes deleted instances[9]
and, more importantly, updates the system statistics
concerning the size of each class. If these statistics
are permitted to become out-of-date and inaccurate, the
POSTGRES query optimizer may make extremely poor decisions
with respect to query evaluation strategies.
Therefore, we recommend running vacuum every night or
so (perhaps in a script that is executed by the UNIX
cron(1) or at(1) commands).
Do frequent backups. That is, you should either back
up your database directories using the POSTGRES copy
command and/or the UNIX dump(1) or tar(1) commands.
You may think, "Why am I backing up my database? What
about crash recovery?" One side effect of the POSTGRES
"no overwrite" storage manager is that it is also a "no
log" storage manager. That is, the database log stores
only abort/commit data, and this is not enough information
to recover the database if the storage medium
(disk) or the database files are corrupted! In other
words, if a disk block goes bad or POSTGRES happens to
corrupt a database file, you cannot recover that file.
This can be disastrous if the file is one of the shared
catalogs, such as pg_database.
15.1.5. Tuning
Once your users start to load a significant amount of
data, you will typically run into performance problems.
POSTGRES is not the fastest DBMS in the world, but many
of the worst problems encountered by users are due to
their lack of experience with any DBMS. Some general
tips include:
- Define indices over attributes that are commonly
used for qualifications. For example, if you
often execute queries of the form
SELECT * from EMP where salary < 5000
then a B-tree index on the salary attribute will
probably be useful. If scans involving equality
are more common, as in
SELECT * from EMP where salary = 5000
then you should consider defining a hash index
on salary. You can define both, though it will
use more disk space and may slow down updates a
bit. Scans using indices are much faster than
sequential scans of the entire class.
- Run the vacuum command a lot. This command
updates the statistics that the query optimizer
uses to make intelligent decisions; if the
statistics are inaccurate, the system will make
inordinately stupid decisions with respect to
the way it joins and scans classes.
- When specifying query qualfications (i.e., the
where part of the query), try to ensure that a
clause involving a constant can be turned into
one of the form range_variable operator constant, e.g.,
EMP.salary = 5000
The POSTGRES query optimizer will only use an
index with a constant qualification of this
form. It doesn't hurt to write the clause as
5000 = EMP.salary
if the operator (in this case, =) has a commutator
operator defined so that POSTGRES can
rewrite the query into the desired form. However,
if such an operator does not exist, POSTGRES
will never consider the use of an index.
- When joining several classes together in one
query, try to write the join clauses in a
"chained" form, e.g.,
where A.a = B.b and B.b = C.c and ...
Notice that relatively few clauses refer to a
given class and attribute; the clauses form a
linear sequence connecting the attributes, like
links in a chain. This is preferable to a query
written in a "star" form, such as
where A.a = B.b and A.a = C.c and ...
Here, many clauses refer to the same class and
attribute (in this case, A.a). When presented
with a query of this form, the POSTGRES query
optimizer will tend to consider far more choices
than it should and may run out of memory.
- If you are really desperate to see what query
plans look like, you can run the postmaster with
the -d option and then run monitor with the -t
option. The format in which query plans will be
printed is hard to read but you should be able
to tell whether any index scans are being performed.
15.2. Infrequent Tasks
At some time or another, every POSTGRES site
administrator has to perform all of the following actions.
15.2.1. Cleaning Up After Crashes
The postgres server and the postmaster run as two
different processes. They may crash separately or
together. The housekeeping procedures required to fix
one kind of crash are different from those required to
fix the other.
The message you will usually see when the backend
server crashes is:
FATAL: no response from backend: detected in ...
This generally means one of two things: there is a bug
in the POSTGRES server, or there is a bug in some user
code that has been dynamically loaded into POSTGRES.
You should be able to restart your application and
resume processing, but there are some considerations:
- POSTGRES usually dumps a core file (a snapshot
of process memory used for debugging) in the
database directory
/usr/local/postgres95/data/base/<database>/core
on the server machine. If you don't want to try
to debug the problem or produce a stack trace to
report the bug to someone else, you can delete
this file (which is probably around 10MB).
- When one backend crashes in an uncontrolled way
(i.e., without calling its built-in cleanup
routines), the postmaster will detect this situation,
kill all running servers and reinitialize
the state shared among all backends (e.g., the
shared buffer pool and locks). If your server
crashed, you will get the "no response" message
shown above. If your server was killed because
someone else's server crashed, you will see the
following message:
I have been signalled by the postmaster.
Some backend process has died unexpectedly and possibly
corrupted shared memory. The current transaction was
aborted, and I am going to exit. Please resend the
last query. -- The postgres backend
- Sometimes shared state is not completely cleaned
up. Frontend applications may see errors of the
form:
WARN: cannot write block 34 of myclass [mydb] blind
In this case, you should kill the postmaster and
restart it.
- When the system crashes while updating the system
catalogs (e.g., when you are creating a
class, defining an index, retrieving into a
class, etc.) the B-tree indices defined on the
catalogs are sometimes corrupted. The general
(and non-unique) symptom is that all queries
stop working. If you have tried all of the
above steps and nothing else seems to work, try
using the reindexdb command. If reindexdb succeeds
but things still don't work, you have
another problem; if it fails, the system catalogs
themselves were almost certainly corrupted
and you will have to go back to your backups.
The postmaster does not usually crash (it doesn't do
very much except start servers) but it does happen on
occasion. In addition, there are a few cases where it
encounters problems during the reinitialization of
shared resources. Specifically, there are race conditions
where the operating system lets the postmaster
free shared resources but then will not permit it to
reallocate the same amount of shared resources (even
when there is no contention).
You will typically have to run the ipcclean command if
system errors cause the postmaster to crash. If this
happens, you may find (using the UNIX ipcs(1) command)
that the "postgres" user has shared memory and/or
semaphores allocated even though no postmaster process
is running. In this case, you should run ipcclean as
the "postgres" user in order to deallocate these
resources. Be warned that all such resources owned by
the "postgres" user will be deallocated. If you have
multiple postmaster processes running on the same
machine, you should kill all of them before running
ipcclean (otherwise, they will crash on their own when
their shared resources are suddenly deallocated).
15.2.2. Moving Database Directories
By default, all POSTGRES databases are stored in
separate subdirectories under
/usr/local/postgres95/data/base
.[10] At some point, you
may find that you wish to move one or more databases to
another location (e.g., to a filesystem with more free
space).
If you wish to move all of your databases to the new
location, you can simply:
- Kill the postmaster.
- Copy the entire data directory to the new location
(making sure that the new files are owned by user
"postgres").
% cp -rp /usr/local/postgres95/data /new/place/data
- Reset your PGDATA environment variable (as described
earlier in this manual and in the installation
instructions).
# using csh or tcsh...
% setenv PGDATA /new/place/data
# using sh, ksh or bash...
% PGDATA=/new/place/data; export PGDATA
- Restart the postmaster.
% postmaster &
- After you run some queries and are sure that the
newly-moved database works, you can remove the old
data directory.
% rm -rf /usr/local/postgres95/data
To install a single database in an alternate directory
while leaving all other databases in place, do the following:
- Create the database (if it doesn't already exist)
using the createdb command. In the following steps
we will assume the database is named foo.
- Kill the postmaster.
- Copy the directory
/usr/local/postgres95/data/base/foo
and its contents
to its ultimate destination. It should still be
owned by the "postgres" user.
% cp -rp /usr/local/postgres95/data/base/foo /new/place/foo
- Remove the directory
/usr/local/postgres95/data/base/foo
:
% rm -rf /usr/local/postgres95/data/base/foo
- Make a symbolic link from
/usr/local/postgres95/data/base
to the new directory:
% ln -s /new/place/foo /usr/local/postgres95/data/base/foo
- Restart the postmaster.
15.2.3. Updating Databases
POSTGRES is a research system. In general, POSTGRES
may not retain the same binary format for the storage
of databases from release to release. Therefore, when
you update your POSTGRES software, you will probably
have to modify your databases as well. This is a common
occurrence with commercial database systems as
well; unfortunately, unlike commercial systems, POSTGRES
does not come with user-friendly utilities to make
your life easier when these updates occur.
In general, you must do the following to update your
databases to a new software release:
- Extensions (such as user-defined types, functions,
aggregates, etc.) must be reloaded by re-executing
the SQL CREATE commands. See Appendix A for more
details.
- Data must be dumped from the old classes into ASCII
files (using the COPY command), the new classes created
in the new database (using the CREATE TABLE
command), and the data reloaded from the ASCII files.
- Rules and views must also be reloaded by
reexecuting the various CREATE commands.
You should give any new release a "trial period"; in
particular, do not delete the old database until you
are satisfied that there are no compatibility problems
with the new software. For example, you do not want to
discover that a bug in a type's "input" (conversion
from ASCII) and "output" (conversion to ASCII) routines
prevents you from reloading your data after you have
destroyed your old databases! (This should be standard
procedure when updating any software package, but some
people try to economize on disk space without applying
enough foresight.)
15.3. Database Security
Most sites that use POSTGRES are educational or
research institutions and do not pay much attention to
security in their POSTGRES installations. If desired,
one can install POSTGRES with additional security
features. Naturally, such features come with additional
administrative overhead that must be dealt with.
15.3.1. Kerberos
POSTGRES can be configured to use the MIT Kerberos network
authentication system. This prevents outside
users from connecting to your databases over the network
without the correct authentication information.
15.4. Querying the System Catalogs
As an administrator (or sometimes as a plain user), you
want to find out what extensions have been added to a
given database. The queries listed below are "canned"
queries that you can run on any database to get simple
answers. Before executing any of the queries below, be
sure to execute the POSTGRES vacuum command. (The
queries will run much more quickly that way.) Also,
note that these queries are also listed in
/usr/local/postgres95/tutorial/syscat.sql
so use cut-and-paste (or the \i command) instead of
doing a lot of typing.
This query prints the names of all database adminstrators
and the name of their database(s).
SELECT usename, datname
FROM pg_user, pg_database
WHERE usesysid = int2in(int4out(datdba))
ORDER BY usename, datname;
This query lists all user-defined classes in the
database.
SELECT relname
FROM pg_class
WHERE relkind = 'r' -- not indices
and relname !~ '^pg_' -- not catalogs
and relname !~ '^Inv' -- not large objects
ORDER BY relname;
This query lists all simple indices (i.e., those that
are not defined over a function of several attributes).
SELECT bc.relname AS class_name,
ic.relname AS index_name,
a.attname
FROM pg_class bc, -- base class
pg_class ic, -- index class
pg_index i,
pg_attribute a -- att in base
WHERE i.indrelid = bc.oid
and i.indexrelid = ic.oid
and i.indkey[0] = a.attnum
and a.attrelid = bc.oid
and i.indproc = '0'::oid -- no functional indices
ORDER BY class_name, index_name, attname;
This query prints a report of the user-defined
attributes and their types for all user-defined classes
in the database.
SELECT c.relname, a.attname, t.typname
FROM pg_class c, pg_attribute a, pg_type t
WHERE c.relkind = 'r' -- no indices
and c.relname !~ '^pg_' -- no catalogs
and c.relname !~ '^Inv' -- no large objects
and a.attnum > 0 -- no system att's
and a.attrelid = c.oid
and a.atttypid = t.oid
ORDER BY relname, attname;
This query lists all user-defined base types (not
including array types).
SELECT u.usename, t.typname
FROM pg_type t, pg_user u
WHERE u.usesysid = int2in(int4out(t.typowner))
and t.typrelid = '0'::oid -- no complex types
and t.typelem = '0'::oid -- no arrays
and u.usename <> 'postgres'
ORDER BY usename, typname;
This query lists all left-unary (post-fix) operators.
SELECT o.oprname AS left_unary,
right.typname AS operand,
result.typname AS return_type
FROM pg_operator o, pg_type right, pg_type result
WHERE o.oprkind = 'l' -- left unary
and o.oprright = right.oid
and o.oprresult = result.oid
ORDER BY operand;
This query lists all right-unary (pre-fix) operators.
SELECT o.oprname AS right_unary,
left.typname AS operand,
result.typname AS return_type
FROM pg_operator o, pg_type left, pg_type result
WHERE o.oprkind = 'r' -- right unary
and o.oprleft = left.oid
and o.oprresult = result.oid
ORDER BY operand;
This query lists all binary operators.
SELECT o.oprname AS binary_op,
left.typname AS left_opr,
right.typname AS right_opr,
result.typname AS return_type
FROM pg_operator o, pg_type left, pg_type right, pg_type result
WHERE o.oprkind = 'b' -- binary
and o.oprleft = left.oid
and o.oprright = right.oid
and o.oprresult = result.oid
ORDER BY left_opr, right_opr;
This query returns the name, number of arguments
(parameters) and return type of all user-defined C
functions. The same query can be used to find all
built-in C functions if you change the "C" to "internal",
or all SQL functions if you change the "C" to
"sql".
SELECT p.proname, p.pronargs, t.typname
FROM pg_proc p, pg_language l, pg_type t
WHERE p.prolang = l.oid
and p.prorettype = t.oid
and l.lanname = 'c'
ORDER BY proname;
This query lists all of the aggregate functions that
have been installed and the types to which they can be
applied. count is not included because it can take any
type as its argument.
SELECT a.aggname, t.typname
FROM pg_aggregate a, pg_type t
WHERE a.aggbasetype = t.oid
ORDER BY aggname, typname;
This query lists all of the operator classes that can
be used with each access method as well as the operators
that can be used with the respective operator
classes.
SELECT am.amname, opc.opcname, opr.oprname
FROM pg_am am, pg_amop amop, pg_opclass opc, pg_operator opr
WHERE amop.amopid = am.oid
and amop.amopclaid = opc.oid
and amop.amopopr = opr.oid
ORDER BY amname, opcname, oprname;
9.
This may mean different things depending on the archive
mode with which each class has been created. However, the
current implementation of the vacuum command does not perform any compaction or clustering of data. Therefore, the
UNIX files which store each POSTGRES class never shrink and
the space "reclaimed" by vacuum is never actually reused.
10.
Data for certain classes may stored elsewhere if a
non-standard storage manager was specified when they were
created. Use of non-standard storage managers is an experimental feature that is not supported outside of Berkeley.
[ TOC ]
[ Previous ]
[ Next ]