PostgreSQL on the Command Line Working efficiently with psql
I’ve been using PostgreSQL on a daily basis for the last four years. While I initially resorted to GUI applications to interact with databases, I now exclusively use the built-in command line tools, allowing me to work more efficiently while sticking to my preferred tools.
This post is split into different sections, talking about specific tasks to be accomplished mainly through the command line tool psql. These are the different topics I’ll be talking about:
- Configuring psql
- Getting Help
- Your Playground Database
- Exploring a Database
- Building a Query
- Comparing Outputs
- Cloning a Database
- Extracting Data
Configuring psql
psql has a decent configuration out of the box. Thus, I won’t go into too many details about available options except for two, which will make every-day use of psql more enjoyable.
The first one is important in order to get the output to show nicely on screen in case the data does not fit horizontally. By default, psql outputs the data as a nicely formatted table.
db=# SELECT 'hello' AS foo, bar FROM generate_series(1, 2) AS bar;
foo | bar
-------+-----
hello | 1
hello | 2
(2 rows)
When the width of the rendered table no longer fits the terminal width, your terminal is going to wrap the lines, making it incredibly hard to visually parse the data.
db=# SELECT 'really long string messing with the output' AS foo, bar,
'another long string making things worse' AS baz FROM generate_series(1,
2) AS bar;
foo | bar | ba
z
--------------------------------------------+-----+---------------------
--------------------
really long string messing with the output | 1 | another long string
making things worse
really long string messing with the output | 2 | another long string
making things worse
(2 rows)
Fortunately there is a command to activate the so-called expanded display mode.
db=# \x
Expanded display is on.
Repeating the query will no longer render the data as a table, but instead show every record as a key-value pair, greatly improving readability.
db=# SELECT 'really long string messing with the output' AS foo, bar,
'another long string making things worse' AS baz FROM generate_series(1,
2) AS bar;
-[ RECORD 1 ]-----------------------------------
foo | really long string messing with the output
bar | 1
baz | another long string making things worse
-[ RECORD 2 ]-----------------------------------
foo | really long string messing with the output
bar | 2
baz | another long string making things worse
The best part: by running \x auto
it will use the table layout only if
there’s enough space and otherwise resort to the key-value layout.
The second crucial config is to configure how NULL
is rendered. By default,
it isn’t. Thus, it’s indifferentiable from an empty string.
db=# SELECT '', NULL;
?column? | ?column?
----------+----------
|
(1 row)
By running \pset null ¤
we’re telling psql to render NULL
as ¤
(or
whatever string you prefer).
db=# SELECT '', NULL;
?column? | ?column?
----------+----------
| ¤
(1 row)
Now, whenever you see an empty cell, you know it’s an empty string and not
a NULL
.
There’s one more—though less crucial—configuration I’d like to introduce. If
you, like me, prefer to have your SQL keywords in ALL-CAPS, then this is for
you. psql has basic tab-completion. By setting \set COMP_KEYWORD_CASE upper
you can start writing keywords in lowercase and when pressing tab they’ll be
completed and transformed to uppercase. For instance, typing sel<tab>
will
result in SELECT<space>
.
Of course, having to type these commands to set up your environment every time
your open a psql shell would be tedious. Luckily, psql evaluates the file
~/.psqlrc
if it exists. Thus, placing the introduced commands in that file
will result in every psql session to be tailored to your liking.
In summary, this is my ~/.psqlrc
that makes working with psql more enjoyable
for me.
\set COMP_KEYWORD_CASE upper
\x auto
\pset null ¤
There are plenty of posts out there talking more in-depth about configuring psql
such as thoughtbot’s. psql’s documentation is also a great resource for
finding further commands that might be useful in a psqlrc
file.
Getting Help
Every once in a while you’ll find yourself having to run a SQL command that you
only use infrequently and thus don’t remember its exact name or invocation. This
is where \h
comes in handy.
Without arguments, \h
will show you a list of available help topics. That in
itself is already a handy reference as the help topics are the start of SQL
commands such as SELECT
or ALTER TABLE
.
Let’s say you would like to drop the NOT NULL
constraint of a column but you
don’t remember exactly how. You do remember though that things usually get
changed with ALTER
. Thus, you run \h alter
to show all related help topics.
This will output quite a bit of information.
Because psql uses a pager to render long outputs, we can make use of the pager’s
search functionality. By typing /
followed by a search term and <enter>
, it
will highlight the first match and jump to it. With n
and N
we can jump to
the next and previous result, respectively.
By searching for NULL
we quickly find ALTER [ COLUMN ] column_name { SET
| DROP } NOT NULL
. Scrolling up a bit we’ll see that this is run in the context
of an ALTER TABLE
. Great, we now have all the pieces to perform our desired
operation without having had to consult Google.
ALTER TABLE table_name ALTER column_name DROP NOT NULL;
As you’ve seen, the \h
command helps you when you get stuck with SQL commands.
It does not offer any help for psql commands (also known as meta-commands or
backslash-commands). This is what \?
is for. It will list all possible
backslash commands and their description. We will be encountering some of these
helpful commands in the next sections.
Your Playground Database
psql is smart when connecting to a database. It assumes default values for
omitted connection settings, such as localhost, standard port, etc. Thus, when
I open a psql session to a local database, all I have to type is psql db_name
.
Now the cool thing is that you can even omit the database name and psql will assume that you want to connect to the database named after the current user. Thus, having a database named after your computer’s user account allows you to quickly drop into psql.
$ psql
psql: FATAL: database "phil" does not exist
$ createdb `whoami`
$ psql
phil=#
This database is perfect for looking up the SQL syntax (\h
) or for quickly
verifying an assumption. I just type psql
, test something, and I’m out. No
need to think about which database to use or about creating a database in the
first place.
Exploring a Database
psql has a bunch of backslash commands that make exploring a database very easy.
To list tables, type \dt
, to list indexes \di
, to list views \dv
, etc. One
great thing about these commands is that they accept a pattern, similar to file
globbing on the command line. If you want to list all tables starting with
user
, simply run \dt user*
.
Where I work, we make extensive use of PostgreSQL schemas. In short, a schema is
a kind of namespace for tables. If you haven’t heard about schemas, here’s
a secret. You’re already using them, albeit only the default one called
public
. When you create a table foo
, you’re creating it in the public
schema and you can refer to that table using public.foo
. In my company’s case
we use one schema per customer.
Sometimes I need to check which customers have a certain table. Assuming the
table is named users
, I run \dt *.users
to get a list of all tables with
that name, regardless of its schema.
db=# \dt *.users
List of relations
Schema | Name | Type | Owner
----------+-------+-------+-------
apple | users | table | phil
google | users | table | phil
facebook | users | table | phil
(3 rows)
Whenever you want to get the details of a single object such as a table or a view,
type \d
followed by the name of that object. This will reveal all the details
such as:
- the columns along with their types, whether they’re
NOT NULL
, and their default values - the indexes
- check constraints
- foreign key constraints
- tables that reference this table through foreign keys
- triggers
db=# \d users
Table "public.users"
Column | Type | Modifiers
----------+---------+----------------------------------------------------
id | integer | not null default nextval('users_id_seq'::regclass)
name | text | not null
email | text | not null
group_id | integer | not null
Indexes:
"users_pkey" PRIMARY KEY, btree (id)
"users_email_key" UNIQUE CONSTRAINT, btree (email)
Check constraints:
"users_email_check" CHECK (email ~ '.@.'::text)
"users_name_check" CHECK (name <> ''::text)
Foreign-key constraints:
"users_group_id_fkey" FOREIGN KEY (group_id) REFERENCES groups(id)
Referenced by:
TABLE "posts" CONSTRAINT "posts_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(id)
Backslash-commands are also very useful when working with functions. So what’s
the command to list functions? I’m sure you’ve guessed it. \df
of course.
However, \df
without an argument doesn’t list all the functions unlike the
other listing commands above. There are just too many functions I guess.
Sometimes you’re looking for a function that you’ve used before but don’t recall
its exact name. Let’s assume you’re looking for a regular expression function to
replace a substring. How about searching for a function containing regexp
?
db=# \df *regexp*
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+-----------------------+------------------+------------------------+--------
pg_catalog | regexp_matches | SETOF text[] | text, text | normal
pg_catalog | regexp_matches | SETOF text[] | text, text, text | normal
pg_catalog | regexp_replace | text | text, text, text | normal
pg_catalog | regexp_replace | text | text, text, text, text | normal
pg_catalog | regexp_split_to_array | text[] | text, text | normal
pg_catalog | regexp_split_to_array | text[] | text, text, text | normal
pg_catalog | regexp_split_to_table | SETOF text | text, text | normal
pg_catalog | regexp_split_to_table | SETOF text | text, text, text | normal
(8 rows)
There it is, regexp_replace
. And I even learned about some fancy functions
I didn’t even know existed.
When working with your own functions, you might want to take a look at their
definition and possibly even edit it. This is where \ef
comes in, which stands
for edit function. It takes the name of the function and, if there are
multiple functions with the same name, the types of the arguments. This will
open a file in your $EDITOR
with that function’s definition neatly wrapped in
a CREATE OR REPLACE FUNCTION
, such that saving and closing the file results in
its contents being evaluated, effectively allowing you to modify a function in
place.
When you only want to inspect a function’s body, it might be a good idea to exit
your editor with a non-zero exit code just to be sure that the function won’t be
modified when you didn’t intend to. psql respects the exit code of your editor
and will only proceed with evaluation if it is 0. In Vim you can exit with an
error code using :cq
.
Building a Query
The psql prompt is great for quickly writing short queries. Once the query
starts to grow and your SQL command starts wrapping in the prompt, it’s time for
an upgrade. This is where \e
comes in handy. \e
will open the last query in
your editor where you’ll get all the benefits of your favorite editor, like
syntax highlighting and more. Once you save and close the file, it gets
evaluated. Thus, \e
is perfect for iteratively building a query.
Occasionally I want to save the query I’ve been working on to a file. I do that
by typing \e
to open the query in my editor and from there I’ll save it to my
desired location.
psql also works well with queries stored in files. By running \i filename
,
psql will evaluate the contents of that file. This is also great for queries you
run periodically.
\e
also works great with existing files. By passing the filename as an
argument, it will open the file in your editor and, same as \e
without
arguments, evaluate the file when saving and closing it. Unfortunately, it’s not
possible to specify the name of an inexistent file, e.g. when you’re about to
start working on a new query. One workaround would be to start typing the first
version of the query directly in the prompt and then use \e
to open the query
in your editor and save it from there. Another workaround would be to touch the
file, either from the command line or from within psql using \!
, which allows
you to run any shell command, e.g. \! touch foo.sql
.
The combination of \i
, \e
, and—especially—your own editor, make
iterating on a query fast and convenient. Personally, I like to have two tmux
splits side-by-side, one with Vim for editing the file containing the query and
one with psql where I simply run \i query.sql
. I then just jump between the
two splits. Gone are the days of copy-pasting and clunky text fields in Java
apps with unfamiliar keyboard shortcuts.
Comparing Outputs
When you run a command in psql, the result gets printed to the terminal. The
\o
command allows you to specify a file to send the output to instead of the
terminal. By running \o foo
, every output resulting from a command is appended
to the file foo
. To revert back to printing to the terminal, run \o
without
arguments.
What’s this good for?, you might be asking yourself. Well, you could export data
this way, but please don’t. There’s a proper way to export data that I’ll
explain later. One specific use case I’ve found for \o
is to compare multiple
outputs.
Let’s assume I want to compare the query plan of two queries in my editor or diff tool. Here’s how I would do this:
db=# \o a.txt
db=# EXPLAIN SELECT * FROM users WHERE id IN (SELECT user_id FROM groups WHERE name = 'admins');
db=# \o b.txt
db=# EXPLAIN SELECT users.* FROM users LEFT JOIN groups WHERE groups.name = 'admins';
db=# \! vimdiff a.txt b.txt
I can now see both query plans side by side. This is also a great way to find out whether the results of two similar queries are the same or not.
Cloning a Database
From time to time I have to create a migration for a database. The process of developing a migration is an iterative one of trial and error, meaning that I run the migration up and down several times. This is done in order to optimize for speed and ensure the reverse migration actually reverses the changes.
The local database on which I test the migration is usually filled with data that I created manually or data from a production dump. Because I run the migration many times, it can happen that the data or schema ends up in a corrupt state. That would require me to set up the database from scratch. Recreating or reloading the data would cost me quite some time.
Thankfully, creating a copy of an existing database is easy as pie. This allows me to create a backup before running the migration for the first time. In case something goes wrong, I just delete the database, restore the backup and try again.
In order to copy a database, we create a new database and specify an existing one as the template. The resulting database is an exact copy of the template.
$ createdb -T app_db app_db_backup
To restore the backup, I simply delete the database and create a new one, using the backup as the template.
$ dropdb app_db
$ createdb -T app_db_backup app_db
Extracting Data
Every now and then there’s the need to share data with your coworkers, management, or clients. Most likely you want to send them a CSV or an Excel file.
You already have your query ready, grabbing the data from all possible tables and joining them into this neat result set. Now, how do you actually get it out of the database?
Googling psql csv export reveals that there are at least two ways of accomplishing this. First, the naïve approach.
Remember, we want to output CSV. Since CSV files are simply values separated by commas, one might be tempted to change how psql prints data to screen. By turning off the pretty alignment and setting the separator to a comma, we actually do get something that resembles CSV. There’s one big problem though. If your data contains a comma, it won’t be escaped. Your output will have rows with a varying number of fields, rendering your CSV file useless.
The proper way of extracting CSV is by using the COPY
SQL command or the
\copy
meta-command. Let’s assume I want to export the result of a query as
CSV. Here’s how that would work:
COPY (SELECT ...)
TO '/absolute/path/export.csv'
WITH (FORMAT csv, HEADER true);
COPY
has a few restrictions. First, it requires you to specify an absolute
path. Second and more annoyingly, you can only write to the file system local to
the database, i.e. if you’re opening a session to a remote database, you won’t
be able to write to your computer.
This is where \copy
comes in. It’s basically a wrapper around COPY
that
circumvents above restrictions. The above query is rewritten as follows:
\copy (SELECT ...) TO export.csv WITH (FORMAT csv, HEADER true)
Notice how we can specify a relative path and even omit the quotes. The only
limitation is that the whole command has to go on one line when entering it
directly in the prompt. The reason is that backslash-commands are triggered on
a new line and not on a semicolon. This does not apply though when using \e
to
enter your query through your editor.
\copy
under the hood issues a COPY
command with the same query and options,
but writes to STDOUT
instead of a file. psql then redirects the output into
the local file, thus bypassing aforementioned restriction.
If you’re planning on using the data in Excel, make sure to adjust the output encoding. The default encoding is UTF-8, which Excel doesn’t play well with (c’mon, we’re in 2015). Instead, go with latin1:
\copy (SELECT ...) TO export.csv WITH (FORMAT csv, HEADER true, ENCODING 'latin1')
If you have a query stored in a file, there’s no way to use the contents of that
file as your copy-statement’s data source. Unfortunately you have to copy paste
the query into a \copy
statement. And don’t forget to remove the semicolon at
the end of the query.
I’ve found myself repeating this too many times. Therefore, I’ve come up with a little tool called psql2csv. It allows you to run a query stored in a file (or as a string argument) and output the result as CSV to STDOUT. Saving the output is as easy as redirecting to a file.
$ psql2csv database 'SELECT ...' > export.csv
If you have a query stored in a file, redirect the file to STDIN.
$ psql2csv database < query.sql > export.csv
The tool’s interface is exactly the same as psql’s. You can simply replace your
psql command with psql2csv
, pass the query string as the last argument or pass
it in through STDIN and it will output the result as valid CSV. To save it,
simply redirect the output to a file. But why stop there? How about piping the
data to another tool that does some interesting thing to it, e.g. plotting it?
psql2csv
also accepts a --no-header
argument and an --encoding=ENCODING
argument, for when you need Excel compatibility.
Interacting with PostgreSQL solely from the command line has been great for me. I can do everything that I used to do with a graphical tool—such as pgAdmin—and much more, all in shorter time and with the tools I prefer. If you so far abstained from psql for whatever reasons, I hope that this article convinced you of psql’s power.
If you have any special tricks or workflows involving psql not covered here, I’d love to hear them. Please reach out!
Edit: Discuss on Hacker News.
Edit 2: Read in Japanese.
You might also like my other posts on PostgreSQL: