PostgreSQL on the Command Line Working efficiently with psql

October 25, 2015

I’ve been using PostgreSQL on a daily basis for the last four years. While I initially resorted to GUI applications to interact with databases, I now exclusively use the built-in command line tools, allowing me to work more efficiently while sticking to my preferred tools.

This post is split into different sections, talking about specific tasks to be accomplished mainly through the command line tool psql. These are the different topics I’ll be talking about:

Configuring psql
Getting Help
Your Playground Database
Exploring a Database
Building a Query
Comparing Outputs
Cloning a Database
Extracting Data

Configuring psql

psql has a decent configuration out of the box. Thus, I won’t go into too many details about available options except for two, which will make every-day use of psql more enjoyable.

The first one is important in order to get the output to show nicely on screen in case the data does not fit horizontally. By default, psql outputs the data as a nicely formatted table.

db=# SELECT 'hello' AS foo, bar FROM generate_series(1, 2) AS bar;
  foo  | bar
-------+-----
 hello |   1
 hello |   2
(2 rows)

When the width of the rendered table no longer fits the terminal width, your terminal is going to wrap the lines, making it incredibly hard to visually parse the data.

db=# SELECT 'really long string messing with the output' AS foo, bar, 
'another long string making things worse' AS baz FROM generate_series(1,
 2) AS bar;
                    foo                     | bar |                   ba
z
--------------------------------------------+-----+---------------------
--------------------
 really long string messing with the output |   1 | another long string 
making things worse
 really long string messing with the output |   2 | another long string 
making things worse
(2 rows)

Fortunately there is a command to activate the so-called expanded display mode.

db=# \x
Expanded display is on.

Repeating the query will no longer render the data as a table, but instead show every record as a key-value pair, greatly improving readability.

db=# SELECT 'really long string messing with the output' AS foo, bar, 
'another long string making things worse' AS baz FROM generate_series(1,
 2) AS bar;
-[ RECORD 1 ]-----------------------------------
foo | really long string messing with the output
bar | 1
baz | another long string making things worse
-[ RECORD 2 ]-----------------------------------
foo | really long string messing with the output
bar | 2
baz | another long string making things worse

The best part: by running \x auto it will use the table layout only if there’s enough space and otherwise resort to the key-value layout.

The second crucial config is to configure how NULL is rendered. By default, it isn’t. Thus, it’s indifferentiable from an empty string.

db=# SELECT '', NULL;
 ?column? | ?column?
----------+----------
          |
(1 row)

By running \pset null ¤ we’re telling psql to render NULL as ¤ (or whatever string you prefer).

db=# SELECT '', NULL;
 ?column? | ?column?
----------+----------
          | ¤
(1 row)

Now, whenever you see an empty cell, you know it’s an empty string and not a NULL.

There’s one more—though less crucial—configuration I’d like to introduce. If you, like me, prefer to have your SQL keywords in ALL-CAPS, then this is for you. psql has basic tab-completion. By setting \set COMP_KEYWORD_CASE upper you can start writing keywords in lowercase and when pressing tab they’ll be completed and transformed to uppercase. For instance, typing sel<tab> will result in SELECT<space>.

Of course, having to type these commands to set up your environment every time your open a psql shell would be tedious. Luckily, psql evaluates the file ~/.psqlrc if it exists. Thus, placing the introduced commands in that file will result in every psql session to be tailored to your liking.

In summary, this is my ~/.psqlrc that makes working with psql more enjoyable for me.

\set COMP_KEYWORD_CASE upper
\x auto
\pset null ¤

There are plenty of posts out there talking more in-depth about configuring psql such as thoughtbot’s. psql’s documentation is also a great resource for finding further commands that might be useful in a psqlrc file.

Getting Help

Every once in a while you’ll find yourself having to run a SQL command that you only use infrequently and thus don’t remember its exact name or invocation. This is where \h comes in handy.

Without arguments, \h will show you a list of available help topics. That in itself is already a handy reference as the help topics are the start of SQL commands such as SELECT or ALTER TABLE.

Let’s say you would like to drop the NOT NULL constraint of a column but you don’t remember exactly how. You do remember though that things usually get changed with ALTER. Thus, you run \h alter to show all related help topics. This will output quite a bit of information.

Because psql uses a pager to render long outputs, we can make use of the pager’s search functionality. By typing / followed by a search term and <enter>, it will highlight the first match and jump to it. With n and N we can jump to the next and previous result, respectively.

By searching for NULL we quickly find ALTER [ COLUMN ] column_name { SET | DROP } NOT NULL. Scrolling up a bit we’ll see that this is run in the context of an ALTER TABLE. Great, we now have all the pieces to perform our desired operation without having had to consult Google.

ALTER TABLE table_name ALTER column_name DROP NOT NULL;

As you’ve seen, the \h command helps you when you get stuck with SQL commands. It does not offer any help for psql commands (also known as meta-commands or backslash-commands). This is what \? is for. It will list all possible backslash commands and their description. We will be encountering some of these helpful commands in the next sections.

Your Playground Database

psql is smart when connecting to a database. It assumes default values for omitted connection settings, such as localhost, standard port, etc. Thus, when I open a psql session to a local database, all I have to type is psql db_name.

Now the cool thing is that you can even omit the database name and psql will assume that you want to connect to the database named after the current user. Thus, having a database named after your computer’s user account allows you to quickly drop into psql.

$ psql
psql: FATAL:  database "phil" does not exist
$ createdb `whoami`
$ psql
phil=#

This database is perfect for looking up the SQL syntax (\h) or for quickly verifying an assumption. I just type psql, test something, and I’m out. No need to think about which database to use or about creating a database in the first place.

Exploring a Database

psql has a bunch of backslash commands that make exploring a database very easy. To list tables, type \dt, to list indexes \di, to list views \dv, etc. One great thing about these commands is that they accept a pattern, similar to file globbing on the command line. If you want to list all tables starting with user, simply run \dt user*.

Where I work, we make extensive use of PostgreSQL schemas. In short, a schema is a kind of namespace for tables. If you haven’t heard about schemas, here’s a secret. You’re already using them, albeit only the default one called public. When you create a table foo, you’re creating it in the public schema and you can refer to that table using public.foo. In my company’s case we use one schema per customer.

Sometimes I need to check which customers have a certain table. Assuming the table is named users, I run \dt *.users to get a list of all tables with that name, regardless of its schema.

db=# \dt *.users
         List of relations
  Schema  | Name  | Type  | Owner
----------+-------+-------+-------
 apple    | users | table | phil
 google   | users | table | phil
 facebook | users | table | phil
(3 rows)

Whenever you want to get the details of a single object such as a table or a view, type \d followed by the name of that object. This will reveal all the details such as:

the columns along with their types, whether they’re NOT NULL, and their default values
the indexes
check constraints
foreign key constraints
tables that reference this table through foreign keys
triggers

db=# \d users
                          Table "public.users"
  Column  |  Type   |                     Modifiers
----------+---------+----------------------------------------------------
 id       | integer | not null default nextval('users_id_seq'::regclass)
 name     | text    | not null
 email    | text    | not null
 group_id | integer | not null
Indexes:
    "users_pkey" PRIMARY KEY, btree (id)
    "users_email_key" UNIQUE CONSTRAINT, btree (email)
Check constraints:
    "users_email_check" CHECK (email ~ '.@.'::text)
    "users_name_check" CHECK (name <> ''::text)
Foreign-key constraints:
    "users_group_id_fkey" FOREIGN KEY (group_id) REFERENCES groups(id)
Referenced by:
    TABLE "posts" CONSTRAINT "posts_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(id)

Backslash-commands are also very useful when working with functions. So what’s the command to list functions? I’m sure you’ve guessed it. \df of course. However, \df without an argument doesn’t list all the functions unlike the other listing commands above. There are just too many functions I guess.

Sometimes you’re looking for a function that you’ve used before but don’t recall its exact name. Let’s assume you’re looking for a regular expression function to replace a substring. How about searching for a function containing regexp?

db=# \df *regexp*
                                    List of functions
   Schema   |         Name          | Result data type |  Argument data types   |  Type
------------+-----------------------+------------------+------------------------+--------
 pg_catalog | regexp_matches        | SETOF text[]     | text, text             | normal
 pg_catalog | regexp_matches        | SETOF text[]     | text, text, text       | normal
 pg_catalog | regexp_replace        | text             | text, text, text       | normal
 pg_catalog | regexp_replace        | text             | text, text, text, text | normal
 pg_catalog | regexp_split_to_array | text[]           | text, text             | normal
 pg_catalog | regexp_split_to_array | text[]           | text, text, text       | normal
 pg_catalog | regexp_split_to_table | SETOF text       | text, text             | normal
 pg_catalog | regexp_split_to_table | SETOF text       | text, text, text       | normal
(8 rows)

There it is, regexp_replace. And I even learned about some fancy functions I didn’t even know existed.

When working with your own functions, you might want to take a look at their definition and possibly even edit it. This is where \ef comes in, which stands for edit function. It takes the name of the function and, if there are multiple functions with the same name, the types of the arguments. This will open a file in your $EDITOR with that function’s definition neatly wrapped in a CREATE OR REPLACE FUNCTION, such that saving and closing the file results in its contents being evaluated, effectively allowing you to modify a function in place.

When you only want to inspect a function’s body, it might be a good idea to exit your editor with a non-zero exit code just to be sure that the function won’t be modified when you didn’t intend to. psql respects the exit code of your editor and will only proceed with evaluation if it is 0. In Vim you can exit with an error code using :cq.

Building a Query

The psql prompt is great for quickly writing short queries. Once the query starts to grow and your SQL command starts wrapping in the prompt, it’s time for an upgrade. This is where \e comes in handy. \e will open the last query in your editor where you’ll get all the benefits of your favorite editor, like syntax highlighting and more. Once you save and close the file, it gets evaluated. Thus, \e is perfect for iteratively building a query.

Occasionally I want to save the query I’ve been working on to a file. I do that by typing \e to open the query in my editor and from there I’ll save it to my desired location.

psql also works well with queries stored in files. By running \i filename, psql will evaluate the contents of that file. This is also great for queries you run periodically.

\e also works great with existing files. By passing the filename as an argument, it will open the file in your editor and, same as \e without arguments, evaluate the file when saving and closing it. Unfortunately, it’s not possible to specify the name of an inexistent file, e.g. when you’re about to start working on a new query. One workaround would be to start typing the first version of the query directly in the prompt and then use \e to open the query in your editor and save it from there. Another workaround would be to touch the file, either from the command line or from within psql using \!, which allows you to run any shell command, e.g. \! touch foo.sql.

The combination of \i, \e, and—especially—your own editor, make iterating on a query fast and convenient. Personally, I like to have two tmux splits side-by-side, one with Vim for editing the file containing the query and one with psql where I simply run \i query.sql. I then just jump between the two splits. Gone are the days of copy-pasting and clunky text fields in Java apps with unfamiliar keyboard shortcuts.

Comparing Outputs

When you run a command in psql, the result gets printed to the terminal. The \o command allows you to specify a file to send the output to instead of the terminal. By running \o foo, every output resulting from a command is appended to the file foo. To revert back to printing to the terminal, run \o without arguments.

What’s this good for?, you might be asking yourself. Well, you could export data this way, but please don’t. There’s a proper way to export data that I’ll explain later. One specific use case I’ve found for \o is to compare multiple outputs.

Let’s assume I want to compare the query plan of two queries in my editor or diff tool. Here’s how I would do this:

db=# \o a.txt
db=# EXPLAIN SELECT * FROM users WHERE id IN (SELECT user_id FROM groups WHERE name = 'admins');
db=# \o b.txt
db=# EXPLAIN SELECT users.* FROM users LEFT JOIN groups WHERE groups.name = 'admins';
db=# \! vimdiff a.txt b.txt

I can now see both query plans side by side. This is also a great way to find out whether the results of two similar queries are the same or not.

Cloning a Database

From time to time I have to create a migration for a database. The process of developing a migration is an iterative one of trial and error, meaning that I run the migration up and down several times. This is done in order to optimize for speed and ensure the reverse migration actually reverses the changes.

The local database on which I test the migration is usually filled with data that I created manually or data from a production dump. Because I run the migration many times, it can happen that the data or schema ends up in a corrupt state. That would require me to set up the database from scratch. Recreating or reloading the data would cost me quite some time.

Thankfully, creating a copy of an existing database is easy as pie. This allows me to create a backup before running the migration for the first time. In case something goes wrong, I just delete the database, restore the backup and try again.

In order to copy a database, we create a new database and specify an existing one as the template. The resulting database is an exact copy of the template.

$ createdb -T app_db app_db_backup

To restore the backup, I simply delete the database and create a new one, using the backup as the template.

$ dropdb app_db
$ createdb -T app_db_backup app_db

Extracting Data

Every now and then there’s the need to share data with your coworkers, management, or clients. Most likely you want to send them a CSV or an Excel file.

You already have your query ready, grabbing the data from all possible tables and joining them into this neat result set. Now, how do you actually get it out of the database?

Googling psql csv export reveals that there are at least two ways of accomplishing this. First, the naïve approach.

Remember, we want to output CSV. Since CSV files are simply values separated by commas, one might be tempted to change how psql prints data to screen. By turning off the pretty alignment and setting the separator to a comma, we actually do get something that resembles CSV. There’s one big problem though. If your data contains a comma, it won’t be escaped. Your output will have rows with a varying number of fields, rendering your CSV file useless.

The proper way of extracting CSV is by using the COPY SQL command or the \copy meta-command. Let’s assume I want to export the result of a query as CSV. Here’s how that would work:

COPY (SELECT ...)
TO '/absolute/path/export.csv'
WITH (FORMAT csv, HEADER true);

COPY has a few restrictions. First, it requires you to specify an absolute path. Second and more annoyingly, you can only write to the file system local to the database, i.e. if you’re opening a session to a remote database, you won’t be able to write to your computer.

This is where \copy comes in. It’s basically a wrapper around COPY that circumvents above restrictions. The above query is rewritten as follows:

\copy (SELECT ...) TO export.csv WITH (FORMAT csv, HEADER true)

Notice how we can specify a relative path and even omit the quotes. The only limitation is that the whole command has to go on one line when entering it directly in the prompt. The reason is that backslash-commands are triggered on a new line and not on a semicolon. This does not apply though when using \e to enter your query through your editor.

\copy under the hood issues a COPY command with the same query and options, but writes to STDOUT instead of a file. psql then redirects the output into the local file, thus bypassing aforementioned restriction.

If you’re planning on using the data in Excel, make sure to adjust the output encoding. The default encoding is UTF-8, which Excel doesn’t play well with (c’mon, we’re in 2015). Instead, go with latin1:

\copy (SELECT ...) TO export.csv WITH (FORMAT csv, HEADER true, ENCODING 'latin1')

If you have a query stored in a file, there’s no way to use the contents of that file as your copy-statement’s data source. Unfortunately you have to copy paste the query into a \copy statement. And don’t forget to remove the semicolon at the end of the query.

I’ve found myself repeating this too many times. Therefore, I’ve come up with a little tool called psql2csv. It allows you to run a query stored in a file (or as a string argument) and output the result as CSV to STDOUT. Saving the output is as easy as redirecting to a file.

$ psql2csv database 'SELECT ...' > export.csv

If you have a query stored in a file, redirect the file to STDIN.

$ psql2csv database < query.sql > export.csv

The tool’s interface is exactly the same as psql’s. You can simply replace your psql command with psql2csv, pass the query string as the last argument or pass it in through STDIN and it will output the result as valid CSV. To save it, simply redirect the output to a file. But why stop there? How about piping the data to another tool that does some interesting thing to it, e.g. plotting it?

psql2csv also accepts a --no-header argument and an --encoding=ENCODING argument, for when you need Excel compatibility.

Interacting with PostgreSQL solely from the command line has been great for me. I can do everything that I used to do with a graphical tool—such as pgAdmin—and much more, all in shorter time and with the tools I prefer. If you so far abstained from psql for whatever reasons, I hope that this article convinced you of psql’s power.

If you have any special tricks or workflows involving psql not covered here, I’d love to hear them. Please reach out!

Edit: Discuss on Hacker News.

Edit 2: Read in Japanese.

You might also like my other posts on PostgreSQL: