PostgreSQL 9.0.1 released, includes security fix & maintenance releases for 6 other versions

The PostgreSQL Global Development group released new maintenance versions today: 9.0.1, 8.4.5, 8.3.12, 8.2.18, 8.1.22, 8.0.26 and 7.4.30. This is the final update for PostgreSQL versions 7.4 and 8.0. There’s a security issue in there involving procedural languages, and a detailed description of the vulnerability is on our wiki. A key thing to remember is that the issue primarily affects people who use SECURITY DEFINER along with a procedural language function. PL/PgSQL is not affected, but any other procedural language with a “trusted” mode is. This includes PL/Perl, PL/tcl, PL/Python (7.4 or earlier) and others. The new versions fix issues in PL/Perl and PL/tcl. A patch for PL/PHP is currently in the works.

Most developers feel that the security issue is relatively obscure. If you aren’t using a procedural language with some mechanism for altering privileges (SET ROLE or SECURITY DEFINER, for example), you aren’t vulnerable to the security issue and can upgrade Postgres during your next regularly scheduled downtime. If you *are* vulnerable, we recommend investigating the use of the functions that may be vulnerable, and taking steps to prevent their exploitation by upgrading as soon as you can.

From the FAQ:

What is the level of risk associated with this exploit?

Low. It requires all of the following:

  • An attacker must have an authenticated connection to the database server.
  • The attacker must be able to execute arbitrary statements over that connection.
  • The attacker must have an strong knowledge of PostgreSQL.
  • Your application must include procedures or functions in an external procedural language.
  • These functions and procedures must be executed by users with greater privileges than the attacker, using SECURITY DEFINER or SET ROLE, and using the same connection as the attacker.

This was also the first release for which I generated release notes! 😀

Here was my list of interesting changes for the announcement:

  • Prevent show_session_authorization() from crashing within autovacuum processes, backpatched to all supported versions;
  • Fix connection leak after duplicate connection name errors, fix handling of connection names longer than 62 bytes and improve contrib/dblink’s handling of tables containing dropped columns, backpatched to all supported versions;
  • Defend against functions returning setof record where not all the returned rows are actually of the same rowtype, backpatched to 8.0;
  • Fix possible duplicate scans of UNION ALL member relations, backpatched to 8.2;
  • Reduce PANIC to ERROR on infrequent btree failure cases, backpatched to 8.2;
  • Add hstore(text, text) function to contrib/hstore, to support migration away from the => operator, which was deprecated in 9.0. Function support backpatched to 8.2;
  • Treat exit code 128 as non-fatal on Win32, backpatched to 8.2;
  • Fix failure to mark cached plans as transient, causing CREATE INDEX CONCURRENTLY to not be used right away, backpatched to 8.3;
  • Fix evaluation of inner side of an outer join is a sub-select with non-strict expressions in its output list, backpatched to 8.4;
  • Allow full SSL certificate verification to succeed in the case where both host and hostaddr are specified, backpatched to 8.4;
  • Improve parallel restore’s ability to cope with selective restore (-L option), backpatched to 8.4 with caveats;
  • Fix failure of “ALTER TABLE t ADD COLUMN c serial” when done by non-owner, 9.0 only.
  • Several bugfixes for join removal, 9.0 only.

If you have a look at a new tool that Robert Haas and Tom Lane commited to the repo called git_changelog, you can use it to find the commit IDs for the various features (you need the whole source tree to do it :)).

You’ll find that there are a lot of commits in these sets. We haven’t had a minor release since May 2010, so they kind of added up.

Any other changes in there you think we should have mentioned in the announcement? Let me know in the comments.

Download new versions now:

Custom aggregates: a couple tips and ORDER BY in 9.0

A friend asked about a way to report the first three semesters that a group of students were documented as being present, and report those values each in a column.

The tricky thing is that the semesters students attend are rarely the same. I started out with a very naive query (and sorry for the bad formatting that follows.. i need to find some good SQL formatting markup) just to get some initial results:


select student,
(SELECT semester as sem1 FROM assoc a2 WHERE a2.student IN (a1.student) ORDER BY sem1 LIMIT 1) as sem1,
(SELECT semester as sem1 FROM assoc a2 WHERE a2.student IN (a1.student) ORDER BY sem1 LIMIT 1 offset 1) as sem2,
(SELECT semester as sem1 FROM assoc a2 WHERE a2.student IN (a1.student) ORDER BY sem1 LIMIT 1 offset 2) as sem3
FROM assoc a1
WHERE
student IN ( select student from assoc group by student HAVING count(*) > 2)
GROUP BY student;

That query pretty much sucks, requiring five sequential scans of ‘assoc’:

                                     QUERY PLAN                                     
 HashAggregate  (cost=3913.13..315256.94 rows=78 width=2)
   ->  Hash Semi Join  (cost=1519.18..3718.08 rows=78017 width=2)
         Hash Cond: (a1.student = assoc.student)
         ->  Seq Scan on assoc a1  (cost=0.00..1126.17 rows=78017 width=2)
         ->  Hash  (cost=1518.20..1518.20 rows=78 width=32)
               ->  HashAggregate  (cost=1516.26..1517.42 rows=78 width=2)
                     Filter: (count(*) > 2)
                     ->  Seq Scan on assoc  (cost=0.00..1126.17 rows=78017 width=2)
   SubPlan 1
     ->  Limit  (cost=1326.21..1326.22 rows=1 width=3)
           ->  Sort  (cost=1326.21..1328.71 rows=1000 width=3)
                 Sort Key: a2.semester
                 ->  Seq Scan on assoc a2  (cost=0.00..1321.21 rows=1000 width=3)
                       Filter: (student = a1.student)
   SubPlan 2
     ->  Limit  (cost=1331.22..1331.22 rows=1 width=3)
           ->  Sort  (cost=1331.21..1333.71 rows=1000 width=3)
                 Sort Key: a2.semester
                 ->  Seq Scan on assoc a2  (cost=0.00..1321.21 rows=1000 width=3)
                       Filter: (student = a1.student)
   SubPlan 3
     ->  Limit  (cost=1334.14..1334.14 rows=1 width=3)
           ->  Sort  (cost=1334.14..1336.64 rows=1000 width=3)
                 Sort Key: a2.semester
                 ->  Seq Scan on assoc a2  (cost=0.00..1321.21 rows=1000 width=3)
                       Filter: (student = a1.student)

So, he reminded me about custom aggregates! I did a little searching and found an example function that I added an extra CASE statement that stops the aggregate from adding more than three items to the array returned:


CREATE FUNCTION array_append_not_null(anyarray,anyelement)
RETURNS anyarray
AS '
SELECT CASE WHEN $2 IS NULL THEN $1 WHEN array_upper($1, 1) > 2 THEN $1 ELSE array_append($1,$2) END
'
LANGUAGE sql IMMUTABLE RETURNS NULL ON NULL INPUT;

And finally, I declared an aggregate:


CREATE AGGREGATE three_semesters_not_null (
sfunc = array_append_not_null,
basetype = anyelement,
stype = anyarray,
initcond = '{}'
);

One problem though – we want the array returned to be only the first three semesters, rather than any three semesters a student has a record for. Meaning, we need to sort the information passed to the aggregate function. We could do this inside the aggregate itself (bubble sort, anyone?) or we can presort the input! I chose presorting, to avoid writing a real ugly case statement.

My query (compatible with 8.3 or higher):


SELECT sorted.student, three_semesters_not_null(sorted.semester)
FROM (SELECT student, semester from assoc order by semester ) as sorted
WHERE
sorted.student IN (select a.student from assoc a group by a.student HAVING count(*) > 2)
GROUP BY sorted.student;

Which yields the much nicer query plan, requiring just two sequential scans:

                                      QUERY PLAN                                      
 HashAggregate  (cost=11722.96..11725.46 rows=200 width=64)
   ->  Hash Semi Join  (cost=10052.32..11570.82 rows=30427 width=64)
         Hash Cond: (assoc.student = a.student)
         ->  Sort  (cost=8533.14..8728.18 rows=78017 width=5)
               Sort Key: assoc.semester
               ->  Seq Scan on assoc  (cost=0.00..1126.17 rows=78017 width=5)
         ->  Hash  (cost=1518.20..1518.20 rows=78 width=32)
               ->  HashAggregate  (cost=1516.26..1517.42 rows=78 width=2)
                     Filter: (count(*) > 2)
                     ->  Seq Scan on assoc a  (cost=0.00..1126.17 rows=78017 width=2)

I ran my queries by Magnus, and he reminded me that what I really needed was ORDER BY in my aggregate! Fortunately, 9.0 has exactly this feature:


SELECT student,
three_semesters_not_null(semester order by semester asc ) as first_three_semesters
FROM assoc
WHERE student IN (select student from assoc group by student HAVING count(*) > 2)
GROUP BY student;

Which results in the following plan:

                                        QUERY PLAN                                        
 GroupAggregate  (cost=11125.05..11711.15 rows=78 width=5)
   ->  Sort  (cost=11125.05..11320.09 rows=78017 width=5)
         Sort Key: public.assoc.student
         ->  Hash Semi Join  (cost=1519.18..3718.08 rows=78017 width=5)
               Hash Cond: (public.assoc.student = public.assoc.student)
               ->  Seq Scan on assoc  (cost=0.00..1126.17 rows=78017 width=5)
               ->  Hash  (cost=1518.20..1518.20 rows=78 width=32)
                     ->  HashAggregate  (cost=1516.26..1517.42 rows=78 width=2)
                           Filter: (count(*) > 2)
                           ->  Seq Scan on assoc  (cost=0.00..1126.17 rows=78017 width=2)

A final alternative would be to transform the IN query into a JOIN:


SELECT a.student,
three_semesters_not_null(a.semester order by a.semester asc ) as first_three_semesters
FROM assoc a
JOIN (select student from assoc group by student HAVING count(*) > 2) as b ON b.student = a.student
GROUP BY a.student;

And the plan isn’t much different:

                                        QUERY PLAN                                        
 GroupAggregate  (cost=11125.05..11711.15 rows=78 width=5)
   ->  Sort  (cost=11125.05..11320.09 rows=78017 width=5)
         Sort Key: a.student
         ->  Hash Join  (cost=1519.18..3718.08 rows=78017 width=5)
               Hash Cond: (a.student = assoc.student)
               ->  Seq Scan on assoc a  (cost=0.00..1126.17 rows=78017 width=5)
               ->  Hash  (cost=1518.20..1518.20 rows=78 width=32)
                     ->  HashAggregate  (cost=1516.26..1517.42 rows=78 width=2)
                           Filter: (count(*) > 2)
                           ->  Seq Scan on assoc  (cost=0.00..1126.17 rows=78017 width=2)

Any other suggestions for this type of query?

I’ve attached the file I was using to test this out.
custom_aggregates.sql

PostgreSQL 9.0: contributions!

New releases are opportunities for reflection!

As PostgreSQL grows, I would like to know how many people are contributing at any time. This is difficult to measure, given how many people contribute in ways not visible to the internet – advocating for PostgreSQL at work, sharing information about PostgreSQL offline in any way, or developing code related to PostgreSQL that isn’t shared directly back.

PostgreSQL developers have a habit of mentioning the people involved in the development of features in the commit logs. This includes people who discuss topics on the mailing list, who report bugs, provide test cases or send in patches. I spent a bit of time digging through the commit logs and pulling out unique names that are mentioned. This is a lossy process, as the log files are long, names are not always easy to spot, and I only spent 6 hours going through it.

I’ve made it through the 9.0 (16163 lines of logs) and 8.4 logs (21257 lines of logs) so far.

Here’s some basic information about them:

  • 8.4 logs mention about 230 unique people (11 committers)
  • 9.0 logs mention about 275 unique people (16 committers)
  • 8.4 development contained 2293 commits with commits per author broken down below (click for a bigger version):
  • 9.0 development contained 1703 commits, and the commits per author broken down below (click for a bigger version):

I’m working on graphs about number of lines inserted or deleted by each author, but need more time to work out the information presentation. Some interesting trends emerge about what the role of each committer is – particularly that there are a couple people who seem to be “gardeners” of the code – removing a lot of lines, sometimes more than they are adding. With a project as old as ours (first commit in 1996!), this maintenance work is critical.

I also did some grepping for key words in the commit messages:

word times mentioned
in 8.3
times mentioned
in 8.4
times mentioned
in 9.0
review 24 14 49
cute 29 26 25
tom lane 904 901 635
gripe 37 48 26
hot standby 0 5 48
replication 18 4 52

You’re welcome to explore our git repo at git.postgresql.org. Thanks to all the folks who worked on the git migration over the past few months, and finally made our transition from CVS to git complete last night!

PostgreSQL 9.0 media coverage roundup

As you probably noticed, we released PostgreSQL 9.0 today!

There were quite a few stories, and here’s what I’ve got so far:

CouchCamp 2010: yay!


Max in a tree! Talking about GeoCouch

I was at CouchCamp last week out at the Walker Creek Ranch – a bit disconnected (no cel service, and spotty internet), but fully immersed in the CouchDB community.

I was there to give a talk on MVCC in PostgreSQL. I forgot to mention it during my talk, but it was a fitting topic given that I first talked with JChris after a talk he gave in Portland, where I basically trolled him about compaction and MVCC in CouchDB. My goal was to show people the benefits of CouchDB’s built-in MVCC, to point out some places where core developers can learn from PostgreSQL and avoid some of the traps we’ve fallen into over the years. I’ve got more to say about the talk some other day, but I wanted to just reflect on CouchCamp for a moment.

One comment a friend made was, “Wow, these people are just so nice!” And it’s true. Every hacker meetup I attend is full of people who are overwhelmingly kind and thoughtful, and CouchCamp was more of the same.

CouchDB is at a critical point in their development – 1.0 is out the door, and developers are already building cool apps on top of it. CouchApps + Evently are an interesting and fun way to get started building things on top of a couch. And replication parties – seriously awesome. Ward Cunningham is rumored to be considering a CouchDB wiki to drive the patterns repository wiki (And here it is! Thanks, Max!), and CouchCamp was overflowing with ideas and implementations (distributed social, a replacement for email, UbuntuOne).

So what did I learn at CouchCamp? I learned how to hack on a CouchApp (Thanks for the help, JChris!). I learned about what Max Ogden is up to, and am so excited for him and the lucky folks that get to work with him. (and he’s running a hack/project night next weekend you should TOTALLY GO TO!)

I heard about the success and tribulations of running CouchDB on the desktop, and the launch of UbuntuOne from Stuart Langridge. During his talk, Stuart brought up the idea of a general replication API – something that I also believe is important to the growth of open source databases and is critical to enabling data freedom. I met a real, live Pick user/admin/developer, and talked about the inability to move to another system but the possibility of interfacing something like CouchDB to it. I got to chat with Rebecca Murphey about Javascript, MVCC and quality booze. I saw bunnies, foxes, deer, raccoons, and tons of bright stars late at night. And, I saw Damien Katz perform a brief interpretive dance.

I also was pointed to a retrospective on Couch 1.0 development by Ted Leung. I don’t know Noah Slater, but wow, what a testimonial. Noah’s comments about why he continues to contribute to CouchDB mirror a recent thread about PostgreSQL contribution — we work on these open source projects because of the incredible community that develops around them.

Thanks, Mikael, JChris, Jan and Damien, and all the CouchDB folks for creating a community that so many people want to contribute and become a part of. I certainly want to be a part of it, and look forward to finding ways of contributing more.

And thanks for bringing us all together in person. From the squirt guns in the welcome bag, to the campfire and sing-alongs, to the very late night Android libc storytelling by Aaron… These are the moments that glue us all together, and make all that work we do to connect up with one another through software completely worth it.

Background reading: Locking (1970)

I’ve been reading some old papers about locking and MVCC in preparation for writing about MVCC in PostgreSQL, and for giving a talk at CouchCamp next week!

I just finished “Locking“, by Jim Gray. He discusses semaphores, and makes the argument for implementing a locking scheduler to handle errors and deadlocks (which he calls interlocks, or a “deadly embrace” – a term I’m sad we’ve stopped using).

An example from the start of the paper illustrates the power of MVCC:

The classic example is an accounting file. Processes reading the file may share it
concurrently. However, a process requesting write access to the file blocks until all processes currently reading have released the file.

A lovely thing about Postgres’ MVCC is that readers (SELECT) don’t require this type of lock, and most writers don’t block readers. For SELECT, the only statements that will block it are those that make changes to tables which move all rows physically around (VACUUM FULL, CLUSTER, REINDEX, TRUNCATE), or make changes to table structure (ALTER TABLE, DROP TABLE).

Have a look at the explicit locking docs for more detail on the lock modes automatically used by PostgreSQL.

Explaining MVCC in Postgres: system defined columns

I’m playing around with some diagrams for explaining MVCC that I’ll be posting here over the next few days. Not sure if I’ll end up giving up on slides and just use a whiteboard for the talk. I made an illustrated shared buffers deck to go along with Greg Smith’s excellent talk on shared buffers a while back. This is the beginning of a talk that I hope will emulate that.

Here are my first few slides, showing the system-defined columns. The next few slides will describe optimizations PostgreSQL has for managing the side effects of our pessimistic rollback strategy, and reducing IO during vacuuming and index updates.

Variable substitution with psql

Updated: Thanks @johto for s/:bar/:foo/. 🙂

A coworker asked about variable substitution with psql using \set, and so I looked into it a bit further.

You definitely can do things like this:

16:55 sdeckelmann@[local]:5432|postgres=> \set test 'select * from :foo limit 10;'
16:56 sdeckelmann@[local]:5432|postgres=> \set foo 'test'
16:56 sdeckelmann@[local]:5432|postgres=> :test
myint
-------
1
2
3
4
5
6
7
8
9
10
(10 rows)

But, what about something like this:


=> \set test 'select * from :var limit 10;'
=> :test mytable

Unfortunately, this isn’t supported.

The best you could do is something pathological like:

=> \set s 'select * from '
=> \set pr ' limit 10;'
=> :s mytable :pr
=> :s test :pr
myint
-------
1
2
3
4
5
6
7
8
9
10
(10 rows)

Using logger with pg_standby

Piping logs to syslog is pretty useful for automating log rotation and forwarding lots of different logs to a central log server.

To that end, the command-line utility ‘logger’ is nice for piping output from utilities like pg_standby without having to add syslogging code to the utility itself. Another thing is that logger comes by default with modern packages of syslog.

Here’s an easy way to implement this:


restore_command = 'pg_standby -d -s 2 -t /pgdata/trigger /shared/wal_archive/ %f %p %r 2>&1 | logger -p local3.info -t pgstandby'

Online aggregation paper from 1997 and PSU’s database reading group

A couple weeks ago, Mark Wong and I took a field trip over to the Database Reading Group at Portland State University. It’s a group of students and professors that meet weekly throughout the school year to go over research papers. The papers are picked by the participants, and vary in topic from obscure to very practical.

This week’s paper reading was led by Professor Len Shapiro, and titled “Online Aggregation“. The paper is considered a foundational paper about SQL aggregates (like COUNT() or AVERAGE), and was published in 1997 by researchers from UC Berkeley and IBM. It’s also precursor to research into query parallelization and streaming databases. It was also awarded the SIGMOD “Test of Time” award in 2007, and is cited by over 170 other papers in the ACM archive.

The basic idea behind the paper centered around how to improve user experience in reporting results of aggregate queries – asking questions about how to solve three key problems when solving aggregates: blocking, fairness and control (from a user’s perspective). Roughly: Blocking is what happens when some part of the system waits and doesn’t return results to the user as a result of the waiting. Fairness concerns whether certain types of operations prevent certain groups of data from being processed (the example given had to do with GROUP BY and groups being processed one at a time). Control concerns whether or not a user can exert control over the speed of computation applied to a group (example given being a lever that “speeds up” processing of a set).

One insight from the paper is how online aggregates should be treated differently than traditional query processing – which might favor expensive plans involving sorts so that the output is ordered. When you’re dealing with online aggregates, you prefer unordered, or ideally random order, because your intermediate results will be more representative of the ultimate result. I guess that’s probably obvious once you think about it, but the paper provided some concrete examples.

Another interesting thought experiment involving the planner is how you pick plans that favor non-blocking, fairness and user control. Each of those properties is not narrowly defined, and changes based on individual user expectation. Professor Kristen Tufte mentioned that she’d be interested in how the ideas presented in this paper would be applied today, and Professor David Meier brought up that we might most be interested in applications involving managing Hadoop.

Prof Meier also brought up an interesting paper involving alternating nested loop joins during a discussion about optimizing JOIN algorithms for online aggregates. Another cool thing about the paper is that it involved modifications to Postgres! Granted, it was Postgres95, which doesn’t resemble the modern PostgreSQL 9.0 very much. But it was nice to revisit research that used Postgres that’s still relevant today.