automatic character set conversion in postgresql

Today, I encountered a few goofy characters in the data I am migrating from one ERP system to another. For example, “¢” isn’t represented the same way in UTF-8 as LATIN1 character sets. In UTF-8, the hex representation for “¢” is c2 a2, but in LATIN1 it is a2.

I started looking for an easy Perl way to translate everything into UTF-8 on the client side, when I discovered that PostgreSQL offers automatic client-to-server character set conversions. All I have to do is specify what my client character set is.

Here’s how you can do it with an SQL command:

SET CLIENT_ENCODING TO 'LATIN1';

Substitute your character set for “LATIN1”.

Lucky for me, my database is set to UTF8, and in that case, all supported encodings on my clients will be automatically converted to UTF-8 — as long as I specify which encoding I’m using.

The support for UTF-8 (formerly called UNICODE in the docs) in PostgreSQL has been around since version 7.1 (early 2000), and in version 8.1 the conversion support for UTF-8 was expanded to all known character sets.

conference audio is up!

PostgreSQL Conference Fall 2007 audio is now available! Check it out.

I didn’t edit much, other than to eliminate break-time chatter. My apologies to Neil Conway — I missed about 10 minutes of his talk. Thank goodness for redundancy! Once I rip the video, I will update the audio and publish the whole thing.

I’m leaving early Sunday morning for a week. I’m taking a break from the interweb while I’m away. So, I look forward to catching up with everyone when I return.

conference aftermath: tired, happy

The conference was insanely great. We had incredible speakers, plentiful coffee, good food and amazing volunteers. I met so many new people and heard about a number of interesting projects that I’ll be following up on and writing about soon. Thank you to Joshua Drake and Josh Berkus for helping organize all of the big and small details.

Thank you to the sponsors:

  • Command Prompt, Inc. – and Joshua Drake for his talk on PL/Proxy
  • Continuent- and Robert Hodges in particular for giving his talk about uni/cluster
  • EnterpriseDB
  • Greenplum
  • Hyperic – nice to meet you John Walker!
  • OmniTi – and Robert Treat in particular for his talk on partitioning
  • Open Technology Group
  • Sun Microsystems, Inc. – and to Josh Berkus for his keynote on what’s new in version 8.3
  • Truviso – and Neil Conway for his talk about Query Execution, which many people wished could have continued
  • The Linux Fund – who also brought Kristine to help manage the registration desk

And special thanks to:

  • Stonehenge, Inc. – who sponsored afternoon snacks
  • Green Frog Consulting – Allen Bernstein recorded video all day
  • Portland State University and the Graduate Student Council – thanks for hosting us and special thanks to Rafael Fernández-Moctezuma for fixing the last-minute A/V problems, suggesting a coffee run in the afternoon, and just being MVP all day!
  • Daniel Browning – he took some fantastic pictures

Thank you everyone for making it happen! There were a few people that I started conversations with but inevitably got interrupted – please get in touch.

I’ve got a week of recovery (well, except for my presentation at Ignite on Thursday!), before I head off to New Orleans. I hope to have the audio from the conference available before I leave Sunday.

PostgreSQL Conference Fall 2007 – only two more days

We’re taking care of all the last minute details – making sure we have enough coffee Saturday morning, getting nametags printed, stuffing folders and practicing (or in some cases finishing) talks.

I created a special page for my conference posts. I’ve included links to public transportation, all the maps to find your way to and from the conference location and the party, cab info, and links to all my other posts which have more detailed information.

There is free public wireless access inside the PSU engineering building. I’ll have information on how to connect when you arrive. Please send any questions you have to the attendees mailing list (here are the archives).

We are making video and audio recordings of the conference. I’ll announce here when they’re available and make them all available from the conference page.

PostgreSQL Conference Fall 2007 – Friday meetup

A few people on the mailing lists and IRC are organizing a meetup on the Friday before the PostgreSQL conference. Check out the wiki page set up to select a location and say whether or not you’ll be there! If you’re a local, vote for the location you’d like to meet at.

Here are the restaurants and bars currently on the list:

  • NW Lucky Lab – 1945 NW Quimby (pretty central, good space, outdoor seating)
  • Side Door – 425 SE Washington St (great food)
  • Paddy’s – 65 SW Yamhill Street (right off the max! excellent scotch selection)

If you’ll be there, and haven’t already subscribed to the attendees mailing list, go subscribe now so you’ll get the latest updates on events.

PostgreSQL Conference Fall 2007 – Where to eat lunch

Screenshot of conference mapThanks to many suggestions from PDXPUG‘s mailling list and Gabrielle Roth‘s patience with Google Maps, we have a great map of the area around the conference. She was kind enough to put the nearest parking garage ($9/day) on the map as well. If you’re looking for some parking that’s a bit more affordable but a further walk away, try Smart Park. You could also park on the East side of the river, and walk across one of our beautiful bridges or hop on a bus or the Max (you’ll need to transfer to a Bus or the Street Car from the Max to get to PSU campus). To figure out where and when to ride public transit, you can call for live help at 503-238-RIDE, or use the Trip Planner.

Portland has great food. If you’re going to be in town for a couple of days, there are great restaurants walking distance from the downtown hotels (Higgins, Typhoon!, Saucebox, many others), on the waterfront (Three Degrees, McCormick and Schmick’s), and over on NW 23rd/21st (Paley’s Place, Lucy’s Table, Serratto, Muu Muu’s). You can get directly to NW on the Streetcar, which has a stop one block from the conference. And of course there are many food carts throughout the city. If you want even more suggestions, check out the Willamette Week’s top 100 restaurants in Portland.

Those of you who have come to Portland for OSCON the last couple of years probably know about a few of the great places on the East side of the river. We’ll be on the west side, and there’s a ton of great places to explore.

PostgreSQL Conference Fall 2007 – talk descriptions are up!

I just finished updating the Talks page for PostgreSQL Conference Fall 2007. There are so many great folks giving talks — Neil Conway, Josh Berkus, Robert Treat, David Fetter and Robert Hodges are all flying in. It’ll be great to see Josh, Robert and David again! I’m excited to meet Neil and Robert, both of whom I’ve heard great things about.

PDXPUG will be represented by Mark Wong and David Wheeler. Mark will be talking about performance, building on a talk he gave last year to PDXPUG on performance and TPC benchmarks. This conference talk will be focused on practical tools one can use with PostgreSQL. David’s talk will be about his recent work with Ruby on Rails and PostgreSQL. David was kind enough to give PDXPUG’s very first talk, about PL/PgSQL.

We also have Webb Sprague from Eugene, OR coming out to talk about PostGIS. We’re hoping to get him out for user group meetings some time. Eugene is about two hours away from Portland, so we occasionally have visitors (hi Andrew!) to PDXPUG and PerlMongers from there. And of course Joshua Drake will be there.

Another great thing about having this conference at PSU is that members of the database reading group folks will be sure to attend. And our favorite relational algebra teachers will certainly be there.

As of today, we’ve almost filled up our event space! So if you haven’t registered yet, register now.

relational algebra talk last night was awesome

Quickly: James Terwilliger and Rafael J. Fernández-Moctezuma gave a fantastic talk about relational algebra operators (there are 9, arguably 8). They started with the set theory origins, dropped Codd’s name a few times and captivated the room. There was also mention of proof by intimidation, brandishing of duct tape, fabulous drinks mixed by Gabrielle Roth, and twenty-one people in attendance.

Before the talk, we had an impromptu “What is HOT?” (HOT stands for Heap-ONLY Tuple) from Jeff Davis. More details on that later. I think we need more 15-minute, what-is-this-new-feature-and-why-is-it-awesome talks.