Recent talks: How to get a job like mine, Command-line essentials, Restore FTW

Here are a few of the talks I’ve given recently here in Portland. I’m trying to give more talks locally, and happy to speak at your Portland User Group. Just drop me an email.

  • How to get a job like mine. This talk was given to PSU students as an encouragement for them to get involved in free and open source software. Toward the end, we did a brainstorming session on the reasons why they didn’t contribute, and tried to come up with projects each person in the audience might be interested in learning more about
  • Command-line II. I’m writing up my notes from this talk, hopefully to turn it into a real tutorial that others could copy. My goal this year is to give a tutorial every other week, and I’m hoping to have at least 10 lessons come out of that work. It seems like I need to give each lesson twice to really get the hang of it. Which means I aught to get out of this experience with 26 lessons… but trying to stay realistic about my time.
  • Restores FTW at PDXPUG. This talk is about backups for PostgreSQL and how to get your teams to come up with restore plans that exercise databases as part of normal operations. I’m trying to switch talks about Backups to being talks about Restores. The next time I give this, I think I’ll change the order of the “restore patterns” to be at the start of the talk, and the discussion about planning for backups/restores to the end. I plan to do a Mozilla brownbag that covers these topics and also goes through a live demo of backing up, restoring and testing PostgreSQL with the new 9.2 tools.

WebTools workweek, start of a symbols database, Kasturba Ghandi

I came across a comment from Sumana saying that she’d like to hear more about the day-to-day life of our fellow FLOSS women. So here’s a run-down of my past week:

Mozilla WebTools team workweek

Mozilla teams hold work weeks from time to time – to get the team together, to experiment with new ideas and in our case, to meet up with a couple other teams (Marketplace and AMO, plus a couple extra folks we work a lot online with, but don’t see very often). I did my normal nerd-out things like making a spreadsheet of all the names and silly intro comments people made on the first day, and I setup and deployed backup scripts to a new 5TB backup server that’s just for crash-stats.mozilla.com’s PostgreSQL database.

There were a few projects on the table to deep-dive into: support for JSON datatypes, creating a symbols database-backed system to replace our filesystem-based one, and work a bit on replacing the SQL-file migration system in Socorro with a SQLAlchemy one.

Symbols database and Range Types

I ended up focusing on the symbols database because Ted, one of our breakpad experts, was around and very generously walked me through what we needed. I have a rough schema in place, and a plan for setting up a few systems to house what will likely be a 1TB database.

In working on this, I spent some time learning more about how to apply range types. The queries for finding symbols are mostly “show me the functions that contain the memory address I have”. Functions all have start addresses and a size, so running “contains” queries makes a lot of sense. In my initial tests, queries using the range types were about 60% faster than queries using plain integer types.

When we’ve got a larger data set to work with imported, I will post some detailed numbers about the in-database comparisons, as well as any performance improvements we’ll get from querying a database instead of loading the plain-text symbols files

Getting JSON files to describe builds and releases

A small project I’ve been working on is getting JSON files produced to describe our builds. Before I go on — please know that this is pretty obscure. The people who are concerned about this information are mostly people who identify crashes and track down which releases are affected by particular bugs. What we keep are things like what platform (Linux, Mac OS X, Windows), what day a release occurred on, whether the release was a beta or not and a few other things.

The way that we got this information in the past was by deriving it from filenames and directory names in our release FTP server. The code to pull this information out is kind of a pain, and if anyone changes a directory name (for a good reason, or on accident..), this code breaks.

It would be much better if we had a way of getting this information in a standardized format. I recently talked to B2G about putting this information into a JSON file (they already were publishing release information via the manifest directory on our FTP server in XML, so it wasn’t too big of a leap). I thought it would be nice to spread this practice to our other software releases.

As luck would have it, a person familiar for Firefox builds is in Mountain View and was giving Ted a ride to the airport! So, just as they were about to leave, we chatted about the problem, created a bug and now I’m going to get build and release information from a JSON file. 🙂

It’s a tiny change, and hopefully won’t take very long to make, but is going to make getting this information much more pleasant and reliable.

Reading about Kasturba Gandhi

I decided to read a real paper book on my flights last week, and picked up a copy of “The Forgotten Woman”, a biography of Kasturba Gandhi, wife of Mahatma Gandhi. Arun Gandhi visited the University of Oregon in the 90s, and my husband had picked up a signed copy.

I’m having a hard time summing up the book. There were a number of things that surprised me. I hadn’t realized that illiteracy for women was so prevalent at the turn of the 20th century in India. I also wasn’t aware of the focus Mahatma Gandhi had on women’s role in political transformation, or how much he had attributed the origin of Satyagraha to Kasturba. Also, this biography attributed Gandhi’s vow of celibacy to Kasturba’s near death after the birth of her fifth child. Kasturba also led an important self-reliance movement, urging women in India to learn to spin and weave their own cloth, rather than buying foreign goods. She also led an effort to teach hygiene to Indigo farming families.

I had a look at the wikipedia page for her, which had no citations and not very well written. I’ve started some work on it, but need to think a bit more about how it should be structured.

Setting up HBase for Socorro

Setting up HBase for use with Socorro is a bit of a bear! The default Vagrant config sets up a VM with filesystem-only. For those that want to try out the HBase support, or are on a path toward setting up a production instance, these instructions might help you along the way.

You may also be interested in Lars’ recent blog posts about Socorro:

Here’s how I got it all working on an Ubuntu Precise (12.04) system, along with some scripts for launching important processes and putting test crashes into the system so you can tell that it is working. Ultimately, my goal is to incorporate all of this into some setup scripts to help new users out.

Set up HBase and Thrift

Socorro uses the Thrift API to insert new crashes and retrieve them through the middleware layer. These Quickstart instructions are pretty helpful for getting HBase installed.

Then, you need to edit

/etc/hosts

and remove the ‘127.0.1.1’ entry, and add your hostname to the localhost ‘127.0.0.1’ line. Also, it’s helpful for the defaults to add ‘crash-stats‘ and ‘crash-reports‘ as host aliases. Your final config line for localhost would look like:

127.0.0.1       localhost wuzetian crash-reports crash-stats

(where wuzetian is your hostname)

You also need to add configuration for HBase. Here’s an example:


<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>file:///var/tmp/hbase</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/var/tmp/zookeeper</value>
  </property>
</configuration>

That sets the location for your HBase files for and zookeeper. This setup is for testing, so I put the directories in a location can easily clear out.

Then, to start HBase and Thrift up:

/etc/init.d/hadoop-hbase-master start
/etc/init.d/hadoop-hbase-thrift start

Setting up processor tools

The processor that looks at raw crashes runs two tools by default: minidump_stackwalk and exploitable.

You can build these from the socorro source tree with:

make minidump_stackwalk

Then make install should put these files into a useful location.

You can also just copy the binaries from the stackwalk/bin directory and the other is exploitable/exploitable.

The paths for these are configured in config/processor.ini: exploitability_tool_pathname and minidump_stackwalk_pathname

There’s also a symbols resolver configured, but I am not setting this up in my test.

Disable LZO compression for HBase (unless you have it configured

Our hbase schema is configured to use LZO compression by default. Change that to ‘NONE’ and load the schema into hbase:

/bin/cat /home/socorro/dev/socorro/analysis/hbase_schema | sed 's/LZO/NONE/g' | /usr/bin/hbase shell

Set up crashmover

Update two lines in scripts/config/collectorconfig.py:

localFS.default = '/home/socorro/primaryCrashStore'
fallbackFS.default = '/home/socorro/fallback'

Set those to directories that you can store crash dumps.

Configure processor and monitor to use HBase

You need to set the processor up to use HBase instead of local crash storage.

The easiest way to do this is as follows:

PYTHONPATH=. python socorro/processor/processor_app.py --admin.conf=./config/processor.ini --source.crashstorage_class=socorro.external.hbase.crashstorage.HBaseCrashStorage --admin.dump_conf=config/processor2.ini
PYTHONPATH=. python socorro/processor/monitor_app.py --admin.conf=./config/monitor.ini --source.crashstorage_class=socorro.external.hbase.crashstorage.HBaseCrashStorage --admin.dump_conf=config/monitor2.ini

Then edit both files to reflect your HBase configuration.

Starting up

The docs suggest starting up four daemons in screen sessions. I mocked up a shell script and a screenrc to get you started.

And that’s it! You should now have a working system, with crashes being submitted and stashed into HBase, and the monitor and processor picking up crashes as they arrive and running the stackwalk and exploitable tools against the crashes.

Please let me know if these instructions work, or don’t work, for you.

Updates on my Lenovo X230 situation: Skype, screencap work; Vidyo not so much

Here was my wish list from before:

  • Camera working: Done! The trick was ‘uvcvideo‘, which I eventually built as a kernel module.
  • A Skitch replacement: Mostly done. I was given Shutter Project as a recommendation. I haven’t had a look at it yet. PrtSc actually takes pics of my visible desktop and I added a Firefox Addon called “Awesome Screenshot”. That solves my problems for now.
  • Vidyo working: Not working. I can now get video, and audio OUT, but I can’t hear other people. I need to dig into and troubleshoot this more. Skype, however, does work well. It does tend to flake out (slow video, loss of audio) far more on the Lenovo than on the Mac.
  • A package for my .bash_profile, .ssh and .gpg directories that I can install in any new system: Not done.
  • A better driver for the touchpad that doesn’t let my mouse jump around while I’m typing: Not done.
  • Change configuration to have the mouse behave like the latest OS X (reverse scrolling): Not done.

Overall, I feel much more comfortable on my Linux laptop now than my Mac. The mousing in particular is frustrating without buttons on the Mac.

I still switch back and forth because of Vidyo. I’m hoping in the next week or so to figure out what’s wrong with my audio and get it solved for good.

The nicest productivity improvements have been around test servers like HBase and Thrift, and being able to recompile my kernel at a moment’s notice for new features.

Abstract for PSU Tech Talk, Feb 1, 4pm

I’m doing a tech talk at PSU about open source community:

TOPIC
Collaborative chaos: what it means to write code, manage projects and work with people in open source communities

SUMMARY
Working in software and with computers means wildly different things depending on who you talk to. In open source, the work spans every aspect of software development — from the marketing and documentation to the troubleshooting end-user systems.

The “community manager” or “organizer” role in open source communities is probably the least-well defined in our industry, but is seen as a crucial part of open source software development. 

Selena will talk about her work as a serial user group starter, open source conference circuit speaker, conference organizer and contributor to PostgreSQL — all roles considered part of community management. She’ll also talk about other kinds of community management roles available at small and large companies, or as a volunteer in an open source project. 

SHORT BIO
Selena is a major contributor to PostgreSQL, she founded and runs the Postgres Open conference and keeps chickens. Selena has been working with open source software for over 15 years.She’s keynoted at SCALE, DjangoCon and LISA, and regularly gives technical talks about Postgres, open source and trolling. She is currently a data architect at Mozilla, makers of the Firefox browser.

Current status: little victories

I’ve got a lot going on right now.

Nothing feels momentous about any particular thing. I’m trying a lot of new ideas and work, struggling, failing and trying again. The transition from the last couple of years of insane travel and starting a business to development work and staying closer to home has been a very good one.

Work

For those that have asked about my work status recently:

Mozilla is great. You can see a lot of the work I do in the Socorro commit feed. Or, in my bug feed. I hang out in #breakpad, #db and a few other channels on irc.mozilla.org. And I’m going to give a talk about Postgres and Backups on February 6th, based on the research I’ve been doing into open source solutions for binary backups.

It’s wonderful to be working in public. I love how much time I have to write software and think about database architecture. I’ve been digging out of a backlog of application and DBA-related work and just coming up to speed on Socorro for a couple months, and that’s starting to pay off.

It’s also wonderful to have coworkers, working on the same things. Most of my work life has been solitary, both in physical proximity and the work itself. Now, all my code is reviewed and I work closely with developers and engineers, daily, on everything.

PyLadies

I’ve been organizing PyLadies meetups with Flora Worley and a few others. We now have more than 60 people who have joined the Meetup, and over 20 women show up to every workshop and hackathon. It feels quite unreal to have 20 women I didn’t know a month ago showing up, forking repos and sending me commits every day. I ask newcomers to send me a commit that links them to our github landing page.

Travel/Speaking in 2013

I’m giving a talk at Portland State University on Feb 1. I’ll be in Mountain View Feb 4-8.

I’m confirmed to be speaking at PyCon March 16 about K-12 teachers and what we in the open source community can do to help them.

I’ll be speaking at a conference in Taiwan in April, and another in the US in May.

Recent talks

My most recent talk was a plenary session at LISA 2012, a USENIX conference in San Diego. It was about the false dichotomy of Education vs Training, and what we can do to improve education of sysadmins. Specifically, I gave shout outs to opsschool.org!

And…

So many other little things are going on. I restarted my sourdough and I’m reorganizing my house, one room at a time. We’re remodeling bits of the basement. We replaced a terrible light fixture in the house, and got an ESPN subscription with cable (which I love and hate at the same time). I’m reading and re-reading some lovely science fiction, at a pace of about 2 books a week. I’m walking more, catching up with family and planning things all the way into 2014.

I’m saying “no” a lot recently to doing more things, volunteering for conferences, and travel. Which, is hard.

Of all the stuff I’m working on right now, PyLadies is the hardest and the most rewarding. So, I’m making space in my life for that, for the little bits of teaching I get to do, and for connecting more women with each other and the open source communities that I love.

Code review for the new PyLadies in your life

This goes out to all the geeky spouses, partners and friends of brand new programmers:

Code review is a cultural practice.

When you sit down to read the work of another, you bring with you all the experience you’ve had up to that point, the code reviews you’ve received, the mistakes you see yourself making and the bits of hard-won knowledge embedded in your coding personality.

Basically, you bring your coding baggage into your review.

When a brand new programmer shares their code with you, they are fundamentally vulnerable. They’re sharing something creative, and like any new creative endeavor, the product is a newborn taking it’s first few, shaky steps.

They are asking for your help and very likely, they’re asking for an indication that they’ve accomplished something. That all the time they just invested in learning something new — paid off.

And, in the case of PyLadies, women are all stepping out on a limb. Some are taking a Coursera class or maybe a workshop, but mostly working alone. We have each other to learn with and we’re all learning something new. Many people are spending 2 nights a week with a group, and another 15-20 hours a week struggling through the very first programs they’ve ever written.

Here’s the very best thing you can say when a PyLady shares her code with you:

“Thanks for sharing this!”

And then, after you’ve had a look:

“I’ve had a look and you’re doing a great job. Tell me about what you’ve written.”

Seriously. That’s about it.

If the PyLady asks specific questions, give your answers. Keep it short and sweet, and encouraging.

I’m laying this out because lots of the women who are trying this stuff out for the first time have loving, geeky spouses and friends who are very excited that the women in their lives are learning to speak their languages. And some of that enthusiasm comes in the form of detailed critique of style, formatting and design.

I’m here to let you off the hook. Just be encouraging, and ask a few open ended questions. That is all you need to do.

Because the reality is: the PyLady is her own worst critic. And, when she comes to a meetup, she can get the detailed help she needs from the other women who are struggling right along with her.

The people in the group have earned the right to share and receive feedback by strugging together. That’s the value of a cohort and one reason why PyLadies, and groups like them, are so important.

If you’re lucky, you’ll get your chance to share some code back, and maybe even write something together. But you build that coding relationship one encouraging step at a time.

If you’re interested in joining PyLadies-PDX, we’re meeting weekly through December, and then starting Monthly meetings on January.

And, if you want to read more about code review in general, here are some additional blog posts I found useful:

A rosetta stone for Mac OS X installers for PostgreSQL

I’m no longer using Mac OS X for my primary desktop, but many of my coworkers and friends do. Particularly developers writing applications that use PostgreSQL (aka Postgres) for their data storage.

I’ve spent a lot time over the last few years troubleshooting people’s Postgres installs in the following, very common, situations:

  • A developer installed Postgres on their Mac laptop >1 year ago
  • Now they need to upgrade their Postgres to help me, or support a new application that needs new features
  • They have an old database they’d like to migrate to the new version
  • They have no idea which particular Mac OS X installer they used last time

For this exact situation, I have documented some features of the Mac OS X Installers for Postgres.

And, I felt so good to see this right after I posted the wiki page earlier today:

@zacduncan: “@selenamarie This is helpful to me at this very moment. Thank you. ”

\o/

A mostly working Lenovo x230 running Ubuntu and Gnome3: Two weeks later

I’ve been planning to switch to a Linux laptop for a while, either for work or as my own laptop aged out. So, joining Mozilla was the perfect opportunity to switch over. And, I’m happy to report that I’m fully converted, enduring a few bugs that need some help, and seriously considering Gentoo to handle all the weird driver issues I’ve got.

Overall, I’m liking the new setup. It’s easier to install all the developer stuff I need like new versions of Python or PostgreSQL. Having real package management instead of adhoc messy MESS of installers is an incredible relief.

I’m using Firefox for my primary browser instead of Chrome, which has made me realize how broken lots of websites I look at regularly are for most people. Also, I am exploring more plugins as a result.

My favorite feature in the Gnome window manager (and lots of window managers support this) is the ability to automatically snap windows to 1/2 or full size with the ‘window’ and arrow keys. It saves an incredible amount of time vs using a mouse to resize.

Unfortunately, I lost the epic rundown of all the problems I encountered on installation, as I encountered them. I can sum up with: the experience of desktop linux has significantly degraded in the seven or so years since I last tried to have a linux laptop as my primary workstation. Talking with friends about this has caused several to remark that Apple got it right with tightly controlling vendors and having full control over the hardware used with it’s operating system. Without a real commitment from a vendor toward supporting drivers, the situation seems unlikely to improve. I think the strongest hope for this is ZaReason, but they weren’t an option for my corporate laptop.

Here’s a few tidbits that might be helpful to a future x230 owner, wanting to run Ubuntu:

I’m running 12.04, Precise Pangolin.

Installed from an Ubuntu netinstall image created with: http://unetbootin.sourceforge.net/.

Here are a bunch of ppas I used, from my /etc/apt/sources.d directory:

deb http://ppa.launchpad.net/andreas-diesner/lightdm-fix-temporary/ubuntu precise main
deb-src http://ppa.launchpad.net/andreas-diesner/lightdm-fix-temporary/ubuntu precise main
deb http://ppa.launchpad.net/andreas-diesner/lightdm-fix-temporary/ubuntu precise main
deb-src http://ppa.launchpad.net/andreas-diesner/lightdm-fix-temporary/ubuntu precise main
deb http://linux.dropbox.com/ubuntu precise main
deb http://linux.dropbox.com/ubuntu precise main
deb http://ppa.launchpad.net/fkrull/deadsnakes/ubuntu precise main
deb-src http://ppa.launchpad.net/fkrull/deadsnakes/ubuntu precise main
deb http://ppa.launchpad.net/fkrull/deadsnakes/ubuntu precise main
deb-src http://ppa.launchpad.net/fkrull/deadsnakes/ubuntu precise main
### THIS FILE IS AUTOMATICALLY CONFIGURED ###
# You may comment out this entry, but any other modifications may be lost.
deb http://dl.google.com/linux/musicmanager/deb/ stable main
### THIS FILE IS AUTOMATICALLY CONFIGURED ###
# You may comment out this entry, but any other modifications may be lost.
deb http://dl.google.com/linux/musicmanager/deb/ stable main
deb http://ppa.launchpad.net/hannes-janetzek/enlightenment-svn/ubuntu precise main
deb-src http://ppa.launchpad.net/hannes-janetzek/enlightenment-svn/ubuntu precise main
deb http://ppa.launchpad.net/hannes-janetzek/enlightenment-svn/ubuntu precise main
deb-src http://ppa.launchpad.net/hannes-janetzek/enlightenment-svn/ubuntu precise main
deb http://ppa.launchpad.net/pitti/postgresql/ubuntu precise main
deb-src http://ppa.launchpad.net/pitti/postgresql/ubuntu precise main
deb http://ppa.launchpad.net/pitti/postgresql/ubuntu precise main
deb-src http://ppa.launchpad.net/pitti/postgresql/ubuntu precise main
deb http://ppa.launchpad.net/upubuntu-com/chat/ubuntu precise main
deb-src http://ppa.launchpad.net/upubuntu-com/chat/ubuntu precise main
deb http://ppa.launchpad.net/upubuntu-com/chat/ubuntu precise main
deb-src http://ppa.launchpad.net/upubuntu-com/chat/ubuntu precise main

There’s a painful lightdm problem fixed by a package the first source in the above list.

I also compiled a new kernel for myself to try to fix a bad video flickering problem I’m having with my external monitor. Jury’s out on that – the flickering hasn’t entirely gone away, and it doesn’t happen to my coworker who’s got a x220 and is running Gentoo, but a different kernel.

Also, my video camera doesn’t work, and I actually need it. Skype seems to work ok for voice, but not video. Vidyo, however, doesn’t work at all.

Wish list for the future:

  • Camera working
  • A Skitch replacement
  • Vidyo working
  • A package for my .bash_profile, .ssh and .gpg directories that I can install in any new system
  • A better driver for the touchpad that doesn’t let my mouse jump around while I’m typing (Yes, I have already enabled the feature, and it doesn’t work so great. Friends suggested it might be a hardware limitation.)
  • Change configuration to have the mouse behave like the latest OS X (reverse scrolling)

Here’s a few other sites that helped me out:

And, I don’t recommend trying out Enlightenment as your only window manager on your first try. You’ll need something else anyway to get your wireless configured, and if you do something stupid like trying to install ‘econnman’ and you blindly say ‘yes’ to uninstalling some packages you don’t know anything about, you’ll end up accidentally removing your wireless devices. So, start with Gnome, read up and switch to E later.

Save the Ada Initiative

If you believe that women are a crucial part of the future of free and open source software, you should give to the Ada Initiative.

If you think we should have more women contributing, talking about and using free and open source software, you should donate to the Ada Initiative today.

I spent this past summer working with Mary, Valerie and the many supporters and contributors to the Ada Initiative. I talked to past donors, and spent a lot of time writing and thinking about how the Ada Initiative has evolved.

I met hundreds of people in person and online who believe not only that the Ada Initiative is a crucial advocate for change in the world of open source, but that establishing gender balance in open source through their work is a worthwhile, achievable goal. That work includes research, writing, training and creating culture and community specifically designed for women to flourish.

They’ve created strong relationships across project, business and ideological boundaries, through their board, advisors and AdaCamps.

I’m a member of the Advisors board, a major contributor to PostgreSQL and a data architect at Mozilla. These relationships have formed into a strong, diverse and visible alliance of women in open technology.

Because of the Ada Initiative’s work, I have seen an important shift from identifying problems to seeking solutions among my colleagues in open source. This work is made possible because TAI provides full-time employment to focus, write about and act on these solutions. Their work cannot continue without your support.

Between now and October 31, you can be the crucial donors who made this organization succeed in 2012. If you work for Microsoft, Google or Red Hat your donation with be doubled thanks to charitable giving matching programs. And individuals like Sumana Harihareswara and Leonard Richardson are sponsoring matching grants.

Social change is never easy, and the organizations like the Ada Initiative, who chose to step into the void, need our support.

Take a few minutes and give to the Ada Initiative, to Mary and Val, and help their work continue in 2013.