Back to posting (and numpy)

Jesus, when it rains, it pours.

I haven’t had time to post in quite a while – somewhat due to the NIH goldrush and somewhat due to the fact that I have been trying to get some manuscripts out the door that have been hanging around for far too long. Then, there’s planning-time wasted for NIH projects that didn’t pan out or that we couldn’t get together before the deadlines.  I’m pretty sure everyone is sort of in the same boat.

Anyway, I’ve been working today with some things in numpy, which is probably one of my favorite libraries (in the whole world) for python. There’s things that you can do in numpy, that quite honestly, are probably far simpler than they should be.

Generally speaking, numpy is meant for calculations involving large arrays. That said, the flexibility of numpy continues to amaze – recently (within the past year), I’ve used it with great success for sequence comparisons and SNP characterization (numpy arrays can hold string values). This morning, i’ve used it to calculate summary counts of interactions between many objects over periods of time (essentially 100 x 200 x 200 element arrays). Of course, the array based nature of numpy lets you maintain structure (i,e. temporal structure) within the data, while performing the needed summary operations over the data.

Numpy wouldn’t be quite so great if it weren’t for the impressive number of methods available for numpy objects or it’s integration into the larger SciPy project. Per usual, numpy can be subclassed to do whatever you want.

Anyway, if you haven’t used it – do. And if you have, then you’re probably as impressed with it as I am.


the NSF website background

Browsing over to the NSF today to do a little work, I came across a common annoyance that arises with respect to their website background image. Basically, on my setup (OS X; Safari), the background tends to flicker/wobble, inducing a sort of nauseous, pre-seizure condition. Here’s a fix, which is worth it if you spend a lot of time at the NSF site.

For basic website filtering, I use Privoxy, which is a great way to reduce common website annoyances like cookies, javascript, etc. In addition, you can create actions, in addition to the defaults, to change the appearance of particular websites to your liking (similar to Greasemonkey for Firefox), although the implementation is different.

Long story short, if you have Privoxy installed (i get it via Macports), you can do the following to fix the “problem”:

sudo nano /opt/local/etc/privoxy/config
# uncomment the following line around line 375:
actionsfile user.action # User customizations

# while your at it, change line 917 to the following
# which allows you to edit things via the handy web interface
# at http://p.p/. Beware of multi-user security issues.
enable-edit-actions 1

Now, create or edit /opt/local/etc/privoxy/user.filter adding the following contents (the code is at Snipplr due to WordPress filtering):

Privoxy nsf filter

Once completed:

  • visit http://p.p/
  • click “View & change the current configuration”
  • choose to edit “/opt/local/etc/privoxy/user.action”
  • choose to “Insert a new section at top”
  • choose “Edit” under Action
  • select “filter nsf” from the list
  • click “Submit”
  • “Add” a url pattern
  • type “.nsf.gov”
  • click “OK”

Problem solved. The NSF website should now look like this.


a new day dawns

Today, I felt prouder of my country and its citizens than at any other point in my life. America the beautiful, indeed.


Management of Lab Resources

Although I hate to be a bit of a ninny about these things… there is one thing that takes: (1) quite a bit of time out of each week and (2) I hate. What could it be?  In very general terms:  “lab resource management”. Cases in point:  the inevitable (and often required) management of chemical inventories, SOPs, MSDSs, orders, receipts, equipment, etc.  

At the moment, I am working on putting together valid SOPs for the lab (this should have been done months ago).  Of course, I could throw these together in LaTeX, MS Word, or Pages and keep them as files on my computer, printing out “hard-copies” whenever they are needed. Adjustment of a single documents requires that I open the relevant file, make the changes, and save the resulting copy.  Wholesale adjustments to all files are time-consuming, and accessing the content of each document for various purposes at a later time, while not impossible, is less than fun. 

I would really like all of this type of data be in a more flexible format than that imposed by any of the options above.  In short, I want to store my data (for each task) in a database (RDBMS), such that I can interact with it, produce the output that I require at any given time in whatever format I desire… be that PDF, HTML, text, etc.


Well, basically, the RDBMS path provides a lot of different options for the maintenance of the data and the production of each output format. For instance, I can setup the RDBMS such that updates to a particular record (say, the description of a single SOP) trigger a procedure to increment the revision number for that document, while also recording the date the SOP was updated, and recording the username of the individual making the changes.  

While this is not terribly helpful if you only have 1-2 documents to keep updated, it becomes terrifically powerful on a larger scale (100s to 1000s of documents).  Furthermore, it helps me maintain the integrity of the data present in each record.  In other words, if many of the fields are updated automatically, it removes problems associated with forgetting to make (or improperly making) manual changes to those fields of that record.

Well, what’s the problem, then?

The largest barrier to current progress is the fact that on-the-fly PDF creation is not terribly fun or easy.  In some circumstances, PDFs are neither necessary or desirable.  But, for the purpose of creating SOPs to be posted in the lab or for completing our departmental purchase requests, they are required.

I would prefer to use something that interacts with a open-source RDBMS (MySQL or PostgreSQL) and Python.  Basically, this appears to mean that I will be  working with ReportLab – which is certainly powerful enough to do what I both want and need.  That said, with power comes complexity, and I don’t find the thought of wading through the ReportLab documentation enjoyable.  Furthermore, although I have found some 3rd party documentation, there is really not too much information out there.

Perhaps the ideal solution would be to locate a software package that would allow me to interact with pre-existing PDF documents/forms.  That way, I can be lazy and use a relatively benign tool like Illustrator or LaTeX or MS Excel to create the layout for the forms that I want to use, adding the content to these forms using a combination of Python and RDBMS.

Does such a thing exist?


No time like the present…

Well, well, well.  I guess I have finally decided (after much internal debate) to take a more personal step into (onto?) the intertubes that have served me well since the the time when just having an email addy was the shit. It’s exciting, fun, a little scary, and, I hope, a good idea.

That said, this step is not – ahem – as personal as it could be. Given the nature of this blog (genetics, computers, molecular biology, whatever I want), my potty-mouth, and my current and future position(s) in academia, I have chosen to write using a pseudonym.

Of course, pseudonymity does not ensure that the writing will be interesting or good. However, in a backwards way, it does ensure that I can write openly about what I perceive to be the truth – while avoiding repercussions. Recently, The Boss and I were discussing the utility of blogging and writing under a pseudonym just the other day. Our consensus was largely that:

  • blogging is an outlet that can help us (generally) perform better at other, similar tasks (journals, lab notebooks, etc)
  • blogging/writing are fundamentally different from writing research manuscripts, therefore they are “good” or at least “fun”
  • pseudonymity is probably best, particularly for those at the early career stages

Of course, good things can be used for evil, and writing under a pseudonym is certainly no different. But, don’t come here for unsupported tirades, slander, etc. That’s not how I roll.

So, you may be wondering how I do, in fact, roll. Well, since this is a new thing for me, I can’t say that I’m exactly sure. I am likely (and, hopefully, consistently) to write about some of the subjects with which I work, the environment in which I work, and my interests. These include, but aren’t limited to: genetics, behavior, molecular biology, organismal biology, bioinformatics, life in academia, music, mountain-biking, cooking, reading, etc.

If you like these things, then you’re in luck. If not, tough shit.