Double Major

2010 October 7
by Leigh Honeywell

I’m back in school, as some folks have probably already gathered from my microblogging. I’m finishing up a double major in Computer Science and Equity Studies at the University of Toronto, and if all goes according to plan I’ll be graduating in May 2011.

While this may sound like a strange combination, it makes perfect sense to me – I’m interested in equity issues within the STEM fields, especially computer science.

It turns out the combination of fields come in handy in unexpected ways some times. After proofreading a paper I wrote for a Women and Gender Studies class for me my friend Valerie suggested that some quantitative data might be useful in supporting one of my assertions. In my paper I argued that while early feminist scholarship on sexual harassment failed at intersectionality, more recent scholarship has embraced it. To support this, I wanted to compare the number of citations for Catherine MacKinnon’s Sexual harassment of working women: a case of sex discrimination to Kimberle Crenshaw’s Demarginalizing the Intersection of Race and Sex: A Black Feminist Critique of Feminist Theory and Antiracist Politics. These are both profoundly influential works, but I wanted to quantify how their relative influence on scholarly work.

So I did what any self-respecting CS student would do – I wrote a script to scrape Google Scholar for citation numbers over time and made a graph comparing the two :)

For your edification, here’s

# (c) 2010 Leigh Honeywell
# Licensed under the Simplified BSD License, reuse as you will!
use strict;
use LWP::Simple;
use LWP;
# set up LWP user agent and cookies; pretend to be Firefox 4 just to be cheeky
my $lua = LWP::UserAgent->new(
    keep_alive => 1,
    timeout    => 180,
    agent =>
"Mozilla/5.0 (Windows NT 6.1; rv:2.0b7pre) Gecko/20100921 Firefox/4.0b7pre"
# edit in your citation numbers from google scholar and the appropriate
# date ranges for what you're trying to do
my $crenshaw = getCites( "10759548619514288444", "1977", "2010" );
my $mackinnon = getCites( "2195253368518808933", "1977", "2010" );
sub getCites {
   (my $cite, my $startyear, my $endyear) = @_;
    for my $year ($startyear .. $endyear) {
        #construct the query URL using the above data
        my $post =
          $lua->get( ""
              . $cite
              . "&as_ylo="
              . $year
              . "&as_yhi="
              . $year );
        # scrape the returned page for the number of results
        if ( $post-&gt;content =~ m#of (?:about )?<strong>(\d*)&lt;\/b&gt;# ) {
            print $cite. "," . $year . "," . $1 . "\n";
        elsif ( $post-&gt;content =~ m#did not match any articles# ) {
            print $cite. "," . $year . ",no results\n";
        else {
            # some kinda error happened, most likely google caught me!
            print $cite. "," . $year . "error\n";
    # don't kill google's servers
return 0;

Oh and if you’re curious, Crenshaw’s paper was cited far more than MacKinnon’s, pretty much as soon as it was published. Intersectionality FTW!

And as these things always go, of course I spend the evening working on this only to find that there’s a Perl module as well.

This post was written by leigh honeywell.

Leigh is a student, hacker, and organizer of communities both online and offline. She blogs, dents and tweets.

One Response
  1. Sushi permalink
    October 8, 2010

    I was a math and French double major, so you’re not alone in interesting major combinations. Being able to use quantitative data in my works like you did has helped immensely in getting various points across, and that has been the most valuable lesson from the math degree. That and drumming into people’s heads that correlation is not causation. Intersectionality FTW!

