Data feels

cn: rape, sexual violence, & CSA juxtaposed with cold data.  This is being crossposted to my other blog.

Some of my most important activist work is in volunteering technical skills for the Asexual Census, a survey of English online ace communities. This past week, I’ve been on a roll analyzing our 2015 survey. No numbers will be reported here, this is just a personal account.

Unsurprisingly, as soon as I was done with prep work, my attention was drawn to the statistics on sexual violence. As a programmer, I’ve been trained to always use descriptive variable names. Now I’m looking at variables named “rape” and “rapeCombined”.

It’s “combined” because it combines people who explicitly said they were raped, and people who said they’ve had sex they didn’t/couldn’t consent to. A lot of respondents see a distinction between the two. I don’t agree that there is or should be a distinction.  Probably sexual coercion should be combined too.  But I’m not sure how appropriate it is to impose my views on the data.  Probably I’ll defer to the judgment of the committee.

“rapeCombined” hovers around 10-20%, depending on the group. This number is not immediately meaningful to me. Is it high or is it low? You can’t get “high” or “low” out of a single number, you have to compare at least two. So I find myself comparing subgroups, and soon it’s a competition. Contra certain people on the internet, *this* group shows higher numbers than *that* group by two whole percentage points, which is slightly larger than our margin of error. Data trumps theory, jerks! I’m totally gonna win some arguments with this. Winning arguments, the consolation prize for losing in life.

Another thing that popped out to me, the age that people first experience sexual violence is bimodal. I can objectively draw a dividing line between child sexual abuse (CSA) and other sexual violence. This is information that I never asked for, but now I have it.

I mean, the dividing line is only sort of objective. Do I draw a dividing line where the distribution hits its minimum, or do I fit it to two Gaussians and draw the line halfway between the peak positions? Or maybe I should find the age where each Gaussian has an equal contribution. Okay, so maybe the methodological question is a bit ridiculous, but wouldn’t you rather think about math than about sexual violence? Anyway, someone needs to think about the math, those are the skills I’m volunteering.

I feel like a dragon sitting on all this data. It’s important for the community to see, but that importance just means we have to take the time to do it right. You know, write a proper formal report or something. By committee. This is incredibly slow. In the mean time, I’ll just sit on this mound, admiring the bits that are especially beautiful or grotesque.

About Siggy

Siggy is a physics grad student in the U.S. He is gay gray-A, and makes amateur attempts at asexual activism. His interests include godlessness, scientific skepticism, and math. While not working or blogging, he plays video and board games with his boyfriend, and folds colored squares.
  1. queenieofaces says:

    “Winning arguments, the consolation prize for losing in life.”
    (But I know that “oh no, I didn’t want to know this information, but now I have to DO something with it” feeling. It’s not pleasant. But I’m really glad that someone is working on the data analysis, even when the subject matter is difficult.)

  2. I could take a look at the some of the data; I’m a mathematics undergrad student at the moment and would love to reverify data on asexuality.

  3. Coyote says:

    “A lot of respondents see a distinction between the two. I don’t agree that there is or should be a distinction.”

    ehhh I can sympathize with not wanting to use the word.

    • Siggy says:

      I was presenting a simplified version of that particular issue, and here’s a slightly less simplified version:

      The main question is what numbers to report. I have five yes/no/unsure questions, so there are 31 ways to combine them using OR functions. Okay, realistically there are only a handful of reasonable combinations, but I also want to cross-tabulate the results against a bunch of other questions. This requires picking out 1-2 combinations–either that or exponentially-growing walls of numbers.

      Some people probably want to see rape, nonconsensual sex, and sexual coercion reported together. Others want to see just rape and non-consensual sex. Others want to see them all separated. Can’t please everyone.

