Sunday, September 03, 2006

Those little traces...

As a long-time computer scientist, I am aware of just how easy it is to use computers to store personal information, and believe that as educators we need to ensure that our students use of web 2.0 and social software technology does not land the students with future problems. But as we shall see below, sometimes the choices are not so clear-cut.

Background: Once, perhaps five or six years ago, when I was teaching a public class on C++, a programming language, I had a student from a large credit card company. In the course of general chit-chat I found out what he was going to do with the programming language — aggregate thousands of data sources on millions of people to find out their buying preferences and their credit ratings. "And this huge scale activity is going on in a company," I thought, "What other companies are doing this?"

Of course, not wanting an eventual shed-load of (targeted) junk mail coming through my door from UK supermarkets, I have avoided their loyalty cards, only to realise that each time I use a debit card in a store, I am anyway adding to the slow accumulation of purchasing data that the store holds on me. At least one supermarket chain in the UK already classifies customers into groups according to purchasing preferences. I know this because a former colleague's partner helps run that particular operation. I never enquired as to my classification, but I guess it would be somewhere between 'gourmet' and 'base-level minimal-cooking survivalist', with a 'definite caffeine habit'.

OK, the wheels of commerce will turn, and data like my strong interest in drinking coffee will be collected. But what of our students and their use of the intenet?

Recently I have become interested in university students blogging as part of their higher education activities. The University of Warwick, for example, encourages all students to blog (see here).
  • In passing, some statistics are interesting, each of these 4052 blogs have an average of 18 posts and an average of 2.3 comments per post. I would say that this is a good uptake for a student body, and am particulary impressed by the comment to post ratio. I hypothesise that the comment to post ratio implies that a cohesive student community is starting to form.
We have been considering a similar approach, but just in the School of Computer Science that I sometimes work in, at the University of Manchester. Our conversations have sometimes centered on the problems in getting a community of computer science students to write prose.

In this context, I wondered if blogging that involves more than learning-related posts — for example about the students' social life — might help in encouraging blogging in general. Luckily for them, our students, mostly male, mostly heterosexual, have social events with students from schools that are less male-dominated, and there is a natural social linkage if we can get students in these other schools blogging too.

But there are big and important questions on what is stored where, and on potentially large real-life ramifications. Anything on the internet is fair game to be harvested and stored.

What might stop some employer-advisory service from scraping student blogs looking for key words or phrases? Text like "I drank 17 alcopops on the Friday, was I pissed or what?", or "We went clubbing and got off our faces" (i.e. took drugs). Who knows if this is recurring behaviour or once-off experimentation? Certainly there is room for creative interpretation, probably not in the best light for the person whose actions are being interpreted: An employer-advisory service could provide a paid vetting service to future employers who want to avoid employees who might 'loose it' from time-to-time.

While I'm not encouraging the above behaviours, I do think that we have a duty to make our students aware of the potential conseqences of their actions on the web.

And, in in this vein, every so often I get a salutary example of the power of little bits of self-generated information on the web. Here are some examples:
  • Today I found out from that there is a member of a particular community who has some interest in activism, hmm that probably should go on a government list somewhere. (Not!)
  • Not long ago I found out from that one of my colleagues bookmarked a lawyer's site; "What is happening in this person's life?" I wondered, because I knew that this person had already bought a flat and did not need a lawyer for that.
  • Recently we saw AOL misguidedly dumping some information regarding searches on the web. This information was subsequently removed but not before it had been archived and chewed over by several interested parties. See here, here, here and here.
  • What does Google do with your search data? Probably every facet of your life is described there; to some level of detail.
  • And the killer app is still coming, the US government trolling social spaces; check this.
  • As for the power of social software for self-revalation and unfortunate consequences, what about this?
The last item is scary, what would you do as a teacher if you found out (by any means) the same information that was gleaned from myspace? What if someone is making things that go bang? What if there is a teen who is self-harming? What is your duty as a teacher?

Certainly, in some circumstances, most thinking teachers would spring into action on receiving certain kinds of information. In turn this may have consequences for our view on automated trolling for data in social spaces, maybe some trolling is not so bad after all. But, then, do we trust the trollers and what they might do with the information? What is the cost-benefit equation here?

Food for thought...


Anonymous Yishay said...

Privacy? You have no privacy. get over it.

The way I see it, those who have the resources (goverments, corporates) know everything they want to know about you for a long time. They can tap into any chanell you use whenever they feel like it. The only way to fight back is to demand symetry, and to get use to the fact that anyone can google you. In fact, I'm more comfortable with my students googleing me than with my goverment checking my phone bills.

Tue Sep 05, 03:56:00 PM GMT+1  
Blogger Mark van Harmelen said...

Yeah, OK ....

But it's amazing how many people don't know about traces, record keeping, data aggregation, data mining, and so on.

And it's amazing how many people don't have any sense of history about misuses of data.

As a computer scientist and someone aware of the potential uses of information, on a practical front I would say, keep any data open to misinterpretation out of computers and the net.

A bit hard for purchasing history sometimes. And I can see this in some future scenarios: "Oh he eats organic food. He's probably a left-wing fruit cake. Let's put him on the WATCH LIST." Too extreme? Hey, I don't worry about things like that, for that is the way to paranoia.

But perhaps we might be well to warn our children and/or students that

-- Information is collected and aggregated

-- This information may then be used in ways that we may not even conceive of

-- That this use may have quite unexpected results for the person who originally generated the information


Tue Sep 05, 09:56:00 PM GMT+1  

Post a Comment

Links to this post:

Create a Link

<< Home