May 23, 2006

Attributed Quotation as Plagiarism

Update: Jonathan Bailey sent me a note with some additional rationale behind his piece, as well as some plans for revision based on comments he's received. I offered him some additional constructive feedback; I'll post a link to the new piece he said he's working on when it's posted.

In a somewhat odd piece at Plagiarism Today, Jonathan Bailey takes weblogs to task for failing to add enough "original" content to material they quote from other sites. The piece begins (in both title and initial paragraph) by labeling the practice "the new plagiarism" (and cites a few sources that agree with him), a move that he later admits was a mistake (in an update at the end of the article) because plagiarism and fair use of intellectual property are different issues—he's apparently more interested in the latter. (He also sounds a little resentful and places the blame for this miscommunication on the reader, and so far has failed to update the incendiary title and opening paragraph. I guess he's enjoying the notoriety.)

But aside from that issue, there are some fundamental problems with Bailey's attempt at linking intellectual property theft to weblogs that are primarily attributed quotes and links:

These sites, which for this article I’ll simply call "gray", are generally identified by a large number of very short posts, with much of it in block quotes or otherwise directly lifted content. Though they meticulously credit their sources, bowing to more traditional rules for blog attribution, and work to add at least some original content, usually over half of their material comes from other sources.[...]

While certainly grey blogs don’t pose the same threat or raise the same concerns as spam blogs and other content scrapers, the cause for concern is clear. Even though blogging is about sharing and reusing information, excessive sharing threatens the authors penning the original content. The tale of the goose laying the golden egg springs to mind as, quite simply, greed can be the blogging world’s biggest enemy.

Here's a parallel, more traditional situation that might clarify things. Compiling resources from very scattered bits of information is a relatively old, accepted method of authorship. Documentaries, for example, are frequently composed of pieces of information that a documentary filmmaker has gathered from original sources by archival research and interview, often with only a very small amount of "original" narration and titling to provide some sort of narrative cohesion. The creative work in such objects comes from the activity of locating those bits of useful information and bringing them together into a single place. The "gray area" weblogs that Bailey critiques are much like that: collections of disparate information from a wide variety of sources. So Datacloud, for example, attempts (in general) to provide an ongoing set of links to conceptual and practical work that demonstrates the trends that I talked about in the printed book. I spend an embarrassingly large amount of time in NetNewsWatcher reading RSS feeds (I think I subscribe to something around 200 feeds) under the assumption that the small number of people who read Datacloud don't also read those same 200 feeds. I'd feel differently about this if I was, say, doing nothing but subscribing to an RSS feed of Boing Boing, putting the material in blockquote tags, and republishing that as my own Weblog. But I'm not.

So I guess the crux of the issue is both in how Bailey understands creative work—for him, the use of "original" words is paramount, and in how he tries to link other sorts of work to the ethical and legal issue of intellectual property theft. (I do agree somewhat with Bailey's comments about blogs that quote full articles, given that it discourages readers from visiting the source site. He should have left his critique at that (which has nothing to do with proportions of original and quoted text) and not tossed in the inaccurate "plagiarism" tag in an attempt to generate controversy.

Oddly, most of what I do post at Datacloud typically includes very little of my own text. I guess that's because I usually agree with things I'm linking to. I almost deleted much of my own text above because I didn't want it to look like I was agreeing with Bailey's proposed restrictions, but I guess I'll leave it sit.

There's also an animated /. discussion of the article you can read/participate in, which is where the inaccuracies of Bailey's terms was originally identified.

Posted by johndanseven at May 23, 2006 03:52 PM