As we discussed in class on Monday, copyright is a really complicated issue, especially in our digital age, where the lines between what information, music, and ideas belong to whom are increasingly blurred.
Academia is one place where the effects of stringent copyright laws are acutely felt. Scholars want to protect their ideas and gain credit for their work, but often, they must also draw upon the works of others in order to pursue their own research or provide supplemental material in their classrooms. The “Fair Use” policy provides some help by allowing scholars and teachers to make use of copyrighted material if they meet certain criteria regarding its:
- purpose
- nature
- extent
- effect on the market
“Fair use” is a broad concept whose implementation can often occur on a case-by-case basis; the reproduction of a copyrighted work for an educational or non-profit purpose does not guarantee that it automatically qualifies as worthy of an exemption from normal copyright policy. One such project that exhibits this paradox is Mark Davies’ Time Magazine Corpus.
Time Magazine Corpus: A ‘Subscriber’ to the Fair Use Act?
The Time Magazine Corpus is one of seven corpora (large collections of text) created by Brigham Young linguistics professor Mark Davies. The corpora exist to “[find] out how native speakers actually speak and write; [to look] at language variation and change; [to find] the frequency of words, phrases, and collocates; and [to design] authentic language teaching materials and resources”, according to the corpora website. These goals are achieved by digitizing vast amounts of historical data and analyzing their contents to find patterns in words usage. With almost a quarter of a million visitors each month, the corpus.byu.edu website suggests that it is the most-accessed corpora available.
The Time Magazine Corpus has a digitized copy of every version of Time Magazine since 1923 in its stores for analysis; collectively, they contain over a hundred million words. Surely one of the most popular corpora on the internet is subject to the conditions of the Fair Use Act. Or is it?
- Purpose–Based upon the website’s stated goals (see above), the Time Magazine Corpus exists to further learning and research about the development of American English. That goal is educational and non-commercial.
- Nature–the Time Magazine issues that are presented in the corpus are published and generally factual (with some more subjective pieces).
- Extent–Clearly, more than10% of all issues of Time Magazine have been utilized; the entirety of the Time Magazine archive since 1923 is housed in the BYU corpora.
- Effect on market/value–to access the full text of any of these issues, one must be a subscriber to Time Magazine.
The purpose and nature of the Time Magazine Corpus seem to be reasonable under Fair Use, but the extent and effect on value call the project into question.
The Teach Act of 2002 adds another interesting layer to the corpus copyright debate. GMU’s Copyright Office says that the TEACH Act “allows digitizing of analog materials, [but] only if not already available in that form”. While Time Magazine had already digitized its stores, the corpus has made them available in a new digital “form”–one in which the text can be searched extensively.
Additionally, Cohen and Rosenzweig show in this table that works published before 1923 have become part of the public domain, while all worked published after that year are subject to copyright policy. The Time Magazine Corpus incorporates issues from 1923 to the present: they’re all subject to copyright laws.
A visit to the Time Magazine website’s archive quickly confirms that material is subject to copyright–and protected. While content is “available exclusively for TIME subscribers”, the “Reprints and Permissions” page does detail the process of obtaining permission to reprint or copy material. Interestingly, when I clicked on the “search here” link under the third point, “Licensing/Republishing Content in Print”, I was able to read entire articles published in a 2002 issue (even though I am not a subscriber), but I did not have access to articles published in a 1932 issue.
How, then, could it possibly be legal for Davies to utilize the whole Time Magazine archive? He tells us himself in the “Questions?” page of his site under numbers 8 and 9:
Our corpora contain hundreds of millions of words of copyrighted material. The only way that their use is legal (under US Fair Use Law) is because of the limited “Keyword in Context” (KWIC) displays. It’s kind of like the “snippet defense” used by Google. They retrieve and index billions of words of copyright material, but they only allow end users to access “snippets” of this data from their servers…We would love to allow end users to have access to full-text, but we simply cannot…We have to be 100% compliant with US Fair Use Law, and that means no full text for anyone under any circumstances — ever. Sorry about that.
Thinking back to one of our earlier class discussions, I searched for the term “solar energy” in the corpus, and when I clicked on one of the 98 results, I was directed to a few sentences that provided the context in which the words were used. There was also a link to the original article in Time Magazine, but when I clicked on it, I was directed to the Time Magazine site and received the same message as I had before: “Time Magazine content is available exclusively for TIME subscribers”.
By ensuring that those who use the corpus are not able to view the original full-text of an article, Davies does not violate copyright law. Instead, his corpus allows for detailed research about the patterns of American speech, and is resource that TIME should be excited to be part of (and likely is, given the fact that they haven’t pressed charges against Davies–at least not yet.)