DPLA vs. real public libraries? New camels’ noses under the tent? ‘Concept note’ goes beyond ‘digital’

Get ePub or Kindle file of this

Part I of Historical Society blog interview with DPLA advocate Robert Darnton

The Digital Public Library of America is out with a new "concept note,” and, alas, our DPLA friends still don’t grasp the franchise and branding issues of genuine public libraries—cash-strapped and already having their roles chipped away by Amazon, Google, and others.

Perhaps the DPLA steering committee members can read a series from ZDNet, starting with Digital Underclass: What Happens When the Libraries Die? Notice? When, not If.

Will the DPLA please drop the “Public”; or, despite rhetoric to the contrary, does it really intend to say with its name: “Oh, don’t worry—we’re the new public system”? LibraryCity.org is cheering on the DPLA as a potential provider of publib content and, we’d hope, maybe even a funding ally for public libraries; we love the vision of an online “Republic of Letters” blended in with an authentic public digital system, just so the latter calls the shots. But as if the “Public” in DPLA’s name isn’t problematic enough, we now read, in the Harvard-hosted initiative’s new concept statement (as transmitted today to a public discussion list by Steering Committee leader John Palfrey):

image"Though the purpose of the DPLA is primarily to provide access to digital materials, it may eventually provide for the future by collecting and preserving a wide variety of information in many formats."

Regardless of the denials we presumably would hear now, does this mean that someday you’ll be going to a DPLA branch? Or that DPLA business development people will compete with public libraries and the Library of Congress for philanthropic funding to amass physical collections? For that matter, in a video and in the concept note most recently, we’ve run across references to possible requests for public funding; how much publib money could this siphon off? No small amount, potentially, we fear. Multiple camels’ noses under the tent?

Ideally the DPLA will amend the concept statement immediately and expressly limit collection of physical materials to archiving for later digitization, and unequivocally rule out any future plans to create physical branches competing with genuine public libraries. The actual mission statement should say that, in fact; and, of course, “Public” should instantly vanish from the group’s name. It could always be restored if the DPLA ended up within the Library of Congress. But don’t count on that happening. The concept statement says: “One could imagine grafting the system onto a structure that already exists, like the Library of Congress, CLIR, or ALA. But it will be necessary also to explore the advisability of creating a new entity." My sense is that the DPLA is headed in the direction of "new."

Schoolchildren and other public library users could suffer, given the current DPLA’s priorities; for example, the 14-member steering committee isn’t exactly brimming with educators from the K-12 world despite the group’s pledge to serve “students of all ages, from grades K-12 to postdoctoral researchers.”

Likewise the DPLA can talk all it wants about about the practical, about “data related to employment,” but both in the board composition and in the choice of the main topics discussed at the March 1 workshop, I don’t see any great passion for, say, the prompt inclusion of multimedia vocational materials or job-hunting help for the unemployed.

No, it’s the scholarly community about which the DPLA most cares, a noble priority, but still an indication that this endeavor is no public library in the modern American sense. Public libraries are the true experts here and could accomplish endless good online as a system, under a public body like the Library of Congress, just so LOC didn’t interfere with responsiveness to local needs.

Speaking of LOC, Robert Darnton, the Harvard Library director who proposed the DPLA, says in the video interview (here and here) that he envisions “a library greater than the Library of Congress” (also see Randal Stephens’ accompanying post in his Historical Society blog).

LibraryCity is disappointed that LOC isn’t coming forward with its own up-to-date vision, ideally picking up concepts from our Chronicle of Higher Education essay—including, the urgency of working with truly public libraries to serve the digital underclass along with the rest of society. Public money is tight now, but political winds can shift quickly, and beyond that, by way of cost-justification, help for entrepreneurs and other features, we’ve explained how an NDL could please many a conservative. The late William F. Buckley Jr. was already sold on the NDL concept and wrote two columns in favor of it.

Simply put, we hope that the DPLA, LOC, and others will remember the special role that public libraries play in informed civic debate and other aspects of American life—hence, the need for public governance. As long as the DPLA retains “Public” in its name, many people will understandably wonder about the organization’s claims that it won’t steal thunder from institutions such as LOC and local public libraries.

In full, in case the already-given link does not work or vanishes in time, here is the “concept note” as distributed on the public DPLA discussion list:

The Digital Public Library of America
Concept Note
March, 2011

The Digital Public Library of America (DPLA) will make the cultural and scientific heritage of humanity available, free of charge, to all.  By adhering to the fundamental principle of free and universal access to knowledge, it will promote education in the broadest sense of the term. That is, it will function as an online library for students of all ages, from grades K-12 to postdoctoral researchers and anyone seeking self-instruction; it will be a deep resource for community colleges, vocational schools, colleges, universities, and adult education programs; it will supplement the services of public libraries in every corner of the country; and it will satisfy other needs as well—the need for data related to employment, for practical information of all kinds, and for enrichment in the use of leisure.

The process of planning for a DPLA takes as its jumping off point a statement drafted in the fall of 2010 at a workshop at the Radcliffe Institute in Cambridge, MA: “Leaders from research libraries, foundations, and a variety of cultural institutions gathered in a workshop at the Radcliffe Institute for Advanced Study on October 1-2 in order to discuss how to work together toward the creation of a Digital Public Library of America — that is an open, distributed network of comprehensive online resources that would draw on the nation’s living heritage from libraries, universities, archives, and museums in order to educate, inform and empower everyone in the current and future generations.”  The list of signatories to this statement can be found online at:  <http://cyber.law.harvard.edu/dpla/Sign_On>.

Though the purpose of the DPLA is primarily to provide access to digital materials, it may eventually provide for the future by collecting and preserving a wide variety of information in many formats.  But the DPLA cannot be everything to everyone.  For it to fulfill its mission, its scope must be carefully defined, and it must be erected incrementally, according to a realistic plan.

Scope and content:  No firm boundaries can be set to the collections of the DPLA; but if it takes the sky as its limit, it will never get off the ground.  Despite its ambitions to include all kinds of cultural products, it should concentrate at first on the written record—books, pamphlets, periodicals, manuscripts, and digital texts.  It will not neglect audio-visual materials, but it will coordinate its growth with that of the Library of Congress, the National Archives and Records Administration, the Smithsonian Institution, and other national repositories so as to avoid unnecessary duplication.  It also should avoid duplicating services that are better provided through other means.

The DPLA must respect copyright, and insofar as it will include works that are commercially available, it must do so only with the consent of the rightsholders.  With adequate funding, it might establish a pool of money to be distributed, according to the frequency of usage, to authors and publishers of works that are in print and covered by copyright.  Many authors are now making their work available online according to open-access programs. The DPLA could coordinate and help implement such voluntary contributions to the general store of knowledge.

In order to lay a solid foundation for its collections, the DPLA should begin with works in the public domain that have already been digitized and are accessible through the Internet Archive, HathiTrust, a broad range of government material, and possibly private-sector initiatives such as the Google Books Project.  These can be supplemented by digital collections of research libraries and amalgamated holdings such as the digitized newspapers from the fifty states, and other related materials, that are now on deposit in the Library of Congress.  A new program of scanning collections should then be undertaken with the goal of including all printed material up to 1923 from all of the major research libraries. 

Further material should be added incrementally to this basic foundation. If Congress passes suitable legislation on “orphan” works—those whose rightsholders have not been located—another layer could include works published between 1923 and 1964, a period when the extent of copyright is most problematic.  Next, the DPLA should attempt to provide access to the largest possible number of books that are covered by copyright but are out of print. Various solutions could be devised to clear a way through the legal obstacles: a fund to compensate the rightsholders, pay-for-view arrangements, a provision to protect the interests of authors and publishers who do not want to cooperate, voluntary agreements with those who do, or legislation to protect the DPLA from litigation on the grounds that, like all public libraries, it is a non-profit enterprise dedicated to the public welfare.  A moving wall could be established to bring in new materials.  However contemporary its holdings may become, the DPLA will remain steadfast in its respect for intellectual property rights.

Architecture: No decisions about the architecture of the DPLA have been made.   One option is to make all this material available from one gigantic database, while another is to maintain it in digital repositories that already exist and to integrate them into a single discovery environment.  Early considerations suggest that the DPLA build an approach that draws its lessons from the architectures and large-scale systems of the Web, and particularly from existing and ambitious digital library initiatives, such as Europeana.  Such a system would most likely take the form of an open, distributed network of online resources enriched by semantic data and thus made discoverable on the Internet.  At the outset, its material is likely to remain hosted, as a primary matter, in a federated series of the existing digital repositories.  The system would allow for broad and easy access to enormous existing collections, such as the Internet Archive, along with those in research libraries and other repositories and those to be created by future scanning.

To unite so much disparate material in one, seamless system with a beautiful and intuitive user interface will be an enormous technical challenge. The design must promote interoperability and also be user-friendly—that is, it should be simple enough at its front end to satisfy the needs of ordinary citizens, while the engineering at its back end links all the platforms together in such a way that the search and discovery tools operate smoothly everywhere.

Metadata: Aside from its engineering requirements, the system cannot function without adequate metadata.  Legacy digitized collections require enriched semantic metadata, and current scanning operations need updated tools to create such metadata.  Exactly what the standards for these metadata should be, and the degree of standardization, is a matter to be determined by a specially convened group of experts.  The experience of HathiTrust suggests that this kind of collaboration is feasible; HathiTrust might well be taken as a model.  But it could be necessary to design new tools in order to promote maximal compatibility.  In any case, open linked data will be necessary to provide both discoverability and context. 

As a practical matter, the DPLA cannot certify that all the items in its collection are reliable and accurate.  That kind of certification must be devolved upon the institutions that originally collected them.  But the DPLA could certify attribution and authenticity—that is, it could devise mechanisms to inform users about the identity of the creators of the documents and to verify that its copies are unmodified replicas of the originals.

Scanning:  The preparation of standardized metadata belongs to the process of digitization, which also must be coordinated in ways to ensure adherence to standards and  interoperability. The technical requirements of the scanning must be determined after careful study—no easy task, because high-quality scanning can be so expensive as to put unacceptable pressure on the finances of the whole operation, yet the quality must be adequate for the use of scholars as well as ordinary viewers and for storage and migration through various formats.

Storage and preservation:  The DPLA is about access at its core.  However, provisions for storage and preservation must be built into the budgets for digitizing.  No one has solved the problem of permanently preserving digital works, but the DPLA should work with the leading preservation technologies—HathiTrust, DuraSpace, and LOCKSS, and potentially others—to build out the nation’s existing preservation architecture.  Many research libraries understand the need to rework digital texts and to migrate them through different formats in order to preserve them from obsolescence and decay.  But all of the contributors to the DPLA should adopt compatible measures.  In fact, the need for migrating digital files could reinforce the argument for creating an additional, catch-all data base or perhaps a “dark” archive, in order to mitigate the risk of loss.

Administration and governance:  The system of decision-making and management of the DPLA, like its architecture, has not yet been determined.  Our inclination is to establish a broad-based, federated structure.  We intend to create coherence out of diversity by erecting one virtual library out of a multiplicity of collections.  But it cannot hold together without adherence to common practices, and those coordinated modes of behavior cannot be sustained unless there is an adequate administration to govern the whole system.  One could imagine grafting the system onto a structure that already exists, like the Library of Congress, CLIR, or ALA.  But it will be necessary also to explore the advisability of creating a new entity.  The design of any governing body should be the work of a commission composed of representatives from the worlds of libraries (public as well as private), information technology, publishing, and the general public.  It probably should include deputies from the research libraries whose holdings will be integrated into the system.  In order to be protected from political pressures, it should be a free and autonomous body, perhaps something like the BBC.

Current Status: During the planning phase, which is well underway early in 2011, the Berkman Center for Internet & Society at Harvard University has begun to host a series of working meetings and work streams to develop a long-term plan for the DPLA.  The research and planning is organized around six primary tracks: 1) content and scope; 2) finance and business models; 3) governance; 4) legal; 5) technical considerations; and 6) audience and participation.  The research and planning work is being recorded and developed on a public wiki, online at: http://cyber.law.harvard.edu/dpla/Main_Page.  A broad community of volunteers has been formed through participation in these work streams, which in turn are meant to prepare the way for a “big tent” project involving the general public.  The work of this planning phase is guided by members of an initial Steering Committee, whose members are posted online at http://cyber.law.harvard.edu/research/dpla/steering.

Access:  Whatever its administrative structure may be, the DPLA must be open to all Americans, free of charge.  Its openness should extend to everyone on the globe where possible, subject to legal constraints that may arise.  Thanks to the world-wide reach of modern technology, the DPLA will be a vital part of the world of knowledge, and its activities should be coordinated with those of digital libraries in other countries.  Its holdings will correspond to its global dimension, because they will include many languages and many means of communication. The use of them should be unrestricted, unless exemptions from copyright requirements may exclude commercial applications.

Funding:  Despite its international scope, most of the financial support for the DPLA will come from sources in the United States.  It may, at some point, become desirable for Congress to appropriate funds to support this public good.  But because the DPLA will be entirely independent of the U.S. government, its funding, at least initially, should come from a coalition of foundations.  One of the work streams will explore the options for financial and business models that might support the DPLA on a sustaining basis.

Timing:  The DPLA’s initial planning phase will end during the spring of 2011.  If sufficient financial support can be obtained, the alpha implementation phase will begin later in 2011 and will run for approximately 18 months, at the end of which an operating demonstration system will be released.

Similar Posts:

   Send article as PDF   

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.