Shared technology services to advance scholarship (cyberinfrastructure): A perspective from Information Services and Technology

Publication Date: 
March 4, 2008
Expiration Date: 
March 4, 2011
David A. Greenbaum, IST—Data Services
Weight: 
0
Body Text: 

How can we advance scholarship through the development of shared technology services? [1]

This is an increasingly important question for campus leaders, faculty, national foundations, disciplinary societies, and scholarly technology initiatives across higher education. This author argues that it is, in fact, the central question at the heart of discussions about cyberinfrastructure [2] in the arts and humanities, social sciences, and sciences — discussions in which there is often conceptual confusion about what problems one is trying to solve and what the term "cyberinfrastructure" means.

To answer the question above requires the collaborative expertise and vision of faculty and staff from diverse communities at a research university: the disciplines; organized research units; computer and information sciences; the library; and the central IT organization, among others. This article briefly sketches a perspective on this question from one of these communities — the campus's Information Services and Technology (IST) organization, and then highlights a number of specific initiatives IST is working on to help answer this question; these are described in more detail in other iNews articles.

Berkeley's OCIO/IST organization has defined as one of its four guiding principles for the future that we must focus on providing shared technology services that enhance scholarship. Central information technology organizations at a number of universities and colleges across the country are searching for better models to provide scaleable technology services for scholarship, but this is a challenge for these organizations for a number of reasons [3]. The predominant focus of their resources and expertise in recent memory has been on campus infrastructure such as networking, data centers, and large enterprisewide administrative systems. These valuable services are often at least one step removed from the disciplines and the direct work of research, teaching, and public service. How to partner with faculty from the arts to the sciences to add value to research in ways that can scale and be sustained — that is, without having to craft individual, almost artisanal solutions for each academic partner as short-term projects — has not been clear as a strategy or culture for most central IT organizations.

From the point of view of some campus technologists, the landscape of scholarly technology projects for the humanities, social sciences, and sciences appears populated by many, generally discrete technology projects large and small [4]. There exist hundreds if not thousands of individual databases, websites, local computing clusters, and web-based tools created by faculty, students, and other researchers around specific topics, interests, projects, or initiatives. In most cases, these tools and resources are and were created to meet the specific needs of a particular community. In many cases, the funding and support for these critical initiatives is fragile and temporary, and future technology investments are taken up in piecemeal fashion. And for a number of these projects, the faculty principal investigator (PI) has become either a software developer or the manager of software developers, spending valuable time sorting through technology options, negotiating rights, and struggling to sustain technology projects.

Complicating matters is that the notion of sharing tools or data with other unrelated projects or systems is often foreign, and at best, is made possible through complex system- or tool-specific application programming interfaces (APIs). As a result, a scholar who has a specific research interest and wants to integrate tools and data from other projects is frequently left to figure out what resources might be available; and if some are identified, to negotiate the right to use the information or tools.

Reflecting on the current landscape of academic computing, a number of campus IT strategists believe we must and can find better ways to make the process of creating and sustaining digital scholarly projects easier. Simplifying the ways in which tools and information resources can be shared and reused across projects, disciplines, and institutions is fundamental to this endeavor. These campus technologists believe that three interrelated strategies are needed to make an evolutionary leap in digital scholarship and cyberinfrastructure:

  • The development of sustainable partnerships among researchers, instructors, libraries, domain specialists, and information technology professionals to understand scholarly practices and common and unique needs (such partnership models often extend well beyond the boundaries of any one institution, domain, or region).
  • The creation of a set of core scholarly technology capabilities and services built upon common technology frameworks.
  • The ability to embrace and use a blend of open/community-source and commercially provided tools, resources, and services.

A successful partnership model is built on a foundation of common understanding about the work and problem space to be addressed. For many campus technologists (and for others as well), a systematic understanding of what the nature of the work is — for example, research in the humanities — and where the disciplines may be going does not exist. In the case of such disciplines as the humanities, this may not just be a problem of insufficient understanding on the part of technologists, but rather a broader absence of analytic models for scholarly practices and workflows in humanities research. It is critical that campus technologists find new ways to partner with a broad set of faculty to create deeper and more systematic understandings of current and developing research and teaching practices both within and across disciplines. And it is essential that new collaborative, hybrid organizational models be built to improve delivery of technologies to scholars where they work: the studio, the laboratory, the field, the conference, the classroom [5].

Faculty and students across disciplines need such core digital tools as collaborative environments, authoring tools, digitization technologies, media and data repositories, digital libraries, text mining, natural language processing, geospatial and mapping services, visualization, website development, and powerful computational processing. Is there not a way to provide these tools as shared services available to many users, rather than building, installing, and supporting these over and over again as local instances?

A key technological and methodological approach that can serve as the impetus for evolution in digital scholarship is to start from a perspective of services and service architectures [6]. Service-based approaches can be seen as coming from two generally different worlds:

  • the world of the large enterprise (whether corporate, governmental, or educational) with a set of service-oriented architecture (SOA) practices that emphasize scale, management, cost-effectiveness, and long-term stability; and
  • what might be called the Wild West of data "mashups" [7] coming from the Web 2.0 [8] and cloud computing [9] worlds in which ease, flexibility, and fast innovation (with a focus on the individual "consumer" rather than the organizational citizen) are paramount.

Common to both approaches is the idea of being able to reuse and weave together loosely coupled, discrete, specialized technology services that come from other providers and projects rather than building and managing all on one's own.

What if one could tie together the reliability of enterprise services with the rapid development and innovation model of mashups into a single services-delivery model? Some in central IT believe the time is right to develop a lightweight, common, and easy-to-implement services architecture that could enable current and future scholarly tools and resources created across institutions and disciplines to be shared, reused, and maintained [10]. Critical to such an approach is the implementation of a web services framework. Such a framework is not a vertical application that focuses on a single in-depth function or a self-contained software tool used directly by a user, but rather a horizontally integrating set of technologies and core shared capabilities that enable the creation, aggregation, and reuse of services and resources among scholars, projects, and institutions. Frameworks of these sorts are appearing in a number of large-scale administrative domains within higher education.

Many questions abound regarding the design and feasibility of a partnership-enabled, shared technology services-based approach to support research, teaching, and public service. For example, how can a services architecture help us to move towards a package of core and common services that can be provided to all scholars or to disciplinary or functional clusters? How can we blend architectures being built for large-scale administrative applications (Kuali Student [11]) with scholarly frameworks? How can campuses connect to and integrate services that come from computational grids such as Globus [12], community-source digital repositories such as Fedora [13], textual analysis and data mining tools such as SEASR [14], learning management systems such as Sakai [15], digital library collections from the Open Content Alliance [16], and collaborative environments from a range of providers, such as Google and Six Apart? How could these services be woven together ("mashed-up") with mapping, news feed, bibliographic, blogging, collaboration, and other social tools available from the Web 2.0 world? Should campuses begin to employ "cloud computing" services as part of the platform for campus cyberinfrastructure?

These are broad, strategic questions; they must be grounded and tested in the delivery of local solutions. In IST, we are working with campus departments in a number of partnerships to provide concrete services while we help the campus to build out a strategy for the provision of cyberinfrastructure for scholarship. In other iNews articles, you can see examples of recent IST–campus partnership work to develop shared technology services. Chris Hoffman writes about our consortial efforts with the Berkeley Natural History Museums and other art and cultural heritage museums and collections on campus to build a community-source and services-based collection management solution with recent funding from the Andrew W. Mellon Foundation [17]. Rich Meyer describes a new collaborative initiative, Humanities Arts Research Technologies (HART), between the Dean of the Arts and Humanities, the Townsend Center for the Humanities, the Library, Computer Science, and IST to find ways to better serve humanities research [18]. Steven Lance describes the community-driven efforts of campus faculty, curators, and technologists to pilot and develop what we hope will be a campuswide Media Vault Program (MVP) to both preserve and deliver far better access to the digital assets being created every day at the Berkeley campus [19]. Patrick Schmitz, a new semantic services architect for the campus, outlines initial plans to focus on increasingly needed services that leverage technologies for statistical natural language processing as well as infrastructure to support social media and community annotation activities [20]. Ian Crew provides an overview of the Campus Collaborative Tools Strategy project we are carrying out to understand the fundamental campus needs around collaboration for research, teaching, public service, and administration, and how the campus might find a more integrated, cohesive, and efficient strategy for the provision of these services [21]. Finally, Noah Wittman, in providing an overview of the OCIO/IST Open Knowledge and Public Interest (OKAPI) initiative, writes about one of the most important questions for a research university in the 21st century — "How can a public university use the Internet and digital technologies to share its knowledge and resources with the world?" [22]

Notes

[1] The author wishes to acknowledge and thank Chad Kainz at the University of Chicago for a number of the ideas and arguments in this article.

[2] See, for example, the National Science Foundation, Revolutionizing Science and Engineering through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure [PDF] (January 2003); and the American Council on Learned Societies, Our Cultural Commonwealth: The report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities & Social Sciences [PDF] (December 2006).

[3] EDUCAUSE Center for Applied Research (ECAR) study, IT Engagement in Research — Key Findings [PDF]. Harvey Blustain with Sandra Braman, Richard N. Katz, and Gail Salaway. EDUCAUSE, July 2006.

[4] This is a very broad assessment of a complex landscape, which often varies greatly by discipline. In the sciences, in particular, there has been much work done by disciplinary clusters (e.g., astrophysics or computational biology) to implement rich shared solutions internationally. Even here, as the NSF notes, much work is to be done across fields and disciplines.

[5] The question of building and sustaining new types of blended organizational and consortial models on campus and between campuses and other institutions is a complex issue in itself. Efforts are being carried out to develop new "virtual organizations" to deliver shared technology services. See, for example, the recent NSF sponsored workshop Building Effective Virtual Organizations.

[6] In his 2005 article in Science magazine, Ian Foster defines service-oriented architectures as being "standard interfaces and protocols that allow developers to encapsulate information tools as services that clients can access without knowledge of, or control over, their internal workings. Thus, tools formerly accessible only to the specialist can be made available to all; previously manual data-processing and analysis tasks can be automated by having services access services". See Service-Oriented Science, Ian Foster, Science, 308, pp 814-17.

[7] "A mashup is a lightweight tactical integration of multisourced applications or content into a single offering. Their primary business benefit is that they can quickly meet tactical needs with reduced development costs and improved user satisfaction." Quoted from the DMReview glossary.

[8] Today, Web 2.0 tends to refer to an online experience that is interactive, social, and data-focused. Tim O'Reilly defines Web 2.0 as software that adheres to seven basic principles, quoted here: 1) "the Web as platform", 2) "harnessing collective intelligence", 3) "data is the next Intel inside", 4) "end of the software release cycle", 5) "lightweight programming models", 6) "software above the level of a single device", and 7) "a rich user experience". See O'Reilly's article What Is Web 2.0 Design Patterns and Business Models for the Next Generation of Software.

[9] The term "cloud computing" has increasingly been used to refer to software as a service provided via the Web by such companies as Google, Yahoo, Microsoft, as well as a range of smaller firms. See, for example, John Markoff, New York Times: Why Can't We Compute in the Cloud?, August 2007.

[10] For a more detailed perspective on the transformation of information technology from a services perspective, see Describing the Elephant: The Different Faces of IT as Service by Ian Foster and Steven Tueke [PDF]. ACM Queue, July/August 2005, pp 26-34.

[11] Kuali Foundation.

[12] The Globus Alliance.

[13] Fedora Commons.

[14] Software Environments for the Advancement of Scholarly Research (SEASR).

[15] Sakai Foundation, Sakai Learning and Collaboration Environment. At Berkeley, our bSpace learning environment is powered by Sakai.

[16] Open Content Alliance.

[17] Collection management systems for campus museums. iNews, February 2008.

[18] Humanities and Arts Research Technologies (HART). iNews, February 2008.

[19] Digital preservation for all: A report from the Media Vault Program. iNews, February 2008.

[20] New semantic services activity in IST—Data Services. iNews, February 2008.

[21] Collaborative Tools Strategy Development project. iNews, February 2008.

[22] Open Knowledge and the Public Interest (OKAPI). iNews, February 2008.