66th IFLA Council and General
Jerusalem, Israel, 13-18 August
Code Number: 028-129-E
Division Number: VI
Professional Group: Statistics
Joint Meeting with: -
Meeting Number: 129
Simultaneous Interpretation: No
A New Culture of Assessment: Preliminary Report on the ARL SERVQUAL Survey
Colleen Cook, Fred Heath & Bruce Thompson
Texas A&M University, Evans Library, College Station,
Texas A&M University and the Association of Research Libraries (ARL) under the New Measures initiative is engaged in a project to evaluate service quality in research libraries using an augmented SERVQUAL instrument. In spring 2000, 13 ARL libraries in North America invited a random sample of students and faculty to take the survey through the web. The pilot project evaluates the efficacy of web-based survey instruments, and the augmented SERVQUAL protocol will be tested for its usefulness in measuring service quality from the user perspective in research libraries. The project plan will be discussed, and preliminary results reported from the administration of the survey to selected ARL libraries in spring 2000.
The 122 member libraries of the Association of Research Libraries (ARL) are among the most important research facilities in the world. While encompassing a cadre of public and specialized libraries, its membership is composed primarily of libraries from 111 of North America's preeminent universities. The membership shares a commitment to excellence in support of research and instruction. In large measure, that commitment is acknowledged by the post-secondary world. Its members are generally regarded as the apex of an important pyramid of more than 3,000 post-secondary libraries on the continent. Their richly diverse collections support the missions of the institutions of which each library is a part and draw scholars from around the world who seek to mine their treasures.
In order to more effectively serve the broadly diverse constituency of students and scholars dependent upon these unique resources, the Association of Research Libraries (ARL) has undertaken a three year program to test and develop new service effectiveness measures among its members. The study enables ARL to re-examine traditional methods of assessing effectiveness while testing new theories to measure delivery of high quality services to those who avail themselves of critical research library resources. The need for new measures is common to all libraries. While the research is grounded in the research library community, it is possible that the emergent tools, with further research in other library cohorts, could be extended to post-secondary libraries generally.
The introduction of new effectiveness measures could not be more timely. Demands for accountability are sweeping higher education. Academic research libraries are confronting a watershed of change more compelling than anything encountered by the world of scholarship since the advent of the printing press in the 15th century. The costs of scholarly communication are rising more rapidly than any other aspect of the post-secondary environment. The confluence of the spiraling information explosion, almost unsustainable inflationary pressures, and the application of information technologies to the collection and dissemination of knowledge bring unparalleled challenges as well as opportunities. Nowhere are those challenges more in evidence than among the member libraries of the Association of Research Libraries.
What however constitutes excellence? How does membership demonstrate to its diverse constituencies that ARL libraries are delivering the best possible services from the considerable investments that are being made in their operation?
The ARL-sponsored project charts a bold divergence from the measurement practices now in place among the nation's research libraries. Currently, member libraries are evaluated mechanistically. The standard measures assume a single objective reality, and all ARL libraries are evaluated one against the other on an index composed of five expenditure-driven variables (ARL 2000):
The relationship that emerges between expenditures and excellence is widely assumed but never empirically demonstrated. Indeed, the focus on expenditures is widely at variance with new demands for evaluation and accountability. There is a growing awareness that North American research libraries represent a rich and complex fabric rather than a single objective reality that can be explained by such an index.
- Total operating expenditures
- Total professional plus support staff
- Total current serials received
- Total volumes added
- Number of volumes held
The problem of assessment plagues North American research libraries. For several years, ARL has espoused initiatives to develop new measures that research libraries can use to better describe and assess their operations and value. Those measures are now lacking. Indeed, the North American practice of evaluating research libraries on the expenditure metric explained above is widely criticized. Each member is mechanically assigned a ranking without analysis of the unique shaping forces in place or the particular roles and responsibilities of the universities of which each library is a part. The metrics, for example, make no distinction between the service context of a large public land grant university and a private, largely graduate institution, with more tightly focused missions.
The ARL SERVQUAL pilot represents a significant innovation that will provide research libraries with a well-grounded theory of library quality based upon user perceptions and will facilitate the wisest allocation of available resources. Meaningful local information will be obtained permitting libraries to identify those dimensions of quality most in need of attention, permitting energy and resources to be focused where needed most. Additionally, it may be possible to identify "best practices" among selected cohorts, enabling administrators to consider lessons learned elsewhere for their applicability to the local experience.
The study is responsive to a call among post-secondary leaders for assessments that permit a deeper understanding of local quality issues while continuing to explore the nature of the relationships among research university libraries. As outcomes are better understood in the context of each university culture, those understandings may provide "best practices" guides for others seeking to correct deficits their own analyses may have identified.
The ARL purpose is to ensure that the research library mission is meaningful to and supportive of its diverse student and faculty clientele, allowing testing of traditional constructs and beliefs. The applications to teaching, learning and research are apparent. Expenditure-driven metrics have no demonstrable correlation to effectiveness. The ARL SERVQUAL project focuses upon user perceptions of the delivery of library services relevant to their needs. Where there are deficits, libraries will have the opportunity to make improvements that fit the local situation. A cohort of best practices across all the dimensions that define library quality may emerge, facilitating the efforts of administrators to best tailor available resources to the institutional mission. Trends across the dimensions can be identified at the national level, placing local results in an important context for librarians and campus administrators alike.
Significance of the Initiative
Research libraries command vast resources on their campuses. In their aggregate, ARL libraries required 2.5 billion dollars to operate in fiscal year 1997-98. Costs of the journals and electronic information sources inflate by double-digit rates annually. Despite innovative strategies to counter rising costs, these libraries are increasingly being called upon by their institutions to show accountability and value in relation to the investments made and the needs required by the university. This accountability and value-added performance is not being fully assessed by college and university libraries at this time. Additionally, libraries must be more inventive, agile, responsive, and effective in order to hold the attention of faculty, students and administrators and in order to respond to the unparalleled changes in the environment forged by the technological revolution. The delivery of new and responsive services and products required by the user community is a central issue of accountability. The need to be both agile and innovative is a result of the acute attention paid to expenditures and performance, and their relation one to the other. Research libraries have not escaped this attention, and as one of the largest budgetary units on campuses, must be ever more articulate about how and when they are adding value to the overall performance of the institution.
The three-year study is one of several initiatives launched by the ARL New Measures program. The study proposes to answer the demands for accountability by enabling member libraries to measure the effectiveness of library services through user perceptions of service quality delivery. It is a practical reform initiative that has broad acceptance and support among the membership among whom the call for new measures is universal (ARL 1999). The assessment tool is grounded in the Gap Theory of Service Quality, developed for the for-profit sector by a marketing research group in the 1980's by Parasuraman, Zeithaml and Berry(1985). Their ground-breaking research led to their development of the SERVQUAL instrument which undertakes to measure service quality across five dimensions: reliability, assurance, tangibles, empathy and responsiveness.
The ARL-sponsored study makes several contributions. At the conclusion of its three-year project, the ARL model intends to:
Equally important, because of the transparency and ease of administration-requiring no local expertise-with additional qualitative research and re-grounding of the instrument to different library environments, the potential library client base could be expanded.
- make the administration of the service quality instrument transparent at the local level,
- develop the normative data that will permit institutions to surface best practices, and
- make it possible to identify national trends while at the same time distinguishing situations of importance to the local institution.
The ARL project undertakes the measurement of service quality in research libraries on a national scale, using an augmented SERVQUAL instrument. This is ARL's first effort ever to capture and score data from a user survey using the world wide web. For the first time, the measurement of educational outcomes from the user perspective will be made known. The huge national investment in research libraries will be evaluated in the context of user perceptions of service quality.
The project intends to make four fundamental contributions to the measurement of effective service delivery of library services:
The ARL project has its origins in the experiences derived at Texas A&M University Libraries over six years in translating the SERVQUAL instrument to the research library context. Administering SERVQUAL as an assessment tool for library performance in 1995, 1997, and 1999, the Texas A&M experience determined that the dimensions evaluated by the standard SERVQUAL instrument need to be adjusted for use in the research library context (Cook, Thompson, 2000a). Corroborating results found elsewhere in the literature, Texas A&M found only three library service dimensions isolated by SERVQUAL:
- First of all, it proposes to shift the focus of assessment from mechanical expenditure-driven metrics to user-centered measures of quality.
- Secondly, it will undertake to re-ground the SERVQUAL protocol, developed for the private sector to meet the needs of research libraries.
- Thirdly, it will undertake the analysis to determine the degree to which the information derived from local data can be generalized across the larger cohort group, providing much-needed "best practices" information.
- And finally, it will demonstrate the efficacy of large-scale administration of user-centered assessment across the world wide web in a manner all but transparent to the local institution.
The project adapts the SERVQUAL instrument across the three dimensions identified at Texas A&M. It will also test two additional dimensions whose presence emerged during an extensive series of interviews conducted with faculty, graduate students and undergraduate students at universities participating in the pilot study: (1) access to library collections and information resources as a responsiveness issue and-transcending the tangibles dimension-(2) the provision of an environment for study, collaboration, and reflection (Heath, Cook 2000).
- tangibles, i.e., appearance of physical facilities, equipment, personnel, and communication materials;
- reliability, i.e., ability to perform the promised service dependably and accurately, and
- affect of library service, which combines the more subjective aspects of library service, such as responsiveness, assurance, and empathy (Cook, Coleman, Heath 2000).
One of the central questions surrounding the use of the SERVQUAL protocol is whether it is useful for cross-functional analysis and comparisons over time as well as of strategic and diagnostic utility. There is little question that SERVQUAL serves as a useful tool for management decision-making at the local level. The studies at Texas A&M (Cook, Thompson, 2000a), and Maryland (Nitecki 1996) and others amply demonstrate that usefulness. Further study is required and perhaps further adaptation is necessary before making statements about whether results can be generalized across institutions. If, however, the research library community could adopt the instrument as a mechanism for setting normative measures, institutions could be recognized and then be further investigated to identify the best practices resulting in high marks for service satisfaction among users.
The overall design of the project is described more fully in the following section. Twelve ARL member libraries have been selected to administer to a sample of their patrons, a common, modified version of the SERVQUAL instrument. The twelve encompass: the University of Arizona, the University of Connecticut, the University of California, Santa Barbara, the University of Houston, the University of Kansas, Michigan State University, the University of Minnesota, the University of Pennsylvania, Pittsburgh University, Virginia Tech University, Washington University, and Virginia Tech University. A larger cohort group has been identified to carry the project into Year Two (2001), the final year of the Texas A&M-conducted study. In keeping with the concept of transparency, the work of the local institution will be limited to development of the respondent sample and assisting the design team with the look of its customized web-based survey instrument. The ease of local administration is a key concept to introducing new measures of assessment and accountability.
The research design is grounded in research libraries. There are three aspects of replicability that underscore the strength and the value of the ARL study:
The research design, supporting the implementation of key reform ideas, can be applied by all libraries. The robust evaluation of the data captured and scored during the pilot phases will help broaden the understanding of how effective the new measures are in meeting the calls for accountability. The far-reaching plans for dissemination, including key international conferences and a theory-building monograph, will help share the lessons learned across the post-secondary community.
- First of all, the survey is web-based, and its administration is transparent to the local campus,
- Secondly, while the ARL pilot undertakes to administer a newly-grounded version of the SERVQUAL instrument, the hardware system developed by the Texas A&M design team for survey administration and the software developed for data collection and scoring can accommodate any instrument, an important step in enabling theory-building in the new measures arena.
- And finally, while the theory initially under development focuses squarely upon the needs and requirements of research libraries, the tools could be re-grounded and tested upon other types of libraries. Phase Two will include libraries from outside the ARL community in order to investigate those issues.
Significance of the Project's Design
There are few web-based paradigms in higher education in any field. The initiative occupies the unique position of being a pragmatic, applied project that is at the same time on the leading edge of theory-building. Nothing of this scale has been attempted in the public sector with the SERVQUAL instrument. The ARL project gains its initial impetus from the experiences derived at Texas A&M University over six years in translating the SERVQUAL instrument to the research library envirnoment. A series of approximately 40 questions are asked across the five dimensions the survey undertakes to measure. For each question, the respondent is asked for impressions on library service performance according to (1) minimum service levels, (2) desired service levels and (3) perceived performance on each. For each question, gap scores are calculated between minimum and perceived expectations and desired and perceived expectations. The zone of tolerance is the difference between the minimum and desired scores. Optimally, perceived performance assessments should fall comfortably within that zone. What administrators look for are scores that fall outside the zone and falling trajectories over time, that while they may still reflect scores within the zone of tolerance, are nonetheless areas of concern. The Texas A&M design team acquired the hardware, designed a web form for collecting the data, and developed the software for scoring the results (Cook, Heath, 1999). The following paragraphs outline the unfolding of the project over its three-year life.
Selection of Participants and Re-grounding of SERVQUAL Instrument. The 12 participants of the pilot project were identified in late 1999 and liaisons were appointed to work with the Texas A&M design team. In December 1999, the liaisons and most directors from the 12 schools met with the Texas A&M design team at the American Library Association Meeting in San Antonio to discuss pilot requirements and timelines. After ALA, most of the participating libraries were visited with the purpose of building theory, qualitatively re-grounding the SERVQUAL instrument through a series of interviews with faculty, graduate students and undergraduates. Between 60 and 80 interviews were conducted on the campuses of the home institutions to hone the questions in the SERVQUAL instrument and to assist in identifying any additional possible dimensions. The interviews of faculty and students were transcribed and then subjected to content analysis. The final version of the SERVQUAL questionnaire was placed on the web in March, 2000. York University in Canada served as the first site to respond.
Procurement of Hardware and Software/Development of the Web Instrument. Texas A&M University procured and installed a Dell PowerEdge 4300 Server and two Dell Dell Power Edge 2400 servers for the administration of the project. The two 2400 servers collected data from the 20,000 potential respondents from the 12 participating institutions. The 4300 server housed the Microsoft SQL database software, capturing and channeling the result sets into SPSS for analysis. Members of the design team from the Cognition and Instructional Technologies Laboratory (CITL) at Texas A&M configured the servers and worked with the liaisons at the participating institutions to prepare their web pages and develop their samples. The web instrument was beta-tested with the Medical Sciences Library at Texas A&M in February 2000. The York version was loaded on the web March 15. Based on experiences there, the instrument was slightly revised for the April administration.
Data Analysis and Theory Testing. In May and June 2000, the data will be collected and scored. In July, each of the pilot libraries will be provided with mean scores on each of the questions as well as for each dimension the instrument succeeds in defining. Each participant will also receive the aggregate mean scores for each question and each dimension and other descriptive statistics. In addition, each participating library will receive its SPSS file for further in-depth analysis. Discussion continues among the participants as to whether to share individual library scores with each other. That decision will be made in spring 2000.
Over the summer of 2000, after all data have been collected, the theoretical foundations of the instrument will be subjected to rigorous quantitative testing. These analyses will be grounded in the premise that scores, not tests, are reliable and valid (cf. Thompson & Vacha-Haase, 2000; Wilkinson & The APA Task Force on Statistical Inference, 1999). Thus, it can not be assumed just because SERVQUAL functions well in business settings that scores from the same protocol when used in library settings will also have sufficient psychometric integrity.
First, score reliability will be evaluated. These analyses will examine "corrected" item discrimination coefficients and alpha-if-deleted statistics, as well as total score alpha coefficients. Second, the primary methods for evaluating validity will invoke factor analysis. As Nunnally (1978) noted, "factor analysis is intimately involved with questions of validity.... Factor analysis is at the heart of the measurement of psychological constructs" (pp. 112-113).
In the academic year 2000-2001, the instrument will be further refined. From among the respondents of the first phase, some may be tagged for a longitudinal follow-up study. In this manner, it will be possible to test the findings qualitatively by going back to some of the respondents in on-line focus groups. A number of libraries have already expressed interest in being included in the second pilot in the Spring 2001.
The third year will mark the emergence of a mature instrument and its movement from the design oversight of Texas A&M University to operational administration by the Association of Research Libraries. Equipment and software similar to that procured, configured and developed by Texas A&M will be acquired by ARL for on-going administration.
The strength of the project is the rigor of its design and the robustness of the statistical analysis to which the results will subjected. Close peer scrutiny of the findings is assured through broad dissemination of the results. The model recognizes the preeminence of local findings and surfaces best practices across institutions. The six years of data collected and analyzed by Texas A&M University General Libraries will be scaled to a national undertaking, accommodating other related research. The experience will enable a technology transfer to libraries generally as well as to a broad range of related ARL applications.
First year results will be reported at an ARL International Conference on the Culture of Assessment, in Washington, D.C., October 2000. Upon the conclusion of testing and assessment, the collaborators will issue a monograph assessing the cross-institutional data on each of the service dimensions. The ARL-sponsored monograph will include the information on aspects of quality library service derived from the interviews at the twelve participating universities. It will also focus on the practical aspects of implementing and administering a large-scale survey across the web. SERVQUAL will be evaluated for its utility as a best practices tool for research libraries. Concurrent with the completion of the monograph, the findings of the first pilot project will be disseminated at the fourth Northumbria International Conference on Performance Measurement in Libraries and Information Services in 2001.
The project plan envisions the migration of the operational oversight of the tool to ARL by 2002, with the instrument available for widespread administration. The advantages of an assessment tool, well grounded in theory and rigorously administered, holds promise to finally answer the calls for greater accountability and responsiveness to user needs in college and university libraries.
ARL Membership Criteria Index, 1998-99 (2000). Memorandum to Directors of ARL Libraries from Martha Kyrillidou, Senior Program Officer for Statistics and Measurement and Julia Blixrud, Director of Information Services, Association of Research Libraries, March 8, 2000.
Blixrud, J. (1999) 'The continuing quest for new measures' ARL Newsletter: A Bimonthly Report on Research Library Issues and Actions from ARL, CNI, and SPARC 207 (December), 11.
Cook, C., Coleman, V., and Heath, F. (2000) 'SERVQUAL: a client-based approach to developing performance indicators' in Department of Information and Library Management. Proceedings of the 3rd Northumbria International Conference on Performance Measurement in Libraries and Information Services 27-31 August 1999. Newcastle upon Tyne: Information North 211-218.
Cook, C. and Heath, F. (1999) 'SERVQUAL and the quest for new measures' ARL Newsletter: A Bimonthly Report on Research Library Issues and Actions from ARL, CNI, and SPARC 207 (December), 12-13.
Cook, C. and Thompson, B. (2000a) 'Reliability and validity of SERVQUAL scores used to evaluate perceptions of library service quality.' Journal of Academic Librarianship (in press).
Cook, C. and Thompson, B. (2000b) 'Higher order factor analytic perspectives on users' perceptions of library service quality.' (Manuscript submitted for publication).
Heath, F. and Cook, C (2000) 'User perceptions of service quality in research libraries: a qualitative assessment.' (Manuscript in preparation).
Nitecki, D. A. (1996) 'An assessment of the applicability of SERVQUAL dimensions as customer-based criteria for evaluating quality of services in an academic library' (Doctoral dissertation, University of Maryland, 1995). Dissertation Abstracts International 56, 2918A (University Microfilms No. 95-39, 711).
Nunnally, J.C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.
Parasuraman, A; Zeithaml, V.A.; and Berry, L.L. (1985) 'A conceptual model of service quality and its implications for future research' Journal of Marketing 70(3) Fall. 201-230.
Thompson, B., & Vacha-Haase, T. (2000). Psychometrics is datametrics: The test is not reliable. Educational and Psychological Measurement, 60, 174-195.
Wilkinson, L., & The APA Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594-604. [reprint available through the APA Home Page: