As of 22 April 2009 this website is 'frozen' in time — see the current IFLA websites

This old website and all of its content will stay on as archive – http://archive.ifla.org

IFLANET home - International Federation of Library Associations and InstitutionsAnnual ConferenceSearchContacts
Jerusalem Conference logo

66th IFLA Council and General

Jerusalem, Israel, 13-18 August


Code Number: 081-174(WS)-E
Division Number: VII
Professional Group: Library History in association with the Association of Jewish Libraries, Judaica Librarians Group, and Hebraica Libraries Group: Workshop - Session 3
Joint Meeting with: -
Meeting Number: 174
Simultaneous Interpretation: No

Who invented the index? - An agenda for research on information access features of Hebrew and Latin manuscripts

Bella Hass Weinberg
Library and Information Science, St. John's University
Jamaica, New York, USA


The Internet has stimulated interest in the history of indexes, but insufficiently detailed book indexes and manuscript catalogs make research in this field difficult. It has been claimed that concordances and subject indexes were invented in France in the thirteenth century, but alphabetical lists of words and phrases from the Hebrew Bible were compiled by the tenth-century Masoretes. The Hebrew codicological database, Sfardata, lacks fields for the paratextual features of manuscripts, and the emerging standards for manuscript cataloging in the electronic environment lack detail in this area. Enhancing codicological databases and standards would facilitate tracing the origins of indexes.



The growth of the Internet and the increased number of electronic documents have stimulated interest in the history of indexing. The well-known librarians Fred Kilgour and Bill Katz have published histories of the book and of reference books, respectively.1 The history of Hebrew books and reference works is neglected in the aforementioned books and in other Western studies. Encountering a claim that the first citation index was published in England in the eighteenth century, I investigated the earliest Hebrew citation indexes and found that they date back nearly six centuries earlier.2 The study of citation indexes stimulated me to look for the earliest Hebrew subject indexes. I researched this topic at the Vatican Library and, surprisingly, found no subject indexes in Hebrew incunabula (books printed until 1500) or manuscripts, only more citation indexes. My article discussing the reasons for this includes a chart surveying the first indexes as well as the features related to them, such as alphabetical order.3

In documenting that paper I encountered the claim that concordances and subject indexes were invented in France.4 I visited half a dozen French repositories in the summer of 1999 and examined the earliest Latin concordances and indexes. A paper based on that research has just been published.5


Masoretic lists and concordances

While in France, I spent a day at the Oriental Division of the Bibliothèque Nationale. After examining early Hebrew manuscripts in that collection for features germane to indexing, I purchased the exhibition catalogue compiled by the curator, Michel Garel. 6 The book includes an illustration of a Masoretic list in which words from the Hebrew Bible are arranged in alphabetical order, and the phrases from which they are taken aligned next to them. It struck me that the format of such lists may have served as the model for the Latin concordances to the Bible that were compiled in the thirteenth century. This hypothesis led to a whole series of questions relating to the definition of index, the breakup of the Bible into chapters, the role of memory in relation to indexes, and Christian-Jewish communication in the Middle Ages.

A key characteristic of indexes is that they lead from a known order of symbols (such as the Hebrew alphabet) to an unknown order of information (e.g., Biblical passages). An index entry consists of three elements, two of which are required. The elements are: the heading, the modification (optional), and the locator. In a typical Biblical concordance, the heading is the word being looked up, the modification is the phrase in which the word is found, and the locator is the number of the chapter and verse. Masoretic lists had only two of these elements; they lacked chapter numbers.

Chapter numbering of the Bible is generally credited to Stephen Langton, who lived in the thirteenth century. The use of Langton's numbers in a major Latin concordance made them a standard reference system that was eventually incorporated into Hebrew Bibles and concordances. Langton did not invent chapter numbering, however. According to Rouse and Rouse, this feature is found in the Greek New Testament and dates back to the second or third century.7

The Masoretes also divided the Bible into chapters, known as sedarim, for the purpose of designating portions of the text to be read in the synagogue. In some Masoretic manuscripts, lists of the initial phrases of the sedarim are numbered with Hebrew letters. Two well-known early Biblical manuscripts, the Cairo Codex and the Aleppo Codex, do not incorporate the numbering of sedarim into the text, but it is found in parts of the Leningrad Codex. In other parts of that codex, numbering of sedarim was added by a later hand. Experts in Rabbinic literature whom I have consulted do not know of cases where the numbering of sedarim was used in references, e.g., Genesis, seder 25. Instead, the name of the portion of the Pentateuch, or the initial words of the seder, were used to indicate the location of a passage.

The names and order of the portions of the Pentateuch are not common knowledge today, 8 but it has been posited that Jews in the Middle Ages did not need concordances because they knew the Bible by heart.9 This returns us to the question of whether Masoretic lists were indexes: the Biblical phrases may have been implicit locators rather than modifications. In other words, just seeing the phrase was enough to tell the learned Jew where in the Bible it was located. It is one thing to be able to recognize the source of a passage, but quite another to recall all the passages in the twenty-four books of the Bible that contain a given word, including function words (prepositions, conjunctions, etc.). Here the focus is on how the Masoretes worked: Clearly they had no computers, but did they have index cards?

If we grant that Masoretic lists were concordance-like structures, we need to establish that Christian Biblical scholars had access to them before claiming that these Hebrew word lists served as the model for Latin concordances. There are varying opinions on this question. Beryl Smalley's theory of Christian-Jewish communication in the Middle Ages10 is not accepted by all scholars. One expert told me that few Jews during that period understood Masoretic notes, and the number of Christians who could interpret them was no doubt even smaller. My hypothesis, however, does not relate to the Masorah Parva (literally, the small Masorah), the tiny coded notes in the margins of Biblical manuscripts; I am speaking only of the independent Masorah, the separate lists such as Okhlah ve-Okhlah, which have a format that is easily copied.


Codicological databases and manuscript cataloging standards

My prior papers have discussed the problems associated with researching the history of indexing. One of them is the poor indexing of books on the subject. For example, Leila Avrin's volume on the history of the book mentions indexes, but the term is not in her index.11 Another major problem is the inconsistent use of terminology for indexes in all the languages with which I am familiar. For example, the term index is often applied to table of contents by catalogers of Hebrew and Latin manuscripts and incunabula. Finally, the inclusion of indexes and features related to them is often not noted in manuscript catalogs. Therefore it is necessary to do sequential searches of manuscript repositories to find indexes.

Computers are now being applied to the study of manuscripts. In the world of Hebrew codicology (the study of codices, i.e., manuscript books), the major database is Sfardata, directed by Malachi Beit-Arié. The questionnaire for the project lacks fields for the paratextual features of manuscripts, i.e., those, such as indexes, which enhance information access. These features can be recorded in a comment field, but that is not likely to capture details on types of indexes and their methods of arrangement in a structured manner to facilitate retrieval.

In the general world, which focuses mainly on Latin manuscripts, a paratextual field has been defined for the standard under development by the Text Encoding Initiative. Its definition is inadequate, and the subfields have not been enumerated yet.12 Digitization is currently a major activity in the library world, but without adequate cataloging, we will not be able to find the documents of interest. The Digital Scriptorium13 is an example of a collection of digitized manuscripts for which the cataloging information lacks fields for paratextual features, and even for collation. The method of numbering the leaves or pages of a codex is very much related to information access. For example, if a manuscript has foliation (numbering of leaves), the locators in its index are not as specific as they would be with pagination. Thus collation data are germane to the study of indexes.


Conclusions and recommendations to the international library community

The study of manuscripts to date has focused primarily on their physical features (paper, ink) and artistic qualities (rubrication, illumination). Data on physical features contributes to the dating of indexes and to the identification of the countries in which they were produced. Artistic qualities, such as the use of color in manuscripts, can also be germane to information structure: Kilgour has shown that red ink was used to highlight keywords in Egyptian documents.14 Alternating blue and red headwords was a common feature of medieval Latin indexes, which enhanced the clarity of these tools.

The literature on the history of indexing is widely scattered, and we - librarians, who are supposed to be experts in the organization of information - have not provided good access to it. It is recognized today that organizing information is the most important thing librarians do, but it is difficult to write the history of our own field because we have not organized, or indexed, the relevant information.

Librarians are involved in developing metadata standards for many forms of document. We should see to it that these structures have fields for the features of interest to us, notably indexes. After the standards are developed, much work remains to fill in the slots, i.e., to catalog in a consistent, detailed manner the many thousands of manuscripts held in repositories throughout the world. Only when such databases exist will we be able to retrieve information on the earliest occurrences of tables of contents, indexes, and related information structures.

This paper has focused on documents in two scripts, Latin and Hebrew, although other writing systems have been mentioned. In challenging the claims that the first citation indexes and concordances were in the Latin alphabet, I have noted Hebrew precedents, but I have also mentioned features germane to indexes that are found in documents written in other scripts. Librarians who work with Greek, Arabic, and Chinese documents should share information on the earliest examples of information access features found in these works, to allow us to move closer to the answer to the question, "Who invented the index?".


Acknowledgments The Eugene Garfield Foundation funded my research in the Vatican Library and in France. The Jesselson Foundation provided a grant for research on Hebrew manuscripts in Israel. Support for this research was also provided by St. John's University.

I am indebted to the following American and Israeli scholars for sharing their expertise with me: Malachi Beit-Arié, Robert Brody, Miles Cohen, Consuelo Dutschke, David Gilner, Sid Leiman, Yosef Ofer, Jordan Penkower, Menahem Schmelzer. (Owing to limitations of space, their individual contributions and affiliations cannot be listed.)



  1. Kilgour, Frederick G. The Evolution of the Book. New York: Oxford University Press, 1998; Katz, Bill. Cuneiform to Computer: A History of Reference Sources. Lanham, MD: Scarecrow Press, 1998.

  2. Weinberg, Bella Hass. "The Earliest Hebrew Citation Indexes". Journal of the American Society for Information Science, 48 (4): 318-330 (1997). Reprinted in: Historical Studies in Information Science. Medford, NJ: Information Today, 1998, 51-64.

  3. Weinberg, Bella Hass. "Indexes and Religion: Reflections on Research in the History of Indexes". The Indexer, 21 (3): 111-118 (1999).

  4. Rouse, Richard H. and Mary A. Preachers, Florilegia and Sermons: Studies on the 'Manipulus Florum' of Thomas of Ireland. Toronto: Pontifical Institute of Mediaeval Studies, 1979.

  5. Weinberg, Bella Hass. "Book Indexes in France: Medieval Specimens and Modern Practices". The Indexer, 22 (1): 2-13 (2000).

  6. Garel, Michel. D'une Main forte: Manuscrits Hébreux des Collections Françaises. Paris: Bibliothèque Nationale, 1991.

  7. Rouse and Rouse, Preachers, 29.

  8. Weinberg, Bella Hass. "The Structures of Early Hebrew Reference Books: Canonical Alphabetical, Classified". In: Association of Jewish Libraries. Proceedings of the 34th Annual Convention, June 1999. New York: Association of Jewish Libraries, 2000, 120-128.

  9. Wellisch, Hans H. "Hebrew Bible Concordances, With a Biographical Study of Solomon Mandelkern". Jewish Book Annual, 43: 56-91 (1985-86).

  10. Smalley, Beryl. The Study of the Bible in the Middle Ages. Oxford: Basil Blackwell, 1952.

  11. Avrin, Leila. Scribes, Script and Books: The Book Arts from Antiquity to the Renaissance. Chicago: American Library Association; London: The British Library, 1991. "[S]ubject indices began in the thirteenth century" (p. 221).

  12. The definition of paratext in the standard under development reads: "contains a description of any other information contained in manuscript which is intended to aid the reader, such as column numbers, running heads, etc.". The only example provided is a description of line numbering. http://www.hcu.ox.ac.uk/TEI/Master/Reference/ref/PARATEXT.htm. Visited May 11, 2000.

  13. Digital Scriptorium http://sunsite.berkeley.edu/scriptorium.

  14. Kilgour, Frederick G. "Locating Information in an Egyptian Text of the 17th century B.C.". Journal of the American Society for Information Science 44 (5): 292-297 (1993).


Latest Revision: May 29, 2000 Copyright © 1995-2000
International Federation of Library Associations and Institutions