Columbia University Receives Andrew W. Mellon Foundation Grant for Computational Linguistics for Metadata Building (CLiMB) Project in the Center for Research on Information Access at the Columbia University Libraries

New York, April 22, 2002

Dr. Judith L. Klavans, director of the Center for Research on Information Access (CRIA) at the Columbia University Libraries and research scientist in the Department of Computer Science, will lead a new project funded by the Andrew W. Mellon Foundation. The $542,000 grant will support the Computational Linguistics for Metadata Building (CLiMB) project, which creatively brings together the most recent developments in natural language processing and applies them to the problems of automatically extracting metadata from text.


The CLiMB project will last approximately two years and proposes to develop innovative uses of computational linguistic techniques for the identification and extraction of descriptive metadata with the purpose of improving access to image collections.

Dr. Klavans, said, "We are pleased to be able to lead this creative new project in applying the latest results in computational linguistics to the needs of users of complex digital information. The award from Mellon promises to enable a major leap in improving access to image collections."

One of the most serious bottlenecks in digitizing collections is in making them easy to search. The strategy proposed has the potential to provide rich, subject-oriented indexing for large image collections that would otherwise be prohibitively expensive to describe and index using manual techniques. A further advantage of the approach is that the descriptive metadata generated may be derived from authoritative scholarship in a way not normally feasible in standard cataloging practice. The project goal of CLiMB is to develop and test automatic approaches to the creation of descriptive metadata for improving access to digital library special collections.

In addition, CLiMB promises to provide a platform for the development and testing of other innovative approaches to text-derived metadata generation and use that could lead in time to even more powerful search, retrieval and presentation tools. The CLiMB project includes an extensive component for assessment and evaluation. This new Mellon-funded project builds on earlier results at Columbia in the development of natural language processing tools that permit large-scale robust analysis of text data.

The Center for Research on Information Access (CRIA) is an innovative interdisciplinary research unit of the university libraries, whose goal is to develop a research program providing the potential to apply the very latest and most creative technologies from Computer Science to the serious problems faced by Libraries in dealing with growing amounts of digital data. The Center, directed by Dr. Judith L. Klavans and established in January 1995, has integrated and coordinated digital information activities at Columbia University, and has enabled the University to push forward research on technologies related to information access.

Examples include the NSF-Funded interdisciplinary PERSVIAL medical digital library, which spans across the Departments of Computer Science, Engineering, Medical Informatics, Academic Computing and the Libraries; the multi-institutional Digital Government Research Center, funded by several government sources; and the DARPA-funded TIDES multilingual multidocument summarization project. The CRIA website is at:

For information:

Dr. Judith L. Klavans

Center for Research on Information Access
Columbia University in the City of New York

(212) 854-7443