VOL. 17 NO. 2 APRIL-JUNE 1998 - Converting CDS/ISIS Databases to Multimedia System for Internet and CD-ROM : A Case Study

A. Sreenivas Ravi & T.B. Rajashekar
National Centre for Science Information, Indian Institute of Science, Bangalore 560012.

A. Neelameghan
Institute of Information Studies, 702 Upstairs, 42nd Cross, III Block , Rajanagar, Bangalore 560010.

Abstract: The OM Information Service (OIS) is an interactive multimedia database service based largely on Micro CDS/ISIS input records. The OIS is available on the Internet, and on CD_ROM with an off-line web browser. The objective of OIS is to provide access to a large database of extracts and quotations from a wide range of sources spanning across different cultures and faiths, from Vedic times to the present. At the core of OIS is the database of extracts (OM database) interlinked with a database of life sketches of the Source Persons (OMBIO database) and another bibliographical database (OMBIB database) of related documentary materials. Web browsers, such as, Netscape and Internet Explorer provide a user-friendly end-user interface, and HTML affords quick and easy design of the interface. The OIS interface is composed of a series of self-explanatory HTML forms. The HTML forms display the alphabetical index to the database more or less conforming to the Field Select Table parameters of the CDS/ISIS databases. From the sequence of index pages a user can select terms, by clicking the mouse, to formulate Boolean search expressions. User also has the option to select names from the Source names index, and combine them with the Extracts index terms already selected to formulate Boolean search expressions. Hypertext links to biographical and bibliographical records are embeded in the display of retrieved extracts, so the user can view the realated life sketches with pictures in some cases, and bibliographic records. A Common Gateway Interface (CGI) program written in C language that runs the web server handles the concept term index and the source name index, and the search and retrieval tasks of the Service. Maintaining the interactivity and making the search process appear like a continuous interactive session, is handled by using a number of hidden fields in the HTML forms which the user can view. This has become necessary as the HTTP is a stateless protocol. As the Web supports multimedia data, the images of the Sources (Authors) are incorporated where available, and the user has the option to select and play some devotional music during a search session. Sri C. Subramaniam, President, Bharatiya Vidya Bhavan, presents an introductory note to the OIS; his voice can be heard and his picture and text of the note can be viewed on the screen. A semi-technical overview on OIS is given by A. Neelameghan which can be viewed by users. The paper gives details of the contents of the database, the fields in the CDS/ISIS database, the indexing and display parameters, the user interface and the search and retrieval process. Some issues relating to the indexing of the records are mentioned.

1. Introduction

In a rapidly changing social, political, cultural, economic, and technological environment, people are subjected to a variety of mental stress factors. For relief from such stress, many persons are taking recourse to and benefitting by, among other things, the study of scriptures, attending spiritual, religious and scholarly discourse, and practising meditation, yoga, etc. Such efforts can be supported and supplemented with information for use at home, in work places, schools and other educational institutions. With this in view, about a year ago, A. Neelameghan started work on a CDS/ISIS database containing extracts from the writings and sayings of spiritual leaders, saints, seers, mystics, and scholars. The authors (here called Sources) covered span across cultures, religions, and faiths, from vedic times to the present. Subsequently, in addition to the Extracts database, a biographical database of the sources and a database of bibliographic references to related documentary materials were also initiated, again using CDS/ISIS. The SELECT. PAS programme, developed at the Institute of Information Studies, Bangalore, was used for multiple database search, and to provide hypertext links among records within a database and among different databases. The retrieval facilities of the standard CDS/ISIS package are available.

As the number of records in the Extracts database increased to about 5,000, several persons evinced interest in the facility. Therefore, ways of making the database accessible to a wider range of users was thought of. From preliminary discussions with NCSI, the idea of creating a multimedia system from CDS/ISIS inputs (excepting the audio and picture files), mounting the database on a web server and CD-ROM emerged. A project proposal was formulated and partial funding received from DESIDOC for the OM Information Service project.

2. OM Information Service

2.1 Objective

The OM Information Service, mounted on the Internet and on CD-ROM, has the goal to assist the moral and spiritual progress and eventual transformation of man by providing access to select extracts and quotations from a wide range of sources, thus providing a richer information environent for self-education, for the preparation of lectures and discourses, and also for analytical and comparative studies. All of this can lead to a better appreciation of divergent views and interpretations of an idea; and to enable one to perceive the fundamental unity of basic tenets of different faiths. In self-learning and guided-learning processes, this could be conducive to a change of people's perceptions and attitudes and thus, hopefully, minimize the chances of conflicts and tensions in human relations.

2.2 Types of Queries the Service Can Respond to

The system can respond to queries such as :

  • Texts of what Saint John of the Cross and Soren Kierkegaard have said about `control of the senses or purity of heart'.

  • What says the Bhagavad Gita about Karma? And Interpretations of the relevant texts by Anne Besant, and Dr. S. Radhakrishnan.

  • The relation between `Salvation' and `Renunciation' as perceived by Swami Vivekananda, Sri Ramakrishna Paramahamsa, Sri Sankara, Sathya Sai Baba, Meister Eckhat, Thomas Kempis and other spiritual leaders, and mystics.

  • Some anecdotes of Mother Teresa relating to Compassion.

  • Biographical sketch of Guru Nanak with photograph.

  • A random selection of an inspirational verse or saying.

  • List of selected works on/by the Sikh Gurus and Sufi Saints of India.

  • Play a piece of devotional music.

3. Databases

3.1 Searchable Files

Currently, the OIS input records are of three types:

  • Records carrying extracts and quotations from the sayings and writings of spiritual and religious leaders, saints, mystics and scholars, as well as as from epics (e.g. the Bhagavad Gita, the Ramayana), Vedic texts, religious texts (e.g. Upanishads, the Bible, the Koran, and Dhammapada). This file currently holds some 16,550 extracts from about 800 Sources. This principal database is called OM.

  • Records of life sketches, about 50 with pictures of selected Sources persons. The database of life sketches is called OMBIO (of course pictures are not included) : and

  • Selected bibliography of relevant books and tracts (about 120), as a guide for further reading. This database if called OMBIB.

The source of the music pieces include compact disc, audio cassette, and the Internet.

New records will be added to these files frequently.

3.2 CDS/ISIS Fields and Indexing Parameters

3.2.1 Extracts Database

The database of extracts has the following fields:

Heading (R, IT 4)
Text (R, IT 4)
Source (original) (R, IT 0)
Context (R, IT 4)
Notes (R, IT 4)
Verse number (R, IT 0)

And the following internal control fields :

Secondary source
Volume
Page (s)
Record identifier (IT 0)
R = Repeating Field; IT = Indexing Technique used.

Subfields are provided to record the Mandala, Kanda, Sloka, etc. for Upanishads.

The size of a textual extract in Field Text may range from a few words to over 1000 words. An extract may be a saying, a verse or sloka, an anecdote, a parable, or a dialogue.

For an extract record, the following fields are displayed: Text, Source, Context, and Notes, (Plus Verse number for extracts from Bhagavad Gita). For an extract from an Upanishad, the Kanda, Mandala, Sloka, etc. are indicated where available.

3.2.2 Biographical Record

The fields and indexing technique for a biographical record are :

Name (of biographee) (IT 0)
Date of birth
Date of death
Period
Life Sketch (R, IT 4)
Remarks (R, IT 4)

The size of a life sketch may range from 500 to 4500 words. A large record is split into two or more records, but the related records appear as a single record in the display in the OM Information Service.

3.2.3 Bibliographical Record

The MIBIS (IDRC) record format is used in the CDS/ISIS bibliographical record input.

4. MULTIMEDIA:SEARCHANDRETRIEVAL

4.1 Introductory Texts

The opening screen presents, among other things, the main menu.

Shri C. Subramaniam, President, Bharatiya Vidya Bhavan, provides an introduction to the use of the Service. His voice can be heard, while the text of his speech, and his photo are concurrently displayed.

In the few screens that follow, the text of an overview of the OM Information Service provided by A. Neelameghan is displayed.

4.2 User-friendly Search Procedure

A user-friendly search interface that can be used even by persons without a 'computer background', is provided. A user has just to follow the help cues given in each screen display, and will be asked simply to click on appropriate items - buttons, boxes, icons - in the menus and sub-menus displayed, or to type a word or the first few characters of it or selcet the desired concept term from a displayed alphabetical index list. Users need not know the intricacies of Boolean search expressions.

4.3 Background Music

An option is provided to play some background devotional music while one is working with the system - Indian and Western devotionals, Buddhist chants, or Koran recitation. By clicking in the appropriate box in a menu, a user can select a desired piece, and also change to another piece while playing a piece. A brief note on each piece of music is also displayed.

4.4 Something to Reflect On

Clicking in the box 'Something to Reflect On' in the opening menu, a quotation, saying, verse, or statement on a concept to reflect on is displayed. This can also service as a starting point for further searches. The system selects the extract at random from the corresponding file.

4.5 Searching by Concept (s)/Topic (s)

The retrieval system is designed principally for searching by one more concept terms. One may start by typing in the displayed blank box, the first few characters of the subject or topic one is interested in, when the system asks for it. For example: if one is interested in retrieving texts on COMPASSION, type in COMPAS (upper or lower case letters). An alphabetical list of terms (i.e. the index) starting with the string COMPAS, will be displayed and user can select the particular term or terms representing his/her subject interest by moving the cursor to the term(s). The terms will be ORed. The number of hits will be indicated. The records retrieved may then be displayed or one may further restrict the search by adding another concept term(s) (e.g. poor or God). The search can also be restricted by adding the name of one or more source names (e.g. Buddha, Mother Teresa, Sogyal Rinpoche or Aurobindo Ghose), selected from the index list of Source names. The concept term(s) and the source name(s) will be ANDed, but among the Source names they will be ORed. Example:

Compassion : poor
(compassion + poor) * (Buddha, Siddhartha Gautama + Teresa of Calcutta, Mother)

The particular format of the source name is selected from the Source name index displayed.

One may also extend the coverage of a concept term by adding synonymous, near-synonymous or other related terms to the starting search term. Example : If the starting search term is liberation, one may also search with salvation, freedom, moksha, nirvana, etc. These will be ORed terms. A thesaurus to assist in this process is under preparation.

One can also move from one type of record to another by clicking in the appropriate box on the screen. For instance, while perusing an extract on the concept ABSOLUTE, say as propounded by Swami Vivekananda, one can retrieve, by using the term Vivekananda (selected from the index) as a search term, as was done before; and a brief biographical sketch and a picture of Swami Vivekananda will be displayed. If the user then wishes to get references to books relating to Swami Vivekananda, a short list can be retrieved by using the name Vivekananda (selected from the index, if necessary) as a search term. The cues/hints on the screen help in the process.

Facility is also provided to get a print out of retrieved records, all or selected ones.

5. DATABASE ORGANIZATION AND CGI PROGRAM FOR WEB ACCESS

5.1 Text Files

The database is organized on the server for Web access in three different text files, generated from the corresponding CDS/ISIS databases (OM, OMBIO and OMBIB) by printing all the records into text files using the appropriate print format. This way of maintaining records in text files helps in retaining the same display formats as defined for the CDS/ISIS databases.

5.2 Index

The inverted or index files are generated by extracting terms from the text files, largely conforming to the parameters prescribed in the dbn.FST files of the CDS/ISIS databases. For the Extracts or OM database two separate index files were constructed : one for the terms extracted from the fields (except the Source field) as indicated above, and the other for the names of Sources. The Source names, it may be noted, are also the access points for the OMBIO and OMBIB records.

5.3 Hyperlinks

The simple user interface for searching by concept term in the Extracts database has been outlined in Sec. 4.4. Records will be displayed according to predefined/preselected format. In the extract records displayed hyperlinks are embedded which refer to the related biographical and bibliographical records. The user can click on an hyperlink to view biographical or bibliographical details. A biographical record also carries a photograph of the Source person, wherever available, in GIF format. The picture can be displayed along with the text of the biography. By clicking on the Back button on the broswer the user can get back to the Extract record.

5.4 Common Gateway Interface Program

The above user interaction, search and retrieval are handled by a Common Gatway Interface (CGI) program written in C. The requirement of the CGI program are :

  • Display index terms from the dictionary

  • Allow browsing in the dictionary/index

  • Formulate search expression after the user selects concept terms and/or source names

  • Retrieve the hit records

  • Get the image files for Sources where available

  • Embed the hyperlinks for biographical and bibliographical details

  • Display the retrieved records.

The HTTP (Hyper Text Transfer Protocol) used by the Web server and the browser for communication is a stateless protocol. Therefore, neither the server nor the browser keeps a record of the transactions after a request has been processed between them. In the present application, maintenance of continuity in the interaction between the browser and the server during a search session is an important consideration. This is achieved through a number of HTML forms. Except the first form, all others are dynamically generated by the CGI program. To maintain continuity or keep track of previous steps in a session, Inout Type Hidden fields of HTML are used.

6. Some INDEXING ISSUES

In a manual system as well as in a machine readable database system, the problems relating to the choice and rendering of names of persons, corporate bodies, subjects and of other entities are common and of a similar nature : spelling variation, Plural-singular choice, synonymns, grammatical variations, etc., are examples. Authority lists, vocabulary control tools such as classification schems, thesaurus, list of subject headings are often used for data entry and in the search process, manually or online.

In the OIS, the name of a person can occur in the Extracts database (OM, in the Source field), in the Biography database (OMBIO, in the Name of person/source field), and also in the Bibliography database (OMBIB, in the Author field). As the Source names are hyperlinked, to ensure successful retrieval from the different databases, they should be rendered in a consistent and uniform format in the different databases. Provision of cross references in the case a Popular ne vis a vis the corresponding 'real' name, and a Pseudonuym vis a vis the corresponding 'real' name, may be necessary.

As mentioned earlier, a thesaurus of concept terms is being constructed, using, among other things, the terms occuring in the OM records. In selecting related concept terms, different types of relations may be found among two terms depending upon the view point of the different authors. This may become clear to user when he/she peruses the extract and, in some cases, the Context and/or Notes field contents.

The OIS uses only English language extract, even if the original text was in a different language, for example, Tamil or Sanskrit. Some authors give the translated term in English, other the transliterated term, for the original (e.g. Nirvana, Moksha, liberation, salvation; Samsara, Karma, Dharma, etc.) Some of the translated terms may not be exact equivalent to the corresponding orginal Sanskrit or other language term. All such variations have to be taken note of in constructing the vocabulary control tool, and the concept indexes so as to secure satisfactory retrieval by and convenience to different users.

7. Acknowledgement

Several of the Source materials for the extracts were gifted by the Ramakrishna Mission in Mysore and Bangalore. We are grateful to the Mission.

Our thanks are due to Shri Sukdev Singh, Bhai Gurdas Library of the Guru Nanak Dev University, Amritsar, for making available pictures of the Sikh Gurus.

DESIDOC's partial financing of the project is gratefully acknowledged.