Using a Computer in Biblical and Theological Studies

Lesson 6: Computer-Assisted Bible Study, Part 2

Dr. Harry Hahne, Tyndale Seminary, Toronto

Copyright © 1996-1999 Harry Hahne


Contents:

  1. Grammatical Searches with a Bible Program
  2. Machine-Readable Original Language Bible Texts
  3. Variations in Search Results with Bible-Search Programs
  4. Historical and Theological Texts With Search Software
  5. Using Text Analysis Software to Study the Bible and Theological Texts
  6. Recommended Reading

Grammatical Searches With a Bible-Search Program:

What is a Grammatical Search?

A grammatical search finds all instances of a particular Greek or Hebrew grammatical construction or a part of speech, such as perfect imperative verbs. Grammatical searches have many uses:

What is a Tagged Text?

In order to do grammatical searches, you must have a morphologically tagged text of the Bible in Greek or Hebrew. A tagged text attaches parsing, lemmas (dictionary forms) and sometimes word definitions to each Greek and Hebrew word in the biblical text. This allows sophisticated stylistic and grammatical searches that would be impossible with the biblical text alone.

In a tagged text, every word is marked with grammatical information (e.g. tense, voice, mood of verbs, case, gender, number of nouns, etc.). Usually the lemma (dictionary form) and sometimes the meaning of the words is also included.

Why is a tagged text important?

A Greek or Hebrew Bible text alone is not enough for doing serious study of the Bible in the original languages.

Greek words change their spelling (morphology) based on the function of the word in a sentence. Thus if a noun is the direct object (accusative case) it will be spelled differently than if it is the subject (nominative). English does the same thing to a lesser extent, particularly with pronouns (I, me, mine, my) and verbs (go, going, went). Thus if you want to find a word that is a direct object (accusative) in a Greek sentence, you cannot search for the dictionary form of the word in an untagged Bible text such as those in the Online Bible.

Many Hebrew words are composed of several syntactical units (morphemes), such as a conjunction, preposition, article and noun. For example, be-re-shith in Gen 1:1 consists of a preposition and noun and wa-ha-eretz in Gen 1:2 consists of a conjunction, article and noun. It is essential to be able to search each component part separately to do accurate searches in Hebrew.

What kinds of searches can you do with a morphologically tagged text?

A machine-readable Bible text tagged with morphological, syntactical and lexical information allows some very exciting searches of the Bible. You can find:

For example, you might be studying Mt. 18:18 and find the unusual expression "whatever you bind on earth will be bound in heaven". You would start with word studies of "binding" and "loosing". In addition, you might do a grammatical study of "will be bound". In Greek this is a future perfect periphrastic, which literally means "will have been bound". This unusual construction consists of a future of the verb eimi ("to be") and a perfect participle. A grammatical search finds several other NT examples of this type of expression. For example, it is found in Mt. 16:19, Lk. 12:52; Heb. 2:13. A more complete search would also look in the Septuagint for similar constructions.

Grammatical Search Capabilities of Various Bible Programs

New Testament Greek

Bible-search programs vary considerably in their ability to do New Testament grammatical searches. The following is a relative rating of search flexibility, from greatest to least:

  1. Gramcord for Windows/Bible Companion, Gramcord for DOS, Accordance (Macintosh): Incredible power and flexibility to perform nearly any type of grammatical analysis. The Macintosh version could also be used for structural studies. Despite their power, both the Windows and Macintosh versions are easy to use.

  2. Bible Windows and TheWord: Moderate search capabilities, but not nearly as powerful as the Gramcord or Accordance. Each has some limitations that prevent certain types of searches.

  3. Logos 2 and Bible Works: Less flexible, but generally adequate for non-scholarly use. The grammatical searches are harder to set up than Gramcord for Windows, Accordance and Bible Windows.

  4. QuickVerse: Only allows simple searches on a single part of speech, primarily useful for beginning Greek students and laypersons.

  5. Logos 1.6 and the TVM module in Logos 2. This is based on the Textus Receptus and Strong's numbers, even though it may display the NA26 text.

Septuagint (LXX) Greek Old Testament Translation

Gramcord, Accordance, Bible Works, Bible Windows, and Logos 2 can perform grammatical searches of the Septuagint. All use the same tagged text, except for Gramcord, which has considerable more accurate tagging. Gramcord and Accordance allow the most flexible searches.

Hebrew Old Testament

The relative ability of various Bible programs for Hebrew Bible grammatical searches is similar to their Greek New Testament search capabilities. Gramcord and Accordance allow the most flexible searches.

It is important that Hebrew morphemes be parsed and searchable separately. For example, be-re- shith ("in the beginning") should be searchable as a preposition ("in") as well as a noun ("beginning"). Programs vary in their treatment of this tricky issue:

Suggestions for Accurate Grammatical Searching:

  1. Construct your search one element at a time. Test the search with one part and then gradually add additional elements. This checks to see if you are getting reasonable results as you go.

  2. Check your results to see if known matches show up. For example, if you search for future perfect periphrastics and Mt 18:18 does not come up, you know you did something wrong.

  3. Be sure you have selected the correct Bible text. Some programs (e.g. Bible Works and TheWord) include both tagged and untagged texts. They require that you select a morphologically tagged text before performing a grammatical search. If you do a lemma search on an untagged text, you will not find all occurrences of the word. Gramcord automatically switches to the Greek tagged text to do a grammatical search, but it can display the results in any text, even an English translation.

  4. Understand your Bible text. Learn about its tagging philosophy and assumptions. How does it handle ambiguous classifications, multiple morphemes, accents, functional classifications and other complexities. Are their any known tagging errors or unusual classifications?

  5. Understand the limits and capabilities of your search engine. Is it sensitive to word order or letter case? Will it search past periods or verse boundaries? Test your program on known searches to understand it better. Many subtleties are not documented and can only be learned through experience.

  6. Search for all possible permutations of the construction. A thorough search must consider all valid orders of the search terms. For example, a search for genitive absolutes must look for constructions with the genitive noun first as well as constructions with the genitive participle first. A thorough search must also find constructions with functionally equivalent parts of speech. For example, many constructions which call for a noun would be valid with a substantival participle or substantival adjective.

  7. Study the grammatical construction before searching with a Bible program. Read about it in a conventional grammar book to understand what you are looking for. This will help you formulate all possible permutations of the construction and will show you whether your results are reasonable.

  8. Manually eliminate false matches. Even the best software will sometimes produce false matches which must be manually eliminated. Programs such as Gramcord which can exclude intervening terms produce fewer false matches than programs which cannot. Some false matches can only be determined by understanding the sense of the passage. For example, in a search for future perfect periphrastics, Lk 1:45 and 6:40 can only be eliminated by hand, since the fact that the participle functions as a substantive is only indicated by context.

Machine-Readable Original Language Bible Texts:

Greek New Testament

Most programs suitable for scholarly study of the New Testament use the United Bible Societies 3rd or 4th edition (e.g. Bible Windows, Bible Works, TheWord) or Nestle-Aland 26th or 27th edition Greek texts (e.g. Logos, Gramcord (DOS/Windows), Accordance).

The grammatical tagging systems used for Greek texts differ significantly between programs. The selection of grammatical tags is based on subtle and often unstated assumptions. Although the function of a Greek word is indicated largely by its spelling (its "morphology"), at times the function of a word must be determined by its relation to the context. There is always a tension between purely morphological analysis based on word forms and a more functional analysis based on the interaction of a word with other words in the sentence.

Grammatical tagging schemes range along a spectrum from formal (morphological) to functional classifications. No scheme for classifying Greek words is purely formal or purely functional, since the function of a word is determined both by its morphology and its relation to the context. However, the more a tagging system tends toward the functional end of the spectrum, the more subjective the classifications become.

For example, Bauer's lexicon classifies óu as an adverb of place However, in 5 instances (Mt 18:20; Rom 4:15; 5:20; 1 Cor 16:6; 2 Cor 3:17), the Friberg text, which is a largely functional system, classifies it as a conjunction, based on the nuances of the word in the context. A user would need to be aware of such functional classifications in order to find every occurrence of a particular part of speech.

There are four major morphologically-tagged Greek NT texts used in Bible-search software:

Here is the "family tree" of various texts and the programs which use these texts:

Hebrew Bible

Most Bible programs which offer the Hebrew Bible use the same machine-readable Hebrew texts. The only variation is Gramcord, which uses a completely revised Hebrew Bible. The following texts are the primary texts in use:

Septuagint (Greek translation of the Hebrew Bible):

All programs with a Septuagint text use the Rahlf's text. It was put in machine-readable form by TLG (Thesaurus Lingua Graecae, University of California Irvine, directed by Theodore F. Brunner).

There are two versions of the grammatically tagged Septuagint text avaialble:

Further Information About Machine-Readable Biblical Texts

For a further details about the history of machine-readable Bible texts, see the manual of Bible Works 3, pp. 85-93.

Variations in Search Results with Bible-Search Programs:

Using a computer to search the Bible may lend unwarranted credibility to research. Computer-assisted biblical research is subject to the same errors as traditional research methods and opens up new potential sources of error. Comparative tests of several popular Bible-search programs show that the same searches often produce radically different results.

These variations are due to several major factors:

Researchers who use these tools should be aware of how these potential pitfalls can affect the accuracy of their analysis.

The following discussion focuses on the Greek New Testament, but the principles are applicable to searching the Hebrew Bible and Septuagint.

Differences in the Underlying Biblical Texts

  1. Different Morphologically Tagged Texts

    As has been shown, there is a considerable variation in the tagging schemes used in Greek New Testament texts. The Friberg texts use a more functional classification method than other texts. Even the Friberg 2 text still has many functional and unusual classifications. The Gramcord and CCAT texts use largely formal classifications.

    Unfortunately, except for Gramcord, the manuals for popular Bible-search programs rarely discuss the assumptions used in the classification of words. Yet it is essential that researchers understand the nature of the underlying machine-readable biblical text if their analysis of the text is to be meaningful.

    The print edition of the Friberg 1 text has an appendix outlining the criteria used for the tags (Barbara and Timothy Friberg, eds., Analytical Greek New Testament, Grand Rapids: Baker, 1981). Unfortunately there is no similar book explaining the classification philosophy of the revised Friberg text. In many instances TheWord deviates from the Friberg 1 tags, without documenting the differences. No program makes use of more than one of the Friberg multiple classifications of ambiguous words and no program documents the selection criteria.

  2. Database Text and Classification Errors

    Although users assume the accuracy of Bible-search tools, the underlying texts are rarely completely free from error. When the databases are created, the classifications of lemmas (dictionary forms) and grammatical forms are often performed initially by an automatic parsing program. Sometimes the human proofreaders may fail to catch errors.

    Most errors fall into three classes:

    • Errors in the biblical text. Fortunately, errors in the biblical texts are rare, since the texts often derive from machine- readable texts used for typesetting the print editions.

    • Incorrect lemmas attached to words in the text. For example, in Jn 15:26, the Friberg 1 text claims `on comes from eimi. This error was corrected in Friberg 2.

    • Incorrect grammatical classifications. James K. Tauber (jtauber@tartarus.uwa.edu.au) has collected a list of hundreds of errors in the CCAT tags, in lemmas and parsings. He suggests that many of the errors appear to be due to automatic parsing and lemmatization. For example, in Gramcord (DOS 4.1 and Bible Companion 1.1), oikoumene is classified as a participle from the verb oikew, but most other programs classify it the traditional way as a feminine noun. Gramcord reports that this will be fixed in the next release of the database, which Logos 2 already uses.

  3. Functional and Unusual Classifications

    Many tagged texts have some functional or unusual classifications of words which can produce unexpected search results.

    In Gramcord, many foreign words such a hosanna are classified as interjections. However, foreign proper nouns are classified as nouns and parsed by function in context. By contrast, Bible Windows and Bible Works classify hosanna as a particle.

    Conjunctions and particles are particularly difficult words to classify. A beginning user might miss many occurrences of kai if he only searches for the word as a conjunction. Since kai also functions as an adverb in some cases, most programs will sometimes classify it as an adverb. However, as the following chart shows, the classification choices in individual instances vary considerably:

      Program:		      Conjunction:		   Adverb:
    	       No subclass:   Copulative:   Correlative:
      GRAMCORD		      8049 words     187 words	  656 words
      BWorks       8214 words				  801 words
    	       4727 verses				   733 verses
      Bwin	      750 words (?)				  750 words (?)
      TheWord      5126 verses				  753 verses
    

    Bible Windows was unable to report the total number of occurrences of kai, because it only allows 750 matches in a search. Since it is hard to predict how a program will classify the word in any given passage, the safest approach is to search for all possible classifications and manually eliminate invalid matches. The Gramcord manual documents how many times each word is classified as a conjunction, particle or adverb, which makes it easier to define searches that will find all occurrences of such words.

    Since the Friberg text (Bible Works and TheWord) attempts to classify many words by function based on discourse analysis, some classifications may be surprising to users. Friberg 1 uses the category of "substantive adjective" to refer to adjectives which are used as nouns in context. For example, agathos ("good") is classified as a substantive adjective in Mt 5:45 ("he makes the sun shine on the evil and the good). This type of classification affects 4131 occurrences of 1068 words in 3009 verses! While adjectives can certainly function as substantives, the term "substantive adjective" is not a part of speech used by most Greek grammars. It would be easy for a user to accidentally miss many important occurrences of adjectives unless he searches both for "adjectives" and "substantive adjectives". The Friberg 2 text eliminates the substantive adjective classification, but it introduces other surprising functional classifications. For example, in most cases Friberg 2 classifies relative pronouns as adjectives, with an adjective subtype of "relative." It introduces a category of participial imperative (168 occurrences of 120 words in 135 verses) and (7813 occurrences of 1726 words in 4792 verses).

    Functional classifications such as those frequently used in Friberg's text are more subjective than formal classifications. Their value depends largely on the accuracy of the classifier's interpretation of the text. While they appear to be objective raw data, in fact they contain the prior conclusions of another researcher, which tends to skew the search results to fit the classifier's own viewpoint.

  4. Treatment of Classification Ambiguities

    Even the strictest formal classification method must classify certain words by function in context, since the morphology of these words is inconclusive. While in most cases the meaning is clear in the context, in some instances the grammatical classification is subject to scholarly debate. For example, the gender of potamou could be either neuter or masculine. In Mt 6:13 the meaning is debated: Does the Lord's Prayer ask for deliverance from "evil" (neuter) or "the evil one" (masculine)? Since Bible Windows 2, Gramcord and Accordance classify potamou in Mt 6:13 as neuter, a search for masculine adjectives will not find the verse. By contrast, TheWord and Bible Works classify the word as masculine and do not allow the word to be found in a search for masculine adjectives! Only Bible Windows 3 acknowledges both possible parsings and allows the word to be found with either search.

    Bible-search programs would be more useful if they marked such words as ambiguous and allowed searching on the multiple classifications. The print version of the Friberg text includes multiple classifications in many instances. However, at this time only Bible Windows 3 allows searching on Friberg's multiple classifications. Although Bible Works and TheWord both remove the multiple parsings in Friberg 1, the documentation does not explain the criteria used to make these choices.

    Gramcord makes a good attempt at handling ambiguous classifications. In many cases, it tags words in multiple ways and flags the ambiguous classification in the resulting concordance. The documentation lists all ambiguous classifications which are used. However, even Gramcord could be improved in this area. For example, it does not include the ambiguous classification of potamou in Mt 6:13.

Differences in Search Software

Although on the surface most Bible-search programs appear to allow similar searches, there is considerable variation in the search capabilities of programs. Many programs lack the sophistication to perform finely tuned searches. Further, some of the hidden or poorly documented assumptions in the search engines can produce surprising results.

  1. Use of Wildcards

    Wildcards are symbols that indicate that any letter or letters will be accepted at a certain point in a word. Thus a search for "apo*" will find apoluw, apodidwmi and other words which begin with "apo".

    Some programs place a limit on the number of words that can match wildcards. TheWord allows a maximum of 300 words to match a wildcard and may not warn if this limit is exceeded. Logos 1.6 only allows 32 words to match a wildcard. Version 2 has no limits and even lets you choose multiple words from a pick list matching the wildcards.

    Most programs (e.g. TheWord, Gramcord, Bible Works, Logos) assume that the search expression includes the whole word, unless wildcards are explicitly included. However Bible Windows uses full word searches for grammatical searches and double wildcard searches for word and phrase searches. (In a double wildcard search, the search letters can be found anywhere within a word.) This inconsistent behavior in Bible Windows can easily confuse users and result in erroneous searches.

    Programs also differ in how they interpret grammatical wildcards. For example, Gramcord finds participles and infinitives in a wildcard search for verbs. In Logos 2, participles and infinitives are not found in a wildcard search for verbs. You must explicitly set up a different set of wildcards for infinitive, participles and finite verbs. Both programs use the Gramcord Greek NT database, but make different search assumptions.

  2. Limits on the Number of Matches

    Bible Windows has an undocumented limit of 750 matches per search. Since there is no error message that warns that the maximum number of matches has been exceeded, this can lead to misleading conclusions.

  3. Different Ways of Reporting Statistics

    TheWord reports matches in terms of the number of verses which contain the desired construction. Bible Windows and Gramcord report the number of occurrences of the desired construction. Bible Works and Logos report a count of occurrences and verses.

  4. Word Order Sensitivity

    For some searches, word order is very important. For example, a search for substantival adjectives should find all occurrences of an article in agreement with an adjective only when the article appears just prior to the adjective, not after the adjective. In other cases, it is important to find all permutations of word order. For example, a search for genitive absolutes should allow either the genitive noun or the participle to appear first.

    Programs differ in the importance they place on word order in search expressions. Gramcord requires an exact match of the order of the search elements. However, searches can be defined that include several combinations of word order and distinguish them in the resulting concordance. Bible Works and TheWord do not distinguish the word order of search elements. This can result in many false matches. For example, a search for "men . . . de" finds 10 verses in which the order of words is "de . . . men". Bible Windows is sensitive to word order in grammatical searches but not in word searches, which produces inconsistent search results. By default Logos 2 is not sensitive to word order. However, search expressions can require that certain elements precede or follow others.

  5. Duplication of Search Terms

    Many grammatical constructions require that the same search term appear more than once (e.g. "de . . . de"). Gramcord, Bible Windows and TheWord allow the same term to appear more than once. They properly find verses in which the word de occurs twice. However, Bible Works simply finds all verses in which the word de occurs at least once. Logos 2 also finds all verses in which de occurs at least once. Although a search expression can require that a certain word precede another, Logos 2.0a does not accurately execute such as search if the search terms are identical.

  6. Exclude Intervening Terms

    False matches can frequently be eliminated by specifying terms that must not appear between search elements. For example, a future perfect periphrastic construction requires a future tense of eimi and the perfect participle of another verb in the same clause. Since it is highly unlikely that a finite verb will occur between these two search terms, the search can be improved if it specifies that no finite verb can intervene.

    Gramcord and Accordance allow multiple intervening exclusion and inclusion terms. They can specify words and parts of speech that may occur between search terms as well as words and parts of speech that may not occur between search terms. Other programs do not include a true "exclude intervening term" option. At first glance it would appear that the "and not" Boolean operator which is available in Bible Windows, Logos 2 and TheWord would accomplish the same thing. However, the "and not" operator defines what a search term may not be, not the types of words that cannot appear between search terms. Thus this feature can produce undesired interactions between the search terms.

    The following chart illustrates the effect of excluding intervening terms in a search for future perfect periphrastics:

    	    Source:		    Matches:   Invalid:    Missing:
    
      Nigel Turner, Syntax, p. 89[1]       6
      GRAMCORD
        Not exclude intervening	      12	  6	      0
          verbs
        Exclude intervening finite	       8	  2	      0
          verbs
      BWin
        Not exclude intervening	       9[2]       3           0
          verbs
        Use "and not" indicative           0          0           6
          verbs
    

    When Gramcord searches without excluding intervening terms, it finds a large number of invalid matches. When Gramcord is set to exclude intervening finite verbs (indicative, subjunctive, optative, imperative), most of the false matches are eliminated. On a search with a larger number of results this could save considerable time.

    Bible Windows has no true exclusion command. When the second search term is set to "and not an indicative verb", there are no matches, because any verse with a future eimi also has a finite verb (i.e. eimi).

  7. Proximity of Multiple Search Terms

    Many grammatical constructions require that two or more words be in close proximity, though not necessarily side by side. A search program should allow restriction of search expressions to a definable number of words.

    Gramcord allows the user to specify that up to 200 words span from beginning to end of a construction and Bible Windows allows specifying up to 20 words. By default both programs assume that all elements in a search expression are juxtaposed. If this number is not set appropriately many valid examples of a construction will be missed.

    Logos 2 allows you to specify that one word occurs within a certain number of words from another word. By default it assumes that multiple words can occur anywhere in a verse.

    Bible Works has less flexibility than either of these programs. By default, all words must appear somewhere in the same verse. The user can specify a maximum number of verses in which to find the search terms. This is far less valuable for grammatical searches than a limit by number of words, though it can have value for discourse-level research.

  8. Search Boundaries for Multiple Search Terms

    For grammatical searches it is more valuable to set the search boundaries by sentences or clauses than by verses. A program based on verse boundaries would have difficulty with sentences that span several verses (e.g. Eph. 1:3- 14). For discourse analysis, search boundaries should be set at the paragraph, chapter or book level. An ideal program would allow setting search boundaries by clause, sentence, a specific number of verses, paragraph, chapter, book. It would also allow the option of stopping at or ignoring various types of punctuation marks.

    Accordance allows boundaries to be clause, sentence, verse, paragraph, chapter or book. TheWord allows boundaries to be verse, paragraph, chapter or book. Most other programs are more restricted. Bible Windows does not allow specifying boundaries, though it will cross verse boundaries if the word proximity is set high enough. Bible Works and Logos use verses as boundaries, but search expressions can cross verse boundaries if the proximity is set to 2 verses. Gramcord uses the sentence as a boundary, so it is more likely than a verse-oriented program to find all occurrences of a grammatical construction.

    Logos 2 is inconsistent about search boundaries. For searches with Boolean operators (AND, OR, NOT), multiple search terms must occur in the same verse. However, if you specify that one word must occur within a certain number of words of another word (or before or after the word), the default search boundary is a sentence, not a verse.

    Programs differ in how they handle a conflict between the number of words in proximity and the search boundary. Bible Windows will cross verse boundaries in an effort to compare the specified number of words. Gramcord will never cross a full stop (period, semi-colon (raised dot) or question mark), regardless of the maximum number of words allowed in the proximity. This subtle difference can produce significantly different search results. Gramcord misses 12 examples of "men . . . de" constructions that Bible Windows finds because it ends the search at a full stop. In most of these cases, the punctuation is a semi-colon, which indicates that the two clauses are closely related. The choice of a semi-colon rather than a comma is a debatable editorial decision in each instance.

  9. Forced Agreement of Grammatical Features

    Many grammatical constructions require either that certain grammatical features agree or not agree. For example, a genitive absolute requires a clause with a genitive noun and a genitive participle that agree in gender and number. If agreement cannot be required between search elements, many false matches must be manually removed.

    Since Logos does not allow specifying agreement of grammatical features, it makes it difficult to find genitive absolutes without substantial manual labor. Bible Windows and Bible Works allow specifying agreement, but they cannot limit the agreement to specific search terms. Thus a search for genitive absolutes is quite simple with Bible Windows or Bible works, but it is difficult to find constructions in which individual pairs of search terms agree. For example, it would be difficult to find the common Greek construction "article1 article2 noun2 noun1", where article1 and noun1 agree with each other and article2 and noun2 are genitive and agree with each other (e.g. ó tou tektonos úios in Mt 13:55). Gramcord and Accordance allow great flexibility in agreement. Any grammatical feature of selected pairs of words can be required or forbidden to agree. As a result, these programs can find very complex grammatical constructions with relatively few false matches.

  10. Sensitivity to Diacritical Marks

    TheWord requires that Greek accents and breathing marks be entered as they appear in the biblical text. This makes entry of search expressions tedious and error prone. It also results in missed matches, where the context changes the accents.

    On the other hand, required entry of breathing marks is desirable, since otherwise it is difficult to distinguish similar word pairs such as eis and éis. Bible Windows includes breathing marks in the word pick list for grammatical searches and gives the option of including them in word searches. If they are not included in word searches, breathing marks are ignored. With Gramcord ambiguities can be resolved by specifying the desired part of speech. Bible Works has no way to easily distinguish óu (an adverb of place) from ou (the negative particle), since it classifies both as adverbs. Logos 2 ignores diacritical marks in word searches. A morphological search is necessary to distinguish words which only differ by diacritical marks.

  11. Comparative Results in a Typical Search

    A search for "men . . . de" is a useful case study that illustrates several of these differences in search capabilities of several programs. While the search is simple, it reveals an astonishing variation in search results:

    	   Source:	      Matches:	 Invalid:   Duplicate:	 Missing:
    
      Nigel Turner, Syntax,	       110
          p. 332
      GRAMCORD
          Not exclude inter-	112	    0		14	   12
    	vening de
          Exclude intervening de	98	    0		0	   12
      BWorks
          Context of 1 verse	104	    11		0	   17
          Context of 2 verses	157	    47		0	   0
      BWin
          Word search mode		134	    27		0	   3
          Grammatical search mode	97	    0		0	   13
      Logos 2
          Terms up to 20 words	667 (!)     0	       567 (?)	   13
    	apart
      TheWord
          Context of 1 verse	97	    0		0	   13
    

    The different search results of these programs are due to the following factors:

    1. Gramcord: This program misses 12 references that cross a full stop boundary (11 semi-colons and 1 period). Gramcord also includes 14 duplicate references in verses that have multiple occurrences of de. If the search is performed with an intervening de excluded, these duplicates are eliminated.

    2. Bible Works: This program includes 11 invalid citations that are due to its inability to distinguish word order. Some occurrences are missing because the default search boundary is one verse. If the search context is set to 2 verses, many more invalid citations are found, although more valid citations are also picked up.

    3. Bible Windows: In word search mode, Bible Windows finds many invalid references due to insensitivity to word order. In grammatical search mode, Bible Windows finds several references missed by Bible Works, since it allows expressions to cross a verse boundary, as long as the specified number of words has not been exceeded.

    4. Logos 2: This search produced very erratic results. The exact number of matches depended on the number of words allowed between the two search terms (the stated result is with up to 20 words allowed between search terms).

    5. TheWord: This program misses some occurrences that extend over more than one verse. It does not find as many invalid citations as Bible Works because it is sensitive to word order.

    In most cases these programs missed citations when the valid reference crossed a verse, sentence or semi-colon boundary. They picked up invalid citations due to the inability to distinguish word order or multiple occurrences of one of the search terms in a sentence.

  12. Summary of Search Capabilities

    The following table summarizes the search capabilities of several programs:

          Feature:	    GRAMCORD:	 BWorks: 	 BWin:		TheWord:     Logos 2
    						 Gram.	   Word
    						 Search    Search
    
      Wildcards:	    optional	 optional	none	 implicit   optional	optional
      Match limit:	    unlimited	 unlimited	 750	   750	    unlimited	unlimited
      Statistics
          reported:     occurrences  occurrences	occur.	  occur.    verses	occurrences
    				 and verses					and verses
      Word order
          sensitivity:    yes	    yes 	 yes	   no	     no 	optional
      Allow duplicate
          terms:	      yes	    no		 yes	   no	     yes	no
      Exclude interven-
          ing terms:      yes	    no		 no	   no	     no 	no
      Proximity:
          Type:	    words	 verses 	words	  words     verse,	words or
    								    para-	verses
    								    graph,
    								    chapter,
    								    book
          Limit:	      200	 unlimited	 20	   20	      1 	unlimited
      Boundary:
          Type:	   full stop	 specified	none	  none	    verse,	full stop
    		   (period,	 number of			    para-
    		   question	 verses 			    graph,
    		   mark,					    chapter,
    		   semicolon					    book
    		   or weak stop
    		   (comma, colon)
    
      Priority:	   boundary	 proximity	proximity  proximity proximity	boundary
      Agreement of	   any		 none		all search none      none	none
          grammatical  combination			terms
          features:    of search
      Diacritical
          marks:	   ignored	 ignored	required   optional  required	ignored
    

Errors When Using the Software

The suggestions described earlier under "suggestions for accurate grammatical searching" will help to avoid common mistakes in using Bible-search software.


Historical and Theological Texts with Search Software:

Some historical and theological texts are available with integrated search software. These enable advanced study of historical texts in much the same manner as computer-assisted study of biblical texts. The best programs allow full boolean Searches. Some examples include:

Using Text Analysis Software to Study the Bible and Theological Texts:

Types of analysis

General purpose text analysis programs can be used for analyzing biblical and theological texts. Programs are available that allow textual studies such as:

Some Texts You May Want to Study:

Obtaining Texts to Study

Many biblical and non-biblical texts are available over the Internet. Usually there is no charge if the texts are used for personal research. Here are some useful places to look:

Some Useful Text Analysis Programs

A description of several general purpose text analysis programs is available from http://info.ox.ac.uk/departments/humanities/general.html. Some of the more useful ones are:

Typical Text Analysis Studies

Writing Your Own Bible Analysis Software

Advantages

There are many reasons you might want to create your own software for studying the Bible:

General purpose programming languages such as C, Pascal and Basic can be used to create text analysis programs. There are also special languages such as ICON, SNOBOL and IBYX which are designed for processing texts. These languages have features that make it easy to perform common text manipulation tasks, such a pattern matching and string substitution.

Where to get algorithms and sample code

Several books and journals contains algorithms and basic programs for text analysis software. Some helpful starting places are:

Some Sources of Information on Computerized Analysis of Texts

Anyone interested in the computer-assisted study of the Bible should be aware of projects in humanities computing. Many of the techniques applied to the study of modern literature could easily be adapted to the study of the Bible.

Here are some useful starting places for information on computer-assisted literary research:


Reading Assignment:

Required Reading

  1. Harry Hahne, "Interpretive Implications of Using Bible-Search Software for Testament Grammatical Analysis," presented at the Annual Meeting of the Evangelical Theological Society, Nov. 24, 1994. Discusses how differences in tagged machine-readable New Testament texts, search assumptions of Bible programs and common user errors can affect the accuracy of search results of Bible programs. (http://www-writing.berkeley.edu/chorus/bible/essays/ntgram.html)

Recommended Reading

  1. Susan Hockey, A Guide to Computer Applications in the Humanities. Baltimore: Johns Hopkins University, 1980. Although the discussion of specific hardware and software is dated, this is an excellent introduction to the major issues in analyzing a text with a computer. It discusses issues such as encoding machine readable texts, word studies, concordances, dictionaries, morphological and syntactical analysis, stylistic analysis, authorship studies, textual criticism, sound patterns and indexing texts.

  2. John R. Abercrombie, Computer Programs for Literary Analysis. Philadelphia: University of Pennsylvania Press, 1984. Discusses algorithms and presents sample programs in Basic and Pascal for textual analysis. Topics include indexing and concordance generation, textual criticism, searching algorithms and morphological analysis.

  3. Willard McCarty discusses text analysis and concordance generating with TACT (http://ilex.cc.kcl.ac.uk/toronto/1001h/06textr.html). Follow the links at the bottom of each page to additional discussions on this topic.

Footnotes:

  1. . James Hope Moulton, A Grammar of New Testament Greek, vol. 3, Syntax, by Nigel Turner (Edinburgh: T & T Clark, 1963), p. 89. Back to Text

  2. . Since this program reports the number of matching verses, not constructions, there may be more than one matching construction in the same verse. Back to Text