debian/0000755000000000000000000000000011741121015007157 5ustar debian/patches/0000755000000000000000000000000011721445631010621 5ustar debian/patches/10-manpage-th.patch0000755000000000000000000000067311721407562014113 0ustar From: Daniel Baumann Subject: Fix typo in manpage diff -Naur agrep-4.17.orig/agrep.1 agrep-4.17/agrep.1 --- agrep-4.17.orig/agrep.1 1999-11-03 20:37:06.000000000 +0000 +++ agrep-4.17/agrep.1 2005-12-20 13:39:59.101562480 +0000 @@ -1,4 +1,4 @@ -.TH AGREP l "Jan 17, 1992" +.TH AGREP 1 "Jan 17, 1992" .SH NAME agrep \- search a file for a string or regular expression, with approximate matching capabilities .SH SYNOPSIS debian/patches/15-manpage-url.patch0000644000000000000000000000120711721445631014275 0ustar From: Jari Aalto Subject: Point path somewhere else: /usr/dict not in not FHS. --- agrep.1 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/agrep.1 +++ b/agrep.1 @@ -298,7 +298,7 @@ distance, by YZ, with up to one additional insertion (\-D2 and \-S2 make deletions and substitutions too "expensive"). .TP -agrep \-5 \-p abcdefghij /usr/dict/words +agrep \-5 \-p abcdefghij /path/to/dictionary/words outputs the list of all words containing at least 5 of the first 10 letters of the alphabet \fIin order\fR. (Try it: any list starting with academia and ending with sacrilegious must mean something!) debian/patches/01-makefile.patch0000755000000000000000000000064611721407570013646 0ustar From: Anreas Jochens Subject: Fix FTBS on AMD64 diff -Naur agrep-4.17.orig/Makefile agrep-4.17/Makefile --- agrep-4.17.orig/Makefile 2003-10-14 09:58:05.000000000 +0000 +++ agrep-4.17/Makefile 2005-12-27 15:55:52.872184392 +0000 @@ -19,7 +19,7 @@ ISO_CHAR_SET = 1 # You might have to change this depending on your machine configuration. -CC = gcc -march=i486 +CC = gcc SHELL = /bin/sh debian/patches/12-manpage-hyphen.patch0000644000000000000000000001122111721410445014753 0ustar From: Jari Aalto Subject: Fix hyphes in manpage --- agrep.1 | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) --- a/agrep.1 +++ b/agrep.1 @@ -7,7 +7,7 @@ .B \-#cdehiklnpstvwxBDGIS ] .I pattern -[ -f +[ \-f .I patternfile ] [ @@ -20,7 +20,7 @@ for records containing strings which either \fIexactly\fP or \fIapproximately\fP match a pattern. A record is by default a line, but it can be defined differently using -the -d option (see below). +the \-d option (see below). Normally, each record found is copied to the standard output. Approximate matching allows finding records that contain the pattern with several errors including substitutions, insertions, and @@ -28,7 +28,7 @@ For example, Massechusets matches Massachusetts with two errors (one substitution and one insertion). Running .B agrep --2 Massechusets foo outputs all lines in foo containing any string with +\-2 Massechusets foo outputs all lines in foo containing any string with at most 2 errors from Massechusets. .LP .B agrep @@ -89,7 +89,7 @@ permitted in finding the approximate matches (defaults to zero). Generally, each insertion, deletion, or substitution counts as one error. It is possible to adjust the relative cost of insertions, -deletions and substitutions (see -I -D and -S options). +deletions and substitutions (see \-I \-D and \-S options). .TP .B \-c Display only the count of matching records. @@ -103,7 +103,7 @@ a regular expression. Text between two \fIdelim\fP's, before the first \fIdelim\fP, and after the last \fIdelim\fP is considered as one record. -For example, -d '$$' defines paragraphs as records and -d '^From\ ' +For example, \-d '$$' defines paragraphs as records and \-d '^From\ ' defines mail messages as records. .B agrep matches each record separately. @@ -135,14 +135,14 @@ .TP .B \-k No symbol in the pattern is treated as a meta character. -For example, agrep -k 'a(b|c)*d' foo will find +For example, agrep \-k 'a(b|c)*d' foo will find the occurrences of a(b|c)*d in foo whereas agrep 'a(b|c)*d' foo will find substrings in foo that match the regular expression 'a(b|c)*d'. .TP .B \-l List only the files that contain a match. This option is useful for looking for files containing a certain pattern. -For example, " agrep -l 'wonderful' * " will list the names of those +For example, " agrep \-l 'wonderful' * " will list the names of those files in current directory that contain the word 'wonderful'. .TP .B \-n @@ -179,7 +179,7 @@ surround the match; they cannot be counted as errors. For example, .B agrep --w -1 car will match cars, but not characters. +\-w \-1 car will match cars, but not characters. .TP .B \-x The pattern must match the whole line. @@ -289,37 +289,37 @@ .SH EXAMPLES .LP .TP -agrep -2 -c ABCDEFG foo +agrep \-2 \-c ABCDEFG foo gives the number of lines in file foo that contain ABCDEFG within two errors. .TP -agrep -1 -D2 -S2 'ABCD#YZ' foo +agrep \-1 \-D2 \-S2 'ABCD#YZ' foo outputs the lines containing ABCD followed, within arbitrary distance, by YZ, with up to one additional insertion -(-D2 and -S2 make deletions and substitutions too "expensive"). +(\-D2 and \-S2 make deletions and substitutions too "expensive"). .TP -agrep -5 -p abcdefghij /usr/dict/words +agrep \-5 \-p abcdefghij /usr/dict/words outputs the list of all words containing at least 5 of the first 10 letters of the alphabet \fIin order\fR. (Try it: any list starting with academia and ending with sacrilegious must mean something!) .TP -agrep -1 'abc[0-9](de|fg)*[x-z]' foo +agrep \-1 'abc[0-9](de|fg)*[x-z]' foo outputs the lines containing, within up to one error, the string that starts with abc followed by one digit, followed by zero or more repetitions of either de or fg, followed by either x, y, or z. .TP -agrep -d '^From\ ' 'breakdown;internet' mbox +agrep \-d '^From\ ' 'breakdown;internet' mbox outputs all mail messages (the pattern '^From\ ' separates mail messages in a mail file) that contain keywords 'breakdown' and 'internet'. .TP -agrep -d '$$' -1 ' ' foo +agrep \-d '$$' \-1 ' ' foo finds all paragraphs that contain word1 followed by word2 with one error in place of the blank. In particular, if word1 is the last word in a line and word2 is the first word in the next line, then the space will be substituted by a newline symbol and it will match. Thus, this is a way to overcome separation by a newline. -Note that -d '$$' (or another delim which spans more than one line) +Note that \-d '$$' (or another delim which spans more than one line) is necessary, because otherwise agrep searches only one line at a time. .TP debian/patches/series0000644000000000000000000000012311721445307012032 0ustar 01-makefile.patch 10-manpage-th.patch 12-manpage-hyphen.patch 15-manpage-url.patch debian/compat0000644000000000000000000000000211741120512010356 0ustar 9 debian/control0000644000000000000000000000140311741120512010561 0ustar Source: agrep Section: non-free/text Priority: optional Maintainer: Jari Aalto Build-Depends: debhelper (>= 9) Standards-Version: 3.9.3.1 Vcs-Browser: http://git.debian.org/?p=collab-maint/debian.git Vcs-Git: git://git.debian.org/git/collab-maint/debian.git Homepage: http://freshmeat.net/projects/agrep Package: agrep Architecture: any Depends: ${shlibs:Depends}, ${misc:Depends} Description: text search tool with support for approximate patterns agrep is a version of standard grep with the following enhancements: . * the ability to search for approximate patterns * it is record oriented rather than just line oriented * multiple patterns with AND OR logic queries . This package contains glimpse's (4.x) last free version of grep. debian/install0000644000000000000000000000001611721406111010546 0ustar agrep usr/bin debian/changelog0000644000000000000000000000767611741120600011050 0ustar agrep (4.17-9) unstable; urgency=low * debian/control - (Build-Depends): Rm dpkg-dev; not needed with DEB_*_MAINT_* variables. - (Standards-Version): Update to 3.9.3.1. * debian/copyright - Update to format 1.0. * debian/rules - Use DEB_*_MAINT_* variables to enable all hardening features. -- Jari Aalto Wed, 11 Apr 2012 00:09:20 +0300 agrep (4.17-8) unstable; urgency=low * debian/compat - Update to 9 * debian/control - (Build-Depends): update to debhelper 9, dpkg-dev 1.16.1. Remove dpatch. * debian/install - New file. * debian/copyright: - Update to DEP5. * debian/patches - Convert all patches to quilt. - Renumber patches: number 10-19 are used for manual pages. * debian/rules - Update to dh(1). - Use hardened CFLAGS. http://wiki.debian.org/ReleaseGoals/SecurityHardeningBuildFlags * debian/source/format - New. Update to 3.0. * debian/debian-vars.mk - Delete. No longer needed. -- Jari Aalto Thu, 23 Feb 2012 09:26:12 -0500 agrep (4.17-7) unstable; urgency=low * debian/debian-vars.mk: - Adjust MAKE_FLAGS check (FTBFS sparc; Closes: #502862). * debian/patches/02-manpage-hyphen.dpatch: - New. Fix hyphens with \- (Lintian). * debian/patches/03-manpage-fhs-dir-dict.dpatch: - New. Fix FHS /usr/share/dict/words (Lintian). -- Jari Aalto Mon, 20 Oct 2008 17:52:00 +0300 agrep (4.17-6) unstable; urgency=low * debian/copyright: - Update to latest dh-make template. - Use ISO 8601 dates. - Add current maintainer to Debian package copyright line. * debian/debian-vars.mk: - New file. * debian/control: - (Description): remove word UNIX, mention glimpse. - (Homepage): Move to new field. - (Standards-Version): Update to latest. - (Vcs-*): New fields * debian/rules: - (build-stamp): Add MAKE_FLAGS to build call. - (clean): Rewrite error suppression '-' (Lintian fix). - (clean-debian-dir): New. Comment out dh_clean calls to preserve orignal sources' *.orig and *.rej files and use this make target. -- Jari Aalto Sun, 19 Oct 2008 10:18:44 +0300 agrep (4.17-5) unstable; urgency=low * New maintainer. (O/ITA; Closes: #407649) * debian/control: Add homepage URL. * debian/watch: New file. * debian/copyright: Clarify the glimpse-agrep license issue (new versions cannot be packaged). -- Jari Aalto Thu, 25 Jan 2007 21:11:14 +0200 agrep (4.17-4) unstable; urgency=low * Orphaning package. -- Daniel Baumann Sat, 20 Jan 2007 11:42:00 +0100 agrep (4.17-3) unstable; urgency=low * New email address. * Some packaging style changes. -- Daniel Baumann Wed, 5 Jul 2006 13:29:00 +0200 agrep (4.17-2) unstable; urgency=low * Added patch to fix FTBS on amd64 (Closes: #344909). -- Daniel Baumann Tue, 27 Dec 2005 17:01:00 +0100 agrep (4.17-1) unstable; urgency=low * New upstream release (Closes: #307442). * Using dpatch for source modifications. * Updated debian/ to newer debhelper templates. -- Daniel Baumann Tue, 20 Dec 2005 13:59:00 +0100 agrep (2.04-3) unstable; urgency=low * New maintainer (Closes: #288371). * debian/*: updated to new standard. * debian/rules rewritten. * fixed lintian errors. * added extra documentation from upstream site, removed useless README.Debian. -- Daniel Baumann Wed, 2 Feb 2005 01:30:00 +0100 agrep (2.04-2) unstable; urgency=low * New maintainer (closes: #201367). * Updated to standards version 3.6.0. * Description on multiple lines (closes: #180727). -- Luk Claes Tue, 15 Jul 2003 18:57:32 +0200 agrep (2.04-1) unstable; urgency=low * Initial Release. -- Michael-John Turner Fri, 11 May 2001 21:19:29 +0200 debian/watch0000644000000000000000000000007711721405564010230 0ustar version=3 ftp://ftp.cs.arizona.edu/agrep/ .*agrep-([\d.]+).tar debian/doc/0000755000000000000000000000000011721405564007740 5ustar debian/doc/agrep.ps.10000644000000000000000000050177011721405564011553 0ustar %!PS-Adobe-1.0 %%Creator: aloe:udi (Udi Manber) %%Title: stdin (ditroff) %%CreationDate: Mon Jun 10 12:20:27 1991 %%EndComments % Start of psdit.pro -- prolog for ditroff translator % Copyright (c) 1985,1987 Adobe Systems Incorporated. All Rights Reserved. % GOVERNMENT END USERS: See Notice file in TranScript library directory % -- probably /usr/lib/ps/Notice % RCS: $Header: psdit.pro,v 2.2 87/11/17 16:40:42 byron Rel $ % Psfig RCSID $Header: psdit.pro,v 1.5 88/01/04 17:48:22 trevor Exp $ /$DITroff 180 dict def $DITroff begin /DocumentInitState [ matrix currentmatrix currentlinewidth currentlinecap currentlinejoin currentdash currentgray currentmiterlimit ] cvx def %% Psfig additions /startFig { /SavedState save def userdict maxlength dict begin currentpoint transform DocumentInitState setmiterlimit setgray setdash setlinejoin setlinecap setlinewidth setmatrix itransform moveto /ury exch def /urx exch def /lly exch def /llx exch def /y exch 72 mul resolution div def /x exch 72 mul resolution div def currentpoint /cy exch def /cx exch def /sx x urx llx sub div def % scaling for x /sy y ury lly sub div def % scaling for y sx sy scale % scale by (sx,sy) cx sx div llx sub cy sy div ury sub translate /DefFigCTM matrix currentmatrix def /initmatrix { DefFigCTM setmatrix } def /defaultmatrix { DefFigCTM exch copy } def /initgraphics { DocumentInitState setmiterlimit setgray setdash setlinejoin setlinecap setlinewidth setmatrix DefFigCTM setmatrix } def /showpage { initgraphics } def } def % Args are llx lly urx ury (in figure coordinates) /clipFig { currentpoint 6 2 roll newpath 4 copy 4 2 roll moveto 6 -1 roll exch lineto exch lineto exch lineto closepath clip newpath moveto } def % doclip, if called, will always be just after a `startfig' /doclip { llx lly urx ury clipFig } def /endFig { end SavedState restore } def /globalstart { % Push details about the enviornment on the stack. fontnum fontsize fontslant fontheight % firstpage mh my resolution slotno currentpoint pagesave restore gsave } def /globalend { grestore moveto /slotno exch def /resolution exch def /my exch def /mh exch def % /firstpage exch def /fontheight exch def /fontslant exch def /fontsize exch def /fontnum exch def F /pagesave save def } def %% end XMOD additions /fontnum 1 def /fontsize 10 def /fontheight 10 def /fontslant 0 def /xi {0 72 11 mul translate 72 resolution div dup neg scale 0 0 moveto /fontnum 1 def /fontsize 10 def /fontheight 10 def /fontslant 0 def F /pagesave save def}def /PB{save /psv exch def currentpoint translate resolution 72 div dup neg scale 0 0 moveto}def /PE{psv restore}def /m1 matrix def /m2 matrix def /m3 matrix def /oldmat matrix def /tan{dup sin exch cos div}bind def /point{resolution 72 div mul}bind def /dround {transform round exch round exch itransform}bind def /xT{/devname exch def}def /xr{/mh exch def /my exch def /resolution exch def}def /xp{}def /xs{docsave restore end}def /xt{}def /xf{/fontname exch def /slotno exch def fontnames slotno get fontname eq not {fonts slotno fontname findfont put fontnames slotno fontname put}if}def /xH{/fontheight exch def F}bind def /xS{/fontslant exch def F}bind def /s{/fontsize exch def /fontheight fontsize def F}bind def /f{/fontnum exch def F}bind def /F{fontheight 0 le {/fontheight fontsize def}if fonts fontnum get fontsize point 0 0 fontheight point neg 0 0 m1 astore fontslant 0 ne{1 0 fontslant tan 1 0 0 m2 astore m3 concatmatrix}if makefont setfont .04 fontsize point mul 0 dround pop setlinewidth}bind def /X{exch currentpoint exch pop moveto show}bind def /N{3 1 roll moveto show}bind def /Y{exch currentpoint pop exch moveto show}bind def /S /show load def /ditpush{}def/ditpop{}def /AX{3 -1 roll currentpoint exch pop moveto 0 exch ashow}bind def /AN{4 2 roll moveto 0 exch ashow}bind def /AY{3 -1 roll currentpoint pop exch moveto 0 exch ashow}bind def /AS{0 exch ashow}bind def /MX{currentpoint exch pop moveto}bind def /MY{currentpoint pop exch moveto}bind def /MXY /moveto load def /cb{pop}def % action on unknown char -- nothing for now /n{}def/w{}def /p{pop showpage pagesave restore /pagesave save def}def /abspoint{currentpoint exch pop add exch currentpoint pop add exch}def /dstroke{currentpoint stroke moveto}bind def /Dl{2 copy gsave rlineto stroke grestore rmoveto}bind def /arcellipse{oldmat currentmatrix pop currentpoint translate 1 diamv diamh div scale /rad diamh 2 div def rad 0 rad -180 180 arc oldmat setmatrix}def /Dc{gsave dup /diamv exch def /diamh exch def arcellipse dstroke grestore diamh 0 rmoveto}def /De{gsave /diamv exch def /diamh exch def arcellipse dstroke grestore diamh 0 rmoveto}def /Da{currentpoint /by exch def /bx exch def /fy exch def /fx exch def /cy exch def /cx exch def /rad cx cx mul cy cy mul add sqrt def /ang1 cy neg cx neg atan def /ang2 fy fx atan def cx bx add cy by add 2 copy rad ang1 ang2 arcn stroke exch fx add exch fy add moveto}def /Barray 200 array def % 200 values in a wiggle /D~{mark}def /D~~{counttomark Barray exch 0 exch getinterval astore /Bcontrol exch def pop /Blen Bcontrol length def Blen 4 ge Blen 2 mod 0 eq and {Bcontrol 0 get Bcontrol 1 get abspoint /Ycont exch def /Xcont exch def Bcontrol 0 2 copy get 2 mul put Bcontrol 1 2 copy get 2 mul put Bcontrol Blen 2 sub 2 copy get 2 mul put Bcontrol Blen 1 sub 2 copy get 2 mul put /Ybi /Xbi currentpoint 3 1 roll def def 0 2 Blen 4 sub {/i exch def Bcontrol i get 3 div Bcontrol i 1 add get 3 div Bcontrol i get 3 mul Bcontrol i 2 add get add 6 div Bcontrol i 1 add get 3 mul Bcontrol i 3 add get add 6 div /Xbi Xcont Bcontrol i 2 add get 2 div add def /Ybi Ycont Bcontrol i 3 add get 2 div add def /Xcont Xcont Bcontrol i 2 add get add def /Ycont Ycont Bcontrol i 3 add get add def Xbi currentpoint pop sub Ybi currentpoint exch pop sub rcurveto }for dstroke}if}def end /ditstart{$DITroff begin /nfonts 60 def % NFONTS makedev/ditroff dependent! /fonts[nfonts{0}repeat]def /fontnames[nfonts{()}repeat]def /docsave save def }def % character outcalls /oc {/pswid exch def /cc exch def /name exch def /ditwid pswid fontsize mul resolution mul 72000 div def /ditsiz fontsize resolution mul 72 div def ocprocs name known{ocprocs name get exec}{name cb} ifelse}def /fractm [.65 0 0 .6 0 0] def /fraction {/fden exch def /fnum exch def gsave /cf currentfont def cf fractm makefont setfont 0 .3 dm 2 copy neg rmoveto fnum show rmoveto currentfont cf setfont(\244)show setfont fden show grestore ditwid 0 rmoveto} def /oce {grestore ditwid 0 rmoveto}def /dm {ditsiz mul}def /ocprocs 50 dict def ocprocs begin (14){(1)(4)fraction}def (12){(1)(2)fraction}def (34){(3)(4)fraction}def (13){(1)(3)fraction}def (23){(2)(3)fraction}def (18){(1)(8)fraction}def (38){(3)(8)fraction}def (58){(5)(8)fraction}def (78){(7)(8)fraction}def (sr){gsave .05 dm .16 dm rmoveto(\326)show oce}def (is){gsave 0 .15 dm rmoveto(\362)show oce}def (->){gsave 0 .02 dm rmoveto(\256)show oce}def (<-){gsave 0 .02 dm rmoveto(\254)show oce}def (==){gsave 0 .05 dm rmoveto(\272)show oce}def end % DIThacks fonts for some special chars 50 dict dup begin /FontType 3 def /FontName /DIThacks def /FontMatrix [.001 0.0 0.0 .001 0.0 0.0] def /FontBBox [-220 -280 900 900] def% a lie but ... /Encoding 256 array def 0 1 255{Encoding exch /.notdef put}for Encoding dup 8#040/space put %space dup 8#110/rc put %right ceil dup 8#111/lt put %left top curl dup 8#112/bv put %bold vert dup 8#113/lk put %left mid curl dup 8#114/lb put %left bot curl dup 8#115/rt put %right top curl dup 8#116/rk put %right mid curl dup 8#117/rb put %right bot curl dup 8#120/rf put %right floor dup 8#121/lf put %left floor dup 8#122/lc put %left ceil dup 8#140/sq put %square dup 8#141/bx put %box dup 8#142/ci put %circle dup 8#143/br put %box rule dup 8#144/rn put %root extender dup 8#145/vr put %vertical rule dup 8#146/ob put %outline bullet dup 8#147/bu put %bullet dup 8#150/ru put %rule dup 8#151/ul put %underline pop /DITfd 100 dict def /BuildChar{0 begin /cc exch def /fd exch def /charname fd /Encoding get cc get def /charwid fd /Metrics get charname get def /charproc fd /CharProcs get charname get def charwid 0 fd /FontBBox get aload pop setcachedevice 40 setlinewidth newpath 0 0 moveto gsave charproc grestore end}def /BuildChar load 0 DITfd put %/UniqueID 5 def /CharProcs 50 dict def CharProcs begin /space{}def /.notdef{}def /ru{500 0 rls}def /rn{0 750 moveto 500 0 rls}def /vr{20 800 moveto 0 -770 rls}def /bv{20 800 moveto 0 -1000 rls}def /br{20 770 moveto 0 -1040 rls}def /ul{0 -250 moveto 500 0 rls}def /ob{200 250 rmoveto currentpoint newpath 200 0 360 arc closepath stroke}def /bu{200 250 rmoveto currentpoint newpath 200 0 360 arc closepath fill}def /sq{80 0 rmoveto currentpoint dround newpath moveto 640 0 rlineto 0 640 rlineto -640 0 rlineto closepath stroke}def /bx{80 0 rmoveto currentpoint dround newpath moveto 640 0 rlineto 0 640 rlineto -640 0 rlineto closepath fill}def /ci{355 333 rmoveto currentpoint newpath 333 0 360 arc 50 setlinewidth stroke}def /lt{20 -200 moveto 0 550 rlineto currx 800 2cx s4 add exch s4 a4p stroke}def /lb{20 800 moveto 0 -550 rlineto currx -200 2cx s4 add exch s4 a4p stroke}def /rt{20 -200 moveto 0 550 rlineto currx 800 2cx s4 sub exch s4 a4p stroke}def /rb{20 800 moveto 0 -500 rlineto currx -200 2cx s4 sub exch s4 a4p stroke}def /lk{20 800 moveto 20 300 -280 300 s4 arcto pop pop 1000 sub currentpoint stroke moveto 20 300 4 2 roll s4 a4p 20 -200 lineto stroke}def /rk{20 800 moveto 20 300 320 300 s4 arcto pop pop 1000 sub currentpoint stroke moveto 20 300 4 2 roll s4 a4p 20 -200 lineto stroke}def /lf{20 800 moveto 0 -1000 rlineto s4 0 rls}def /rf{20 800 moveto 0 -1000 rlineto s4 neg 0 rls}def /lc{20 -200 moveto 0 1000 rlineto s4 0 rls}def /rc{20 -200 moveto 0 1000 rlineto s4 neg 0 rls}def end /Metrics 50 dict def Metrics begin /.notdef 0 def /space 500 def /ru 500 def /br 0 def /lt 250 def /lb 250 def /rt 250 def /rb 250 def /lk 250 def /rk 250 def /rc 250 def /lc 250 def /rf 250 def /lf 250 def /bv 250 def /ob 350 def /bu 350 def /ci 750 def /bx 750 def /sq 750 def /rn 500 def /ul 500 def /vr 0 def end DITfd begin /s2 500 def /s4 250 def /s3 333 def /a4p{arcto pop pop pop pop}def /2cx{2 copy exch}def /rls{rlineto stroke}def /currx{currentpoint pop}def /dround{transform round exch round exch itransform} def end end /DIThacks exch definefont pop ditstart (psc)xT 576 1 1 xr 1(Times-Roman)xf 1 f 2(Times-Italic)xf 2 f 3(Times-Bold)xf 3 f 4(Times-BoldItalic)xf 4 f 5(Helvetica)xf 5 f 6(Helvetica-Bold)xf 6 f 7(Courier)xf 7 f 8(Courier-Bold)xf 8 f 9(Symbol)xf 9 f 10(DIThacks)xf 10 f 10 s 1 f xi %%EndProlog %%Page: 1 1 10 s 10 xH 0 xS 1 f 288 768 MXY 0 0 MXY PB 130 -108 moveto 351 0 rlineto 0 -234 rlineto -351 0 rlineto closepath 1 setlinewidth stroke % draw box outline /ctext { 3 1 roll moveto dup stringwidth pop -2 div 0 rmoveto show } def /Times-Roman findfont 18 scalefont setfont 306 -450 (DEPARTMENT OF COMPUTER SCIENCE) ctext 192 -612 translate % position for wordmark -187 -638 translate % cancel wordmark coordinate system %!PS-Adobe-3.0 EPSF-3.0 %%%Creator: Adobe Illustrator(TM) 3.0 %%%For: (Pat Crowe) (PPSS) %%%Title: (wm) %%%CreationDate: (1/2/91) (8:53 AM) %%%DocumentProcessColors: Black %%%DocumentSuppliedResources: procset Adobe_packedarray 2.0 0 %%%+ procset Adobe_cmykcolor 1.1 0 %%%+ procset Adobe_cshow 1.1 0 %%%+ procset Adobe_customcolor 1.0 0 %%%+ procset Adobe_IllustratorA_AI3 1.0 0 %%%BoundingBox: 187 638 416 733 %%AI3_ColorUsage: Black&White %%AI3_TemplateBox: 306 396 306 396 %%AI3_TileBox: -522 761 30 1491 %%AI3_DocumentPreview: Macintosh_Pic %%%EndComments %%%BeginProlog %%%BeginResource: procset Adobe_packedarray 2.0 0 %%%Title: (Packed Array Operators) %%%Version: 2.0 %%%CreationDate: (8/2/90) () %%%Copyright: ((C) 1987-1990 Adobe Systems Incorporated All Rights Reserved) userdict /Adobe_packedarray 5 dict dup begin put /initialize % - initialize - { /packedarray where { pop } { Adobe_packedarray begin Adobe_packedarray { dup xcheck { bind } if userdict 3 1 roll put } forall end } ifelse } def /terminate % - terminate - { } def /packedarray % arguments count packedarray array { array astore readonly } def /setpacking % boolean setpacking - { pop } def /currentpacking % - setpacking boolean { false } def currentdict readonly pop end %%%EndResource Adobe_packedarray /initialize get exec %%%BeginResource: procset Adobe_cmykcolor 1.1 0 %%%Title: (CMYK Color Operators) %%%Version: 1.1 %%%CreationDate: (1/23/89) () %%%Copyright: ((C) 1987-1990 Adobe Systems Incorporated All Rights Reserved) currentpacking true setpacking userdict /Adobe_cmykcolor 4 dict dup begin put /initialize % - initialize - { /setcmykcolor where { pop } { userdict /Adobe_cmykcolor_vars 2 dict dup begin put /_setrgbcolor /setrgbcolor load def /_currentrgbcolor /currentrgbcolor load def Adobe_cmykcolor begin Adobe_cmykcolor { dup xcheck { bind } if pop pop } forall end end Adobe_cmykcolor begin } ifelse } def /terminate % - terminate - { currentdict Adobe_cmykcolor eq { end } if } def /setcmykcolor % cyan magenta yellow black setcmykcolor - { 1 sub 4 1 roll 3 { 3 index add neg dup 0 lt { pop 0 } if 3 1 roll } repeat Adobe_cmykcolor_vars /_setrgbcolor get exec pop } def /currentcmykcolor % - currentcmykcolor cyan magenta yellow black { Adobe_cmykcolor_vars /_currentrgbcolor get exec 3 { 1 sub neg 3 1 roll } repeat 0 } def currentdict readonly pop end setpacking %%%EndResource %%%BeginResource: procset Adobe_cshow 1.1 0 %%%Title: (cshow Operator) %%%Version: 1.1 %%%CreationDate: (1/23/89) () %%%Copyright: ((C) 1987-1990 Adobe Systems Incorporated All Rights Reserved) currentpacking true setpacking userdict /Adobe_cshow 3 dict dup begin put /initialize % - initialize - { /cshow where { pop } { userdict /Adobe_cshow_vars 1 dict dup begin put /_cshow % - _cshow proc {} def Adobe_cshow begin Adobe_cshow { dup xcheck { bind } if userdict 3 1 roll put } forall end end } ifelse } def /terminate % - terminate - { } def /cshow % proc string cshow - { exch Adobe_cshow_vars exch /_cshow exch put { 0 0 Adobe_cshow_vars /_cshow get exec } forall } def currentdict readonly pop end setpacking %%%EndResource %%%BeginResource: procset Adobe_customcolor 1.0 0 %%%Title: (Custom Color Operators) %%%Version: 1.0 %%%CreationDate: (5/9/88) () %%%Copyright: ((C) 1987-1990 Adobe Systems Incorporated All Rights Reserved) currentpacking true setpacking userdict /Adobe_customcolor 5 dict dup begin put /initialize % - initialize - { /setcustomcolor where { pop } { Adobe_customcolor begin Adobe_customcolor { dup xcheck { bind } if pop pop } forall end Adobe_customcolor begin } ifelse } def /terminate % - terminate - { currentdict Adobe_customcolor eq { end } if } def /findcmykcustomcolor % cyan magenta yellow black name findcmykcustomcolor object { 5 packedarray } def /setcustomcolor % object tint setcustomcolor - { exch aload pop pop 4 { 4 index mul 4 1 roll } repeat 5 -1 roll pop setcmykcolor } def /setoverprint % boolean setoverprint - { pop } def currentdict readonly pop end setpacking %%%EndResource %%%BeginResource: procset Adobe_IllustratorA_AI3 1.0 0 %%%Title: (Adobe Illustrator (R) Version 3.0 Abbreviated Prolog) %%%Version: 1.0 %%%CreationDate: (7/22/89) () %%%Copyright: ((C) 1987-1990 Adobe Systems Incorporated All Rights Reserved) currentpacking true setpacking userdict /Adobe_IllustratorA_AI3 61 dict dup begin put %% initialization /initialize % - initialize - { userdict /Adobe_IllustratorA_AI3_vars 46 dict dup begin put %% paint operands /_lp /none def /_pf {} def /_ps {} def /_psf {} def /_pss {} def /_pjsf {} def /_pjss {} def /_pola 0 def /_doClip 0 def %% paint operators /cf currentflat def % - cf flatness %% typography operands /_tm matrix def /_renderStart [/e0 /r0 /a0 /o0 /i0 /i0 /i0 /i0] def /_renderEnd [null null null null /e1 /r1 /a1 /clip] def /_render -1 def /_rise 0 def /_ax 0 def % x character spacing (_ax, _ay, _cx, _cy follows awidthshow naming convention) /_ay 0 def % y character spacing /_cx 0 def % x word spacing /_cy 0 def % y word spacing /_leading [0 0] def /_ctm matrix def /_mtx matrix def /_sp 16#020 def /_hyphen (-) def /_fScl 0 def /_cnt 0 def /_hs 1 def /_nativeEncoding 0 def /_useNativeEncoding 0 def /_tempEncode 0 def /_pntr 0 def %% typography operators /Tx {} def /Tj {} def %% compound path operators /CRender {} def %% printing /_AI3_savepage {} def %% color operands /_gf null def /_cf 4 array def /_if null def /_of false def /_fc {} def /_gs null def /_cs 4 array def /_is null def /_os false def /_sc {} def /_i null def Adobe_IllustratorA_AI3 begin Adobe_IllustratorA_AI3 { dup xcheck { bind } if pop pop } forall end end Adobe_IllustratorA_AI3 begin Adobe_IllustratorA_AI3_vars begin newpath } def /terminate % - terminate - { end end } def %% definition operators /_ % - _ null null def /ddef % key value ddef - { Adobe_IllustratorA_AI3_vars 3 1 roll put } def /xput % key value literal xput - { dup load dup length exch maxlength eq { dup dup load dup length 2 mul dict copy def } if load begin def end } def /npop % integer npop - { { pop } repeat } def %% marking operators /sw % ax ay string sw x y { dup length exch stringwidth exch 5 -1 roll 3 index 1 sub mul add 4 1 roll 3 1 roll 1 sub mul add } def /swj % cx cy fillchar ax ay string swj x y { dup 4 1 roll dup length exch stringwidth exch 5 -1 roll 3 index 1 sub mul add 4 1 roll 3 1 roll 1 sub mul add 6 2 roll /_cnt 0 ddef {1 index eq {/_cnt _cnt 1 add ddef} if} forall pop exch _cnt mul exch _cnt mul 2 index add 4 1 roll 2 index add 4 1 roll pop pop } def /ss % ax ay string matrix ss - { 4 1 roll { % matrix ax ay char 0 0 {proc} - 2 npop (0) exch 2 copy 0 exch put pop gsave false charpath currentpoint 4 index setmatrix stroke grestore moveto 2 copy rmoveto } exch cshow 3 npop } def /jss % cx cy fillchar ax ay string matrix jss - { 4 1 roll { % cx cy fillchar matrix ax ay char 0 0 {proc} - 2 npop (0) exch 2 copy 0 exch put gsave _sp eq { exch 6 index 6 index 6 index 5 -1 roll widthshow currentpoint } { false charpath currentpoint 4 index setmatrix stroke }ifelse grestore moveto 2 copy rmoveto } exch cshow 6 npop } def %% path operators /sp % ax ay string sp - { { 2 npop (0) exch 2 copy 0 exch put pop false charpath 2 copy rmoveto } exch cshow 2 npop } def /jsp % cx cy fillchar ax ay string jsp - { { % cx cy fillchar ax ay char 0 0 {proc} - 2 npop (0) exch 2 copy 0 exch put _sp eq { exch 5 index 5 index 5 index 5 -1 roll widthshow } { false charpath }ifelse 2 copy rmoveto } exch cshow 5 npop } def %% path construction operators /pl % x y pl x y { transform 0.25 sub round 0.25 add exch 0.25 sub round 0.25 add exch itransform } def /setstrokeadjust where { pop true setstrokeadjust /c % x1 y1 x2 y2 x3 y3 c - { curveto } def /C /c load def /v % x2 y2 x3 y3 v - { currentpoint 6 2 roll curveto } def /V /v load def /y % x1 y1 x2 y2 y - { 2 copy curveto } def /Y /y load def /l % x y l - { lineto } def /L /l load def /m % x y m - { moveto } def } {%else /c { pl curveto } def /C /c load def /v { currentpoint 6 2 roll pl curveto } def /V /v load def /y { pl 2 copy curveto } def /Y /y load def /l { pl lineto } def /L /l load def /m { pl moveto } def }ifelse %% graphic state operators /d % array phase d - { setdash } def /cf {} def % - cf flatness /i % flatness i - { dup 0 eq { pop cf } if setflat } def /j % linejoin j - { setlinejoin } def /J % linecap J - { setlinecap } def /M % miterlimit M - { setmiterlimit } def /w % linewidth w - { setlinewidth } def %% path painting operators /H % - H - {} def /h % - h - { closepath } def /N % - N - { _pola 0 eq { _doClip 1 eq {clip /_doClip 0 ddef} if newpath } { /CRender {N} ddef }ifelse } def /n % - n - {N} def /F % - F - { _pola 0 eq { _doClip 1 eq { gsave _pf grestore clip newpath /_lp /none ddef _fc /_doClip 0 ddef } { _pf }ifelse } { /CRender {F} ddef }ifelse } def /f % - f - { closepath F } def /S % - S - { _pola 0 eq { _doClip 1 eq { gsave _ps grestore clip newpath /_lp /none ddef _sc /_doClip 0 ddef } { _ps }ifelse } { /CRender {S} ddef }ifelse } def /s % - s - { closepath S } def /B % - B - { _pola 0 eq { _doClip 1 eq % F clears _doClip gsave F grestore { gsave S grestore clip newpath /_lp /none ddef _sc /_doClip 0 ddef } { S }ifelse } { /CRender {B} ddef }ifelse } def /b % - b - { closepath B } def /W % - W - { /_doClip 1 ddef } def /* % - [string] * - { count 0 ne { dup type (stringtype) eq {pop} if } if _pola 0 eq {newpath} if } def %% group operators /u % - u - {} def /U % - U - {} def /q % - q - { _pola 0 eq {gsave} if } def /Q % - Q - { _pola 0 eq {grestore} if } def /*u % - *u - { _pola 1 add /_pola exch ddef } def /*U % - *U - { _pola 1 sub /_pola exch ddef _pola 0 eq {CRender} if } def /D % polarized D - {pop} def /*w % - *w - {} def /*W % - *W - {} def %% place operators /` % matrix llx lly urx ury string ` - { /_i save ddef 6 1 roll 4 npop concat userdict begin /showpage {} def false setoverprint pop } def /~ % - ~ - { end _i restore } def %% color operators /O % flag O - { 0 ne /_of exch ddef /_lp /none ddef } def /R % flag R - { 0 ne /_os exch ddef /_lp /none ddef } def /g % gray g - { /_gf exch ddef /_fc { _lp /fill ne { _of setoverprint _gf setgray /_lp /fill ddef } if } ddef /_pf { _fc fill } ddef /_psf { _fc ashow } ddef /_pjsf { _fc awidthshow } ddef /_lp /none ddef } def /G % gray G - { /_gs exch ddef /_sc { _lp /stroke ne { _os setoverprint _gs setgray /_lp /stroke ddef } if } ddef /_ps { _sc stroke } ddef /_pss { _sc ss } ddef /_pjss { _sc jss } ddef /_lp /none ddef } def /k % cyan magenta yellow black k - { _cf astore pop /_fc { _lp /fill ne { _of setoverprint _cf aload pop setcmykcolor /_lp /fill ddef } if } ddef /_pf { _fc fill } ddef /_psf { _fc ashow } ddef /_pjsf { _fc awidthshow } ddef /_lp /none ddef } def /K % cyan magenta yellow black K - { _cs astore pop /_sc { _lp /stroke ne { _os setoverprint _cs aload pop setcmykcolor /_lp /stroke ddef } if } ddef /_ps { _sc stroke } ddef /_pss { _sc ss } ddef /_pjss { _sc jss } ddef /_lp /none ddef } def /x % cyan magenta yellow black name gray x - { /_gf exch ddef findcmykcustomcolor /_if exch ddef /_fc { _lp /fill ne { _of setoverprint _if _gf 1 exch sub setcustomcolor /_lp /fill ddef } if } ddef /_pf { _fc fill } ddef /_psf { _fc ashow } ddef /_pjsf { _fc awidthshow } ddef /_lp /none ddef } def /X % cyan magenta yellow black name gray X - { /_gs exch ddef findcmykcustomcolor /_is exch ddef /_sc { _lp /stroke ne { _os setoverprint _is _gs 1 exch sub setcustomcolor /_lp /stroke ddef } if } ddef /_ps { _sc stroke } ddef /_pss { _sc ss } ddef /_pjss { _sc jss } ddef /_lp /none ddef } def %% locked object operator /A % value A - { pop } def currentdict readonly pop end setpacking %% annotate page operator /annotatepage { } def %%%EndResource %%%EndProlog %%%BeginSetup Adobe_cmykcolor /initialize get exec Adobe_cshow /initialize get exec Adobe_customcolor /initialize get exec Adobe_IllustratorA_AI3 /initialize get exec %%%EndSetup 0 A u 0 O 0 g 0 i 0 J 0 j 1 w 4 M []0 d %%AI3_Note: 0 D 236.1606 721.7052 m 236.1606 720.565 236.1406 719.7848 237.2008 719.1647 c 237.2008 719.1047 L 233.24 719.1047 L 233.24 719.1647 L 234.2402 719.5448 234.1201 721.0051 234.1201 721.9053 c 234.1201 730.8272 L 232.3598 730.8272 L 231.4996 730.8272 230.5994 730.6071 229.9793 730.027 c 229.9192 730.027 L 230.6194 732.4075 L 230.6794 732.4075 L 230.9195 732.3075 231.1795 732.3075 231.4396 732.2675 c 231.9397 732.2675 L 239.4013 732.2675 L 239.7413 732.2675 240.0614 732.2875 240.3215 732.4075 c 240.3815 732.4075 L 239.7413 730.027 L 239.6813 730.027 L 239.4213 730.7272 238.6211 730.8272 237.961 730.8272 c 236.1606 730.8272 L 236.1606 721.7052 l f 242.7238 724.2428 m 242.7238 721.3456 L 242.7238 720.6253 242.6277 719.4568 243.4281 719.1527 c 243.4281 719.1047 L 240.3868 719.1047 L 240.3868 719.1527 L 241.1872 719.4568 241.0911 720.6253 241.0911 721.3456 c 241.0911 727.396 L 241.0911 728.1163 241.2032 729.2848 240.3868 729.5889 c 240.3868 729.6369 L 243.4281 729.6369 L 243.4281 729.5889 L 242.6277 729.2848 242.7238 728.1163 242.7238 727.38 c 242.7238 725.3952 L 248.5021 725.3952 L 248.5021 727.38 L 248.5021 728.1163 248.6142 729.2848 247.7978 729.5889 c 247.7978 729.6369 L 250.8551 729.6369 L 250.8551 729.5889 L 250.0387 729.2848 250.1348 728.1163 250.1348 727.396 c 250.1348 721.3456 L 250.1348 720.6253 250.0387 719.4568 250.8551 719.1527 c 250.8551 719.1047 L 247.7978 719.1047 L 247.7978 719.1527 L 248.6142 719.4568 248.5021 720.6253 248.5021 721.3456 c 248.5021 724.2428 L 242.7238 724.2428 l f 254.4034 720.4492 m 256.4362 720.2572 L 257.4926 720.1611 258.5971 720.4972 259.3974 721.2015 c 259.4454 721.2015 L 258.7251 719.1047 L 252.0504 719.1047 L 252.0504 719.1527 L 252.8668 719.4408 252.7707 720.6253 252.7707 721.3456 c 252.7707 727.38 L 252.7707 728.1163 252.8668 729.2848 252.0504 729.5889 c 252.0504 729.6369 L 257.1725 729.6369 L 257.4446 729.6369 257.7167 729.6049 257.9248 729.717 c 257.9728 729.717 L 257.9728 727.8442 L 257.9248 727.8442 L 257.4286 728.4205 256.7723 728.4845 256.036 728.4845 c 255.4758 728.4845 254.9156 728.4685 254.4034 728.3724 c 254.4034 725.3952 L 256.3081 725.3952 L 256.5643 725.3952 256.8204 725.3952 257.0124 725.4913 c 257.0605 725.4913 L 257.0605 723.7946 L 257.0124 723.7946 L 256.7563 724.2588 256.0841 724.2428 255.5879 724.2428 c 254.4034 724.2428 L 254.4034 720.4492 l f 274.0932 720.425 m 273.2131 719.2647 271.5727 718.7846 270.1724 718.7846 c 268.8521 718.7846 267.4518 719.2047 266.5516 720.1849 c 265.4914 721.3452 265.5314 722.6855 265.5314 724.1258 c 265.5314 729.4669 L 265.5314 730.3671 265.6714 731.8274 264.6512 732.2075 c 264.6512 732.2675 L 268.452 732.2675 L 268.452 732.2075 L 267.4518 731.8274 267.5718 730.3671 267.5718 729.4669 c 267.5718 724.1258 L 267.5718 721.4652 268.6921 720.2249 270.7525 720.2249 c 271.8728 720.2249 273.013 720.685 273.6732 721.6252 c 274.1132 722.2254 274.0932 722.7255 274.0932 723.4456 c 274.0932 729.4669 L 274.0932 730.3671 274.2133 731.8274 273.2131 732.2075 c 273.2131 732.2675 L 277.0139 732.2675 L 277.0339 732.2075 L 276.0137 731.8274 276.1337 730.3671 276.1337 729.4469 c 276.1337 721.9053 L 276.1337 721.0051 276.0137 719.5448 277.0339 719.1647 c 277.0339 719.1047 L 274.0932 719.1047 L 274.0932 720.425 l f 280.2988 721.1855 m 280.2988 720.2892 280.2828 719.6489 281.1311 719.1527 c 281.1311 719.1047 L 278.298 719.1047 L 278.298 719.1527 L 279.1463 719.6489 279.1463 720.2892 279.1463 721.1855 c 279.1463 727.5721 L 279.1463 728.4685 279.1463 729.1087 278.314 729.5889 c 278.314 729.6369 L 280.6509 729.6369 L 280.6509 729.6209 L 280.7149 729.4289 280.779 729.3488 280.891 729.2208 c 281.1151 728.9006 L 286.9735 721.5057 L 286.9735 727.5721 L 286.9735 728.4685 286.9895 729.1087 286.1411 729.5889 c 286.1411 729.6369 L 288.9583 729.6369 L 288.9583 729.5889 L 288.1259 729.1087 288.1259 728.4685 288.1259 727.5721 c 288.1259 718.5925 L 286.9895 718.9766 286.4933 719.5048 285.789 720.4172 c 280.2988 727.38 L 280.2988 721.1855 l f 290.5858 727.38 m 290.5858 728.1163 290.6979 729.2848 289.8815 729.5889 c 289.8815 729.6369 L 292.9228 729.6369 L 292.9228 729.5889 L 292.1224 729.2848 292.2185 728.1003 292.2185 727.38 c 292.2185 721.3456 L 292.2185 720.6253 292.1224 719.4568 292.9228 719.1527 c 292.9228 719.1047 L 289.8815 719.1047 L 289.8815 719.1527 L 290.6819 719.4408 290.5858 720.6253 290.5858 721.3456 C 290.5858 727.38 l f 300.6645 726.7398 m 301.0966 727.8762 301.6408 729.1888 300.6484 729.5889 c 300.6484 729.6369 L 303.0334 729.6369 L 298.8077 718.6245 L 297.6873 719.3608 297.3191 719.4888 296.8549 720.7534 c 294.4379 727.3 L 294.1338 728.1163 293.8937 729.0447 293.1894 729.5889 c 293.1894 729.6369 L 295.5264 729.6369 L 295.5104 729.4449 L 295.5104 729.0927 295.8145 728.3084 295.9425 727.9563 c 298.4876 720.9934 L 300.6645 726.7398 l f 306.0684 720.4492 m 308.1012 720.2572 L 309.1577 720.1611 310.2621 720.4972 311.0624 721.2015 c 311.1105 721.2015 L 310.3902 719.1047 L 303.7155 719.1047 L 303.7155 719.1527 L 304.5318 719.4408 304.4358 720.6253 304.4358 721.3456 c 304.4358 727.38 L 304.4358 728.1163 304.5318 729.2848 303.7155 729.5889 c 303.7155 729.6369 L 308.8375 729.6369 L 309.1096 729.6369 309.3818 729.6049 309.5898 729.717 c 309.6379 729.717 L 309.6379 727.8442 L 309.5898 727.8442 L 309.0936 728.4205 308.4374 728.4845 307.7011 728.4845 c 307.1408 728.4845 306.5806 728.4685 306.0684 728.3724 c 306.0684 725.3952 L 307.9732 725.3952 L 308.2293 725.3952 308.4854 725.3952 308.6775 725.4913 c 308.7255 725.4913 L 308.7255 723.7946 L 308.6775 723.7946 L 308.4214 724.2588 307.7491 724.2428 307.2529 724.2428 c 306.0684 724.2428 L 306.0684 720.4492 l f 315.7934 729.6369 m 317.9383 729.6369 319.1067 728.5165 319.1067 727.0599 c 319.1067 725.6513 317.8742 724.5469 316.5617 724.2428 c 318.8666 721.3456 L 319.5389 720.5133 320.5473 719.6809 321.4597 719.1047 c 319.9871 719.1047 L 319.1387 719.1047 318.6105 719.3128 318.1303 719.905 c 316.1775 722.322 L 314.705 724.5789 L 315.9855 724.771 317.4741 725.3632 317.4741 726.8678 c 317.4741 728.0203 316.4817 728.6926 315.4092 728.6445 c 315.0411 728.6285 314.6889 728.5805 314.3208 728.5165 c 314.3208 721.3456 L 314.3208 720.6093 314.2248 719.4408 315.0411 719.1527 c 315.0411 719.1047 L 311.9839 719.1047 L 311.9839 719.1527 L 312.8002 719.4408 312.6881 720.6253 312.6881 721.3456 c 312.6881 727.38 L 312.6881 728.1163 312.8002 729.2848 311.9839 729.5889 c 311.9839 729.6369 L 315.7934 729.6369 l f 326.8471 727.7322 m 326.2228 728.3724 325.2304 728.7406 324.3341 728.7406 c 323.4217 728.7406 322.2692 728.3884 322.2692 727.284 c 322.2692 725.0911 327.7755 725.1231 327.7755 722.1619 c 327.7755 720.4492 325.9827 718.8486 323.4857 718.8486 c 322.5093 718.8486 321.5329 718.9926 320.6206 719.3288 c 320.1564 721.3296 L 321.1008 720.5133 322.4133 720.001 323.6618 720.001 c 324.5742 720.001 325.9507 720.5453 325.9507 721.6657 c 325.9507 724.1627 320.4445 723.7145 320.4445 727.1079 c 320.4445 729.1247 322.5093 729.893 324.4621 729.893 c 325.2624 729.893 326.0788 729.781 326.8471 729.5409 C 326.8471 727.7322 l f 329.6092 727.38 m 329.6092 728.1163 329.7213 729.2848 328.905 729.5889 c 328.905 729.6369 L 331.9462 729.6369 L 331.9462 729.5889 L 331.1459 729.2848 331.2419 728.1003 331.2419 727.38 c 331.2419 721.3456 L 331.2419 720.6253 331.1459 719.4568 331.9462 719.1527 c 331.9462 719.1047 L 328.905 719.1047 L 328.905 719.1527 L 329.7053 719.4408 329.6092 720.6253 329.6092 721.3456 C 329.6092 727.38 l f 336.8708 721.1855 m 336.8708 720.2732 336.8547 719.6489 337.7031 719.1527 c 337.7031 719.1047 L 334.5338 719.1047 L 334.5338 719.1527 L 335.3341 719.4568 335.2381 720.6253 335.2381 721.3456 c 335.2381 728.4845 L 333.8295 728.4845 L 333.1412 728.4845 332.421 728.3084 331.9248 727.8442 c 331.8767 727.8442 L 332.437 729.749 L 332.485 729.749 L 332.6771 729.669 332.8851 729.669 333.0932 729.6369 c 333.4934 729.6369 L 339.4638 729.6369 L 339.7359 729.6369 339.992 729.6529 340.2001 729.749 c 340.2481 729.749 L 339.7359 727.8442 L 339.6879 727.8442 L 339.4798 728.4044 338.8395 728.4845 338.3113 728.4845 c 336.8708 728.4845 L 336.8708 721.1855 l f 345.1265 721.1855 m 345.1265 720.2251 345.1265 719.7449 345.9428 719.1527 c 345.9428 719.1047 L 342.8216 719.1047 L 342.8216 719.1527 L 343.6059 719.5529 343.4938 720.5933 343.4938 721.3456 c 343.4938 723.7145 L 340.8368 727.9883 L 340.4046 728.6926 340.0845 729.2208 339.3161 729.6369 c 340.7727 729.6369 L 341.2849 729.6369 341.7651 729.5249 341.9732 729.1888 c 344.5343 725.0111 L 346.5191 727.9883 L 346.7912 728.3884 347.1593 729.1567 346.263 729.5889 c 346.263 729.6369 L 349.0641 729.6369 L 345.1265 723.7145 L 345.1265 721.1855 l f 365.0425 724.4829 m 365.0425 721.2175 362.3374 718.8486 359.1521 718.8486 c 355.9829 718.8486 353.3098 721.1215 353.3098 724.4028 c 353.3098 727.4441 355.9508 729.9891 359.3122 729.9091 c 362.6736 729.9251 365.0425 727.364 365.0425 724.4829 c f 1 g 359.2002 728.7566 m 361.7292 728.7566 363.2178 726.4997 363.2178 724.1627 c 363.2178 721.7778 361.6652 720.001 359.2322 720.001 c 356.7192 720.001 355.1345 722.306 355.1345 724.5469 c 355.1345 726.9639 356.7192 728.7566 359.2002 728.7566 c f 0 g 368.3314 721.3456 m 368.3314 720.6253 368.2033 719.4568 369.0036 719.1527 c 369.0036 719.1047 L 365.9784 719.1047 L 365.9784 719.1527 L 366.7947 719.4408 366.6987 720.6253 366.6987 721.3456 c 366.6987 727.38 L 366.6987 728.1163 366.7947 729.2848 365.9784 729.5889 c 365.9784 729.6369 L 371.1005 729.6369 L 371.3726 729.6369 371.6447 729.6049 371.8528 729.717 c 371.9008 729.717 L 371.9008 727.8602 L 371.8528 727.8602 L 371.3566 728.4205 370.7003 728.4845 369.964 728.4845 c 369.4038 728.4845 368.8436 728.4685 368.3314 728.3724 c 368.3314 725.3952 L 370.2521 725.3952 L 370.4922 725.3952 370.7483 725.3952 370.9404 725.4913 c 370.9884 725.4913 L 370.9884 723.7946 L 370.9404 723.7946 L 370.6843 724.2748 370.0121 724.2428 369.5158 724.2428 c 368.3314 724.2428 L 368.3314 721.3456 l f 213.3082 708.0982 m 225.5802 677.8043 L 226.7732 673.7297 228.8982 670.9797 230.96 669.2847 c 230.96 669.1047 L 218.6605 669.1047 L 218.6605 669.2847 L 221.6003 670.6046 221.1204 671.2646 219.5004 675.4644 c 216.2606 683.9241 L 201.1412 683.9241 l 198.0213 675.4644 L 196.7613 672.1046 196.3414 670.3646 199.2212 669.2847 c 199.2212 669.1047 L 188.1817 669.1047 L 188.1817 669.2847 L 191.6616 671.0246 192.6815 674.3845 194.0015 677.8043 c 204.0211 702.8233 L 204.6491 704.4382 205.5119 706.858 205.9074 707.9764 C 209.3357 717.8547 l 213.3082 708.0982 L 213.3082 708.0982 L f 1 g 214.5206 688.3639 m 202.7011 688.3639 L 208.6409 703.4833 l 214.5206 688.3639 L f 0 g 246.0102 702.0067 m 252.7106 702.0067 256.3608 698.5065 256.3608 693.9562 c 256.3608 689.5559 252.5106 686.1057 248.4104 685.1557 c 255.6108 676.1051 L 257.7109 673.505 260.8611 670.9048 263.7113 669.1047 c 259.111 669.1047 L 256.4608 669.1047 254.8107 669.7547 253.3106 671.6048 c 247.2103 679.1553 L 242.61 686.2057 L 246.6102 686.8058 251.2605 688.6559 251.2605 693.3562 c 251.2605 696.9564 248.1603 699.0565 244.8101 698.9065 c 243.6601 698.8565 242.56 698.7065 241.4099 698.5065 c 241.4099 676.1051 L 241.4099 673.805 241.1099 670.1548 243.6601 669.2547 c 243.6601 669.1047 L 234.1095 669.1047 L 234.1095 669.2547 L 236.6596 670.1548 236.3096 673.855 236.3096 676.1051 c 236.3096 694.9563 L 236.3096 697.2564 236.6596 700.9066 234.1095 701.8567 c 234.1095 702.0067 L 246.0102 702.0067 l f 265.5988 694.9563 m 265.5988 697.2564 265.9488 700.9066 263.3987 701.8567 c 263.3987 702.0067 L 272.8992 702.0067 L 272.8992 701.8567 L 270.3991 700.9066 270.6991 697.2064 270.6991 694.9563 c 270.6991 676.1051 L 270.6991 673.855 270.3991 670.2048 272.8992 669.2547 c 272.8992 669.1047 L 263.3987 669.1047 L 263.3987 669.2547 L 265.8988 670.1548 265.5988 673.855 265.5988 676.1051 C 265.5988 694.9563 l f 275.8864 669.1047 m 293.9875 697.9064 L 285.4869 698.4065 L 282.3868 698.6065 280.5867 697.9564 277.9865 696.2563 c 277.8365 696.2563 L 280.3866 702.0067 L 302.488 702.0067 L 284.4869 673.3049 L 294.9875 672.7049 L 298.6378 672.5049 301.7879 673.605 304.8881 675.4551 c 305.0381 675.4551 L 302.238 669.1047 L 275.8864 669.1047 l f 342.1312 685.9057 m 342.1312 675.7051 333.6806 668.3046 323.73 668.3046 c 313.8294 668.3046 305.4789 675.4051 305.4789 685.6557 c 305.4789 695.1563 313.7294 703.1068 324.2301 702.8567 c 334.7307 702.9067 342.1312 694.9063 342.1312 685.9057 c f 1 g 323.88 699.2565 m 331.7805 699.2565 336.4308 692.2061 336.4308 684.9057 c 336.4308 677.4552 331.5805 671.9049 323.9801 671.9049 c 316.1296 671.9049 311.1793 679.1053 311.1793 686.1057 c 311.1793 693.6562 316.1296 699.2565 323.88 699.2565 c f 0 g 349.9185 675.6051 m 349.9185 672.8049 349.8685 670.8048 352.5187 669.2547 c 352.5187 669.1047 L 343.6682 669.1047 L 343.6682 669.2547 L 346.3183 670.8048 346.3183 672.8049 346.3183 675.6051 c 346.3183 695.5563 L 346.3183 698.3565 346.3183 700.3566 343.7182 701.8567 c 343.7182 702.0067 L 351.0186 702.0067 L 351.0186 701.9567 L 351.2186 701.3567 351.4186 701.1066 351.7687 700.7066 c 352.4687 699.7066 L 370.7698 676.6051 L 370.7698 695.5563 L 370.7698 698.3565 370.8198 700.3566 368.1697 701.8567 c 368.1697 702.0067 L 376.9702 702.0067 L 376.9702 701.8567 L 374.37 700.3566 374.37 698.3565 374.37 695.5563 c 374.37 667.5046 L 370.8198 668.7047 369.2697 670.3548 367.0696 673.2049 c 349.9185 694.9563 L 349.9185 675.6051 l f 389.9668 681.4554 m 387.3666 674.405 L 386.3166 671.6048 385.9665 670.1548 388.3667 669.2547 c 388.3667 669.1047 L 379.1661 669.1047 L 379.1661 669.2547 L 382.0663 670.7048 382.9163 673.505 384.0164 676.3551 c 392.3669 697.2064 L 393.067 699.0065 394.017 701.1066 391.6169 701.8567 c 391.6169 702.0067 L 399.5174 702.0067 L 409.918 676.3551 L 411.0681 673.505 411.9681 670.7048 414.8183 669.2547 c 414.8183 669.1047 L 404.5677 669.1047 L 404.5677 669.2547 L 407.0178 670.3548 406.6178 670.9048 405.2677 674.405 c 402.5675 681.4554 L 389.9668 681.4554 l f 1 g 401.1175 685.1557 m 391.2669 685.1557 L 396.2172 697.7564 l 401.1175 685.1557 L f 0 g 237.3542 641.7052 m 237.3542 640.565 237.3342 639.7848 238.3944 639.1647 c 238.3944 639.1047 L 234.4336 639.1047 L 234.4336 639.1647 L 235.4338 639.5448 235.3137 641.0051 235.3137 641.9053 c 235.3137 650.8272 L 233.5534 650.8272 L 232.6932 650.8272 231.793 650.6071 231.1729 650.027 c 231.1128 650.027 L 231.813 652.4075 L 231.873 652.4075 L 232.1131 652.3075 232.3731 652.3075 232.6332 652.2675 c 233.1333 652.2675 L 240.5949 652.2675 L 240.9349 652.2675 241.255 652.2875 241.5151 652.4075 c 241.5751 652.4075 L 240.9349 650.027 L 240.8749 650.027 L 240.6149 650.7272 239.8147 650.8272 239.1546 650.8272 c 237.3542 650.8272 L 237.3542 641.7052 l f 248.9914 640.1611 m 248.2872 639.2327 246.9746 638.8486 245.8542 638.8486 c 244.7977 638.8486 243.6773 639.1847 242.957 639.969 c 242.1087 640.8974 242.1407 641.9698 242.1407 643.1223 c 242.1407 647.396 L 242.1407 648.1163 242.2527 649.2848 241.4364 649.5889 c 241.4364 649.6369 L 244.4776 649.6369 L 244.4776 649.5889 L 243.6773 649.2848 243.7733 648.1163 243.7733 647.396 c 243.7733 643.1223 L 243.7733 640.9934 244.6697 640.001 246.3184 640.001 c 247.2147 640.001 248.1271 640.3692 248.6553 641.1215 c 249.0074 641.6017 248.9914 642.0019 248.9914 642.5781 c 248.9914 647.396 L 248.9914 648.1163 249.0875 649.2848 248.2872 649.5889 c 248.2872 649.6369 L 251.3284 649.6369 L 251.3444 649.5889 L 250.5281 649.2848 250.6241 648.1163 250.6241 647.38 c 250.6241 641.3456 L 250.6241 640.6253 250.5281 639.4568 251.3444 639.1527 c 251.3444 639.1047 L 248.9914 639.1047 L 248.9914 640.1611 l f 260.7827 647.6521 m 259.9024 648.2124 258.8779 648.5645 257.8215 648.5645 c 255.6126 648.5645 254.076 646.9479 254.076 644.6269 c 254.076 642.274 255.6766 640.1771 258.1416 640.1771 c 259.3421 640.1771 260.5426 640.6093 261.551 641.1855 c 261.599 641.1855 L 260.7827 639.3288 L 260.0304 638.9766 259.1981 638.8486 258.3657 638.8486 c 254.5882 638.8486 252.2513 640.9614 252.2513 644.3868 c 252.2513 647.7002 254.5882 649.893 257.8695 649.893 c 258.8459 649.893 259.8383 649.717 260.7827 649.4929 C 260.7827 647.6521 l f 268.3796 647.7322 m 267.7553 648.3724 266.7629 648.7406 265.8665 648.7406 c 264.9542 648.7406 263.8017 648.3884 263.8017 647.284 c 263.8017 645.0911 269.3079 645.1231 269.3079 642.1619 c 269.3079 640.4492 267.5152 638.8486 265.0182 638.8486 c 264.0418 638.8486 263.0654 638.9926 262.153 639.3288 c 261.6889 641.3296 L 262.6332 640.5133 263.9458 640.001 265.1943 640.001 c 266.1066 640.001 267.4832 640.5453 267.4832 641.6657 c 267.4832 644.1627 261.977 643.7145 261.977 647.1079 c 261.977 649.1247 264.0418 649.893 265.9946 649.893 c 266.7949 649.893 267.6112 649.781 268.3796 649.5409 C 268.3796 647.7322 l f 281.9461 644.4829 m 281.9461 641.2175 279.241 638.8486 276.0557 638.8486 c 272.8864 638.8486 270.2133 641.1215 270.2133 644.4028 c 270.2133 647.4441 272.8544 649.9891 276.2158 649.9091 c 279.5771 649.9251 281.9461 647.364 281.9461 644.4829 c f 1 g 276.1037 648.7566 m 278.6327 648.7566 280.1213 646.4997 280.1213 644.1627 c 280.1213 641.7778 278.5687 640.001 276.1357 640.001 c 273.6227 640.001 272.0381 642.306 272.0381 644.5469 c 272.0381 646.9639 273.6227 648.7566 276.1037 648.7566 c f 0 g 284.8508 641.1855 m 284.8508 640.2892 284.8348 639.6489 285.6831 639.1527 c 285.6831 639.1047 L 282.85 639.1047 L 282.85 639.1527 L 283.6983 639.6489 283.6983 640.2892 283.6983 641.1855 c 283.6983 647.5721 L 283.6983 648.4685 283.6983 649.1087 282.866 649.5889 c 282.866 649.6369 L 285.2029 649.6369 L 285.2029 649.6209 L 285.2669 649.4289 285.331 649.3488 285.443 649.2208 c 285.6671 648.9006 L 291.5255 641.5057 L 291.5255 647.5721 L 291.5255 648.4685 291.5415 649.1087 290.6931 649.5889 c 290.6931 649.6369 L 293.5103 649.6369 L 293.5103 649.5889 L 292.6779 649.1087 292.6779 648.4685 292.6779 647.5721 c 292.6779 638.5925 L 291.5415 638.9766 291.0453 639.5048 290.341 640.4172 c 284.8508 647.38 L 284.8508 641.1855 l f 306.5596 652.2675 m 310.7205 642.0053 L 311.1806 640.8651 311.5407 639.7448 312.6809 639.1647 c 312.6809 639.1047 L 308.58 639.1047 L 308.58 639.1647 L 309.5603 639.6048 309.4002 639.8248 308.8601 641.2251 c 307.7799 644.0457 L 302.7388 644.0457 l 301.6986 641.2251 L 301.2785 640.1049 301.1385 639.5248 302.0987 639.1647 c 302.0987 639.1047 L 298.4179 639.1047 L 298.4179 639.1647 L 299.5781 639.7448 299.9182 640.8651 300.3583 642.0053 c 303.699 650.3471 L 305.4607 655.1672 l 306.5596 652.2675 L f 1 g 307.1997 645.5261 m 303.2589 645.5261 L 305.2393 650.5671 l 307.1997 645.5261 L f 0 g 317.1511 649.6369 m 319.2959 649.6369 320.4644 648.5165 320.4644 647.0599 c 320.4644 645.6513 319.2319 644.5469 317.9194 644.2428 c 320.2243 641.3456 L 320.8966 640.5133 321.905 639.6809 322.8173 639.1047 c 321.3448 639.1047 L 320.4964 639.1047 319.9682 639.3128 319.488 639.905 c 317.5352 642.322 L 316.0626 644.5789 L 317.3431 644.771 318.8317 645.3632 318.8317 646.8678 c 318.8317 648.0203 317.8393 648.6926 316.7669 648.6445 c 316.3988 648.6285 316.0466 648.5805 315.6785 648.5165 c 315.6785 641.3456 L 315.6785 640.6093 315.5824 639.4408 316.3988 639.1527 c 316.3988 639.1047 L 313.3415 639.1047 L 313.3415 639.1527 L 314.1578 639.4408 314.0458 640.6253 314.0458 641.3456 c 314.0458 647.38 L 314.0458 648.1163 314.1578 649.2848 313.3415 649.5889 c 313.3415 649.6369 L 317.1511 649.6369 l f 322.9703 647.38 m 322.9703 648.1163 323.0824 649.2848 322.266 649.5889 c 322.266 649.6369 L 325.3073 649.6369 L 325.3073 649.5889 L 324.507 649.2848 324.603 648.1003 324.603 647.38 c 324.603 641.3456 L 324.603 640.6253 324.507 639.4568 325.3073 639.1527 c 325.3073 639.1047 L 322.266 639.1047 L 322.266 639.1527 L 323.0664 639.4408 322.9703 640.6253 322.9703 641.3456 C 322.9703 647.38 l f 325.9741 639.1047 m 331.7685 648.3244 L 329.0474 648.4845 L 328.055 648.5485 327.4787 648.3404 326.6464 647.7962 c 326.5984 647.7962 L 327.4147 649.6369 L 334.4896 649.6369 L 328.7272 640.4492 L 332.0886 640.2572 L 333.2571 640.1931 334.2655 640.5453 335.2579 641.1375 c 335.3059 641.1375 L 334.4095 639.1047 L 325.9741 639.1047 l f 347.5286 644.4829 m 347.5286 641.2175 344.8235 638.8486 341.6382 638.8486 c 338.4689 638.8486 335.7959 641.1215 335.7959 644.4028 c 335.7959 647.4441 338.4369 649.9891 341.7983 649.9091 c 345.1596 649.9251 347.5286 647.364 347.5286 644.4829 c f 1 g 341.6862 648.7566 m 344.2153 648.7566 345.7039 646.4997 345.7039 644.1627 c 345.7039 641.7778 344.1512 640.001 341.7183 640.001 c 339.2052 640.001 337.6206 642.306 337.6206 644.5469 c 337.6206 646.9639 339.2052 648.7566 341.6862 648.7566 c f 0 g 350.4333 641.1855 m 350.4333 640.2892 350.4173 639.6489 351.2656 639.1527 c 351.2656 639.1047 L 348.4325 639.1047 L 348.4325 639.1527 L 349.2808 639.6489 349.2808 640.2892 349.2808 641.1855 c 349.2808 647.5721 L 349.2808 648.4685 349.2808 649.1087 348.4485 649.5889 c 348.4485 649.6369 L 350.7854 649.6369 L 350.7854 649.6209 L 350.8495 649.4289 350.9135 649.3488 351.0255 649.2208 c 351.2496 648.9006 L 357.108 641.5057 L 357.108 647.5721 L 357.108 648.4685 357.124 649.1087 356.2757 649.5889 c 356.2757 649.6369 L 359.0928 649.6369 L 359.0928 649.5889 L 358.2605 649.1087 358.2605 648.4685 358.2605 647.5721 c 358.2605 638.5925 L 357.124 638.9766 356.6278 639.5048 355.9235 640.4172 c 350.4333 647.38 L 350.4333 641.1855 l f 362.9612 643.0583 m 362.1289 640.8014 L 361.7928 639.905 361.6807 639.4408 362.449 639.1527 c 362.449 639.1047 L 359.5038 639.1047 L 359.5038 639.1527 L 360.4322 639.6169 360.7043 640.5133 361.0565 641.4256 c 363.7295 648.1003 L 363.9536 648.6766 364.2578 649.3488 363.4895 649.5889 c 363.4895 649.6369 L 366.0185 649.6369 L 369.3478 641.4256 L 369.716 640.5133 370.0041 639.6169 370.9165 639.1527 c 370.9165 639.1047 L 367.6351 639.1047 L 367.6351 639.1527 L 368.4194 639.5048 368.2914 639.6809 367.8592 640.8014 c 366.9949 643.0583 L 362.9612 643.0583 l f 1 g 366.5307 644.2428 m 363.3774 644.2428 L 364.962 648.2764 l 366.5307 644.2428 L f U %%%PageTrailer gsave annotatepage grestore %% showpage %%%Trailer Adobe_IllustratorA_AI3 /terminate get exec Adobe_customcolor /terminate get exec Adobe_cshow /terminate get exec Adobe_cmykcolor /terminate get exec Adobe_packedarray /terminate get exec %%%EOF PE 3 f 14 s 1657 1536(Fast)N 1891(Text)X 2137(Searching)X 2644(W)X (ith)S 2915(Errors)X 1 f 12 s 1965 1920(Sun)N 2138(Wu)X 2301(and)X 2464(Udi)X 2632(Manber)X 2262 2208(TR)N 2409(91-11)X 1 p %%Page: 1 2 12 s 12 xH 0 xS 1 f 10 s 3 f 12 s 1521 816(FAST)N 1790(TEXT)X 2075(SEARCHING)X 2679(WITH)X 2975(ERRORS)X 1 f 10 s 2021 1312(Sun)N 2165(Wu)X 2301(and)X 2437(Udi)X 2577(Manber)X 7 s 2827 1280(1)N 10 s 1910 1552(Department)N 2309(of)X 2396(Computer)X 2736(Science)X 2096 1672(University)N 2454(of)X 2541(Arizona)X 2146 1792(Tucson,)N 2422(AZ)X 2549(85721)X 2284 2152(June)N 2451(1991)X 4 f 2244 2632(ABSTRACT)N 1 f 864 2905(Searching)N 1215(for)X 1339(a)X 1405(pattern)X 1658(in)X 1750(a)X 1816(text)X 1966(\256le)X 2098(is)X 2181(a)X 2248(very)X 2422(common)X 2733(operation)X 3067(in)X 3160(many)X 3369(applications)X 3787(ranging)X 864 3025(from)N 1047(text)X 1194(editors)X 1439(and)X 1582(databases)X 1917(to)X 2006(applications)X 2420(in)X 2509(molecular)X 2856(biology.)X 3166(In)X 3259(many)X 3463(instances)X 3783(the)X 3907(pat-)X 864 3145(tern)N 1011(does)X 1180(not)X 1304(appear)X 1541(in)X 1625(the)X 1745(text)X 1887(exactly.)X 2181(Errors)X 2404(in)X 2488(the)X 2608(text)X 2750(or)X 2839(in)X 2923(the)X 3043(query)X 3248(can)X 3382(result)X 3582(from)X 3761(misspel-)X 864 3265(ling)N 1025(or)X 1129(from)X 1322(experimental)X 1778(errors)X 2003(\(e.g.,)X 2203(when)X 2414(the)X 2549(text)X 2706(is)X 2796(a)X 2869(DNA)X 3080(sequence\).)X 3478(The)X 3639(use)X 3782(of)X 3885(such)X 864 3385(approximate)N 1290(pattern)X 1538(matching)X 1861(has)X 1993(been)X 2171(limited)X 2423(until)X 2595(now)X 2759(to)X 2847(speci\256c)X 3118(applications.)X 3571(Most)X 3761(text)X 3907(edi-)X 864 3505(tors)N 1021(and)X 1174(searching)X 1519(programs)X 1859(do)X 1976(not)X 2115(support)X 2392(searching)X 2737(with)X 2916(errors)X 3141(because)X 3433(of)X 3537(the)X 3672(complexity)X 864 3625(involved)N 1180(in)X 1278(implementing)X 1758(it.)X 1879(In)X 1983(this)X 2135(paper)X 2351(we)X 2482(present)X 2751(a)X 2824(new)X 2995(algorithm)X 3343(for)X 3474(approximate)X 3912(text)X 864 3745(searching)N 1193(which)X 1410(is)X 1484(very)X 1648(fast)X 1785(and)X 1922(very)X 2086(\257exible.)X 2387(We)X 2520(believe)X 2772(that)X 2912(the)X 3030(new)X 3184(algorithm)X 3515(will)X 3659(\256nd)X 3803(its)X 3898(way)X 864 3865(to)N 957(many)X 1166(searching)X 1506(applications)X 1925(and)X 2073(will)X 2229(enable)X 2471(searching)X 2811(with)X 2985(errors)X 3205(to)X 3299(be)X 3407(just)X 3554(as)X 3653(common)X 3965(as)X 864 3985(searching)N 1192(exactly.)X 6 f 14 s 864 4257(1.)N 1019(Introduction)X 1 f 10 s 864 4410(The)N 1033(string-matching)X 1584(problem)X 1895(is)X 1992(a)X 2072(very)X 2259(common)X 2583(problem.)X 2935(We)X 3092(are)X 3236(searching)X 3589(for)X 3728(a)X 3809(pattern)X 2 f 864 4530(P)N 9 f 932(=)X 2 f 989(p)X 1 f 7 s 1038 4546(1)N 2 f 10 s 1072 4530(p)N 1 f 7 s 1121 4546(2)N 2 f 10 s 1155 4530(...p)N 7 s 4546(m)Y 1 f 10 s 1334 4530(inside)N 1558(a)X 1627(large)X 1821(text)X 1974(\256le)X 2 f 2109(T)X 9 f 2172(=)X 2 f 2229(t)X 1 f 7 s 2260 4546(1)N 2 f 10 s 2294 4530(t)N 1 f 7 s 2325 4546(2)N 10 s 2379 4506(.)N 2419(.)X 2459(.)X 2 f 2499 4530(t)N 7 s 2521 4546(n)N 1 f 10 s 2555 4530(.)N 2628(The)X 2786(pattern)X 3042(and)X 3191(the)X 3322(text)X 3475(are)X 3607(sequences)X 3965(of)X 2 f 864 4650(characters)N 1 f 1233(from)X 1415(a)X 1477(\256nite)X 1667(character)X 1989(set)X 2 f 9 f 2104(S)X 1 f 2151(.)X 2217(The)X 2368(characters)X 2721(may)X 2885(be)X 2987(English)X 3257(characters)X 3611(in)X 3700(a)X 3763(text)X 3910(\256le,)X 864 4770(DNA)N 1067(base)X 1239(pairs,)X 1444(lines)X 1624(of)X 1720(source)X 1959(code,)X 2160(angles)X 2394(between)X 2691(edges)X 2903(in)X 2993(polygons,)X 3334(machines)X 3665(or)X 3760(machine)X 864 4890(parts)N 1041(in)X 1124(a)X 1181(production)X 1549(schedule,)X 1871(music)X 2083(notes)X 2273(and)X 2410(tempo)X 2631(in)X 2714(a)X 2771(musical)X 3042(score,)X 3254(etc.)X 3410(We)X 3544(want)X 3722(to)X 3806(\256nd)X 3952(all)X 864 5010 0.3125(occurrences)AN 1313(of)X 2 f 1444(P)X 1 f 1557(in)X 2 f 1683(T)X 1 f 1727(;)X 1832(namely,)X 2151(we)X 2308(are)X 2470(searching)X 2841(for)X 2998(the)X 3159(set)X 3311(of)X 3441(starting)X 3744(positions)X 2 f 864 5130(F)N 1 f 2 f 9 f 939(=)X 1 f 1003({)X 2 f 1054(i)X 1 f 9 f 1108(|)X 1 f 1150(1)X 2 f 9 f 1203(\243)X 2 f 1260(i)X 9 f 1301(\243)X 2 f 1358(n)X 9 f 1411(-)X 2 f 1455(m)X 9 f 1526(+)X 1 f 1570(1)X 1630(such)X 1797(that)X 2 f 1937(t)X 7 s 1959 5146(i)N 10 s 1981 5130(t)N 7 s 2003 5146(i)N 9 f 2028(+)X 1 f 2059(1)X 10 s 2113 5106(.)N 2153(.)X 2193(.)X 2 f 2233 5130(t)N 7 s 2255 5146(i)N 9 f 2280(+)X 2 f 2311(m)X 9 f 2360(-)X 1 f 2391(1)X 10 s 2 f 9 f 2445 5130(=)N 1 f 2 f 2509(P)X 1 f 2558(}.)X 2665(The)X 2820(two)X 2970(most)X 3155(famous)X 3421(algorithms)X 3793(for)X 3917(this)X 864 5250(problem)N 1155(are)X 1277(the)X 1398(Knuth)X 1621(Morris)X 1862(Pratt)X 2036(algorithm)X 2370([KMP77])X 2700(and)X 2839(the)X 2960(Boyer-Moore)X 3420(algorithm)X 3754([BM77].)X 864 5370(There)N 1074(are)X 1195(many)X 1395(extensions)X 1755(to)X 1839(this)X 1976(problem;)X 2307(for)X 2423(example,)X 2737(we)X 2853(may)X 3014(be)X 3113(looking)X 3380(for)X 3497(a)X 3556(set)X 3668(of)X 3758(patterns,)X 8 s 10 f 864 5466(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)N 5 s 1 f 864 5568(1)N 8 s 906 5593(Supported)N 1191(in)X 1263(part)X 1384(by)X 1470(an)X 1552(NSF)X 1692(Presidential)X 2016(Young)X 2212(Investigator)X 2539(Award)X 2734(\(grant)X 2908(DCR-8451397\),)X 3345(with)X 3482(matching)X 3743(funds)X 3908(from)X 864 5689(AT&T,)N 1070(and)X 1178(by)X 1258(an)X 1334(NSF)X 1468(grant)X 1615(CCR-9002351.)X 2 p %%Page: 2 3 8 s 8 xH 0 xS 1 f 10 s 3 f 2428 696(2)N 1 f 864 984(a)N 934(regular)X 1197(expression,)X 1595(a)X 1666(pattern)X 1924(with)X 2101(``wild)X 2332(cards,'')X 2611(etc.)X 2780(String-matching)X 3335(tools)X 3525(are)X 3659(included)X 3970(in)X 864 1104(every)N 1063(reasonable)X 1427(text)X 1567(editor)X 1774(and)X 1910(they)X 2068(serve)X 2258(many)X 2456(different)X 2753(applications.)X 1064 1257(In)N 1174(some)X 1386(instances,)X 1744(however,)X 2085(the)X 2227(pattern)X 2494(and/or)X 2743(the)X 2885(text)X 3049(are)X 3192(not)X 3338(exact.)X 3592(We)X 3748(may)X 3930(not)X 864 1377(remember)N 1214(the)X 1336(exact)X 1530(spelling)X 1807(of)X 1897(a)X 1956(name)X 2153(we)X 2270(are)X 2392(searching,)X 2743(the)X 2864(name)X 3061(may)X 3222(be)X 3321(misspelled)X 3686(in)X 3771(the)X 3892(text,)X 864 1497(the)N 985(text)X 1128(may)X 1290(correspond)X 1671(to)X 1757(a)X 1817(sequence)X 2136(of)X 2227(numbers)X 2527(with)X 2693(a)X 2753(certain)X 2996(property)X 3292(and)X 3432(we)X 3550(do)X 3654(not)X 3780(have)X 3956(an)X 864 1617(exact)N 1058(pattern,)X 1325(the)X 1447(text)X 1591(may)X 1753(be)X 1853(a)X 1913(sequence)X 2232(of)X 2323(DNA)X 2521(molecules)X 2869(and)X 3008(we)X 3125(are)X 3247(looking)X 3514(for)X 3631(approximate)X 864 1737(patterns,)N 1159(etc.)X 1314(The)X 1460(approximate)X 1883(string-matching)X 2412(problem)X 2701(is)X 2776(to)X 2860(\256nd)X 3006(all)X 3108(substrings)X 3454(in)X 2 f 3538(T)X 1 f 3604(that)X 3746(are)X 2 f 3867(close)X 1 f 864 1857(to)N 2 f 951(P)X 1 f 1025(under)X 1233(some)X 1427(measure)X 1720(of)X 1812(closeness.)X 2180(The)X 2330(most)X 2510(common)X 2814(measure)X 3106(of)X 3197(closeness)X 3524(is)X 3601(known)X 3843(as)X 3934(the)X 864 1977(edit)N 1012(distance)X 1303(\(also)X 1487(the)X 1613(Levenshtein)X 2033(measure)X 2329([Le66]\).)X 2643(A)X 2729(string)X 2 f 2939(P)X 1 f 3016(is)X 3098(said)X 3256(to)X 3347(be)X 3452(of)X 3548(distance)X 2 f 3840(k)X 1 f 3905(to)X 3996(a)X 864 2097(string)N 2 f 1067(Q)X 1 f 1146(if)X 1216(we)X 1331(can)X 1464(transform)X 2 f 1797(P)X 1 f 1867(to)X 1950(be)X 2046(equal)X 2240(to)X 2 f 2322(Q)X 1 f 2400(with)X 2562(a)X 2618(sequence)X 2933(of)X 2 f 3020(k)X 1 f 3076(insertions)X 3407(of)X 3494(single)X 3705(characters)X 864 2217(in)N 955(\(arbitrary)X 1289(places)X 1520(in\))X 2 f 1639(P)X 1 f 1688(,)X 1738(deletions)X 2057(of)X 2154(single)X 2375(characters)X 2732(in)X 2 f 2824(P)X 1 f 2873(,)X 2923(or)X 3020(substitutions)X 3453(of)X 3550(characters.)X 3947(IN)X 864 2337(some)N 1070(cases)X 1277(we)X 1408(may)X 1583(want)X 1776(to)X 1875(de\256ne)X 2108(closeness)X 2448(differently.)X 2864(For)X 3012(example,)X 3340(a)X 3412(policeman)X 3782(may)X 3956(be)X 864 2457(searching)N 1202(for)X 1326(a)X 1392(license)X 1645(plate)X 1831(ABC123)X 2145(with)X 2317(the)X 2445(knowledge)X 2827(that)X 2977(the)X 3106(letters)X 3333(are)X 3463(correct,)X 3738(but)X 3871(there)X 864 2577(may)N 1029(be)X 1132(an)X 1235(error)X 1419(with)X 1588(the)X 1713(numbers.)X 2056(In)X 2150(this)X 2292(case,)X 2478(a)X 2540(string)X 2748(is)X 2827(of)X 2920(distance)X 3209(1)X 3275(to)X 3363(ABC123)X 3673(if)X 3748(only)X 3916(one)X 864 2697(error)N 1044(occurs)X 1277(and)X 1416(it)X 1483(is)X 1560(within)X 1788(the)X 1910(digits)X 2111(area.)X 2310(Maybe)X 2557(there)X 2742(are)X 2865(always)X 3112(3)X 3176(digits)X 3377(in)X 3463(a)X 3523(license)X 3770(plate,)X 3970(in)X 864 2817(which)N 1085(case)X 1249(only)X 1416(substitutions)X 1843(are)X 1966(allowed.)X 2284(Sometimes)X 2663(one)X 2803(wants)X 3014(to)X 3100(vary)X 3267(the)X 3389(cost)X 3542(of)X 3633(the)X 3755(different)X 864 2937(edit)N 1004(operations,)X 1378(say)X 1505(deletions)X 1814(cost)X 1963(3,)X 2043(insertions)X 2374(2,)X 2454(and)X 2590(substitutions)X 3013(1.)X 1064 3090(Many)N 1294(different)X 1615(approximate)X 2060(string-matching)X 2611(algorithms)X 2997(have)X 3193(been)X 3389(suggested)X 3749(\([CL90],)X 864 3210([GG88],)N 1156([GP90],)X 1434([HD80],)X 1726([LV88],)X 2009([LV89],)X 2292([My86],)X 2579([TU90],)X 2861(and)X 2998([Uk85a])X 3287(is)X 3361(a)X 3418(partial)X 3644(list\).)X 3829(In)X 3917(this)X 864 3330(paper)N 1066(we)X 1183(present)X 1438(a)X 1497(new)X 1654(algorithm)X 1988(which)X 2207(is)X 2283(very)X 2450(fast)X 2590(in)X 2676(practice,)X 2975(reasonably)X 3347(simple)X 3584(to)X 3670(implement,)X 864 3450(and)N 1004(supports)X 1299(a)X 1359(large)X 1544(number)X 1813(of)X 1904(variations)X 2244(of)X 2334(the)X 2455(approximate)X 2879(string-matching)X 3409(problem.)X 3719(The)X 3867(algo-)X 864 3570(rithm)N 1070(is)X 1156(based)X 1372(on)X 1485(a)X 1554(numeric)X 1850(scheme)X 2124(for)X 2251(exact)X 2454(string)X 2669(matching)X 3000(developed)X 3363(by)X 3476(Baeza-Yates)X 3916(and)X 864 3690(Gonnet)N 1125([BG89])X 1395(\(See)X 1563(also)X 1716([BG91]\).)X 2052(The)X 2201(algorithm)X 2536(can)X 2672(handle)X 2910(several)X 3162(variations)X 3502(of)X 3593(measures)X 3916(and)X 864 3810(most)N 1040(of)X 1128(the)X 1247(common)X 1548(types)X 1738(of)X 1826(queries,)X 2099(including)X 2422(arbitrary)X 2720(regular)X 2969(expressions.)X 3404(In)X 3492(our)X 3620(experiments,)X 864 3930(the)N 988(algorithm)X 1325(was)X 1476(at)X 1560(least)X 1733(twice)X 1933(as)X 2026(fast)X 2168(as)X 2261(other)X 2452(algorithms)X 2820(we)X 2940(tested)X 3153(\(which)X 3402(are)X 3526(not)X 3653(as)X 3745(\257exible\),)X 864 4050(and)N 1003(for)X 1120(many)X 1321(cases)X 1514(an)X 1614(order)X 1808(of)X 1899(magnitude)X 2261(faster.)X 2504(For)X 2639(example,)X 2955(\256nding)X 3205(all)X 3309 0.3125(occurrences)AX 3718(of)X 2 f 3809(Homo-)X 864 4170(genos)N 1 f 1073(allowing)X 1375(two)X 1517(errors)X 1727(in)X 1811(a)X 1869(one)X 2007(megabyte)X 2341(bibliographic)X 2790(text)X 2932(takes)X 3119(about)X 3318(0.4)X 3439(seconds)X 3714(on)X 3815(a)X 3872(SUN)X 864 4290(SparcStation)N 1301(II,)X 1403(which)X 1627(is)X 1708(about)X 1914(twice)X 2117(as)X 2213(fast)X 2358(as)X 2454(running)X 2732(the)X 2859(program)X 2 f 3160(egrep)X 1 f 3372(\(which)X 3624(will)X 3777(not)X 3908(\256nd)X 864 4410(anything)N 1172(because)X 1455(of)X 1550(the)X 1676(misspelling\).)X 2119(We)X 2259(actually)X 2540(used)X 2714(this)X 2856(example)X 3155(and)X 3298(found)X 3512(a)X 3575(misspelling)X 3970(in)X 864 4530(our)N 991(text.)X 1064 4683(The)N 1224(paper)X 1438(is)X 1526(organized)X 1878(as)X 1980(follows.)X 2295(We)X 2442(\256rst)X 2601(describe)X 2904(the)X 3037(algorithm)X 3383(for)X 3512(the)X 3645(pure)X 3823(string-)X 864 4803(matching)N 1189(problem)X 1483(\(i.e.,)X 1655(the)X 1780(pattern)X 2030(is)X 2110(a)X 2173(simple)X 2413(string\).)X 2689(In)X 2782(Section)X 3048(3,)X 3134(we)X 3254(present)X 3512(many)X 3716(variations)X 864 4923(and)N 1012(extensions)X 1382(of)X 1481(the)X 1611(basic)X 1808(algorithm,)X 2171(culminating)X 2585(with)X 2759(matching)X 3089(arbitrary)X 3398(regular)X 3658(expressions)X 864 5043(with)N 1034(errors.)X 1290(Experimental)X 1750(results)X 1987(are)X 2114(given)X 2319(in)X 2408(Section)X 2675(4.)X 2782(In)X 2876(Section)X 3143(5)X 3210(we)X 3331(describe)X 3626(a)X 3689(tool)X 3840(called)X 2 f 864 5163(agrep)N 1 f 1092(for)X 1227(approximate)X 1669(string)X 1893(matching)X 2233(based)X 2458(on)X 2580(the)X 2720(algorithm.)X 3113(Agrep)X 3356(is)X 3451(available)X 3783(through)X 864 5283(anonymous)N 1253(ftp)X 1362(from)X 1538(cs.arizona.edu.)X 3 p %%Page: 3 4 10 s 10 xH 0 xS 1 f 3 f 2428 696(3)N 6 f 14 s 864 984(2.)N 1019(The)X 1248(Algorithm)X 1 f 10 s 864 1137(We)N 1003(\256rst)X 1154(describe)X 1449(the)X 1574(case)X 1740(of)X 1834(exact)X 2031(string)X 2240(matching.)X 2585(The)X 2737(algorithm)X 3075(for)X 3196(this)X 3338(case)X 3505(is)X 3586(identical)X 3890(with)X 864 1257(that)N 1013(of)X 1109(Baeza-Yates)X 1545(and)X 1690(Gonnet)X 1955([BG89].)X 2269(We)X 2410(then)X 2576(show)X 2773(how)X 2939(to)X 3029(extend)X 3271(the)X 3397(algorithm)X 3736(to)X 3826(search)X 864 1377(with)N 1034(errors.)X 1290(We)X 1430(then)X 1596(describe)X 1892(how)X 2058(to)X 2148(speed)X 2359(up)X 2467(the)X 2593(search)X 2827(with)X 2997(errors)X 3213(by)X 3321(using)X 3522(an)X 3627(exact)X 3826(search)X 864 1497(most)N 1039(of)X 1126(the)X 1244(time.)X 6 f 12 s 864 1737(2.1.)N 1078(Exact)X 1360(Matching)X 1 f 10 s 864 1890(Let)N 2 f 996(R)X 1 f 1070(be)X 1171(a)X 1232(bit)X 1341(array)X 1532(of)X 1624(size)X 2 f 1774(m)X 1 f 1857(\(the)X 2007(size)X 2157(of)X 2249(the)X 2372(pattern\).)X 2667(We)X 2804(denote)X 3043(by)X 2 f 3149(R)X 7 s 3202 1906(j)N 1 f 10 s 3250 1890(the)N 3374(value)X 3574(of)X 3667(the)X 3791(array)X 2 f 3983(R)X 1 f 864 2010(after)N 1039(the)X 2 f 1170(j)X 1 f 1219(character)X 1542(of)X 1636(the)X 1761(text)X 1908(has)X 2042(been)X 2221(processed.)X 2605(The)X 2757(array)X 2 f 2950(R)X 7 s 3003 2026(j)N 1 f 10 s 3051 2010(contains)N 3344(information)X 3748(about)X 3952(all)X 864 2130(matches)N 1151(of)X 1242(pre\256xes)X 1520(of)X 2 f 1611(P)X 1 f 1684(that)X 1828(end)X 1968(at)X 2 f 2056(j)X 1 f 2078(.)X 2122(More)X 2320(precisely,)X 2 f 2654(R)X 7 s 2707 2146(j)N 1 f 10 s 2729 2130([)N 2 f 2756(i)X 1 f 2791(])X 2 f 9 f 2831(=)X 1 f 2888(1)X 2952(if)X 3025(the)X 3147(\256rst)X 2 f 3295(i)X 1 f 3341(characters)X 3692(of)X 3784(the)X 3907(pat-)X 864 2250(tern)N 1020(match)X 1247(exactly)X 1510(the)X 1639(last)X 2 f 1781(i)X 1 f 1834(characters)X 2192(up)X 2303(to)X 2 f 2402(j)X 1 f 2455(in)X 2548(the)X 2676(text)X 2826(\(i.e.,)X 2 f 3001(p)X 1 f 7 s 3050 2266(1)N 2 f 10 s 3084 2250(p)N 1 f 7 s 3133 2266(2)N 10 s 3187 2226(.)N 3227(.)X 3267(.)X 2 f 3307 2250(p)N 7 s 2266(i)Y 1 f 10 s 2 f 9 f 3389 2250(=)N 1 f 2 f 3453(t)X 7 s 3479 2266(j)N 9 f 3504(-)X 2 f 3535(i)X 9 f 3560(+)X 1 f 3591(1)X 2 f 10 s 3625 2250(t)N 7 s 3651 2266(j)N 9 f 3676(-)X 2 f 3707(i)X 9 f 3732(+)X 1 f 3763(2)X 10 s 3817 2226(.)N 3857(.)X 3897(.)X 2 f 3937 2250(t)N 7 s 3963 2266(j)N 1 f 10 s 3985 2250(\).)N 864 2370(When)N 1076(we)X 1190(read)X 2 f 1349(t)X 7 s 1375 2386(j)N 9 f 1400(+)X 1 f 1431(1)X 10 s 1485 2370(we)N 1599(need)X 1771(to)X 1853(determine)X 2194(whether)X 2 f 2473(t)X 7 s 2499 2386(j)N 9 f 2524(+)X 1 f 2555(1)X 10 s 2609 2370(can)N 2741(extend)X 2976(any)X 3113(of)X 3201(the)X 3320(partial)X 3546(matches)X 3830(so)X 3922(far.)X 864 2490(For)N 996(each)X 2 f 1165(i)X 1 f 1208(such)X 1376(that)X 2 f 1517(R)X 7 s 1570 2506(j)N 1 f 10 s 1592 2490([)N 2 f 1619(i)X 1 f 1654(])X 2 f 9 f 1694(=)X 1 f 1751(1)X 1812(we)X 1927(need)X 2100(to)X 2183(check)X 2391(whether)X 2 f 2670(t)X 7 s 2696 2506(j)N 9 f 2721(+)X 1 f 2752(1)X 10 s 2806 2490(is)N 2879(equal)X 3073(to)X 2 f 3155(p)X 7 s 2506(i)Y 9 f 3220(+)X 1 f 3251(1)X 10 s 3285 2490(.)N 3345(If)X 2 f 3419(R)X 7 s 3472 2506(j)N 1 f 10 s 3494 2490([)N 2 f 3521(i)X 1 f 3556(])X 2 f 9 f 3596(=)X 1 f 3653(0)X 3713(then)X 3871(there)X 864 2610(is)N 952(no)X 1067(match)X 1298(up)X 1413(to)X 2 f 1510(i)X 1 f 1567(and)X 1718(there)X 1914(cannot)X 2163(be)X 2274(a)X 2345(match)X 2577(up)X 2693(to)X 2 f 2791(i)X 9 f 2826(+)X 1 f 2870(1.)X 2986(If)X 2 f 3076(t)X 7 s 3102 2626(j)N 9 f 3127(+)X 1 f 3158(1)X 2 f 10 s 9 f 3205 2610(=)N 2 f 3262(p)X 1 f 7 s 3311 2626(1)N 10 s 3381 2610(then)N 2 f 3555(R)X 7 s 3608 2626(j)N 9 f 3633(+)X 1 f 3664(1)X 10 s 3698 2610([1])N 2 f 9 f 3805(=)X 1 f 3862(1.)X 3978(If)X 2 f 864 2730(R)N 7 s 917 2746(j)N 9 f 942(+)X 1 f 973(1)X 10 s 1007 2730([)N 2 f 1034(m)X 1 f 1105(])X 2 f 9 f 1152(=)X 1 f 1216(1)X 1281(then)X 1444(we)X 1563(have)X 1739(a)X 1799(complete)X 2117(match,)X 2357(starting)X 2621(at)X 2 f 2709(j)X 9 f 2744(-)X 2 f 2788(m)X 9 f 2859(+)X 1 f 2903(2,)X 2987(and)X 3127(we)X 3245(output)X 3473(it.)X 3581(The)X 3730(transition)X 864 2850(from)N 2 f 1040(R)X 7 s 1093 2866(j)N 1 f 10 s 1135 2850(to)N 2 f 1217(R)X 7 s 1270 2866(j)N 9 f 1295(+)X 1 f 1326(1)X 10 s 1380 2850(can)N 1512(be)X 1608(summarized)X 2020(as)X 2107(follows:)X 1008 2970(Initially,)N 2 f 1301(R)X 1 f 7 s 1359 2986(0)N 10 s 1393 2970([)N 2 f 1420(i)X 1 f 1455(])X 2 f 9 f 1502(=)X 1 f 1566(0)X 1626(for)X 1740(all)X 2 f 1840(i)X 1 f 1862(,)X 1902(1)X 2 f 9 f 1955(\243)X 2 f 2012(i)X 9 f 2053(\243)X 2 f 2110(m)X 1 f 2188(;)X 2 f 2230(R)X 1 f 7 s 2288 2986(0)N 10 s 2322 2970([0])N 2 f 9 f 2436(=)X 1 f 2500(1)X 2560(\(to)X 2669(avoid)X 2867(having)X 3105(a)X 3161(special)X 3404(case)X 3563(for)X 2 f 3677(i)X 9 f 3712(=)X 1 f 3756(1\).)X 2 f 1024 3194(R)N 7 s 1077 3210(j)N 9 f 1102(+)X 1 f 1133(1)X 10 s 1167 3194([)N 2 f 1194(i)X 1 f 1229(])X 2 f 9 f 1296(=)X 1 f 10 f 1393 3114(I)N 1393 3194(K)N 1393 3274(L)N 1 f 3234(0)Y 1413 3122(1)N 1493 3234(otherwise)N 1493 3106(if)N 2 f 1562(R)X 7 s 1615 3122(j)N 1 f 10 s 1637 3106([)N 2 f 1664(i)X 9 f 1699(-)X 1 f 1743(1])X 2 f 9 f 1830(=)X 1 f 1894(1)X 1954(and)X 2 f 2090(p)X 7 s 3122(i)Y 10 s 9 f 2165 3106(=)N 2 f 2222(t)X 7 s 2248 3122(j)N 9 f 2273(+)X 1 f 2304(1)X 10 s 864 3434(If)N 2 f 938(R)X 7 s 991 3450(j)N 9 f 1016(+)X 1 f 1047(1)X 10 s 1081 3434([)N 2 f 1108(m)X 1 f 1179(])X 2 f 9 f 1226(=)X 1 f 1290(1)X 1350(then)X 1508(we)X 1622(output)X 1846(a)X 1902(match)X 2118(at)X 2 f 2202(j)X 9 f 2237(-)X 2 f 2281(m)X 9 f 2352(+)X 1 f 2396(2)X 2456(;)X 1064 3587(This)N 1230(transition,)X 1576(which)X 1796(we)X 1914(have)X 2090(to)X 2176(compute)X 2476(once)X 2652(for)X 2771(every)X 2975(text)X 3120(character,)X 3461(seems)X 3682(quite)X 3867(com-)X 864 3707(plicated.)N 1190(Other)X 1405(fast)X 1553(string-matching)X 2092(algorithms)X 2466(avoid)X 2676(the)X 2806(need)X 2990(to)X 3084(maintain)X 3396(the)X 3526(whole)X 3754(array)X 3952(by)X 864 3827(storing)N 1113(only)X 1282(the)X 2 f 1407(best)X 1563(match)X 1 f 1786(so)X 1885(far)X 2003(and)X 2147(some)X 2344(more)X 2537(information)X 2943(that)X 3091(depends)X 3382(on)X 3490(the)X 3616(pattern.)X 3907(The)X 864 3947(main)N 1052(observation)X 1454(about)X 1660(this)X 1803(transition,)X 2153(due)X 2297(to)X 2387(Baeza-Yates)X 2821(and)X 2964(Gonnet)X 3227([BG89],)X 3519(is)X 3599(that)X 3746(it)X 3817(can)X 3956(be)X 864 4067(computed)N 1201(very)X 1365(fast)X 1502(in)X 1585(practice)X 1861(as)X 1949(follows.)X 2250(Let)X 2378(the)X 2497(alphabet)X 2791(be)X 2 f 9 f 2889(S)X 2949(=)X 2 f 3006(s)X 1 f 7 s 3046 4083(1)N 10 s 3080 4067(,)N 2 f 3106(s)X 1 f 7 s 3146 4083(2)N 10 s 3180 4067(,)N 2 f 3206(...)X 1 f (,)S 2 f 3292(s)X 1 f 7 s 9 f 3341 4083(|)N 2 f 9 f 3361(S)X 1 f 9 f 3403(|)X 10 s 1 f 3429 4067(.)N 3491(For)X 3624(each)X 3794(charac-)X 864 4187(ter)N 2 f 983(s)X 7 s 1014 4203(i)N 1 f 10 s 1070 4187(in)N 1166(the)X 1298(alphabet)X 1604(we)X 1732(construct)X 2059(a)X 2128(bit)X 2245(array)X 2 f 2444(S)X 7 s 4203(i)Y 1 f 10 s 2539 4187(of)N 2639(size)X 2 f 2797(m)X 1 f 2888(such)X 3068(that)X 2 f 3221(S)X 7 s 4203(i)Y 1 f 10 s 3283 4187([)N 2 f 3310(r)X 1 f 3354(])X 2 f 9 f 3394(=)X 1 f 3451(1)X 3524(if)X 2 f 3606(p)X 7 s 4203(r)Y 10 s 9 f 3687 4187(=)N 2 f 3744(s)X 7 s 3775 4203(i)N 1 f 10 s 3797 4187(.)N 3870(\(It)X 3979(is)X 864 4307(suf\256cient)N 1193(to)X 1286(construct)X 1611(the)X 2 f 1740(S)X 1 f 1811(arrays)X 2039(only)X 2212(for)X 2337(the)X 2466(characters)X 2824(that)X 2975(appear)X 3222(in)X 3316(the)X 3446(pattern.\))X 3768(In)X 3867(other)X 864 4427(words,)N 2 f 1103(S)X 7 s 4443(i)Y 1 f 10 s 1188 4427(denotes)N 1456(the)X 1577(indices)X 1827(in)X 1912(the)X 2033(pattern)X 2279(that)X 2422(contain)X 2 f 2681(s)X 7 s 2712 4443(i)N 1 f 10 s 2734 4427(.)N 2796(It)X 2867(is)X 2942(easy)X 3107(to)X 3191(verify)X 3405(now)X 3565(that)X 3707(the)X 3827(transi-)X 864 4547(tion)N 1014(from)X 2 f 1196(R)X 7 s 1249 4563(j)N 1 f 10 s 1297 4547(to)N 2 f 1385(R)X 7 s 1438 4563(j)N 9 f 1463(+)X 1 f 1494(1)X 10 s 1554 4547(amounts)N 1851(to)X 1939(no)X 2045(more)X 2236(than)X 2400(a)X 2 f 2462(right)X 2644(shift)X 1 f 2808(of)X 2 f 2902(R)X 7 s 2955 4563(j)N 1 f 10 s 3004 4547(and)N 3147(an)X 3250(AND)X 3451(operation)X 3781(with)X 2 f 3950(S)X 7 s 4563(i)Y 1 f 10 s 4012 4547(,)N 864 4667(where)N 2 f 1085(s)X 7 s 1116 4683(i)N 10 s 9 f 1151 4667(=)N 2 f 1208(t)X 7 s 1234 4683(j)N 9 f 1259(+)X 1 f 1290(1)X 10 s 1324 4667(.)N 1388(So,)X 1516(each)X 1688(transition)X 2014(can)X 2150(be)X 2250(executed)X 2560(with)X 2726(only)X 2892(two)X 3035(simple)X 3271(arithmetic)X 3619(operations,)X 3996(a)X 864 4787(shift)N 1031(and)X 1172(an)X 1273(AND.)X 1512(We)X 1649(assume)X 1910(that)X 2055(the)X 2178(right)X 2354(shift)X 2521(\256lls)X 2665(the)X 2788(\256rst)X 2937(position)X 3219(with)X 3386(a)X 3447(1.)X 3553(If)X 3633(only)X 3801(0-\256lled)X 864 4907(shifts)N 1063(are)X 1188(available)X 1504(\(as)X 1623(is)X 1701(the)X 1824(case)X 1988(with)X 2155(C\),)X 2280(then)X 2443(we)X 2562(can)X 2699(add)X 2840(one)X 2981(more)X 3171(OR)X 3307(operation)X 3635(with)X 3802(a)X 3863(mask)X 864 5027(that)N 1009(has)X 1141(one)X 1282(bit.)X 1431 0.2955(\(Baeza-Yates)AX 1890(and)X 2031(Gonnet)X 2292([BG89])X 2562(used)X 2734(0)X 2799(to)X 2886(indicate)X 3165(a)X 3226(match)X 3447(and)X 3588(an)X 3689(OR)X 3826(opera-)X 864 5147(tion)N 1010(instead)X 1259(of)X 1348(an)X 1446(AND;)X 1684(that)X 1826(way,)X 2002(0-\256lled)X 2255(shifts)X 2449(are)X 2569(suf\256cient.)X 2928(This)X 3091(is)X 3165(counterintuitive)X 3693(to)X 3776(explain,)X 864 5267(so)N 966(we)X 1091(opted)X 1300(for)X 1425(the)X 1554(easier)X 1773(de\256nition.\))X 2177(An)X 2306(example)X 2609(is)X 2693(given)X 2902(in)X 2996(Figure)X 3237(1a,)X 3365(where)X 3594(the)X 3724(pattern)X 3979(is)X 2 f 864 5387(aabac)N 1 f 1080(and)X 1216(the)X 1334(text)X 1474(is)X 2 f 1547(aabaacaabacab)X 1 f 2059(.)X 2119(The)X 2264(masks)X 2484(for)X 2 f 2598(a)X 2658(b)X 1 f 2718(and)X 2 f 2854(c)X 1 f 2910(are)X 3029(given)X 3227(in)X 3309(Figure)X 3538(1b.)X 1064 5540(The)N 1216(discussion)X 1576(above)X 1795(assumes,)X 2109(of)X 2203(course,)X 2460(that)X 2607(the)X 2732(pattern's)X 3040(size)X 3193(is)X 3274(no)X 3382(more)X 3575(than)X 3741(the)X 3867(word)X 864 5660(size,)N 1047(which)X 1281(is)X 1372(often)X 1575(the)X 1711(case.)X 1928(If)X 2020(the)X 2155(pattern's)X 2473(size)X 2635(is)X 2725(twice)X 2936(the)X 3071(word)X 3273(size,)X 3455(then)X 3630(4)X 3707(arithmetic)X 4 p %%Page: 4 5 10 s 10 xH 0 xS 1 f 3 f 2428 696(4)N 1 f 1308 1233(a)N 1468(a)X 1628(b)X 1788(a)X 1948(a)X 2108(c)X 2268(a)X 2428(a)X 2588(b)X 2748(a)X 2908(c)X 3068(a)X 3228(b)X 3508(a)X 3668(b)X 3828(c)X 1028 1353(a)N 1308(1)X 1468(1)X 1628(0)X 1788(1)X 1948(1)X 2108(0)X 2268(1)X 2428(1)X 2588(0)X 2748(1)X 2908(0)X 3068(1)X 3228(0)X 3508(1)X 3668(0)X 3828(0)X 1028 1473(a)N 1308(0)X 1468(1)X 1628(0)X 1788(0)X 1948(1)X 2108(0)X 2268(0)X 2428(1)X 2588(0)X 2748(0)X 2908(0)X 3068(0)X 3228(0)X 3508(1)X 3668(0)X 3828(0)X 1028 1593(b)N 1308(0)X 1468(0)X 1628(1)X 1788(0)X 1948(0)X 2108(0)X 2268(0)X 2428(0)X 2588(1)X 2748(0)X 2908(0)X 3068(0)X 3228(0)X 3508(0)X 3668(1)X 3828(0)X 1028 1713(a)N 1308(0)X 1468(0)X 1628(0)X 1788(1)X 1948(0)X 2108(0)X 2268(0)X 2428(0)X 2588(0)X 2748(1)X 2908(0)X 3068(0)X 3228(0)X 3508(1)X 3668(0)X 3828(0)X 1028 1833(c)N 1308(0)X 1468(0)X 1628(0)X 1788(0)X 1948(0)X 2108(0)X 2268(0)X 2428(0)X 2588(0)X 2748(0)X 2908(1)X 3068(0)X 3228(0)X 3508(0)X 3668(0)X 3828(1)X 2016 2050(a.)N 3628(b.)X 1303 2266(Figure)N 1532(1:)X 1634(An)X 1752(example)X 2044(of)X 2131(exact)X 2321(matching)X 2639(and)X 2775(the)X 2893(corresponding)X 3372(masks.)X 864 2482(operations)N 1229(will)X 1384(suf\256ce.)X 1649(Patterns)X 1938(of)X 2036(more)X 2232(than)X 2401(64)X 2512(characters)X 2870(are)X 3001(quite)X 3193(rare)X 3351(in)X 3445(practice,)X 3752(although)X 864 2602(there)N 1046(are)X 1166(applications)X 1574(for)X 1689(which)X 1905(they)X 2063(can)X 2195(appear.)X 2470(We)X 2602(discuss)X 2853(this)X 2988(issue)X 3168(further)X 3407(in)X 3489(section)X 3736(3,)X 3816(but)X 3938(for)X 864 2722(now)N 1026(we'll)X 1215(assume)X 1475(that)X 1619(the)X 1741(pattern's)X 2046(size)X 2195(is)X 2272(no)X 2376(more)X 2566(than)X 2729(the)X 2852(word)X 3042(size.)X 3232(This)X 3399(algorithm)X 3735(is)X 3813(clearly)X 864 2842(very)N 1029(easy)X 1194(to)X 1278(implement.)X 1682(Its)X 1784(running)X 2055(time)X 2219(is)X 2294(totally)X 2520(predictable)X 2899(because)X 3175(it)X 3240(depends)X 3524(only)X 3687(on)X 3788(the)X 3907(size)X 864 2962(of)N 951(the)X 1069(text)X 1209(\(assuming)X 1558(again)X 1753(that)X 1894(the)X 2013(pattern)X 2257(\256ts)X 2375(into)X 2520(a)X 2577(word\))X 2790(and)X 2927(not)X 3050(on)X 3151(the)X 3270(actual)X 3483(text)X 3624(or)X 3712(the)X 3831(alpha-)X 864 3082(bet.)N 6 f 12 s 864 3322(2.2.)N 1078(Matching)X 1527(With)X 1763(Errors)X 1 f 10 s 864 3475(We)N 1007(now)X 1176(show)X 1376(how)X 1545(to)X 1638(adapt)X 1843(the)X 1972(previous)X 2279(algorithm)X 2621(to)X 2714(allow)X 2923(errors.)X 3182 0.2955(\(Baeza-Yates)AX 3648(and)X 3796(Gonnet)X 864 3595([BG89])N 1138(showed)X 1412(how)X 1579(to)X 1670(handle)X 1913(only)X 2084(mismatches)X 2491(by)X 2600(essentially)X 2967(counting)X 2 f 3276(k)X 1 f 3341(of)X 3437(them)X 3626(with)X 3796(a)X 3860(log)X 7 s 3962 3611(2)N 2 f 10 s 3996 3595(k)N 1 f 864 3715(size)N 1016(counter,)X 1304(but)X 1433(they)X 1599(did)X 1729(not)X 1859(handle)X 2101(insertions)X 2440(or)X 2535(deletions.\))X 2919(We)X 3059(start)X 3225(with)X 3395(a)X 3459(very)X 3630(simple)X 3871(case:)X 864 3835(only)N 1032(one)X 1174(insertion)X 1480(is)X 1559(allowed)X 1839(into)X 1989(the)X 2113(pattern)X 2362(at)X 2446(any)X 2588(position.)X 2891(In)X 2984(other)X 3175(words,)X 3416(we)X 3535(want)X 3716(to)X 3803(\256nd)X 3952(all)X 864 3955(intervals)N 1162(of)X 1251(size)X 1398(at)X 1478(most)X 2 f 1655(m)X 9 f 1732(+)X 1 f 1789(1)X 1851(in)X 1935(the)X 2055(text)X 2197(that)X 2339(contain)X 2597(the)X 2717(pattern)X 2962(as)X 3052(a)X 3111(subsequence.)X 3580(We)X 3715(de\256ne)X 3934(the)X 2 f 864 4075(R)N 1 f 938(and)X 2 f 1079(S)X 1 f 1144(arrays)X 1366(as)X 1458(before,)X 1709(but)X 1836(now)X 1999(we)X 2118(have)X 2295(two)X 2440(possibilities)X 2846(for)X 2965(each)X 3137(pre\256x)X 3348(match.)X 3608(We)X 3744(can)X 3880(have)X 864 4195(an)N 963(exact)X 1156(match)X 1375(or)X 1465(a)X 1524(match)X 1743(with)X 1908(one)X 2048(insertion.)X 2392(Therefore,)X 2754(we)X 2872(introduce)X 3199(another)X 3464(array,)X 3674(denoted)X 3952(by)X 2 f 864 4315(R)N 7 s 921 4331(j)N 1 f 917 4283(1)N 10 s 951 4315(,)N 1007(which)X 1239(indicates)X 1560(all)X 1676(possible)X 1974(matches)X 2272(up)X 2387(to)X 2 f 2484(t)X 7 s 2510 4331(j)N 1 f 10 s 2567 4315(with)N 2744(at)X 2837(most)X 3027(one)X 3178(insertion.)X 3513(More)X 3722(precisely,)X 2 f 864 4435(R)N 7 s 921 4451(j)N 1 f 917 4403(1)N 10 s 951 4435([)N 2 f 978(i)X 1 f 1013(])X 2 f 9 f 1053(=)X 1 f 1110(1)X 1172(if)X 1243(the)X 1363(\256rst)X 2 f 1509(i)X 1 f 1553(characters)X 1902(of)X 1991(the)X 2111(pattern)X 2356(match)X 2 f 2574(i)X 1 f 2618(of)X 2707(the)X 2827(last)X 2 f 2960(i)X 9 f 2995(+)X 1 f 3039(1)X 3101(characters)X 3450(up)X 3552(to)X 2 f 3642(j)X 1 f 3686(in)X 3771(the)X 3892(text.)X 864 4555(If)N 954(we)X 1084(can)X 1232(maintain)X 1548(both)X 2 f 1726(R)X 1 f 1811(and)X 2 f 1963(R)X 1 f 7 s 2021 4523(1)N 10 s 2091 4555(then)N 2265(we)X 2395(can)X 2543(\256nd)X 2703(all)X 2819(matches)X 3118(with)X 3296(at)X 3389(most)X 3579(one)X 3730(insertion:)X 2 f 864 4675(R)N 7 s 917 4691(j)N 1 f 10 s 939 4675([)N 2 f 966(m)X 1 f 1037(])X 2 f 9 f 1077(=)X 1 f 1134(1)X 1196(indicates)X 1503(that)X 1646(there)X 1830(is)X 1906(an)X 2005(exact)X 2198(match)X 2417(and)X 2 f 2556(R)X 7 s 2613 4691(j)N 1 f 2609 4643(1)N 10 s 2643 4675([)N 2 f 2670(m)X 1 f 2741(])X 2 f 9 f 2781(=)X 1 f 2838(1)X 2901(indicates)X 3209(that)X 3352(there)X 3536(is)X 3612(a)X 3671(match)X 3890(with)X 864 4795(at)N 942(most)X 1117(one)X 1253(insertion)X 1553(\(sometimes)X 1942(both)X 2104(will)X 2248(equal)X 2442(to)X 2524(1)X 2584(at)X 2662(the)X 2780(same)X 2965(time\).)X 1064 4948(The)N 1212(transition)X 1537(for)X 1654(the)X 2 f 1776(R)X 1 f 1849(array)X 2039(is)X 2116(the)X 2238(same)X 2427(as)X 2518(before.)X 2788(We)X 2924(need)X 3100(only)X 3266(to)X 3352(specify)X 3608(the)X 3730(transition)X 864 5068(for)N 2 f 980(R)X 1 f 7 s 1038 5036(1)N 10 s 1072 5068(.)N 1134(There)X 1344(are)X 1465(two)X 1607(cases)X 1799(for)X 1915(a)X 1973(match)X 2191(with)X 2355(at)X 2435(most)X 2612(one)X 2750(insertion)X 3051(of)X 3139(the)X 3258(\256rst)X 2 f 3403(i)X 1 f 3446(characters)X 3794(of)X 2 f 3882(P)X 1 f 3952(up)X 864 5188(to)N 2 f 946(t)X 7 s 972 5204(j)N 9 f 997(+)X 1 f 1028(1)X 10 s 1062 5188(:)N 864 5341(I1.)N 1064(There)X 1272(is)X 1345(an)X 1441(exact)X 1631(match)X 1847(of)X 1934(the)X 2052(\256rst)X 2 f 2196(i)X 1 f 2238(characters)X 2585(up)X 2685(to)X 2 f 2767(t)X 7 s 2793 5357(j)N 1 f 10 s 2815 5341(.)N 2876(In)X 2964(this)X 3100(case,)X 3280(inserting)X 2 f 3581(t)X 7 s 3607 5357(j)N 9 f 3632(+)X 1 f 3663(1)X 10 s 3718 5341(at)N 3797(the)X 3916(end)X 1064 5461(of)N 1151(the)X 1269(exact)X 1459(match)X 1675(creates)X 1919(a)X 1975(match)X 2191(with)X 2353(one)X 2489(insertion.)X 5 p %%Page: 5 6 10 s 10 xH 0 xS 1 f 3 f 2428 696(5)N 1 f 864 984(I2.)N 1064(There)X 1276(is)X 1353(a)X 1413(match)X 1633(of)X 1724(the)X 1846(\256rst)X 2 f 1994(i)X 9 f 2029(-)X 1 f 2073(1)X 2137(characters)X 2489(up)X 2594(to)X 2 f 2681(t)X 7 s 2707 1000(j)N 1 f 10 s 2754 984(with)N 2921(one)X 3062(insertion)X 2 f 3367(and)X 3512(t)X 7 s 3538 1000(j)N 9 f 3563(+)X 1 f 3594(1)X 2 f 10 s 9 f 3641 984(=)N 2 f 3698(p)X 7 s 1000(i)Y 1 f 10 s 3760 984(.)N 3825(In)X 3917(this)X 1064 1104(case,)N 1243(the)X 1361(insertion)X 1661(is)X 1734(somewhere)X 2120(inside)X 2331(the)X 2449(pattern)X 2692(and)X 2828(not)X 2950(at)X 3028(the)X 3146(end.)X 864 1257(Case)N 1046(I1)X 1140(can)X 1279(be)X 1382(handled)X 1663(by)X 1770(just)X 1912(copying)X 2197(the)X 2322(value)X 2523(of)X 2 f 2617(R)X 1 f 2693(to)X 2 f 2782(R)X 1 f 7 s 2840 1225(1)N 10 s 2901 1257(and)N 3044(case)X 3210(I2)X 3304(can)X 3443(be)X 3546(handled)X 3827(with)X 3996(a)X 864 1377(right)N 1037(shift)X 1201(of)X 2 f 1290(R)X 1 f 7 s 1348 1345(1)N 10 s 1404 1377(and)N 1542(an)X 1640(AND)X 1836(operation)X 2160(with)X 2 f 2323(S)X 7 s 1393(i)Y 1 f 10 s 2406 1377(such)N 2574(that)X 2 f 2715(s)X 7 s 2746 1393(i)N 10 s 9 f 2781 1377(=)N 2 f 2838(t)X 7 s 2864 1393(j)N 9 f 2889(+)X 1 f 2920(1)X 10 s 2954 1377(.)N 3015(So,)X 3140(to)X 3223(compute)X 2 f 3520(R)X 7 s 3577 1393(j)N 1 f 3573 1345(1)N 10 s 3628 1377(we)N 3743(need)X 3916(one)X 864 1497(additional)N 1212(shift)X 1382(\(the)X 1535(shift)X 1705(of)X 2 f 1800(R)X 1 f 1877(is)X 1958(done)X 2142(already\),)X 2454(one)X 2598(AND)X 2800(operation)X 3132(and)X 3277(one)X 3422(OR)X 3562(operation.)X 3934(An)X 864 1617(example)N 1160(\(with)X 1353(the)X 1475(same)X 1664(pattern)X 1911(and)X 2051(text)X 2195(as)X 2286(the)X 2408(example)X 2704(for)X 2822(the)X 2944(exact)X 3138(matching,)X 3480(is)X 3557(given)X 3759(in)X 3845(\256gure)X 864 1737(2.)N 1064 1890(Consider)N 1375(now)X 1535(allowing)X 1837(one)X 1975(deletion)X 2255(from)X 2433(the)X 2553(pattern)X 2798(\(and)X 2963(no)X 3065(insertions\).)X 3465(We)X 3599(will)X 3745(de\256ne)X 2 f 3963(R)X 1 f 4012(,)X 2 f 864 2010(R)N 1 f 7 s 922 1978(1)N 10 s 984 2010(\(which)N 1235(now)X 1401(indicates)X 1714(one)X 1857(deletion\),)X 2189(and)X 2 f 2332(S)X 1 f 2399(as)X 2493(before.)X 2766(There)X 2981(are)X 3107(again)X 3308(two)X 3455(cases)X 3652(for)X 3773(a)X 3836(match)X 864 2130(with)N 1026(at)X 1104(most)X 1279(one)X 1415(deletion)X 1693(of)X 1780(the)X 1898(\256rst)X 2 f 2042(i)X 1 f 2084(characters)X 2431(of)X 2 f 2518(P)X 1 f 2587(up)X 2687(to)X 2 f 2769(t)X 7 s 2795 2146(j)N 9 f 2820(+)X 1 f 2851(1)X 10 s 2885 2130(:)N 864 2283(D1.)N 1064(There)X 1274(is)X 1349(an)X 1447(exact)X 1639(match)X 1858(of)X 1948(the)X 2069(\256rst)X 2 f 2216(i)X 9 f 2251(-)X 1 f 2295(1)X 2358(characters)X 2708(up)X 2811(to)X 2 f 2896(t)X 7 s 2922 2299(j)N 9 f 2947(+)X 1 f 2978(1)X 10 s 3035 2283(\(which)N 3281(is)X 3357(indicated)X 3674(by)X 3777(the)X 3898(new)X 1064 2403(value)N 1261(of)X 1351(the)X 2 f 1472(R)X 1 f 1544(array)X 2 f 1733(R)X 7 s 1786 2419(j)N 9 f 1811(+)X 1 f 1842(1)X 10 s 1876 2403([)N 2 f 1903(i)X 9 f 1938(-)X 1 f 1982(1]\).)X 2139(This)X 2304(case)X 2466(corresponds)X 2877(to)X 2962(deleting)X 2 f 3243(p)X 7 s 2419(i)Y 1 f 10 s 3328 2403(and)N 3467(matching)X 3788(the)X 3908(\256rst)X 2 f 1064 2523(i)N 9 f 1099(-)X 1 f 1143(1)X 1203(characters.)X 864 2676(D2.)N 1064(There)X 1277(is)X 1355(a)X 1416(match)X 1637(of)X 1730(the)X 1854(\256rst)X 2 f 2004(i)X 9 f 2039(-)X 1 f 2083(1)X 2149(characters)X 2502(up)X 2608(to)X 2 f 2696(t)X 7 s 2722 2692(j)N 1 f 10 s 2770 2676(with)N 2938(one)X 3080(deletion)X 2 f 3364(and)X 3510(t)X 7 s 3536 2692(j)N 9 f 3561(+)X 1 f 3592(1)X 2 f 10 s 9 f 3639 2676(=)N 2 f 3696(p)X 7 s 2692(i)Y 1 f 10 s 3758 2676(.)N 3824(In)X 3917(this)X 1064 2796(case,)N 1243(the)X 1361(deletion)X 1639(is)X 1712(somewhere)X 2098(inside)X 2309(the)X 2427(pattern)X 2670(and)X 2806(not)X 2928(at)X 3006(the)X 3124(end.)X 864 2949(Case)N 1040(D2)X 1158(is)X 1231(handled)X 1505(as)X 1592(before)X 1818(\(it)X 1909(is)X 1983(exactly)X 2236(the)X 2355(same\),)X 2588(and)X 2725(case)X 2885(D1)X 3004(is)X 3078(handled)X 3353(by)X 3454(a)X 3511(right)X 3683(shift)X 3846(of)X 3934(the)X 864 3069(new)N 1018(value)X 1212(of)X 2 f 1299(R)X 7 s 1352 3085(j)N 9 f 1377(+)X 1 f 1408(1)X 10 s 1442 3069(.)N 1064 3222(Finally)N 1310(let's)X 1468(consider)X 1760(a)X 1816(substitution.)X 2248(That)X 2415(is,)X 2508(we)X 2622(allow)X 2821(replacing)X 3141(one)X 3278(character)X 3595(of)X 2 f 3683(P)X 1 f 3753(with)X 3916(one)X 864 3342(character)N 1191(of)X 2 f 1289(T)X 1 f 1333(.)X 1404(\(We)X 1574(can)X 1717(achieve)X 1994(substitution)X 2397(with)X 2570(one)X 2717(deletion)X 3006(and)X 3153(one)X 3300(insertion,)X 3630(but)X 3762(in)X 3854(many)X 864 3462(cases)N 1054(we)X 1168(want)X 1344(substitution)X 1736(to)X 1818(count)X 2016(as)X 2103(only)X 2265(one)X 2401(error.\))X 2645(We)X 2777(again)X 2971(have)X 3143(two)X 3283(cases:)X 864 3615(S1.)N 1064(There)X 1276(is)X 1353(an)X 1453(exact)X 1647(match)X 1867(of)X 1958(the)X 2080(\256rst)X 2 f 2228(i)X 9 f 2263(-)X 1 f 2307(1)X 2371(characters)X 2723(up)X 2828(to)X 2 f 2915(t)X 7 s 2941 3631(j)N 1 f 10 s 2988 3615(This)N 3155(case)X 3319(corresponds)X 3732(to)X 3819(substi-)X 1064 3735(tuting)N 2 f 1273(t)X 7 s 1299 3751(j)N 9 f 1324(+)X 1 f 1355(1)X 10 s 1412 3735(with)N 2 f 1577(p)X 7 s 3751(i)Y 1 f 10 s 1662 3735(\(whether)N 1971(or)X 2061(not)X 2186(they)X 2347(are)X 2469(equal)X 2666(\320)X 2769(the)X 2890(equality)X 3171(will)X 3318(be)X 3417(indicated)X 3734(in)X 2 f 3818(R)X 1 f 3867(\))X 3916(and)X 1064 3855(matching)N 1382(the)X 1500(\256rst)X 2 f 1644(i)X 9 f 1679(-)X 1 f 1723(1)X 1783(characters.)X 864 4008(S2.)N 1064(There)X 1279(is)X 1359(a)X 1422(match)X 1645(of)X 1739(the)X 1864(\256rst)X 2 f 2015(i)X 9 f 2050(-)X 1 f 2094(1)X 2161(characters)X 2515(up)X 2622(to)X 2 f 2711(t)X 7 s 2737 4024(j)N 1 f 10 s 2787 4008(with)N 2957(one)X 3101(substitution)X 2 f 3501(and)X 3649(t)X 7 s 3675 4024(j)N 9 f 3700(+)X 1 f 3731(1)X 2 f 10 s 9 f 3778 4008(=)N 2 f 3835(p)X 7 s 4024(i)Y 1 f 10 s 3897 4008(.)N 3965(In)X 1064 4128(this)N 1199(case,)X 1378(the)X 1496(substitution)X 1888(is)X 1961(somewhere)X 2347(inside)X 2558(the)X 2676(pattern)X 2919(and)X 3055(not)X 3177(at)X 3255(the)X 3373(end.)X 1036 4464(a)N 1152(a)X 1267(b)X 1382(a)X 1497(a)X 1612(c)X 1728(a)X 1843(a)X 1958(b)X 2073(a)X 2188(c)X 2304(a)X 2419(b)X 2764(a)X 2880(a)X 2995(b)X 3110(a)X 3225(a)X 3340(c)X 3456(a)X 3571(a)X 3686(b)X 3801(a)X 3916(c)X 4032(a)X 4147(b)X 864 4584(a)N 1036(1)X 1152(1)X 1267(0)X 1382(1)X 1497(1)X 1612(0)X 1728(1)X 1843(1)X 1958(0)X 2073(1)X 2188(0)X 2304(1)X 2419(0)X 2649(a)X 2764(1)X 2880(1)X 2995(1)X 3110(1)X 3225(1)X 3340(1)X 3456(1)X 3571(1)X 3686(1)X 3801(1)X 3916(1)X 4032(1)X 4147(1)X 864 4704(a)N 1036(0)X 1152(1)X 1267(0)X 1382(0)X 1497(1)X 1612(0)X 1728(0)X 1843(1)X 1958(0)X 2073(0)X 2188(0)X 2304(0)X 2419(0)X 2649(a)X 2764(0)X 2880(1)X 2995(1)X 3110(1)X 3225(1)X 3340(1)X 3456(1)X 3571(1)X 3686(1)X 3801(1)X 3916(0)X 4032(0)X 4147(0)X 864 4824(b)N 1036(0)X 1152(0)X 1267(1)X 1382(0)X 1497(0)X 1612(0)X 1728(0)X 1843(0)X 1958(1)X 2073(0)X 2188(0)X 2304(0)X 2419(0)X 2649(b)X 2764(0)X 2880(0)X 2995(1)X 3110(1)X 3225(0)X 3340(0)X 3456(0)X 3571(0)X 3686(1)X 3801(1)X 3916(0)X 4032(0)X 4147(0)X 864 4944(a)N 1036(0)X 1152(0)X 1267(0)X 1382(1)X 1497(0)X 1612(0)X 1728(0)X 1843(0)X 1958(0)X 2073(1)X 2188(0)X 2304(0)X 2419(0)X 2649(a)X 2764(0)X 2880(0)X 2995(0)X 3110(1)X 3225(1)X 3340(0)X 3456(0)X 3571(0)X 3686(0)X 3801(1)X 3916(1)X 4032(0)X 4147(0)X 864 5064(c)N 1036(0)X 1152(0)X 1267(0)X 1382(0)X 1497(0)X 1612(0)X 1728(0)X 1843(0)X 1958(0)X 2073(0)X 2188(1)X 2304(0)X 2419(0)X 2649(c)X 2764(0)X 2880(0)X 2995(0)X 3110(0)X 3225(0)X 3340(0)X 3456(0)X 3571(0)X 3686(0)X 3801(0)X 3916(1)X 4032(1)X 4147(0)X 2 f 1728 5248(R)N 3456(R)X 1 f 7 s 3514 5216(1)N 10 s 1352 5464(Figure)N 1581(2:)X 1683(An)X 1801(example)X 2093(for)X 2207(approximate)X 2628(matching)X 2946(with)X 3108(one)X 3244(insertion.)X 6 p %%Page: 6 7 10 s 10 xH 0 xS 1 f 3 f 2428 696(6)N 1 f 864 984(Case)N 1047(S2)X 1158(is)X 1238(again)X 1439(the)X 1564(same.)X 1796(Case)X 1979(S1)X 2090(corresponds)X 2505(to)X 2594(looking)X 2865(at)X 2 f 2950(R)X 7 s 3003 1000(j)N 1 f 10 s 3025 984([)N 2 f 3052(i)X 9 f 3087(-)X 1 f 3131(1])X 3225(as)X 3319(opposed)X 3613(to)X 3702(looking)X 3974(at)X 2 f 864 1104(R)N 7 s 917 1120(j)N 9 f 942(+)X 1 f 973(1)X 10 s 1007 1104([)N 2 f 1034(i)X 9 f 1069(-)X 1 f 1113(1])X 1200(in)X 1282(case)X 1441(D1.)X 1599(Still)X 1751(very)X 1914(few)X 2055(operations)X 2409(cover)X 2608(one)X 2744(substitution)X 3136(as)X 3223(well.)X 1064 1257(We)N 1207(are)X 1337(now)X 1506(ready)X 1716(to)X 1809(consider)X 2112(the)X 2241(general)X 2509(case)X 2679(of)X 2777(up)X 2888(to)X 2 f 2981(k)X 1 f 3048(errors,)X 3287(where)X 3515(an)X 3623(error)X 3812(can)X 3956(be)X 864 1377(either)N 1079(an)X 1187(insertion,)X 1519(a)X 1587(deletion,)X 1897(or)X 1996(a)X 2064(substitution)X 2468(\(the)X 2625(Levenshtein)X 3049(or)X 3147(the)X 3276(edit-distance)X 3717(measure\).)X 864 1497(Overall,)N 1154(instead)X 1410(of)X 1506(one)X 1651(additional)X 2 f 2000(R)X 1 f 7 s 2058 1465(1)N 10 s 2121 1497(array,)N 2336(we)X 2459(will)X 2612(maintain)X 2 f 2921(k)X 1 f 2986(additional)X 3335(arrays)X 2 f 3562(R)X 1 f 7 s 3620 1465(1)N 10 s 3654 1497(,)N 2 f 3693(R)X 1 f 7 s 3751 1465(2)N 10 s 3785 1497(,)N 2 f 3824(...)X 1 f (,)S 2 f 3923(R)X 7 s 3981 1465(k)N 1 f 10 s 4012 1497(,)N 864 1617(such)N 1034(that)X 1177(array)X 2 f 1366(R)X 7 s 1424 1585(d)N 1 f 10 s 1481 1617(stores)N 1691(all)X 1794(possible)X 2078(matches)X 2363(with)X 2527(up)X 2629(to)X 2 f 2713(d)X 1 f 2775(errors.)X 3025(We)X 3159(need)X 3333(to)X 3417(determine)X 3760(the)X 3880(tran-)X 864 1737(sition)N 1065(from)X 1245(array)X 2 f 1436(R)X 7 s 1493 1753(j)N 1489 1705(d)N 1 f 10 s 1548 1737(to)N 2 f 1635(R)X 7 s 1692 1753(j)N 9 f 1717(+)X 1 f 1748(1)X 2 f 1688 1705(d)N 1 f 10 s 1782 1737(.)N 1847(There)X 2060(are)X 2184(4)X 2249(possibilities)X 2655(for)X 2774(obtaining)X 3101(a)X 3162(match)X 3383(of)X 3475(the)X 3598(\256rst)X 2 f 3747(i)X 1 f 3794(charac-)X 864 1857(ters)N 1000(with)X 2 f 9 f 1175(\243)X 2 f 1232(d)X 1 f 1292(errors)X 1500(up)X 1600(to)X 2 f 1682(t)X 7 s 1708 1873(j)N 9 f 1733(+)X 1 f 1764(1)X 10 s 1798 1857(:)N 864 2010(1.)N 1064(There)X 1277(is)X 1355(a)X 1416(match)X 1638(of)X 1731(the)X 1855(\256rst)X 2 f 2005(i)X 9 f 2040(-)X 1 f 2084(1)X 2150(characters)X 2503(with)X 2 f 9 f 2684(\243)X 2 f 2741(d)X 1 f 2807(errors)X 3021(up)X 3127(to)X 2 f 3215(t)X 7 s 3241 2026(j)N 1 f 10 s 3289 2010(and)N 2 f 3431(t)X 7 s 3457 2026(j)N 9 f 3482(+)X 1 f 3513(1)X 2 f 10 s 9 f 3560 2010(=)N 2 f 3617(p)X 7 s 2026(i)Y 1 f 10 s 3679 2010(.)N 3725(This)X 3893(case)X 1064 2130(corresponds)N 1472(to)X 1554(matching)X 2 f 1872(t)X 7 s 1898 2146(j)N 9 f 1923(+)X 1 f 1954(1)X 10 s 1988 2130(.)N 864 2283(2.)N 1064(There)X 1272(is)X 1345(a)X 1401(match)X 1617(of)X 1704(the)X 1822(\256rst)X 2 f 1966(i)X 9 f 2001(-)X 1 f 2045(1)X 2105(characters)X 2452(with)X 2 f 9 f 2627(\243)X 2 f 2684(d)X 9 f 2737(-)X 1 f 2781(1)X 2841(errors)X 3049(up)X 3149(to)X 2 f 3232(t)X 7 s 3258 2299(j)N 1 f 10 s 3280 2283(.)N 3321(This)X 3484(case)X 3644(corresponds)X 1064 2403(to)N 1146(substituting)X 2 f 1538(t)X 7 s 1564 2419(j)N 9 f 1589(+)X 1 f 1620(1)X 10 s 1654 2403(.)N 864 2556(3.)N 1064(There)X 1293(is)X 1387(a)X 1464(match)X 1701(of)X 1809(the)X 1948(\256rst)X 2 f 2113(i)X 9 f 2148(-)X 1 f 2192(1)X 2273(characters)X 2642(with)X 2 f 9 f 2839(\243)X 2 f 2896(d)X 9 f 2949(-)X 1 f 2993(1)X 3075(errors)X 3305(up)X 3427(to)X 2 f 3531(t)X 7 s 3557 2572(j)N 9 f 3582(+)X 1 f 3613(1)X 10 s 3647 2556(.)N 3709(This)X 3893(case)X 1064 2676(corresponds)N 1472(to)X 1554(deleting)X 2 f 1832(p)X 7 s 2692(i)Y 1 f 10 s 1894 2676(.)N 864 2829(4.)N 1064(There)X 1273(is)X 1347(a)X 1404(match)X 1621(of)X 1709(the)X 1828(\256rst)X 2 f 1973(i)X 1 f 2016(characters)X 2364(with)X 2 f 9 f 2540(\243)X 2 f 2597(d)X 9 f 2650(-)X 1 f 2694(1)X 2755(errors)X 2964(up)X 3065(to)X 2 f 3148(t)X 7 s 3174 2845(j)N 1 f 10 s 3196 2829(.)N 3237(This)X 3400(case)X 3560(corresponds)X 3970(to)X 1064 2949(inserting)N 2 f 1364(t)X 7 s 1390 2965(j)N 9 f 1415(+)X 1 f 1446(1)X 10 s 1480 2949(.)N 864 3102(Let's)N 1057(denote)X 2 f 1300(R)X 1 f 1378(as)X 2 f 1474(R)X 1 f 7 s 1532 3070(0)N 10 s 1566 3102(,)N 1615(and)X 1760(assume)X 2025(that)X 2 f 2174(t)X 7 s 2200 3118(j)N 9 f 2225(+)X 1 f 2256(1)X 10 s 2 f 9 f 2310 3102(=)N 1 f 2 f 2374(s)X 7 s 2405 3118(c)N 1 f 10 s 2436 3102(.)N 2505(Overall,)X 2795(we)X 2918(have)X 3099(the)X 3226(following)X 3566(expression)X 3938(for)X 2 f 864 3222(R)N 7 s 921 3238(j)N 9 f 946(+)X 1 f 977(1)X 2 f 917 3190(d)N 1 f 10 s 1011 3222(:)N 2 f 1024 3390(R)N 1 f 7 s 1077 3406(0)N 2 f 1077 3358(d)N 10 s 9 f 1157 3390(=)N 1 f 1241(11..100...000)X 2 f 1701(d)X 1 f 1767(ones)X 2 f 1914(.)X 1024 3606(R)N 7 s 1081 3622(j)N 9 f 1106(+)X 1 f 1137(1)X 2 f 1077 3574(d)N 10 s 9 f 1217 3606(=)N 2 f 1301(Rshift)X 1 f 1493([)X 2 f 1520(R)X 7 s 1577 3622(j)N 1573 3574(d)N 1 f 10 s 1607 3606(])N 1654(AND)X 2 f 1848(S)X 7 s 3622(c)Y 1 f 10 s 1939 3606(OR)N 2 f 2070(Rshift)X 1 f 2262([)X 2 f 2289(R)X 7 s 2346 3622(j)N 2342 3574(d)N 9 f 2379(-)X 1 f 2410(1)X 10 s 2444 3606(])N 2491(OR)X 2 f 2622(Rshift)X 1 f 2814([)X 2 f 2841(R)X 7 s 2898 3622(j)N 9 f 2923(+)X 1 f 2954(1)X 2 f 2894 3574(d)N 9 f 2931(-)X 1 f 2962(1)X 10 s 3002 3606(])N 3049(OR)X 2 f 3180(R)X 7 s 3237 3622(j)N 3233 3574(d)N 9 f 3270(-)X 1 f 3301(1)X 10 s 3878 3822(\(2.1\))N 2 f 9 f 1177(=)X 2 f 1261(Rshift)X 1 f 1453([)X 2 f 1480(R)X 7 s 1537 3838(j)N 1533 3790(d)N 1 f 10 s 1567 3822(])N 1614(AND)X 2 f 1808(S)X 7 s 3838(c)Y 1 f 10 s 1899 3822(OR)N 2 f 2030(Rshift)X 1 f 2222([)X 2 f 2249(R)X 7 s 2306 3838(j)N 2302 3790(d)N 9 f 2339(-)X 1 f 2370(1)X 10 s 2424 3822(OR)N 2 f 2555(R)X 7 s 2612 3838(j)N 9 f 2637(+)X 1 f 2668(1)X 2 f 2608 3790(d)N 9 f 2645(-)X 1 f 2676(1)X 10 s 2716 3822(])N 2763(OR)X 2 f 2894(R)X 7 s 2951 3838(j)N 2947 3790(d)N 9 f 2984(-)X 1 f 3015(1.)X 10 s 864 3990(Overall,)N 1157(we)X 1283(have)X 1467(a)X 1535(total)X 1709(of)X 1808(two)X 1960(shifts,)X 2185(one)X 2333(AND,)X 2559(and)X 2707(three)X 2900(ORs)X 3074(for)X 3200(each)X 2 f 3381(R)X 7 s 3439 3958(d)N 1 f 10 s 3473 3990(.)N 3546(There)X 3767(are)X 2 f 3899(k)X 9 f 3948(+)X 1 f 3992(1)X 864 4110(arrays,)N 1103(so)X 1196(the)X 1316(total)X 1480(amount)X 1742(of)X 1831(work)X 2018(is)X 2 f 2093(O)X 1 f 2164(\(\()X 2 f 2218(k)X 9 f 2267(+)X 1 f 2311(1\))X 2 f 2378(n)X 1 f 2424(\).)X 2513(An)X 2633(important)X 2966(feature)X 3212(of)X 3301(this)X 3438(algorithm)X 3771(is)X 3846(that)X 3988(it)X 864 4230(can)N 1002(be)X 1104(relatively)X 1433(easily)X 1646(extended)X 1962(to)X 2050(several)X 2304(more)X 2495(complicated)X 2913(patterns.)X 3234(This)X 3403(is)X 3483(the)X 3608(topic)X 3795(of)X 3889(Sec-)X 864 4350(tion)N 1008(3.)X 6 f 12 s 864 4590(2.3.)N 1078(An)X 1233(Improvement)X 1862(to)X 1980(the)X 2151(Main)X 2397(Algorithm)X 1 f 10 s 864 4743(If)N 941(the)X 1062(number)X 1330(of)X 1420(errors)X 1631(is)X 1707(small)X 1903(compared)X 2243(to)X 2328(the)X 2449(size)X 2597(of)X 2687(the)X 2808(pattern,)X 3074(then)X 3235(we)X 3352(can)X 3487(improve)X 3777(the)X 3898(run-)X 864 4863(ning)N 1030(time)X 1196(sometimes)X 1562(by)X 1666(what)X 1846(we)X 1964(call)X 2 f 2104(the)X 2226(partition)X 2529(approach)X 1 f 2836(.)X 2900(Suppose)X 3195(again)X 3393(that)X 3537(the)X 3659(pattern)X 2 f 3906(P)X 1 f 3979(is)X 864 5015(of)N 951(size)X 2 f 1096(m)X 1 f 1174(and)X 1310(that)X 1450(at)X 1528(most)X 2 f 1703(k)X 1 f 1759(errors)X 1967(are)X 2086(allowed.)X 2400(Let)X 2 f 2527(r)X 1 f 2 f 9 f 2584(=)X 1 f 2 f 10 f 2648(Q)X 2 f 2701 5071(k)N 9 f 2750(+)X 1 f 2794(1)X 2 f 2739 4967(m)N 1 f 10 f 2690 4991(hhhh)N 2 f 10 f 2868 5015(P)N 1 f (,)S 2928(and)X 3064(let)X 2 f 3164(P)X 1 f 7 s 3222 5031(1)N 10 s 3256 5015(,)N 2 f 3295(P)X 1 f 7 s 3353 5031(2)N 10 s 3387 5015(,)N 2 f 3426(...)X 1 f (,)S 2 f 3525(P)X 7 s 3574 5031(k)N 9 f 3608(+)X 1 f 3639(1)X 10 s 3693 5015(be)N 3789(the)X 3908(\256rst)X 2 f 864 5175(k)N 9 f 913(+)X 1 f 957(1)X 1022(blocks)X 1256(of)X 2 f 1347(P)X 1 f 1420(each)X 1592(of)X 1683(size)X 2 f 1832(r)X 1 f 1863(.)X 1927(In)X 2018(other)X 2207(words,)X 2 f 2447(P)X 1 f 7 s 2505 5191(1)N 10 s 2 f 9 f 2559 5175(=)N 1 f 2 f 2623(p)X 1 f 7 s 2672 5191(1)N 2 f 10 s 2706 5175(p)N 1 f 7 s 2755 5191(2)N 10 s 2822 5151(.)N 2862(.)X 2902(.)X 2 f 2955 5175(p)N 7 s 5191(r)Y 1 f 10 s 3023 5175(,)N 3067(...,)X 2 f 3171(P)X 7 s 3224 5191(j)N 10 s 9 f 3259 5175(=)N 2 f 3316(p)X 1 f 7 s 3365 5191(\()N 2 f 3388(j)X 9 f 3413(-)X 1 f 3444(1\))X 2 f 3491(r)X 9 f 3522(+)X 1 f 3553(1)X 10 s 3620 5151(.)N 3660(.)X 3700(.)X 2 f 3753 5175(p)N 7 s 3797 5191(jr)N 1 f 10 s 3841 5175(.)N 3905(If)X 2 f 3983(P)X 1 f 864 5295(matches)N 1153(the)X 1277(text)X 1423(with)X 1592(at)X 1677(most)X 2 f 1859(k)X 1 f 1922(errors,)X 2157(then)X 2322(at)X 2407(least)X 2581(one)X 2724(of)X 2818(the)X 2 f 2943(P)X 7 s 2996 5311(j)N 1 f 10 s 3018 5295('s)N 3103(must)X 3285(match)X 3508(the)X 3633(text)X 3780(exactly.)X 864 5415(We)N 998(can)X 1132(search)X 1360(for)X 1476(all)X 2 f 1578(P)X 7 s 1631 5431(j)N 1 f 10 s 1653 5415('s)N 1733(at)X 1813(the)X 1933(same)X 2120(time)X 2284(\(we)X 2427(discuss)X 2680(how)X 2840(to)X 2923(do)X 3024(that)X 3165(in)X 3248(the)X 3367(next)X 3526(paragraph\))X 3896(and,)X 864 5535(if)N 941(one)X 1085(of)X 1180(them)X 1368(matches,)X 1679(then)X 1845(we)X 1967(check)X 2183(the)X 2309(whole)X 2533(pattern)X 2784(directly)X 3057(\(using)X 3286(the)X 3413(scheme)X 3683(in)X 3774(2.2\))X 3930(but)X 864 5655(only)N 1028(within)X 1254(a)X 1312(neighborhood)X 1779(of)X 1868(size)X 2 f 2015(m)X 1 f 2095(from)X 2273(the)X 2393(position)X 2672(of)X 2761(the)X 2881(match.)X 3139(Since)X 3339(we)X 3455(are)X 3576(looking)X 3841(for)X 3956(an)X 7 p %%Page: 7 8 10 s 10 xH 0 xS 1 f 3 f 2428 696(7)N 1 f 864 984(exact)N 1057(match,)X 1296(there)X 1480(is)X 1556(no)X 1659(need)X 1834(to)X 1919(maintain)X 2222(all)X 2 f 2326(k)X 1 f 2386(of)X 2477(the)X 2 f 2599(R)X 7 s 2657 952(d)N 1 f 10 s 2715 984(vectors.)N 3011(This)X 3177(scheme)X 3442(will)X 3590(run)X 3721(fast)X 3861(if)X 3934(the)X 864 1104(number)N 1140(of)X 1238(exact)X 1439(matches)X 1733(to)X 1826(any)X 1972(one)X 2118(of)X 2215(the)X 2 f 2343(P)X 7 s 2396 1120(j)N 1 f 10 s 2418 1104('s)N 2506(is)X 2589(not)X 2721(too)X 2853(high.)X 3065(The)X 3220(number)X 3495(of)X 3592(such)X 3769(matches)X 864 1224(depend)N 1119(on)X 1222(many)X 1423(factors)X 1665(including)X 1990(the)X 2111(size)X 2259(of)X 2349(the)X 2470(alphabet,)X 2785(the)X 2906(actual)X 3121(text,)X 3284(and)X 3423(the)X 3544(values)X 3772(of)X 2 f 3862(r)X 1 f 3916(and)X 2 f 864 1344(m)N 1 f 922(.)X 985(For)X 1119(example,)X 1434(if)X 2 f 1506(r)X 9 f 1556(=)X 1 f 1613(1,)X 1696(then)X 1857(we)X 1973(will)X 2119(need)X 2293(to)X 2377(check)X 2587(any)X 2725(time)X 2889(there)X 3072(is)X 3147(a)X 3205(character)X 3523(match,)X 3761(which)X 3979(is)X 864 1464(probably)N 1174(too)X 1301(often.)X 1531(On)X 1654(the)X 1777(other)X 1967(hand,)X 2168(if)X 2 f 2242(r)X 9 f 2292(=)X 1 f 2349(3,)X 2 f 2434(m)X 9 f 2511(=)X 1 f 2568(12)X 2673(\(which)X 2921(implies)X 2 f 3181(k)X 9 f 3236(=)X 1 f 3293(3\),)X 3406(the)X 3530(alphabet)X 3828(size)X 3979(is)X 864 1584(26,)N 990(and)X 1132(the)X 1256(text)X 1402(is)X 1481(uniformly)X 1826(random)X 2096(\(i.e.,)X 2266(each)X 2439(character)X 2760(appears)X 3031(with)X 3198(the)X 3321(same)X 3511(probability\),)X 3934(the)X 864 1704(expected)N 1171(number)X 1437(of)X 1525(matches)X 1809(of)X 1897(any)X 2034(of)X 2123(the)X 2 f 2243(P)X 7 s 2296 1720(j)N 1 f 10 s 2318 1704('s)N 2398(is)X 2473(about)X 2673(0.02%)X 2902(of)X 2991(the)X 3111(time.)X 3315(In)X 3404(this)X 3541(case,)X 3722(it)X 3788(is)X 3863(obvi-)X 864 1824(ously)N 1063(advantageous)X 1525(to)X 1612(search)X 1843(for)X 1962(exact)X 2157(matches)X 2445(and)X 2586(use)X 2718(the)X 2841(approximate)X 3267(scheme)X 3533(only)X 3700(at)X 3783(the)X 3906(rare)X 864 1944(occasions)N 1199(where)X 1419(a)X 1478(match)X 1697(occurs.)X 1970(The)X 2118(running)X 2390(time)X 2555(in)X 2640(this)X 2778(case)X 2940(is)X 3016(essentially)X 3377(the)X 3498(same)X 3686(as)X 3776(the)X 3898(run-)X 864 2064(ning)N 1030(time)X 1196(of)X 1287(a)X 1347(search)X 1576(without)X 1843(errors.)X 2094(Experiments)X 2522(using)X 2718(this)X 2856(partition)X 3150(scheme)X 3414(for)X 3531(different)X 3831(alpha-)X 864 2184(bet)N 982(sizes)X 1158(are)X 1277(given)X 1475(in)X 1557(Section)X 1817(4.)X 1064 2337(The)N 1219(main)X 1409(advantage)X 1765(of)X 1862(this)X 2007(scheme)X 2278(is)X 2361(that)X 2511(the)X 2639(algorithm)X 2980(for)X 3104(exact)X 3304(matching)X 3632(presented)X 3970(in)X 864 2457(Section)N 1125(2.1)X 1246(can)X 1379(be)X 1476(adapted)X 1747(in)X 1830(an)X 1926(elegant)X 2178(way)X 2332(to)X 2414(support)X 2674(it.)X 2778(We)X 2910(illustrate)X 3210(the)X 3328(idea)X 3482(with)X 3644(an)X 3740(example.)X 864 2577(Suppose)N 1171(that)X 1327(the)X 1461(pattern)X 1720(is)X 1809(ABCDEFGHIJKL)X 2441(\()X 2 f 2468(m)X 9 f 2545(=)X 1 f 2602(12\))X 2745(and)X 2 f 2898(k)X 9 f 2953(=)X 1 f 3010(3.)X 3127(We)X 3276(divide)X 3513(the)X 3648(pattern)X 3908(into)X 2 f 864 2697(k)N 9 f 913(+)X 1 f 957(1)X 2 f 9 f 1010(=)X 1 f 1067(4)X 1132(blocks:)X 1388(ABC,)X 1597(DEF,)X 1793(GHI,)X 1981(and)X 2122(JKL.)X 2325(We)X 2462(need)X 2639(to)X 2726(\256nd)X 2875(whether)X 3159(any)X 3300(of)X 3392(them)X 3577(appears)X 3848(in)X 3934(the)X 864 2817(text.)N 1049(We)X 1186(create)X 1404(one)X 1545(combined)X 1886(pattern)X 2134(by)X 2239(interleaving)X 2647(the)X 2771(4)X 2837(blocks:)X 3094(ADGJBEHKCFIL.)X 3756(We)X 3894(then)X 864 2937(build)N 1049(the)X 1168(mask)X 1358(vector)X 2 f 1580(R)X 1 f 1650(as)X 1738(usual)X 1928(for)X 2043(this)X 2179(interleaved)X 2557(pattern)X 2801(\(see)X 2951(Section)X 3211(2.1\).)X 3398(The)X 3543(only)X 3705(difference)X 864 3057(is)N 938(that,)X 1099(instead)X 1347(of)X 1435(shifting)X 1700(by)X 1801(one)X 1938(in)X 2021(each)X 2190(step,)X 2360(we)X 2475(shift)X 2638(by)X 2739(four!)X 2942(There)X 3152(is)X 3227(a)X 3285(match)X 3503(if)X 3574(any)X 3712(of)X 3801(the)X 3921(last)X 864 3177(four)N 1026(bits)X 1169(is)X 1250(1.)X 1358(\(When)X 1605(we)X 1727(shift)X 1897(we)X 2019(need)X 2199(to)X 2289(\256ll)X 2405(the)X 2531(\256rst)X 2683(four)X 2845(positions)X 3161(with)X 3331(1's,)X 3476(or)X 3570(better)X 3780(yet,)X 3925(use)X 864 3297(shift-OR.\))N 1238(Thus,)X 1445(the)X 1570(match)X 1793(for)X 1914(all)X 2021(blocks)X 2257(can)X 2396(be)X 2499(done)X 2682(exactly)X 2941(the)X 3066(same)X 3258(way)X 3419(as)X 3513(regular)X 3769(matches)X 864 3417(and)N 1000(it)X 1064(takes)X 1249(essentially)X 1607(the)X 1725(same)X 1910(running)X 2179(time.)X 6 f 14 s 864 3689(3.)N 1019(Extensions)X 1 f 10 s 864 3842(An)N 984(important)X 1317(feature)X 1563(of)X 1653(our)X 1783(algorithm)X 2117(is)X 2193(its)X 2291(\257exibility.)X 2664(In)X 2754(addition)X 3039(to)X 3124(asking)X 3356(about)X 3557(a)X 3616(single)X 3830(string,)X 864 3962(the)N 994(algorithm)X 1337(supports)X 1640(range)X 1851(of)X 1950(characters)X 2309(\(e.g.,)X 2503(``0-9''\),)X 2796(complements)X 3254(\(e.g.,)X 3448(everything)X 3822(except)X 864 4082(blank\),)N 1110(arbitrary)X 1408(sets)X 1549(of)X 1637(characters)X 1985(\(e.g.,)X 2169({a,e,i,o,u}\),)X 2567(unlimited)X 2894(``wild)X 3111(cards,'')X 3376(and)X 3513(combinations)X 3965(of)X 864 4202(the)N 985(above.)X 1240(Searching)X 1584(for)X 1701(several)X 1952(strings)X 2188(at)X 2269(the)X 2390(same)X 2578(time)X 2743(is)X 2819(also)X 2971(possible,)X 3276(although)X 3578(the)X 3698(size)X 3845(of)X 3934(the)X 864 4322(pattern)N 1107(becomes)X 1408(the)X 1526(sum)X 1679(of)X 1766(the)X 1884(sizes)X 2060(of)X 2147(the)X 2265(different)X 2563(strings)X 2797(\(and)X 2961(might)X 3168(thus)X 3322(require)X 3571(more)X 3757(than)X 3916(one)X 864 4442(word)N 1050(to)X 1133(represent\).)X 1516(The)X 1662(algorithm)X 1994(can)X 2127(be)X 2224(extended)X 2535(to)X 2618(support)X 2879(any)X 3016(regular)X 3265(expression.)X 3669(We)X 3801(discuss)X 864 4562(regular)N 1112(expressions)X 1506(brie\257y)X 1735(in)X 1817(section)X 2064(3.8.)X 6 f 12 s 864 4802(3.1.)N 1078(Sets)X 1307(of)X 1425(Characters)X 1 f 10 s 864 4955(Replacing)N 1213(one)X 1353(character)X 1673(with)X 1839(a)X 1899(set)X 2012(of)X 2103(allowable)X 2439(characters)X 2790(is)X 2867(very)X 3034(easy)X 3202(to)X 3289(achieve)X 3560(with)X 3727(this)X 3867(algo-)X 864 5075(rithm)N 1064(\(as)X 1185(was)X 1336(shown)X 1571(by)X 1677(Baeza-Yates)X 2110(and)X 2252(Gonnet)X 2514([BG89]\).)X 2852(Suppose)X 3149(that)X 3295(the)X 3419(pattern)X 3668(we)X 3788(want)X 3970(to)X 864 5195(\256nd)N 1011(is)X 2 f 1087(P)X 1 f 7 s 1145 5211(1)N 10 s 1202 5195(followed)N 1510(by)X 1613(one)X 1752(digit)X 1921(followed)X 2229(by)X 2 f 2332(P)X 1 f 7 s 2390 5211(2)N 10 s 2447 5195(and)N 2586(that)X 2729(we)X 2846(allow)X 3047(up)X 3150(to)X 2 f 3235(k)X 1 f 3294(errors.)X 3545(We)X 3680(denote)X 3917(this)X 864 5315(pattern)N 1109(by)X 2 f 1211(P)X 1 f 7 s 1269 5331(1)N 10 s 1303 5315([0)N 2 f 9 f 1370(-)X 1 f 1414(9])X 2 f 1481(P)X 1 f 7 s 1539 5331(2)N 10 s 1573 5315(.)N 1635(The)X 1782(only)X 1946(thing)X 2132(we)X 2248(need)X 2422(to)X 2506(do)X 2607(to)X 2690(accept)X 2917(a)X 2974(set)X 3084(of)X 3172(characters)X 3520(is)X 3594(to)X 3677(include)X 3934(the)X 864 5435(position)N 1142(of)X 1230([0)X 9 f 1297(-)X 1 f 1341(9])X 1429(in)X 1512(the)X 2 f 1631(S)X 1 f 1692(arrays)X 1910(for)X 2026(all)X 2128(digits.)X 2367(That)X 2536(is,)X 2631(in)X 2715(the)X 2835(preprocessing)X 3303(stage,)X 3510(when)X 3706(we)X 3822(decide)X 864 5555(for)N 982(each)X 1154(character)X 1474(the)X 1596(positions)X 1908(that)X 2051(this)X 2189(character)X 2508(matches)X 2794(in)X 2879(the)X 3000(pattern,)X 3266(we)X 3383(include)X 3642(all)X 3745(the)X 3866(char-)X 864 5675(acters)N 1089(in)X 1188(the)X 1323(set)X 1449(within)X 1690(that)X 1847(position.)X 2181(The)X 2343(rest)X 2496(of)X 2600(the)X 2735(algorithm)X 3083(is)X 3174(identical)X 3488(with)X 3668(the)X 3804(regular)X 8 p %%Page: 8 9 10 s 10 xH 0 xS 1 f 3 f 2428 696(8)N 1 f 864 984(algorithm.)N 1242(A)X 1327(complement)X 1750(of)X 1844(a)X 1907(character)X 2230(is)X 2310(a)X 2373(special)X 2623(case)X 2789(of)X 2883(a)X 2946(set)X 3062(of)X 3156(characters)X 3510(and)X 3653(it)X 3724(can)X 3863(obvi-)X 864 1104(ously)N 1057(be)X 1153(handled)X 1427(in)X 1509(the)X 1627(same)X 1812(way.)X 6 f 12 s 864 1344(3.2.)N 1078(Wild)X 1309(Cards)X 1 f 10 s 864 1497(A)N 944(single)X 1157(wild)X 1321(card)X 1482(is)X 1557(a)X 1615(symbol)X 1872(that)X 2014(matches)X 2299(all)X 2402(characters.)X 2792(As)X 2904(such,)X 3094(it)X 3161(is)X 3237(a)X 3296(special)X 3542(case)X 3704(of)X 3794(a)X 3853(set)X 3965(of)X 864 1617(characters)N 1216(and)X 1357(can)X 1494(be)X 1595(handled)X 1874(as)X 1966(we)X 2085(discussed)X 2417(in)X 2504(the)X 2627(previous)X 2927(section.)X 3218(Sometimes,)X 3617(however,)X 3938(we)X 864 1737(want)N 1044(to)X 1130(indicate)X 1408(that)X 1552(we)X 1670(allow)X 1872(an)X 1972(unbounded)X 2352(number)X 2621(of)X 2712(characters)X 3063(to)X 3149(appear)X 3388(in)X 3474(the)X 3596(middle)X 3842(of)X 3934(the)X 864 1857(pattern)N 1109(\(or)X 1225(even)X 1399(do)X 1501(it)X 1567(several)X 1817(times)X 2012(in)X 2096(the)X 2216(middle)X 2459(of)X 2547(the)X 2666(pattern\).)X 2977(This)X 3140(case)X 3300(requires)X 3580(modifying)X 3934(the)X 864 1977(algorithm)N 1196(slightly.)X 1496(Let)X 1624(the)X 1743(pattern)X 1987(be)X 2 f 2084(P)X 9 f 2152(=)X 2 f 2209(p)X 1 f 7 s 2258 1993(1)N 2 f 10 s 2292 1977(p)N 1 f 7 s 2341 1993(2)N 10 s 2395 1953(.)N 2435(.)X 2475(.)X 2 f 2515 1977(p)N 7 s 1993(m)Y 1 f 10 s 2601 1977(,)N 2642(and)X 2779(assume)X 3036(that)X 3177(the)X 3296(positions)X 3605(of)X 3693(`#')X 3809(\(which)X 864 2097(indicates)N 1171(unlimited)X 1499(wild)X 1663(cards)X 1855(in)X 1938(agrep\))X 2165(are)X 2285(after)X 2454(the)X 2573(characters)X 2 f 2921(p)X 7 s 2113(i)Y 1 f 4 s 2982 2124(1)N 10 s 3008 2097(,)N 2 f 3047(p)X 7 s 2113(i)Y 1 f 4 s 3108 2124(2)N 10 s 3134 2097(,)N 2 f 3173(...)X 1 f (,)S 2 f 3272(p)X 7 s 2113(i)Y 4 s 3328 2124(s)N 1 f 10 s 3350 2097(.)N 3411(\(There)X 3647(is)X 3721(no)X 3822(reason)X 864 2228(to)N 946(have)X 1118(two)X 1258(#'s)X 1376(in)X 1458(a)X 1514(row.\))X 1726(Let)X 2 f 1853(S)X 7 s 1902 2196(#)N 1 f 10 s 1956 2228(be)N 2052(a)X 2108(bit)X 2212(array)X 2398(that)X 2538(has)X 2665(1)X 2725(in)X 2808(exactly)X 3061(the)X 3180(positions)X 2 f 3489(i)X 1 f 7 s 3520 2244(1)N 10 s 3554 2228(,)N 2 f 3593(i)X 1 f 7 s 3624 2244(2)N 10 s 3658 2228(,)N 2 f 3697(...)X 1 f (,)S 2 f 3796(i)X 7 s 3818 2244(s)N 1 f 10 s 3846 2228(.)N 3907(The)X 864 2348(effect)N 1072(of)X 1163(putting)X 1413(a)X 1473(`#')X 1591(following)X 2 f 1926(p)X 7 s 2364(i)Y 1 f 10 s 2012 2348(can)N 2147(be)X 2246(de\256ned)X 2505(as)X 2595(follows.)X 2898(If)X 2975(we)X 3092(are)X 3214(scanning)X 2 f 3522(t)X 7 s 3548 2364(j)N 1 f 10 s 3593 2348(and)N 3732(we)X 3849(\256nd)X 3996(a)X 864 2468(match)N 1087(with)X 1256(up)X 1363(to)X 2 f 1452(d)X 1 f 1519(errors)X 1734(that)X 1881(ends)X 2055(at)X 2 f 2140(p)X 7 s 2484(i)Y 1 f 10 s 2202 2468(,)N 2249(then)X 2414(later)X 2584(when)X 2785(we)X 2906(scan)X 2 f 3076(t)X 7 s 3098 2484(r)N 1 f 10 s 3126 2468(,)N 3173(for)X 3295(any)X 2 f 3439(r)X 3489(>)X 3562(j)X 1 f 3584(,)X 3632(we)X 3754(can)X 3894(start)X 864 2588(matching)N 2 f 1189(t)X 7 s 1211 2604(r)N 1 f 10 s 1266 2588(to)N 2 f 1355(p)X 7 s 2604(i)Y 9 f 1420(+)X 1 f 1451(1)X 10 s 1512 2588(no)N 1619(matter)X 1851(how)X 2016(many)X 2220(characters)X 2573(we)X 2693(skipped.)X 2988(In)X 3081(other)X 3272(words,)X 3514(if)X 3589(at)X 3673(some)X 3868(point)X 864 2708(there)N 1045(is)X 1118(a)X 1174(match)X 1390(up)X 1490(to)X 2 f 1572(p)X 7 s 2724(i)Y 1 f 10 s 1654 2708(then)N 1812(this)X 1947(match)X 2163(is)X 2236(always)X 2479(valid)X 2659(later)X 2822(on)X 2922(\(because)X 3224(all)X 3324(the)X 3442(characters)X 3789(later)X 3952(on)X 864 2828(can)N 996(be)X 1092(considered)X 1460(as)X 1547(part)X 1692(of)X 1779(the)X 1897(`#'\).)X 1064 2981(We)N 1205(can)X 1346(adjust)X 1566(the)X 1693(algorithm)X 2033(for)X 2156(this)X 2301(case)X 2470(as)X 2567(follows.)X 2877(At)X 2987(each)X 3165(step,)X 3344(we)X 3468(apply)X 3676(the)X 3804(regular)X 864 3101(algorithm)N 1201(to)X 1289(compute)X 1591(all)X 1697(the)X 2 f 1821(R)X 1 f 1896(arrays.)X 2158(That)X 2330(is,)X 2428(we)X 2547(compute)X 2 f 2848(R)X 7 s 2905 3117(j)N 1 f 2901 3069(0)N 10 s 2935 3101(,)N 2 f 2974(R)X 7 s 3031 3117(j)N 1 f 3027 3069(1)N 10 s 3061 3101(,)N 2 f 3100(...)X 1 f (,)S 2 f 3199(R)X 7 s 3256 3117(j)N 3252 3069(d)N 1 f 10 s 3311 3101(using)N 3509(\(2.1\).)X 3728(Then,)X 3938(for)X 864 3221(each)N 2 f 1034(i)X 1 f 1056(,)X 1098(1)X 2 f 9 f 1151(\243)X 2 f 1208(i)X 9 f 1249(\243)X 2 f 1306(d)X 1 f (,)S 1388(we)X 1504(set)X 2 f 1615(R)X 7 s 1672 3237(j)N 1668 3189(i)N 1 f 10 s 1717 3221(=)N 2 f 1785(R)X 7 s 1842 3237(j)N 1838 3189(i)N 1 f 10 s 2 f 1884 3221(OR)N 1 f 2017([)X 2 f 2044(R)X 7 s 2101 3237(j)N 9 f 2126(-)X 1 f 2157(1)X 2 f 2097 3189(i)N 1 f 10 s 2 f 2217 3221(AND)N 1 f 2 f 2403(S)X 7 s 2452 3189(#)N 1 f 10 s 2486 3221(].)N 2556(This)X 2721(step)X 2873(corresponds)X 3284(to)X 3369(the)X 3490(action)X 3709(``if)X 3835(at)X 3916(any)X 864 3341(point,)N 1068(there)X 1249(is)X 1322(a)X 1378(1)X 1438(entry)X 1623(in)X 2 f 1705(R)X 7 s 1763 3309(i)N 1 f 10 s 2 f 1805 3341(AND)N 1 f 2 f 1991(S)X 7 s 2040 3309(#)N 1 f 10 s 2074 3341(,)N 2114(then)X 2272(this)X 2407(entry)X 2592(should)X 2825(remain)X 3068(1)X 3128(from)X 3304(now)X 3462(on.'')X 6 f 12 s 864 3581(3.3.)N 1078(Unknown)X 1538(Number)X 1927(of)X 2045(Errors)X 1 f 10 s 864 3821(In)N 972(some)X 1182(cases,)X 1413(we)X 1549(do)X 1671(not)X 1815(know)X 2035(the)X 2175(number)X 2462(of)X 2571(errors)X 2801(a-priori.)X 3124(We)X 3278(would)X 3520(like)X 3682(to)X 3786(\256nd)X 3952(all)X 864 3941 0.3125(occurrences)AN 1283(of)X 1384(the)X 1516(pattern)X 1773(with)X 1949(the)X 2 f 2080(minimal)X 1 f 2375(number)X 2653(of)X 2753(errors)X 2974(possible.)X 3309(The)X 3467(algorithm)X 3811(can)X 3956(be)X 864 4061(extended)N 1178(to)X 1264(this)X 1403(case)X 1567(as)X 1659(follows.)X 1964(We)X 2101(\256rst)X 2250(try)X 2364(to)X 2451(\256nd)X 2600(the)X 2723(pattern)X 2971(with)X 3138(no)X 3243(errors.)X 3496(If)X 3575(we)X 3694(are)X 3818(unsuc-)X 864 4181(cessful,)N 1137(we)X 1261(try)X 1380(with)X 1552(one)X 1698(error,)X 1905(then)X 2073(with)X 2245(three)X 2436(errors,)X 2674(then)X 2842(with)X 3014(7)X 3083(errors,)X 3320(and)X 3465(so)X 3565(on,)X 3694(essentially)X 864 4301(doubling)N 1171(the)X 1292(number)X 1560(of)X 1650(errors)X 1861(\(and)X 2027(adding)X 2269(one\))X 2436(at)X 2518(each)X 2690(attempt.)X 2974(If)X 3052(the)X 3174(number)X 3443(of)X 3534(errors)X 3746(turns)X 3930(out)X 864 4421(to)N 956(be)X 2 f 1062(k)X 1 f 1098(,)X 1148(then)X 1316(the)X 1444(running)X 1723(time)X 1895(will)X 2049(be)X 2 f 2155(O)X 1 f 2226(\(1)X 2306 4397(.)N 2 f 2339 4421(n)N 1 f 2 f 9 f 2405(+)X 1 f 2469(2)X 2522 4397(.)N 2 f 2555 4421(n)N 1 f 2 f 9 f 2621(+)X 1 f 2685(4)X 2738 4397(.)N 2 f 2771 4421(n)N 1 f 2 f 9 f 2837(+)X 1 f 2921 4397(.)N 2961(.)X 3001(.)X 2 f 9 f 3061 4421(+)N 1 f 3125(2)X 2 f 7 s 4389(b)Y 1 f 10 s 3212 4397(.)N 2 f 3245 4421(n)N 1 f 3291(\),)X 3368(where)X 3595(2)X 2 f 7 s 4389(b)Y 1 f 10 s 3699 4421(is)N 3781(the)X 3908(\256rst)X 864 4541(power)N 1086(of)X 1174(2)X 1235(greater)X 1480(than)X 2 f 1640(k)X 1 f 1676(.)X 1738(In)X 1827(the)X 1947(worst)X 2147(case,)X 2328(we)X 2444(perform)X 2725(4)X 2787(times)X 2982(as)X 3071(many)X 3271(operations)X 3627(as)X 3716(we)X 3832(would)X 864 4661(have)N 1042(had)X 1184(we)X 1304(known)X 2 f 1548(k)X 1 f 1610(\(in)X 1725(most)X 1906(cases,)X 2122(the)X 2246(factor)X 2460(is)X 2539(actually)X 2819(2)X 2885(or)X 2978(3\).)X 3091(This)X 3259(is)X 3338(not)X 3466(desirable,)X 3802(but)X 3930(not)X 864 4781(prohibitive.)N 1275(There)X 1483(are)X 1602(other)X 1787(methods)X 2078(to)X 2160(\256nd)X 2304(the)X 2422(minimum)X 2752(number)X 3017(of)X 3104(errors.)X 6 f 12 s 864 5021(3.4.)N 1078(A)X 1174(Combination)X 1789(of)X 1907(Patterns)X 2317(With)X 2553(and)X 2751(Without)X 3137(Errors)X 1 f 10 s 864 5174(Sometimes)N 1241(we)X 1357(do)X 1459(not)X 1583(want)X 1761(to)X 1845(allow)X 2045(parts)X 2223(of)X 2312(the)X 2432(pattern)X 2677(to)X 2761(have)X 2935(errors.)X 3165(For)X 3298(example,)X 3612(we)X 3729(may)X 3890(look)X 864 5294(for)N 981(license)X 1227(plate)X 1406(ABC123,)X 1733(and)X 1872(we)X 1989(know)X 2190(that)X 2333(the)X 2454(letters)X 2673(are)X 2794(correct)X 3040(but)X 3164(the)X 3284(numbers)X 3582(may)X 3742(have)X 3916(one)X 864 5414(error)N 1052(in)X 1145(them.)X 1376(We)X 1519(denote)X 1764(this)X 1910(pattern)X 2165(by)X 2277(123.)X 2723(We)X 2867(can)X 3011(modify)X 3274(the)X 3404(algorithm)X 3747(to)X 3841(shield)X 864 5534(parts)N 1043(of)X 1133(the)X 1253(pattern)X 1498(from)X 1676(having)X 1916(any)X 2054(errors)X 2264(in)X 2348(them.)X 2570(Let's)X 2757(assume)X 3015(that)X 2 f 3157(I)X 1 f 3206(is)X 3281(the)X 3401(set)X 3512(of)X 3601(indices)X 3850(in)X 3934(the)X 864 5654(pattern)N 1115(where)X 1340(no)X 1448(error)X 1633(is)X 1715(allowed,)X 2018(and)X 2163(let)X 2 f 2272(M)X 1 f 2368(be)X 2473(a)X 2538(masking)X 2838(array)X 3033(\(of)X 3156(size)X 2 f 3310(m)X 1 f 3368(\))X 3424(that)X 3573(has)X 3709(a)X 3774(0)X 3843(in)X 3934(the)X 9 p %%Page: 9 10 10 s 10 xH 0 xS 1 f 3 f 2428 696(9)N 1 f 864 984(indices)N 1115(of)X 2 f 1206(I)X 1 f 1257(and)X 1397(a)X 1457(1)X 1521(otherwise.)X 1897(We)X 2033(would)X 2257(like)X 2401(to)X 2486(modify)X 2740(\(2.1\))X 2917(such)X 3087(that)X 3230(insertions,)X 3584(deletions,)X 3916(and)X 864 1104(substitutions)N 1301(can)X 1447(only)X 1623(occur)X 1836(outside)X 2102(of)X 2 f 2204(I)X 1 f 2231(.)X 2306(This)X 2483(is)X 2571(done)X 2762(by)X 2877(masking)X 3183(these)X 3383(cases)X 3588(with)X 2 f 3765(M)X 1 f 3832(.)X 3907(The)X 864 1224(expression)N 1227(in)X 1309(\(2.1\))X 1483(is)X 1556(changed)X 1844(to)X 3878 1416(\(2.2\))N 2 f 1024 1408(R)N 7 s 1081 1424(j)N 9 f 1106(+)X 1 f 1137(1)X 2 f 1077 1376(d)N 10 s 9 f 1217 1408(=)N 1 f 10 f 1314 1368(I)N 1314 1448(L)N 2 f 1354 1408(Rshift)N 1 f 1546([)X 2 f 1573(R)X 7 s 1630 1424(j)N 1626 1376(d)N 1 f 10 s 1660 1408(])N 1707(AND)X 2 f 1901(S)X 7 s 1424(c)Y 1 f 10 s 10 f 1992 1368(M)N 1992 1448(O)N 1 f 2032 1408(OR)N 10 f 2176 1368(I)N 2176 1448(L)N 2 f 2216 1408(Rshift)N 1 f 2408([)X 2 f 2435(R)X 7 s 2492 1424(j)N 2488 1376(d)N 9 f 2525(-)X 1 f 2556(1)X 10 s 2610 1408(OR)N 2 f 2741(R)X 7 s 2798 1424(j)N 9 f 2823(+)X 1 f 2854(1)X 2 f 2794 1376(d)N 9 f 2831(-)X 1 f 2862(1)X 10 s 2902 1408(])N 2949(OR)X 2 f 3080(R)X 7 s 3137 1424(j)N 3133 1376(d)N 9 f 3170(-)X 1 f 3201(1)X 10 s 10 f 3255 1368(M)N 3255 1448(O)N 1 f 3295 1408(AND)N 2 f 3489(M.)X 6 f 12 s 864 1728(3.5.)N 1078(Non-Uniform)X 1692(Costs)X 1 f 10 s 864 1881(The)N 1017(edit)X 1165(distance)X 1456(measure,)X 1772(de\256ned)X 2036(in)X 2126(section)X 2381(1,)X 2469(assumes)X 2765(that)X 2914(insertions,)X 3274(deletions,)X 3612(and)X 3757(substitu-)X 864 2001(tions)N 1043(all)X 1147(have)X 1323(the)X 1445(same)X 1634(cost.)X 1827(But)X 1966(in)X 2052(some)X 2245(cases,)X 2459(we)X 2577(want)X 2757(to)X 2843(allow)X 3044(fewer)X 3251(deletions,)X 3583(say,)X 3733(than)X 3894(sub-)X 864 2121(stitutions.)N 1220(The)X 1369(algorithm)X 1704(can)X 1840(be)X 1941(extended,)X 2276(albeit)X 2479(in)X 2566(a)X 2627(limited)X 2878(way,)X 3057(to)X 3144(the)X 3267(case)X 3431(where)X 3653(each)X 3826(opera-)X 864 2241(tion)N 1011(has)X 1141(a)X 1200(different)X 1500(cost.)X 1692(We)X 1827(illustrate)X 2130(this)X 2268(extension)X 2598(with)X 2762(an)X 2860(example.)X 3194(Suppose)X 3487(that)X 3629(substitutions)X 864 2361(add)N 1000(1)X 1060(to)X 1142(the)X 1261(distance,)X 1565(but)X 1688(insertions)X 2020(and)X 2157(deletions)X 2467(add)X 2604(3)X 2665(each.)X 2874(Insertions)X 3211(and)X 3348(deletions)X 3658(are)X 3778(handled)X 864 2481(in)N 949(cases)X 1142(4)X 1204(and)X 1342(3)X 1404(\(see)X 1556(section)X 1805(2\).)X 1934(Insertions)X 2272(contribute)X 2619(the)X 2739(OR)X 2872(of)X 2 f 2961(R)X 7 s 3018 2497(j)N 3014 2449(d)N 9 f 3051(-)X 1 f 3082(1)X 10 s 3138 2481(and)N 3276(deletions)X 3587(contribute)X 3934(the)X 864 2601(OR)N 996(of)X 2 f 1084(Rshift)X 1 f 1283([)X 2 f 1310(R)X 7 s 1367 2617(j)N 9 f 1392(+)X 1 f 1423(1)X 2 f 1363 2569(d)N 9 f 1400(-)X 1 f 1431(1)X 10 s 1471 2601(])N 1519(\(2.1\).)X 1734(We)X 1867(would)X 2088(like)X 2229(them)X 2410(to)X 2493(cost)X 2643(3)X 2704(times)X 2898(as)X 2986(much.)X 3205(In)X 3293(other)X 3479(words,)X 3716(a)X 3774(deletion)X 864 2721(or)N 956(insertion)X 1261(that)X 1406(leads)X 1596(to)X 1683(a)X 1744(match)X 1965(with)X 2 f 2132(d)X 1 f 2197(errors)X 2410(should)X 2648(come)X 2847(from)X 3028(a)X 3089(match)X 3310(with)X 2 f 3477(d)X 9 f 3530(-)X 1 f 3574(3)X 3638(errors.)X 3890(This)X 864 2841(can)N 1000(be)X 1100(achieved)X 1410(by)X 1514(simply)X 1755(replacing)X 2078(the)X 2 f 2200(d)X 9 f 2253(-)X 1 f 2297(1)X 2361(in)X 2448(both)X 2615(expressions)X 3014(with)X 2 f 3181(d)X 9 f 3234(-)X 1 f 3278(3.)X 3383(This)X 3550(modi\256cation)X 3979(is)X 864 2961(very)N 1038(simple)X 1282(and)X 1429(it)X 1504(does)X 1682(not)X 1815(add)X 1962(to)X 2055(the)X 2184(running)X 2464(time;)X 2679(however,)X 3007(it)X 3082(works)X 3309(only)X 3482(for)X 3606(small)X 3809(integer)X 864 3081(costs.)N 6 f 12 s 864 3321(3.6.)N 1078(A)X 1174(Set)X 1350(of)X 1468(Patterns)X 1 f 10 s 864 3474(If)N 939(we)X 1054(have)X 1227(several)X 1476(patterns)X 1751(and)X 1888(we)X 2004(want)X 2182(to)X 2266(\256nd)X 2412(all)X 2514 0.3125(occurrences)AX 2921(of)X 3010(any)X 3148(of)X 3237(them,)X 3439(then)X 3599(we)X 3715(can)X 3849(either)X 864 3594(search)N 1094(them)X 1278(one)X 1418(at)X 1499(a)X 1558(time)X 1723(or)X 1813(together.)X 2139(The)X 2287(advantage)X 2636(of)X 2726(searching)X 3057(for)X 3174(all)X 3277(of)X 3367(them)X 3550(together)X 3836(is)X 3912(that)X 864 3714(it)N 930(can)X 1064(be)X 1162(done)X 1340(in)X 1424(one)X 1562(scan)X 1727(\(and)X 1892(in)X 1976(one)X 2114(command\).)X 2519(Suppose)X 2812(that)X 2954(we)X 3070(are)X 3191(looking)X 3457(for)X 2 f 3574(P)X 1 f 7 s 3632 3730(1)N 10 s 3666 3714(,)N 2 f 3705(P)X 1 f 7 s 3763 3730(2)N 10 s 3797 3714(,)N 2 f 3836(...)X 1 f (,)S 2 f 3935(P)X 7 s 3984 3730(r)N 1 f 10 s 4012 3714(.)N 864 3834(We)N 1001(concatenate)X 1406(all)X 1511(the)X 1634(patterns)X 1913(and)X 2054(put)X 2181(them)X 2366(in)X 2453(one)X 2594(array)X 2785(\(using)X 3010(as)X 3102(many)X 3305(words)X 3526(as)X 3617(needed\),)X 3916(and)X 864 3954(apply)N 1062(the)X 1180(algorithm)X 1511(on)X 1611(that)X 1751(array)X 1937(with)X 2099(the)X 2217(following)X 2548(modi\256cations.)X 3043(Let)X 2 f 3170(M)X 1 f 3257(be)X 3353(a)X 3409(bit)X 3513(array)X 3700(the)X 3819(size)X 3965(of)X 864 4074(the)N 986(combined)X 1326(pattern,)X 1593(and)X 1733(let)X 1837(bit)X 2 f 1945(i)X 1 f 1991(be)X 2091(1)X 2155(if)X 2228(and)X 2368(only)X 2534(if)X 2 f 2607(i)X 1 f 2653(corresponds)X 3064(to)X 3149(the)X 3270(\256rst)X 3417(character)X 3736(of)X 3826(any)X 3965(of)X 864 4194(the)N 997(patterns.)X 1326(For)X 1472(each)X 2 f 1656(s)X 9 f 1706(\316)X 1776(S)X 1 f 1823(,)X 1879(we)X 2009(build)X 2209(two)X 2365(bit)X 2485(arrays.)X 2758(The)X 2919(\256rst,)X 2 f 3099(S)X 7 s 4210(s)Y 1 f 10 s 3203 4194(is)N 3292(identical)X 3604(with)X 3782(the)X 3916(one)X 864 4314(described)N 1213(in)X 1315(section)X 1582(2.)X 1702(It)X 1791(is)X 1884(used)X 2071(to)X 2173(determine)X 2534(if)X 2623(a)X 2699(match)X 2935(occurs.)X 3225(The)X 3390(second)X 3653(array)X 2 f 3859(S)X 9 f (\242)S 7 s 2 f 3919 4330(s)N 1 f 10 s 3987 4314(=)N 2 f 864 4434(S)N 7 s 4450(s)Y 1 f 10 s 2 f 952 4434(AND)N 1 f 2 f 1138(M)X 1 f 1205(.)X 1266(It)X 1336(indicates)X 1642(whether)X 2 f 1922(s)X 1 f 1974(is)X 2048(the)X 2167(\256rst)X 2312(character)X 2629(of)X 2717(any)X 2854(pattern.)X 3138(If)X 3213(so,)X 3325(then)X 3484(we)X 3599(must)X 3775(start)X 3934(the)X 864 4554(match)N 1082(at)X 1162(that)X 1304(pattern:)X 1571(we)X 1686(do)X 1787(not)X 1910(want)X 2087(to)X 2170(depend)X 2423(on)X 2524(the)X 2643(end)X 2780(of)X 2868(the)X 2987(previous)X 3284(pattern.)X 3568(Thus,)X 3769(after)X 3938(we)X 864 4674(compute)N 2 f 1171(R)X 7 s 1224 4690(j)N 1 f 10 s 1246 4674(,)N 1297(we)X 1422(OR)X 1564(it)X 1639(with)X 2 f 1812(S)X 9 f (\242)S 7 s 2 f 1872 4690(s)N 1 f 10 s 1931 4674(\(where)N 2 f 2186(s)X 9 f 2236(=)X 2 f 2293(t)X 7 s 2319 4690(j)N 1 f 10 s 2341 4674(\).)N 2439(We)X 2582(compute)X 2890(the)X 3020(rest)X 3168(of)X 3267(the)X 2 f 3397(R)X 1 f 3478(arrays)X 3707(as)X 3806(before,)X 864 4794(except)N 1106(that)X 1258(in)X 1352(each)X 1532(step)X 1693(we)X 1819(OR)X 1962(them)X 2153(to)X 2246(a)X 2313(special)X 2567(mask)X 2767(that)X 2918(sets)X 3069(the)X 3198(\256rst)X 2 f 3353(d)X 1 f 3424(bits)X 3570(in)X 2 f 3663(R)X 7 s 3721 4762(d)N 1 f 10 s 3786 4794(of)N 3884(each)X 864 4914(separate)N 1148(pattern)X 1391(to)X 1473(1;)X 1555(this)X 1690(allows)X 2 f 1919(d)X 1 f 1980(initial)X 2187(errors)X 2396(in)X 2479(each)X 2648(pattern.)X 2932(\(This)X 3122(is)X 3196(not)X 3319(the)X 3438(most)X 3614(ef\256cient)X 3898(way)X 864 5034(to)N 947(solve)X 1137(this)X 1273(problem,)X 1581(but)X 1704(it's)X 1827(reasonably)X 2196(simple.\))X 2497(This)X 2660(case)X 2820(is)X 2894(a)X 2951(special)X 3195(case)X 3355(of)X 3443(patterns)X 3717(as)X 3804(regular)X 864 5154(expressions,)N 1278(which)X 1494(we)X 1608(will)X 1752(discuss)X 2003(shortly.)X 10 p %%Page: 10 11 10 s 10 xH 0 xS 1 f 3 f 2408 696(10)N 6 f 12 s 864 984(3.7.)N 1078(Long)X 1341(Patterns)X 1 f 10 s 864 1137(Suppose)N 1170(that)X 1325(the)X 1458(pattern)X 1716(occupies)X 2032(several)X 2295(words)X 2526(and)X 2677(that)X 2832(it)X 2912(is)X 3001(a)X 3073(simple)X 3322(string.)X 3560(The)X 3721(algorithm)X 864 1257(proceeds)N 1184(in)X 1280(the)X 1412(same)X 1611(fashion)X 1881(by)X 1995(computing)X 2371(the)X 2 f 2503(R)X 7 s 2561 1225(d)N 1 f 10 s 2629 1257(arrays)N 2860(for)X 2988(all)X 3102(words.)X 3352(However,)X 3701(unless)X 3934(the)X 864 1377(number)N 1130(of)X 1218(errors)X 1427(is)X 1501(large,)X 1703(the)X 1822(\256rst)X 1967(part)X 2113(of)X 2201(the)X 2320(pattern)X 2564(will)X 2709(not)X 2832(match)X 3049(the)X 3169(text)X 3311(quite)X 3493(often.)X 3720(If)X 3796(there)X 3979(is)X 864 1497(no)N 966(match)X 1184(with)X 2 f 1348(k)X 1 f 1406(errors)X 1616(starting)X 1878(after)X 2048(position)X 2 f 2327(r)X 1 f 2380(of)X 2469(the)X 2589(pattern,)X 2854(then)X 3014(there)X 3197(is)X 3272(no)X 3374(need)X 3548(to)X 3632(maintain)X 3934(the)X 2 f 864 1617(R)N 1 f 938(arrays)X 1160(corresponding)X 1644(to)X 1731(positions)X 2044(larger)X 2257(than)X 2 f 2420(r)X 1 f 2476(\(their)X 2675(values)X 2905(will)X 3054(be)X 3155(0\).)X 3287(Thus,)X 3492(most)X 3673(of)X 3766(the)X 3890(time)X 864 1737(there)N 1046(will)X 1191(be)X 1287(no)X 1387(need)X 1559(to)X 1641(maintain)X 1941(the)X 2 f 2059(R)X 7 s 2117 1705(d)N 1 f 10 s 2171 1737(arrays)N 2388(for)X 2502(the)X 2620(right)X 2791(side)X 2940(of)X 3027(the)X 3145(pattern.)X 3408(We)X 3540(only)X 3702(need)X 3874(to)X 3956(be)X 864 1857(alerted)N 1105(when)X 1301(the)X 1421(last)X 1554(bit)X 1660(of)X 1749(the)X 1869(last)X 2 f 2002(R)X 7 s 2060 1825(d)N 1 f 10 s 2116 1857(array)N 2304(that)X 2446(we)X 2562(maintain)X 2864(gets)X 3015(the)X 3135(value)X 3331(of)X 3420(1.)X 3523(In)X 3613(that)X 3756(case,)X 3938(we)X 864 1977(start)N 1027(maintaining)X 1434(the)X 2 f 1556(R)X 7 s 1605 1993(d)N 1 f 10 s 1663 1977(arrays)N 1884(for)X 2002(the)X 2124(next)X 2286(part)X 2435(of)X 2526(the)X 2648(pattern.)X 2935(This)X 3101(improvement)X 3552(works)X 3772(only)X 3938(for)X 864 2097(simple)N 1097(strings)X 1330(and)X 1466(not)X 1588(for)X 1702(sets)X 1842(of)X 1929(strings)X 2162(or)X 2249(regular)X 2497(expressions.)X 6 f 12 s 864 2337(3.8.)N 1078(Regular)X 1462(Expressions)X 1 f 10 s 864 2490(The)N 1021(algorithm)X 1364(can)X 1508(be)X 1616(extended)X 1938(to)X 2032(allow)X 2242(any)X 2390(regular)X 2650(expression)X 3025(as)X 3124(a)X 3192(pattern.)X 3488(We)X 3633(describe)X 3934(the)X 864 2610(method)N 1130(here)X 1295(only)X 1463(brie\257y.)X 1738(Algorithms)X 2127(for)X 2246(matching)X 2569(regular)X 2822(expressions)X 3221(with)X 3388(errors,)X 3621(based)X 3829(on)X 3934(the)X 864 2730(dynamic-programming)N 1624(approach,)X 1960(appear)X 2196(in)X 2279([MM89].)X 2616(First,)X 2803(we)X 2918(illustrate)X 3219(the)X 3338(algorithm)X 3670(with)X 3833(a)X 3890(sim-)X 864 2850(ple)N 984(example.)X 1318(We)X 1452(do)X 1554(not)X 1678(try)X 1789(to)X 1873(optimize)X 2175(the)X 2295(algorithm)X 2628(at)X 2708(this)X 2845(stage;)X 3074(we)X 3190(try)X 3300(to)X 3383(make)X 3578(it)X 3643(as)X 3731(simple)X 3965(as)X 864 2970(possible)N 1153(\(a)X 1243(more)X 1435(detailed)X 1716(discussion)X 2076(and)X 2219(more)X 2411(ef\256cient)X 2701(algorithms)X 3071(will)X 3223(be)X 3327(presented)X 3663(elsewhere\).)X 864 3090(Let)N 997(the)X 1121(pattern)X 1370(be)X 2 f 1472(P)X 9 f 1540(=)X 2 f 1597(ab)X 1 f 1690(\()X 2 f 1717(cd)X 1 f 9 f 1806(|)X 2 f 1835(e)X 1 f 1877(\))X 2 f 7 s 1904 3058(*)N 10 s 1944 3090(fg)N 1 f 2032(\(i.e.,)X 2203(starting)X 2469(with)X 2 f 2637(ab)X 1 f 2743(and)X 2884(ending)X 3127(with)X 2 f 3300(fg)X 1 f 3387(with)X 3554(any)X 3695(number)X 3965(of)X 864 3210(either)N 2 f 1078(cd)X 1 f 1185(or)X 2 f 1283(e)X 1 f 1350(in)X 1443(between\).)X 1809(This)X 1983(regular)X 2243(expression)X 2618(is)X 2703(translated)X 3047(to)X 3141(the)X 3271(non-deterministic)X 3868(\256nite)X 864 3330(automata)N 1181(shown)X 1413(in)X 1498(Fig.)X 1647(3)X 1710(\(for)X 1854(more)X 2042(on)X 2145(such)X 2315(translations)X 2708(see)X 2835([HU79]\).)X 3176(We)X 3312(now)X 3474(assign)X 3698(a)X 3758(bit)X 3866(array)X 864 3450(to)N 947(represent)X 1263(the)X 1382(automata.)X 1736(We)X 1868(number)X 2133(the)X 2251(states)X 2449(including)X 2771(the)X 2889(null)X 3033(states)X 3231(that)X 3371(do)X 3471(not)X 3593(correspond)X 3970(to)X 864 3570(any)N 1000(character)X 1316(\(see)X 1466(Fig.)X 1612(3\).)X 1739(This)X 1901(``linearizes'')X 2337(the)X 2455(automata.)X 2809(Each)X 2990(state)X 3157(corresponds)X 3565(to)X 3647(one)X 3784(entry)X 3970(in)X 864 3690(the)N 984(array.)X 1212(Thus,)X 1414(for)X 2 f 1530(P)X 1 f 1601(we)X 1717(have)X 1891(an)X 1989(array)X 2177(of)X 2266(size)X 2413(11.)X 2555(Notice)X 2791(that)X 2933(all)X 3035(the)X 3155(non-)X 2 f 9 f 3302(e)X 1 f 3359(moves)X 3590(go)X 3692(to)X 3775(the)X 3894(next)X 864 3810(state)N 1032(and)X 1169(thus)X 1323(can)X 1456(be)X 1553(handled)X 1828(by)X 1929(essentially)X 2288(a)X 2345(shift)X 2508(and)X 2645(an)X 2742(AND)X 2937(operation.)X 3302(We)X 3436(need)X 3610(to)X 3694(\256nd)X 3840(a)X 3898(way)X 3 f 3695 4448(10)N 3298(9)X 2900(8)X 2503 4846(7)N 2106(6)X 2702 4051(5)N 2304(4)X 1907(3)X 1709 4448(2)N 1311(1)X 914(0)X 3670 4523 MXY 49 Dc 1 f 2466 4026(d)N 2081 4039(c)N 3471 4622(g)N 3087(f)X 2 f 9 f 1882 4337(e)N 2851 4275(e)N 2578 4697(e)N 1932 4635(e)N 1 f 2292 5019(e)N 2 f 9 f 2292 5206(e)N 1 f 1473 4635(b)N 1076(a)X 1733 MX -8 39 Dl 16 -19 Dl 25 4 Dl -33 -24 Dl 2876 MX -33 24 Dl 25 -4 Dl 16 19 Dl -8 -39 Dl 2702 5119 MXY 198 -547 Dl 1907 5119 MXY 795 0 Dl 1709 4572 MXY 198 547 Dl 1994 4833 MXY -2 -40 Dl -11 23 Dl -25 2 Dl 38 15 Dl 1833 4212 MXY -29 29 Dl 24 -7 Dl 18 17 Dl -13 -39 Dl 2789 4597 MXY -38 15 Dl 25 3 Dl 10 23 Dl 3 -41 Dl 2826 4423 MXY 13 -38 Dl -18 17 Dl -25 -6 Dl 30 27 Dl 3596 4523 MXY -35 -22 Dl 13 22 Dl -13 22 Dl 35 -22 Dl 3347 MX 298 0 Dl 2751 4126 MXY 100 397 Dl 2553 4920 MXY 298 -397 Dl 2156 4920 MXY 297 0 Dl 1758 4523 MXY 298 397 Dl 1758 4523 MXY 100 -397 Dl 3645 4523 MXY 99 Dc 2453 4920 MXY 99 Dc 2056 MX 99 Dc 3198 4523 MXY -34 -22 Dl 12 22 Dl -12 22 Dl 34 -22 Dl 2602 4126 MXY -34 -22 Dl 13 22 Dl -13 21 Dl 34 -21 Dl 2205 MX -34 -22 Dl 12 22 Dl -12 21 Dl 34 -21 Dl 1609 4523 MXY -34 -22 Dl 13 22 Dl -13 22 Dl 34 -22 Dl 1212 MX -34 -22 Dl 12 22 Dl -12 22 Dl 34 -22 Dl 2950 MX 298 0 Dl 2354 4126 MXY 298 0 Dl 1957 MX 298 0 Dl 1361 4523 MXY 298 0 Dl 964 MX 298 0 Dl 3248 MX 99 Dc 2851 MX 99 Dc 2652 4126 MXY 99 Dc 2255 MX 99 Dc 1858 MX 99 Dc 1659 4523 MXY 99 Dc 1262 MX 99 Dc 864 MX 99 Dc 1262 5446(Figure)N 1491(3:)X 1593(The)X 1738(non-deterministic)X 2323(automata)X 2637(corresponding)X 3116(to)X 2 f 3198(ab)X 1 f 3291(\()X 2 f 3318(cd)X 1 f 9 f 3407(|)X 2 f 3436(e)X 1 f 3478(\))X 2 f 3505(*)X 3551(fg)X 1 f 3613(.)X 11 p %%Page: 11 12 10 s 10 xH 0 xS 1 f 3 f 2408 696(11)N 1 f 864 984(to)N 948(deal)X 1104(with)X 1268(arbitrary)X 1567(jumps)X 1784(required)X 2074(by)X 2176(the)X 2 f 9 f 2296(e)X 1 f 2353(moves)X 2584(\(e.g.,)X 2770(from)X 2949(state)X 3119(2)X 3182(to)X 3267(state)X 3437(8\))X 3527(and)X 3666(with)X 3831(``non-)X 864 1104(jumps'')N 1138(that)X 1282(happen)X 1538(to)X 1624(be)X 1724(in)X 1810(consecutive)X 2213(states)X 2415(\(e.g.,)X 2602(from)X 2782(state)X 2953(5)X 3017(to)X 3103(state)X 3274(6\).)X 3405(The)X 3554(non-jumps)X 3920(can)X 864 1224(be)N 967(handled)X 1248(easily)X 1462(with)X 1631(a)X 1694(mask.)X 1930(The)X 2082(arbitrary)X 2386(jumps)X 2608(are)X 2735(harder)X 2969(to)X 3059(handle.)X 3341(The)X 3494(meaning)X 3798(of)X 3893(an)X 2 f 9 f 3997(e)X 1 f 864 1344(move)N 1071(from)X 1256(state)X 2 f 1432(i)X 1 f 1483(to)X 1574(state)X 2 f 1756(j)X 1 f 1807(is)X 1889(that)X 2038(if,)X 2136(at)X 2223(any)X 2368(point,)X 2581(we)X 2704(match)X 2928(up)X 3036(to)X 3126(state)X 2 f 3301(i)X 1 f 3351(then)X 3517(the)X 3643(same)X 3836(match)X 864 1464(holds)N 1057(also)X 1206(up)X 1306(to)X 1388(state)X 2 f 1561(j)X 1 f 1583(.)X 1643(In)X 1730(other)X 1915(words,)X 2151(if)X 2220(there)X 2401(is)X 2474(a)X 2531(1)X 2592(corresponding)X 3072(to)X 3155(state)X 2 f 3323(i)X 1 f 3366(in)X 3449(the)X 3568(array,)X 3775(then)X 3934(the)X 2 f 9 f 864 1584(e)N 1 f 921(move)X 1121(from)X 2 f 1299(i)X 1 f 1343(to)X 2 f 1433(j)X 1 f 1477(implies)X 1734(that)X 1876(there)X 2059(should)X 2294(be)X 2392(a)X 2450(1)X 2511(corresponding)X 2991(to)X 3074(state)X 2 f 3248(j)X 1 f 3270(.)X 3331(The)X 3477(main)X 3658(observation)X 864 1704(is)N 937(that)X 1077(a)X 1133(given)X 1331(bit)X 1435(array)X 1621(and)X 1757(set)X 1866(of)X 2 f 9 f 1953(e)X 1 f 2008(moves)X 2237(completely)X 2613(determine)X 2954(the)X 3072(value)X 3266(of)X 3354(the)X 3473(bit)X 3578(array)X 3765(after)X 3934(the)X 2 f 9 f 864 1824(e)N 1 f 925(moves)X 1160(are)X 1285(taken.)X 1545(Thus,)X 1751(the)X 1875(set)X 1990(of)X 2 f 9 f 2083(e)X 1 f 2144(moves)X 2379(de\256nes)X 2631(a)X 2692(function)X 2984(that)X 3129(maps)X 3323(a)X 3384(bit)X 3493(array)X 3684(to)X 3771(another.)X 864 1944(We)N 996(need)X 1168(to)X 1250(be)X 1346(able)X 1500(to)X 1582(implement)X 1944(this)X 2079(function)X 2366(ef\256ciently.)X 1064 2097(Let)N 2 f 1197(f)X 1 f 1239(denote)X 1473(the)X 1591(function)X 1878(that)X 2018(maps)X 2207(one)X 2343(bit)X 2447(array)X 2633(to)X 2715(another)X 2976(by)X 3076(applying)X 3376(all)X 3476(the)X 2 f 9 f 3594(e)X 1 f 3650(moves.)X 3920(We)X 864 2217(divide)N 1087(the)X 1208(bit)X 1315(array)X 1504(into)X 1651(bytes,)X 1863(i.e.,)X 2004(groups)X 2245(of)X 2335(8)X 2398(bits)X 2536(each.)X 2747(Consider)X 3059(the)X 3180(\256rst)X 3327(8)X 3390(bits)X 3528(of)X 3618(the)X 3739(bit)X 3846(array.)X 864 2337(The)N 1010(values)X 1236(of)X 1324(these)X 1510(bits)X 1646(determine)X 1988(which)X 2206(1's)X 2326(should)X 2561(be)X 2659(set)X 2770(when)X 2966(we)X 3082(apply)X 3282(the)X 2 f 9 f 3402(e)X 1 f 3459(moves)X 3690(on)X 3792(states)X 3992(1)X 864 2457(to)N 951(8.)X 1036(Since)X 1239(there)X 1425(are)X 1549(only)X 1716(256)X 1860(\()X 2 f 9 f 1887(=)X 1 f 1944(2)X 7 s 2425(8)Y 10 s 2018 2457(\))N 2069(possible)X 2355(values)X 2584(for)X 2702(8)X 2766(bits,)X 2925(we)X 3043(can)X 3179(preprocess)X 3547(all)X 3651(possibilities)X 864 2577(and)N 1003(construct)X 1320(a)X 1379(table)X 1558(of)X 1648(size)X 1796(256)X 1939(which)X 2158(will)X 2306(hold,)X 2492(for)X 2610(each)X 2782(possible)X 3068(byte,)X 3250(the)X 3372(whole)X 3592(bit)X 3700(array)X 3890(with)X 864 2697(1's)N 984(only)X 1148(in)X 1232(places)X 1455(corresponding)X 1936(to)X 2020(the)X 2 f 9 f 2140(e)X 1 f 2197(moves.)X 2468(\(We)X 2629(need)X 2803(the)X 2923(whole)X 3141(array)X 3329(and)X 3467(not)X 3591(just)X 3728(the)X 3847(\256rst)X 3992(8)X 864 2817(bits,)N 1022(because)X 1300(there)X 1484(might)X 1693(be)X 1792(forward)X 2070(jumps.\))X 2355(We)X 2490(can)X 2625(do)X 2728(that)X 2871(for)X 2988(each)X 3160(byte.)X 3362(Given)X 3582(now)X 3744(a)X 3804(current)X 864 2937(value)N 1071(of)X 2 f 1171(R)X 1 f 1220(,)X 1273(we)X 1400(\256rst)X 1557(apply)X 1768(the)X 1899(regular)X 2160(algorithm,)X 2524(taking)X 2757(care)X 2925(of)X 3025(regular)X 3286(non)X 2 f 9 f 3439(e)X 1 f 3507(moves,)X 3768(then)X 3938(we)X 864 3057(divide)N 1096(the)X 1226(array)X 1424(into)X 1580(bytes,)X 1801(\256nd)X 1957(the)X 2087(corresponding)X 2578(arrays)X 2807(in)X 2901(the)X 3032(tables)X 3252(\(we)X 3406(have)X 3591(one)X 3740(table)X 3929(per)X 864 3177(byte\),)N 1075(and)X 1217(OR)X 1354(all)X 1460(of)X 1553(them)X 1739(to)X 2 f 1827(R)X 1 f 1876(.)X 1942(This)X 2110(implements)X 2509(all)X 2615(the)X 2739(jumps,)X 2980(and)X 3122(if)X 3197(the)X 3320(pattern)X 3568(is)X 3646(occupies)X 3952(no)X 864 3297(more)N 1054(than)X 1217(32)X 1322(bits)X 1462(\(as)X 1581(is)X 1659(often)X 1849(the)X 1972(case\),)X 2183(only)X 2350(4)X 2415(more)X 2605(steps)X 2790(are)X 2914(required.)X 3247(\(If)X 3353(the)X 3477(text)X 3623(is)X 3702(large,)X 3909(it)X 3979(is)X 864 3417(worthwhile)N 1249(to)X 1331(preprocess)X 1695(16)X 1795(bits)X 1930(for)X 2044(a)X 2100(table)X 2276(of)X 2363(size)X 2508(65536)X 2728(and)X 2864(half)X 3009(as)X 3096(many)X 3294(steps.\))X 1064 3570(The)N 1215(algorithm)X 1552(sketched)X 1859(above)X 2077(is)X 2156(not)X 2284(the)X 2408(most)X 2589(ef\256cient)X 2878(algorithm,)X 3235(but)X 3364(it)X 3435(is)X 3515(reasonably)X 3890(sim-)X 864 3690(ple.)N 1022(More)X 1216(ef\256cient)X 1499(algorithms)X 1861(will)X 2005(be)X 2101(described)X 2429(elsewhere.)X 6 f 12 s 864 3930(3.9.)N 1078(Very)X 1312(Large)X 1600(Alphabets)X 1 f 10 s 864 4083(Sometimes)N 1247(the)X 1373(alphabet)X 1673(is)X 1754(very)X 1925(large.)X 2154(For)X 2293(example,)X 2613(the)X 2740(pattern)X 2992(can)X 3133(be)X 3238(a)X 3303(segment)X 3599(of)X 3695(a)X 3760(program)X 864 4203(which)N 1081(we)X 1196(want)X 1373(to)X 1456(\256nd)X 1601(inside)X 1813(a)X 1870(large)X 2052(program.)X 2385(Instead)X 2638(of)X 2726(counting)X 3027(each)X 3195(text)X 3335(character)X 3651(as)X 3738(a)X 3794(charac-)X 864 4323(ter)N 972(in)X 1057(the)X 1178(pattern,)X 1444(we)X 1561(may)X 1722(wish)X 1896(to)X 1982(count)X 2184(each)X 2356(line)X 2500(as)X 2591(a)X 2651(character)X 2971(\(this)X 3137(is)X 3214(done,)X 3414(for)X 3532(example,)X 3848(in)X 3934(the)X 864 4443(program)N 2 f 1156(diff)X 1 f 1282(which)X 1498(is)X 1571(used)X 1738(to)X 1820(compare)X 2117(\256les\).)X 2337(The)X 2482(problem)X 2769(we)X 2883(have)X 3055(with)X 3217(large)X 3398(alphabets)X 3721(is)X 3794(that)X 3934(the)X 864 4563(preprocessing)N 1340(stage,)X 1555(where)X 1782(all)X 1892(the)X 2 f 2020(S)X 7 s 4579(s)Y 1 f 10 s 2119 4563(arrays)N 2347(are)X 2477(constructed,)X 2898(will)X 3053(take)X 3218(too)X 3351(long)X 3524(and)X 3671(require)X 3930(too)X 864 4683(much)N 1064(space)X 1265(\(the)X 1412(set)X 1523(of)X 1612(all)X 1714(possible)X 1998(program)X 2292(lines)X 2465(is,)X 2560(for)X 2676(all)X 2778(practical)X 3077(purposes,)X 3403(in\256nite\).)X 3717(However,)X 864 4803(we)N 986(can)X 1126(use)X 1261(hashing)X 1538(to)X 1628(map)X 1794(the)X 1920(alphabet)X 2220(to)X 2310(a)X 2374(reasonable)X 2746(size.)X 2939(We)X 3079(\256rst)X 3232(decide)X 3471(the)X 3598(hashing)X 3876(table)X 864 4923(size,)N 1042(which)X 1271(should)X 1517(be)X 1626(large)X 1820(enough)X 2089(to)X 2184(avoid)X 2395(many)X 2606(collisions)X 2944(but)X 3078(not)X 3212(too)X 3346(large)X 3539(so)X 3642(that)X 3794(we)X 3920(can)X 864 5043(afford)N 1088(the)X 1213(space)X 1419(\(e.g.,)X 1609(1024\).)X 1863(The)X 2015(alphabet)X 2314(becomes)X 2622(the)X 2747(set)X 2863(of)X 2957(integers)X 3238(from)X 3421(1)X 3488(to)X 3577(the)X 3703(table)X 3887(size.)X 864 5163(The)N 1020(algorithm)X 1362(is)X 1446(applied)X 1713(to)X 1806(the)X 1935(hash)X 2113(values.)X 2388(At)X 2498(the)X 2626(end,)X 2792(all)X 2902(matches)X 3195(should)X 3438(be)X 3544(checked)X 3838(again,)X 864 5283(however,)N 1186(to)X 1274(remove)X 1541(matchings)X 1896(that)X 2042(result)X 2246(from)X 2428(collisions.)X 2800(We)X 2938(do)X 3044(not)X 3172(support,)X 3458(at)X 3542(this)X 3683(time,)X 3871(large)X 864 5403(alphabets)N 1187(in)X 2 f 1269(agrep)X 1 f 1456(.)X 12 p %%Page: 12 13 10 s 10 xH 0 xS 1 f 3 f 2408 696(12)N 6 f 14 s 864 984(4.)N 1019(Experim)X 1460(ental)X 1752(Results)X 1 f 10 s 864 1137(We)N 1000(have)X 1176(implemented)X 1618(the)X 1740(algorithm)X 2075(and)X 2215(many)X 2417(of)X 2508(its)X 2607(extensions)X 2969(and)X 3109(tested)X 3320(them)X 3504(against)X 3756(previous)X 864 1257(algorithms.)N 1272(All)X 1400(tests)X 1568(were)X 1751(run)X 1884(on)X 1990(a)X 2051(SUN)X 2236(SparcStation)X 2670(II)X 2749(running)X 3023(UNIX.)X 3289(A)X 3372(description)X 3753(of)X 2 f 3845(agrep)X 1 f 864 1377(\320)N 968(a)X 1028(tool)X 1176(for)X 1294(approximate)X 1719(string)X 1925(matching)X 2247(based)X 2454(on)X 2558(this)X 2697(algorithm,)X 3052(which)X 3272(we)X 3390(used)X 3562(for)X 3681(the)X 3804(experi-)X 864 1497(ments)N 1082(\320)X 1189(is)X 1269(given)X 1474(in)X 1563(the)X 1687(next)X 1851(section.)X 2144(In)X 2237(almost)X 2476(all)X 2582(the)X 2706(cases,)X 2922(our)X 3055(algorithm)X 3392(beat)X 3552(the)X 3676(other)X 3867(algo-)X 864 1617(rithms,)N 1108(sometimes)X 1470(by)X 1570(a)X 1626(wide)X 1802(margin.)X 1064 1770(The)N 1222(numbers)X 1531(given)X 1742(here)X 1914(should)X 2160(be)X 2269(taken)X 2476(with)X 2651(caution.)X 2960(Any)X 3131(such)X 3311(results)X 3554(depend)X 3820(on)X 3934(the)X 864 1890(architecture,)N 1296(the)X 1426(operating)X 1761(system,)X 2035(and)X 2183(the)X 2313(compilers)X 2661(used.)X 2879(We)X 3022(probably)X 3338(put)X 3471(more)X 3667(efforts)X 3908(into)X 864 2010(optimizing)N 1233(our)X 1363(algorithm)X 1697(than)X 1858(we)X 1975(did)X 2100(for)X 2218(other)X 2407(algorithms)X 2773(\(although)X 3104(we)X 3222(put)X 3348(signi\256cant)X 3705(effort)X 3908(into)X 864 2130(that)N 1006(too\),)X 1177(and)X 1315(we)X 1431(did)X 1555(not)X 1679(test)X 1812(all)X 1914(possible)X 2198(scenarios.)X 2559(However,)X 2896(we)X 3012(tried)X 3181(not)X 3304(only)X 3467(to)X 3550(be)X 3647(fair)X 3780(but)X 3903(also)X 864 2250(to)N 946(be)X 1042(conservative.)X 1508(We)X 1640(used)X 1807(the)X 1925(\256nal)X 2087(agrep)X 2286(program)X 2578(for)X 2693(all)X 2794(our)X 2922(tests)X 3085(even)X 3258(though)X 3501(it)X 3566(contains)X 3854(many)X 864 2370(options)N 1126(that)X 1273(slow)X 1451(it)X 1521(down)X 1725(and)X 1867(are)X 1992(not)X 2120(available)X 2436(in)X 2524(the)X 2648(other)X 2839(programs)X 3168(\(e.g.,)X 3357(the)X 3481(fact)X 3628(that)X 3774(agrep)X 3979(is)X 864 2490(record)N 1093(oriented)X 1379(\320)X 1482(see)X 1608(next)X 1769(section)X 2020(\320)X 2124(slows)X 2330(it)X 2398(down)X 2600(considerably\).)X 3101(For)X 3236(the)X 3358(simple)X 3595(cases)X 3789(that)X 3933(are)X 864 2610(listed)N 1065(in)X 1155(the)X 1281(tables)X 1496(below,)X 1740(we)X 1862(sometimes)X 2232(obtained)X 2536(20-30%)X 2818(faster)X 3025(running)X 3302(times)X 3502(with)X 3671(versions)X 3965(of)X 864 2730(the)N 982(program)X 1274(that)X 1414(has)X 1541(only)X 1703(the)X 1821(tested)X 2028(options.)X 2323(We)X 2455(believe)X 2707(that)X 2847(the)X 2965(main)X 3145(strength)X 3423(of)X 3511(this)X 3647(algorithm)X 3979(is)X 864 2850(that)N 1015(it)X 1090(is)X 1174(more)X 1370(\257exible,)X 1661(general,)X 1949(and)X 2096(convenient)X 2479(than)X 2648(all)X 2759(previous)X 3066(algorithms.)X 3479(The)X 3635(fact)X 3787(that)X 3938(for)X 864 2970(most)N 1046(of)X 1140(the)X 1265(common)X 1573(applications)X 1988(of)X 2083(it,)X 2175(agrep)X 2382(is)X 2463(also)X 2620(signi\256cantly)X 3043(faster)X 3250(than)X 3416(other)X 3609(algorithms)X 3979(is)X 864 3090(nice,)N 1038(but)X 1160(speed)X 1363(is)X 1436(mostly)X 1673(a)X 1729(secondary)X 2075(issue)X 2255(here;)X 2456(other)X 2641(algorithms)X 3003(are)X 3122(reasonably)X 3490(fast)X 3626(already.)X 1064 3243(First,)N 1252(we)X 1368(tested)X 1577(searching)X 1907(without)X 2173(errors)X 2384(vs.)X 2498(the)X 2619(grep)X 2785(family)X 3017(of)X 3107(programs)X 3433(available)X 3746(in)X 3831(UNIX)X 864 3363(and)N 1003(against)X 1253(an)X 1351(implementation)X 1875(of)X 1964(the)X 2084(Boyer-Moore)X 2543(algorithm.)X 2916(The)X 3063(text)X 3205(was)X 3352(a)X 3410(bibliography)X 3841(\256le)X 3965(of)X 864 3483(size)N 1012(one)X 1151(Megabytes.)X 1567(We)X 1703(used)X 1874(5)X 1938(patterns)X 2216(of)X 2307(varying)X 2576(sizes)X 2756(\(4-10\))X 2981(and)X 3121(averaged)X 3436(the)X 3558(results.)X 3831(Agrep)X 864 3603(consistently)N 1284(beats)X 1487(the)X 1623(grep)X 1804(family,)X 2071(but)X 2211(it)X 2293(is)X 2383(not)X 2522(as)X 2626(fast)X 2779(as)X 2883(the)X 3018(Boyer-Moore)X 3492(algorithm.)X 3880(\(The)X 864 3723(Boyer-Moore)N 1321(algorithm)X 1652(cannot)X 1886(be)X 1982(used,)X 2169(as)X 2256(far)X 2366(as)X 2453(we)X 2567(know,)X 2785(for)X 2899(most)X 3074(of)X 3161(the)X 3279(extensions)X 3638(in)X 3721(the)X 3840(previ-)X 864 3843(ous)N 1002(section;)X 1298(even)X 1477(\256nding)X 1730(line)X 1877(numbers)X 2180(for)X 2301(all)X 2408(the)X 2533(matches)X 2823(is)X 2902(not)X 3030(trivial)X 3247(and)X 3389(slows)X 3597(the)X 3721(algorithm)X 864 3963(down)N 1062(considerably.\))X 1064 4116(We)N 1196(then)X 1354(tested)X 1561(the)X 1679(approximate)X 2100(string-matching)X 2627(algorithm)X 2958(against)X 3206(two)X 3347(other)X 3533(algorithms,)X 3916(one)X 864 4236(by)N 969(Ukkonen)X 1288([Uk85a])X 1581(\(which)X 1829(we)X 1948(implemented\))X 2418(and)X 2559(one)X 2699(by)X 2803(Galil)X 2987(and)X 3127(Park)X 3298([GP90])X 3558(\(labeled)X 3841(MN2;)X 864 4356(the)N 983(program)X 1276(was)X 1422(provided)X 1728(to)X 1811(us)X 1903(by)X 2004(W.)X 2121(I.)X 2189(Chang\))X 2446(which)X 2663(is)X 2737(based)X 2941(on)X 3042(another)X 3304(technique)X 3637(by)X 3738(Ukkonen)X 864 4476([Uk85b].)N 1200(The)X 1349(last)X 1484(algorithm)X 1818(was)X 1966(found)X 2176(by)X 2279(Chang)X 2511(and)X 2650(Lawler)X 2901([CL90])X 3160(to)X 3245(be)X 3344(the)X 3465(fastest)X 3693(among)X 3934(the)X 864 4596(algorithms)N 1227(they)X 1386(tested.)X 1635(We)X 1769(used)X 1938(random)X 2205(text)X 2347(\(of)X 2463(size)X 2610(1,000,000\))X 2979(and)X 3117(pattern)X 3362(\(of)X 3478(size)X 3625(20\),)X 3774(and)X 3912(two)X 864 4716(different)N 1166(alphabet)X 1463(sizes.)X 1684(In)X 1776(this)X 1916(case,)X 2099(since)X 2288(we)X 2406(use)X 2537(the)X 2659(idea)X 2817(of)X 2908(partitioning)X 3305(the)X 3427(pattern,)X 3694(the)X 3816(size)X 3965(of)X 864 4836(the)N 983(alphabet)X 1276(makes)X 1502(a)X 1560(big)X 1684(difference.)X 2073(A)X 2153(large)X 2336(alphabet)X 2630(leads)X 2817(to)X 2901(very)X 3066(few)X 3209(accidental)X 3557(exact)X 3749(matches,)X 10 f 1632 5005(i)N 1663(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1 f 1680 5125(BM)N 1932(agrep)X 2231(egrep)X 2530(grep)X 2793(fgrep)X 3106(wc)X 10 f 1632 5165(i)N 1663(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1 f 1672 5285(0.21)N 1951(0.35)X 2250(0.79)X 2531(1.22)X 2808(1.61)X 3083(1.19)X 10 f 1632 5325(i)N 1663(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1632(c)X 5245(c)Y 5165(c)Y 5085(c)Y 3263 5325(c)N 5245(c)Y 5165(c)Y 5085(c)Y 1 f 1758 5478(Table)N 1961(1:)X 2063(Exact)X 2266(matching)X 2584(of)X 2671(simple)X 2904(strings.)X 13 p %%Page: 13 14 10 s 10 xH 0 xS 1 f 3 f 2408 696(13)N 1 f 864 984(thus)N 1017(the)X 1135(running)X 1404(time)X 1566(is)X 1639(essentially)X 1997(the)X 2115(same)X 2300(as)X 2387(the)X 2505(one)X 2641(for)X 2755(exact)X 2945(matching.)X 3303(A)X 3381(small)X 3574(alphabet)X 3867(leads)X 864 1104(to)N 953(many)X 1158(matches)X 1447(and)X 1589(the)X 1713(algorithm's)X 2108(performance)X 2541(degrades.)X 2893(The)X 3044(case)X 3209(of)X 3302(binary)X 3533(alphabet)X 3831(serves)X 864 1224(as)N 952(the)X 1071(worst)X 1270(case)X 1430(for)X 1545(this)X 1681(purpose.)X 1976(Results)X 2232(are)X 2352(shown)X 2582(in)X 2666(Table)X 2871(2.)X 2973(The)X 3120(\256nal)X 3284(test)X 3417(was)X 3564(for)X 3680(more)X 3867(com-)X 864 1344(plicated)N 1146(patterns,)X 1448(including)X 1778(some)X 1974(of)X 2068(the)X 2193(extensions)X 2558(discussed)X 2892(in)X 2981(the)X 3106(previous)X 3409(section.)X 3703(\(Anything)X 864 1464(within)N 1093(the)X 1216(<>)X 1331(brackets)X 1624(must)X 1804(match)X 2025(exactly;)X 2305(a)X 2367(`#')X 2487(stands)X 2713(for)X 2833(a)X 2895(wild)X 3063(card)X 3228(of)X 3321(arbitrary)X 3624(length;)X 3872(A)X 3956(`;')X 864 1584(serves)N 1088(as)X 1178(the)X 1299(Boolean)X 1589(AND)X 1786(operation,)X 2132(namely)X 2391(all)X 2494(patterns)X 2771(must)X 2949(appear)X 3187(within)X 3413(the)X 3533(same)X 3720(line;)X 3904(a)X 3962(`)X 9 f 3989(|)X 1 f (')S 864 1704(is)N 937(the)X 1055(regular)X 1303(expression)X 1666(union)X 1868(operation;)X 2213(and)X 2349(a)X 2405(`*')X 2519(is)X 2592(the)X 2710(Kleene)X 2958(closure.\))X 3277(The)X 3422(results)X 3651(are)X 3771(given)X 3970(in)X 864 1824(Table)N 1068(3)X 1129(\(the)X 1275(\256le)X 1398(was)X 1544(the)X 1663(same)X 1849(bibliographic)X 2297(\256le)X 2420(used)X 2588(in)X 2671(Table)X 2874(1\).)X 3001(The)X 3146(best)X 3295(algorithm)X 3626(we)X 3740(know)X 3938(for)X 864 1944(approximate)N 1291(matching)X 1615(to)X 1703(arbitrary)X 2006(regular)X 2260(expressions)X 2660(is)X 2739(by)X 2846(Myers)X 3078(and)X 3221(Miller)X 3448([MM89].)X 3791(Its)X 3898(run-)X 864 2064(ning)N 1034(times)X 1235(for)X 1357(the)X 1483(cases)X 1681(we)X 1803(tested)X 2018(were)X 2203(more)X 2396(than)X 2562(an)X 2666(order)X 2864(of)X 2959(magnitude)X 3325(slower)X 3567(than)X 3733(our)X 3867(algo-)X 864 2184(rithm,)N 1083(but)X 1211(this)X 1352(is)X 1431(not)X 1559(a)X 1621(fair)X 1759(test,)X 1916(because)X 2197(Myers)X 2428(and)X 2570(Miller's)X 2854(algorithm)X 3191(can)X 3329(handle)X 3569(arbitrary)X 3872(costs)X 864 2304(\(which)N 1120(we)X 1246(cannot)X 1492(handle\))X 1765(and)X 1913(its)X 2020(running)X 2301(time)X 2475(is)X 2560(independent)X 2984(of)X 3083(the)X 3213(number)X 3490(of)X 3589(errors)X 3809(\(which)X 864 2424(makes)N 1089(it)X 1153(high)X 1315(for)X 1429(small)X 1622(errors\).)X 10 f 1381 2746(i)N 1394(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1 f 1726 2866(agrep)N 2501(MN2)X 3111(Uk85a)X 9 f 1706 2986(S)N 1 f 1773(=)X 1838(2)X 9 f 1998(S)X 1 f 2065(=)X 2130(30)X 9 f 2334(S)X 1 f 2401(=)X 2466(2)X 9 f 2630(S)X 1 f 2697(=)X 2762(30)X 9 f 2966(S)X 1 f 3033(=)X 3098(2)X 9 f 3262(S)X 1 f 3329(=)X 3394(30)X 10 f 1381 3026(i)N 1394(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1 f 1421 3146(k)N 1481(=)X 1546(0)X 1722(0.35)X 2034(0.35)X 2350(1.21)X 2666(0.98)X 2982(2.36)X 3298(0.90)X 1421 3266(k)N 1481(=)X 1546(1)X 1722(0.52)X 2034(0.38)X 2350(3.03)X 2666(2.42)X 2982(5.01)X 3298(2.06)X 1421 3386(k)N 1481(=)X 1546(2)X 1722(1.78)X 2034(0.38)X 2350(4.94)X 2666(3.87)X 2982(7.93)X 3298(3.19)X 1421 3506(k)N 1481(=)X 1546(3)X 1722(2.56)X 2034(0.39)X 2350(6.68)X 2666(5.33)X 2962(11.80)X 3298(4.38)X 1421 3626(k)N 1481(=)X 1546(4)X 1722(3.83)X 2034(0.41)X 2350(8.72)X 2666(6.89)X 2962(13.40)X 3298(5.55)X 1421 3746(k)N 1481(=)X 1546(5)X 1722(4.42)X 2034(0.42)X 2330(10.41)X 2666(8.28)X 2962(15.45)X 3298(6.77)X 1421 3866(k)N 1481(=)X 1546(6)X 1722(5.13)X 2034(0.73)X 2330(11.83)X 2666(9.75)X 2962(17.07)X 3298(7.99)X 10 f 1381 3906(i)N 1394(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1381(c)X 3866(c)Y 3786(c)Y 3706(c)Y 3626(c)Y 3546(c)Y 3466(c)Y 3386(c)Y 3306(c)Y 3226(c)Y 3146(c)Y 3066(c)Y 2986(c)Y 2906(c)Y 2826(c)Y 1646 3906(c)N 3826(c)Y 3746(c)Y 3666(c)Y 3586(c)Y 3506(c)Y 3426(c)Y 3346(c)Y 3266(c)Y 3186(c)Y 3106(c)Y 2270 3906(c)N 3866(c)Y 3786(c)Y 3706(c)Y 3626(c)Y 3546(c)Y 3466(c)Y 3386(c)Y 3306(c)Y 3226(c)Y 3146(c)Y 3066(c)Y 2986(c)Y 2906(c)Y 2826(c)Y 2902 3906(c)N 3866(c)Y 3786(c)Y 3706(c)Y 3626(c)Y 3546(c)Y 3466(c)Y 3386(c)Y 3306(c)Y 3226(c)Y 3146(c)Y 3066(c)Y 2986(c)Y 2906(c)Y 2826(c)Y 3514 3906(c)N 3866(c)Y 3786(c)Y 3706(c)Y 3626(c)Y 3546(c)Y 3466(c)Y 3386(c)Y 3306(c)Y 3226(c)Y 3146(c)Y 3066(c)Y 2986(c)Y 2906(c)Y 2826(c)Y 1 f 1537 4123(Table)N 1740(2:)X 1842(Approximate)X 2285(string)X 2487(matching)X 2805(of)X 2892(simple)X 3125(strings.)X 10 f 1335 4292(i)N 1361(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1 f 1375 4412(pattern)N 2501(k)X 2561(=)X 2626(0)X 2786(k)X 2846(=)X 2911(1)X 3071(k)X 3131(=)X 3196(2)X 3356(k)X 3416(=)X 3481(3)X 10 f 1335 4452(i)N 1361(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1 f 1375 4572(Homogenious)N 2513(0.35)X 2798(0.39)X 3083(0.41)X 3368(0.47)X 1375 4692(ogenious)N 2513(0.53)X 2798(1.10)X 3083(1.42)X 3368(1.74)X 1375 4812(JACM;)N 1630(1981;)X 1832(Graph)X 2513(0.53)X 2798(1.10)X 3083(1.43)X 3368(1.75)X 1375 4932(Prob#tic;)N 1688(Algo#m)X 2513(0.55)X 2798(1.10)X 3083(1.42)X 3368(1.76)X 1375 5052(<[CJ]ACM>;)N 1827(Prob#tic;)X 2140(trees)X 2513(0.54)X 2798(1.11)X 3083(1.43)X 3368(1.75)X 1375 5172(\(<[23]>\\)N 9 f 1648(-)X 1 f 1692([23]*)X 9 f 1866(|)X 1 f (\).*ees)S 2513(0.66)X 2798(1.53)X 3083(2.19)X 3368(2.83)X 10 f 1335 5212(i)N 1361(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1335(c)X 5172(c)Y 5092(c)Y 5012(c)Y 4932(c)Y 4852(c)Y 4772(c)Y 4692(c)Y 4612(c)Y 4532(c)Y 4452(c)Y 4372(c)Y 3561 5212(c)N 5172(c)Y 5092(c)Y 5012(c)Y 4932(c)Y 4852(c)Y 4772(c)Y 4692(c)Y 4612(c)Y 4532(c)Y 4452(c)Y 4372(c)Y 1 f 1528 5429(Table)N 1731(3:)X 1833(Approximate)X 2276(matching)X 2594(of)X 2681(complicated)X 3093(patterns.)X 14 p %%Page: 14 15 10 s 10 xH 0 xS 1 f 3 f 2408 696(14)N 6 f 14 s 864 984(5.)N 1019(A)X 1131(Description)X 1777(of)X 1914(agrep)X 2 f 10 s 864 1137(agrep)N 1 f 1078(is)X 1158(used)X 1332(similarly)X 1643(to)X 1732(egrep)X 1938(and)X 2082(it)X 2154(supports)X 2453(most)X 2636(of)X 2731(the)X 2857(features)X 3140(of)X 3235(egrep)X 3442(\(although)X 3777(it)X 3849(is)X 3930(not)X 864 1257(100%)N 1079(compatible\))X 1490(and)X 1634(many)X 1840(additional)X 2188(features.)X 2511(One)X 2673(notable)X 2936(difference)X 3290(between)X 3585(agrep)X 3791(and)X 3934(the)X 864 1377(grep)N 1042(family)X 1286(\(besides)X 1584(the)X 1717(additional)X 2072(features\))X 2389(is)X 2477(that)X 2632(agrep)X 2846(is)X 2 f 2934(record)X 3184(oriented)X 1 f 3487(\(rather)X 3738(than)X 3912(line)X 864 1497(oriented\).)N 1230(A)X 1324(record)X 1566(is)X 1655(de\256ned)X 1927(by)X 2043(the)X 2176(user)X 2345(\(with)X 2549(the)X 2682(default)X 2940(being)X 3153(a)X 3224(line\).)X 3446(Agrep)X 3682(outputs)X 3952(all)X 864 1617(records)N 1135(that)X 1289(match)X 1519(the)X 1651(query)X 1868(\(see)X 2032(also)X 2195(the)X 2327(-d)X 2428(option)X 2666(described)X 3008(below\).)X 3305(Agrep)X 3540(is)X 3627(available)X 3952(by)X 864 1737(anonymous)N 1253(ftp)X 1362(from)X 1538(cs.arizona.edu)X 2018(\(IP)X 2136(number)X 2401(192.12.69.5\).)X 3 f 1008 1921(agrep)N 1 f 2 f 1250(pattern)X 1 f 1527(\256le-name)X 1856(\320)X 1962(\256nds)X 2143(all)X 2249 0.3125(occurrences)AX 2661(of)X 2 f 2755(pattern)X 1 f 3013(\(that)X 3187(is,)X 3287(output)X 3518(all)X 3625(records)X 3889(con-)X 864 2041(taining)N 2 f 1107(pattern)X 1 f 1338(\))X 1386(in)X 1469(the)X 1588(text)X 1729(\256le)X 1852(\256le-name)X 2176(without)X 2441(errors.)X 2690(The)X 2836(pattern)X 3080(can)X 3213(be)X 3310(a)X 3367(string)X 3570(or)X 3658(an)X 3755(arbitrary)X 864 2161(regular)N 1114(expression.)X 1519(We)X 1653(describe)X 1943(below)X 2162(the)X 2283(unusual)X 2555(features)X 2833(of)X 2923(agrep)X 3125(that)X 3268(are)X 3390(not)X 3515(found)X 3725(in)X 3810(similar)X 864 2281(programs.)N 1227(A)X 1305(complete)X 1619(description)X 1995(is)X 2068(given)X 2266(in)X 2348(the)X 2466(manual)X 2722(pages)X 2925(distributed)X 3287(with)X 3449(agrep.)X 864 2434(-)N 2 f 891(k)X 1 f 1064(\256nds)X 1240(all)X 1341 0.3125(occurrences)AX 1747(with)X 1910(at)X 1989(most)X 2 f 2165(k)X 1 f 2222(errors)X 2431(\(insertions,)X 2810(deletions,)X 3141(or)X 3230(substitutions\),)X 3702(where)X 2 f 3921(k)X 1 f 3979(is)X 1064 2554(a)N 1120(positive)X 1393(integer.)X 864 2707(-D)N 2 f 949(c)X 1 f 1064(each)X 1232(deletion)X 1510(counts)X 1739(as)X 2 f 1826(c)X 1 f 1882(errors;)X 2 f 2112(c)X 1 f 2168(must)X 2343(be)X 2439(a)X 2495(non-negative)X 2934(integer;)X 3199(the)X 3317(default)X 3560(value)X 3754(of)X 2 f 3841(c)X 1 f 3897(is)X 3970(1)X 864 2860(-I)N 2 f 918(c)X 1 f 1064(each)X 1232(insertion)X 1532(counts)X 1761(as)X 2 f 1848(c)X 1 f 1904(errors;)X 2 f 2134(c)X 1 f 2190(must)X 2365(be)X 2461(a)X 2517(non-negative)X 2956(integer;)X 3221(the)X 3339(default)X 3582(value)X 3776(of)X 2 f 3863(c)X 1 f 3919(is)X 3992(1)X 864 3013(-S)N 2 f 935(c)X 1 f 1064(each)X 1234(substitution)X 1628(counts)X 1859(as)X 2 f 1948(c)X 1 f 2006(errors;)X 2 f 2238(c)X 1 f 2296(must)X 2474(be)X 2573(a)X 2632(non-negative)X 3074(integer;)X 3342(the)X 3463(default)X 3709(value)X 3906(of)X 2 f 3996(c)X 1 f 1064 3133(is)N 1137(1)X 864 3286(-d)N 951(`)X 3 f 978(delim)X 1 f 1169(')X 3 f 1064 3406(delim)N 1 f 1282(is)X 1362(a)X 1425(user-de\256ned)X 1849(symbol)X 2111(\(or)X 2232(string\))X 2468(for)X 2589(record)X 2823(delimiter)X 3140(\(the)X 3293(default)X 3544(is)X 3625(the)X 3751(new-line)X 1064 3526(symbol\).)N 1386(This)X 1548(enables)X 1809(searching)X 2137(paragraphs)X 2510(\(in)X 2619(which)X 2835(case)X 3 f 2994(delim)X 1 f 3205(=)X 3270(2)X 3330(new)X 3484(lines)X 3655(in)X 3737(a)X 3793(row\))X 3965(or)X 1064 3646(mail)N 1239(messages)X 1575(\()X 3 f 1602(delim)X 1 f 1826(=)X 1904('\303From)X 2151('\).)X 2278(This)X 2453(feature)X 2710(makes)X 2948(agrep)X 3160(a)X 3230(record-oriented)X 3760(program)X 1064 3766(rather)N 1272(than)X 1430(just)X 1565(a)X 1621(line-oriented)X 2051(program.)X 2383(We)X 2515(believe)X 2767(that)X 2907(it)X 2971(will)X 3115(be)X 3211(very)X 3374(useful.)X 6 f 12 s 864 3919(Examples)N 2 f 10 s 864 4072(agrep)N 1 f 1071(-3)X 1158(-D2)X 1064 4192(\256nds)N 1250(all)X 1361 0.3125(occurrences)AX 1777(with)X 1950(at)X 2039(most)X 2225(3)X 2297(errors)X 2517(where)X 2746(a)X 2814(deletion)X 3104(counts)X 3345(as)X 3444(2)X 3516(errors)X 3736(and)X 3884(each)X 1064 4312(insertion)N 1364(or)X 1451(substitution)X 1843(counts)X 2072(as)X 2159(one)X 2295(error.)X 2 f 864 4465(agrep)N 1 f 1071(-4)X 1158(-I5)X 1064 4585(\256nds)N 1246(all)X 1353 0.3125(occurrences)AX 1765(with)X 1934(at)X 2019(most)X 2201(4)X 2268(errors)X 2483(but)X 2612(no)X 2719(insertions)X 3057(allowed)X 3338(\(because)X 3647(their)X 3822(cost)X 3979(is)X 1064 4705(prohibited\).)N 864 4858(agrep)N 1063(-d)X 1150('\303From)X 1397(')X 1444('breakdown;)X 1870(\(inter)X 9 f 2044(|)X 1 f (arpa)S 9 f 2199(|)X 1 f (bit\)net')S 1064 4978(outputs)N 1324(all)X 1429(mail)X 1596(messages)X 1924(\(the)X 2074(pattern)X 2322('\303From)X 2569(')X 2621(separates)X 2942(mail)X 3110(messages)X 3439(in)X 3527(a)X 3589(mail)X 3757(\256le\))X 3912(that)X 1064 5098(contain)N 1320(breakdown)X 1697(and)X 1833(one)X 1969(of)X 2056(either)X 2259(internet,)X 2544(arpanet,)X 2821(or)X 2908(bitnet.)X 2 f 864 5251(agrep)N 1 f 1071(-d)X 1158('$$')X 1312(-1)X 1399(')X 1761(')X 1064 5371(\256nds)N 1248(all)X 1357(paragraphs)X 1739(that)X 1888(contain)X 2153(word1)X 2387(followed)X 2701(by)X 2810(word2)X 3044(with)X 3215(one)X 3360(error)X 3546(in)X 3637(place)X 3837(of)X 3934(the)X 1064 5491(blank)N 1264(between)X 1554(the)X 1673(words)X 1890(\(the)X 2036(<>)X 2147(indicate)X 2422(that)X 2563(no)X 2664(error)X 2842(is)X 2916(allowed)X 3191(inside;)X 3425(see)X 3549(section)X 3797(3.4\).)X 3965(In)X 1064 5611(particular,)N 1412(if)X 1481(word1)X 1706(is)X 1779(the)X 1897(last)X 2028(word)X 2213(in)X 2295(a)X 2351(line)X 2491(and)X 2627(word2)X 2852(is)X 2925(the)X 3043(\256rst)X 3187(word)X 3372(in)X 3455(the)X 3574(next)X 3733(line,)X 3894(then)X 15 p %%Page: 15 16 10 s 10 xH 0 xS 1 f 3 f 2408 696(15)N 1 f 1064 984(the)N 1187(space)X 1391(will)X 1540(be)X 1641(substituted)X 2012(by)X 2117(a)X 2178(newline)X 2457(symbol)X 2716(and)X 2856(it)X 2924(will)X 3072(match.)X 3332(Thus,)X 3536(this)X 3675(is)X 3752(a)X 3812(way)X 3970(to)X 1064 1104(overcome)N 1407(separation)X 1763(by)X 1869(a)X 1931(newline.)X 2251(Note)X 2433(that)X 2579(-d)X 2672('')X 2752(\(or)X 2873(another)X 3141(delim)X 3350(that)X 3497(spans)X 3702(more)X 3894(than)X 1064 1224(one)N 1200(line\))X 1367(is)X 1440(necessary,)X 1793(because)X 2068(otherwise)X 2400(agrep)X 2599(searches)X 2892(only)X 3054(one)X 3190(line)X 3330(at)X 3408(a)X 3464(time.)X 6 f 14 s 864 1496(6.)N 1019 -0.3063(Conclusions)AX 1 f 10 s 864 1649(Searching)N 1208(text)X 1351(in)X 1436(the)X 1557(presence)X 1862(of)X 1952(errors)X 2163(is)X 2239(commonly)X 2604(done)X 2783(`by)X 2914(hand')X 3121(\320)X 3225(one)X 3365(tries)X 3527(all)X 3631(possibilities.)X 864 1769(This)N 1033(is)X 1113(frustrating,)X 1494(slow,)X 1692(and)X 1834(with)X 2002(no)X 2108(guarantee)X 2447(of)X 2540(success.)X 2847(The)X 2998(new)X 3158(algorithm)X 3495(presented)X 3829(in)X 3917(this)X 864 1889(paper)N 1073(for)X 1197(searching)X 1535(with)X 1707(errors)X 1925(can)X 2068(alleviate)X 2371(this)X 2517(problem)X 2815(and)X 2962(make)X 3167(searching)X 3506(in)X 3599(general)X 3867(more)X 864 2009(robust.)N 1127(It)X 1198(also)X 1349(makes)X 1576(searching)X 1906(more)X 2093(convenient)X 2467(by)X 2569(not)X 2693(having)X 2933(to)X 3017(spell)X 3190(everything)X 3555(precisely.)X 3907(The)X 864 2129(algorithm)N 1195(is)X 1268(very)X 1431(fast)X 1567(and)X 1703(general)X 1960(and)X 2096(it)X 2160(should)X 2393(\256nd)X 2537(numerous)X 2873(applications.)X 1064 2282(There)N 1273(is)X 1347(one)X 1484(important)X 1816(area)X 1972(of)X 2060(searching)X 2389(with)X 2552(errors)X 2761(that)X 2902(we)X 3017(did)X 3140(not)X 3263(address)X 3525(\320)X 3626(searching)X 3956(an)X 864 2402(indexed)N 1147(\256le.)X 1318(Throughout)X 1725(the)X 1852(paper)X 2060(we)X 2183(assumed)X 2488(that)X 2637(the)X 2764(\256les)X 2926(are)X 3054(not)X 3185(indexed)X 3468 0.2500(\(preprocessed\))AX 3970(in)X 864 2522(any)N 1000(way,)X 1174(thus)X 1327(a)X 1383(sequential)X 1728(scan)X 1891(through)X 2160(them)X 2340(is)X 2413(necessary.)X 2786(We)X 2918(believe)X 3170(that)X 3311(the)X 3430(problem)X 3718(of)X 3806(\256nding)X 864 2642(good)N 1060(indexing)X 1376(schemes)X 1684(that)X 1840(allow)X 2053(approximate)X 2489(search)X 2730(is)X 2818(the)X 2951(main)X 3146(open)X 3337(problem)X 3639(is)X 3727(this)X 3877(area.)X 864 2762(Unfortunately,)N 1355(we)X 1470(do)X 1571(not)X 1694(know)X 1893(of)X 1981(any)X 2118(satisfactory)X 2509(solution)X 2787(at)X 2866(this)X 3002(point.)X 3228(However,)X 3565(with)X 3729(the)X 3849(speed)X 864 2882(of)N 957(current)X 1211(computers,)X 1590(scanning)X 1900(large)X 2086(\256les)X 2244(\(up)X 2376(to)X 2463(tens)X 2617(of)X 2709(megabytes\))X 3104(can)X 3241(be)X 3342(done)X 3523(reasonably)X 3896(fast.)X 864 3002(One)N 1023(can)X 1160(argue)X 1364(that)X 1509(the)X 1632(size)X 1782(of)X 1874(our)X 2006(data)X 2165(increases)X 2485(as)X 2577(our)X 2709(speed)X 2918(of)X 3011(processing)X 3380(it)X 3450(increases.)X 3811(This)X 3979(is)X 864 3122(certainly)N 1173(true)X 1326(for)X 1448(some)X 1645(applications,)X 2079(but)X 2208(not)X 2337(for)X 2458(all.)X 2605(Many)X 2819(applications)X 3233(have)X 3412(an)X 3515(upper)X 3725(bound)X 3952(on)X 864 3242(size)N 1009(and)X 1145(sequential)X 1490(search)X 1716(for)X 1830(those)X 2019(applications)X 2426(will)X 2570(be)X 2666(realistic.)X 6 f 14 s 864 3482(Acknowledgem)N 1683(ents:)X 1 f 10 s 864 3635(We)N 1000(thank)X 1202(Ricardo)X 1480(Baeza-Yates,)X 1931(Gene)X 2125(Myers,)X 2375(and)X 2516(Chunghwa)X 2888(H.)X 2991(Rao)X 3145(for)X 3264(many)X 3467(helpful)X 3719(conversa-)X 864 3755(tions)N 1051(about)X 1261(approximate)X 1694(string)X 1908(matching)X 2238(and)X 2386(for)X 2512(comments)X 2873(that)X 3025(improved)X 3364(the)X 3493(manuscript.)X 3920(We)X 864 3875(thank)N 1084(Ric)X 1237(Anderson,)X 1611(Cliff)X 1804(Hathaway,)X 2192(and)X 2351(Shu-Ing)X 2652(Tsuei)X 2873(for)X 3010(their)X 3200(help)X 3381(and)X 3540(comments)X 3912(that)X 864 3995(improved)N 1193(the)X 1312(implementation)X 1835(of)X 1923(agrep.)X 2163(We)X 2296(also)X 2446(thank)X 2645(William)X 2928(I.)X 2996(Chang)X 3226(for)X 3341(kindly)X 3566(providing)X 3898(pro-)X 864 4115(grams)N 1080(for)X 1194(some)X 1383(of)X 1470(the)X 1588(experiments.)X 6 f 14 s 864 4355(References)N 1 f 10 s 864 4628([BG89])N 1064 4748(Baeza-Yates)N 1492(R.)X 1586(A.,)X 1705(and)X 1842(G.)X 1941(H.)X 2040(Gonnet,)X 2317(``A)X 2450(new)X 2605(approach)X 2921(to)X 3004(text)X 3145(searching,'')X 2 f 3548(Proceedings)X 3970(of)X 1064 4868(the)N 1192(12th)X 1364(Annual)X 1624(ACM-SIGIR)X 2050(conference)X 2432(on)X 2541(Information)X 2952(Retrieval,)X 1 f 3295(Cambridge,)X 3700(MA)X 3858(\(June)X 1064 4988(1989\),)N 1291(pp.)X 1411(168)X 9 f (-)S 1 f 1575(175.)X 864 5141([BM77])N 1064 5261(Boyer)N 1284(R.)X 1381(S.,)X 1489(and)X 1629(J.)X 1704(S.)X 1793(Moore,)X 2052(``A)X 2189(fast)X 2330(string)X 2537(searching)X 2870(algorithm,'')X 2 f 3280(Communications)X 3847(of)X 3934(the)X 1064 5381(ACM,)N 3 f 1273(20)X 1 f 1373(\(October)X 1679(1977\),)X 1906(pp.)X 2026(762)X 9 f (-)S 1 f 2190(772.)X 864 5534([CL90])N 1064 5654(Chang)N 1294(W.)X 1411(I.,)X 1499(and)X 1636(E.)X 1726(L.)X 1816(Lawler,)X 2085(``Approximate)X 2584(string)X 2788(matching)X 3108(in)X 3192(sublinear)X 3508(expected)X 3816(time,'')X 16 p %%Page: 16 17 10 s 10 xH 0 xS 1 f 3 f 2408 696(16)N 1 f 1064 984(FOCS)N 1283(90,)X 1403(pp.)X 1523(116)X 9 f (-)S 1 f 1687(124.)X 864 1137([GG88])N 1064 1257(Galil)N 1250(Z.,)X 1365(and)X 1507(R.)X 1606(Giancarlo,)X 1969(``Data)X 2201(structures)X 2539(and)X 2682(algorithms)X 3051(for)X 3172(approximate)X 3600(string)X 3809(match-)X 1064 1377(ing,'')N 2 f 1260(Journal)X 1529(of)X 1611(Complexity,)X 3 f 2016(4)X 1 f 2076(\(1988\),)X 2330(pp.)X 2450(33)X 9 f (-)S 1 f 2574(72.)X 864 1530([GP90])N 1064 1650(Galil)N 1247(Z.,)X 1359(and)X 1498(K.)X 1599(Park,)X 1789(``An)X 1964(improved)X 2294(algorithm)X 2628(for)X 2745(approximate)X 3169(string)X 3374(matching,'')X 2 f 3769(SIAM)X 3976(J.)X 1064 1770(on)N 1164(Computing,)X 3 f 1559(19)X 1 f 1659(\(December)X 2037(1990\),)X 2264(pp.)X 2384(989)X 9 f (-)S 1 f 2548(999.)X 864 1923([GB91])N 1064 2043(Gonnet,)N 1360(G.)X 1478(H.)X 1596(and)X 1752(R.)X 1865(A.)X 1983(Baeza-Yates,)X 2 f 2450(Handbook)X 2824(of)X 2926(Algorithms)X 3321(and)X 3482(Data)X 3683(Structures,)X 1 f 1064 2163(Second)N 1320(Edition,)X 1595(Addison-Wesley,)X 2174(Reading,)X 2481(MA,)X 2650(1991.)X 864 2316([HD80])N 1064 2436(Hall,)N 1250(P.)X 1343(A.)X 1450(V.,)X 1577(and)X 1722(G.)X 1829(R.)X 1931(Dowling,)X 2260(``Approximate)X 2766(string)X 2977(matching,'')X 2 f 3378(Computing)X 3762(Surveys,)X 1 f 1064 2556(\(December)N 1442(1980\),)X 1669(pp.)X 1789(381)X 9 f (-)S 1 f 1953(402.)X 864 2709([HU79])N 1064 2829(Hopcroft,)N 1396(J.E.,)X 1558(and)X 1696(J.D.)X 1847(Ullman,)X 2 f 2129(Introduction)X 2551(to)X 2635(Automata)X 2968(Theory,)X 3237(Languages,)X 3631(and)X 3774(Compu-)X 1064 2949(tation)N 1 f 1250(,)X 1290(Addison-Wesley,)X 1869(Reading,)X 2176(Mass)X 2365(\(1979\).)X 864 3102([KMP77])N 1064 3222(Knuth)N 1287(D.)X 1388(E.,)X 1501(J.)X 1576(H.)X 1678(Morris,)X 1940(and)X 2080(V.)X 2182(R.)X 2279(Pratt,)X 2474(``Fast)X 2685(pattern)X 2932(matching)X 3254(in)X 3340(strings,'')X 2 f 3651(SIAM)X 3858(Jour-)X 1064 3342(nal)N 1186(on)X 1286(Computing,)X 3 f 1681(6)X 1 f 1741(\(June)X 1935(1977\),)X 2162(pp.)X 2282(323)X 9 f (-)S 1 f 2446(350.)X 864 3495([LV88])N 1064 3615(Landau)N 1326(G.)X 1425(M.,)X 1557(and)X 1694(U.)X 1793(Vishkin,)X 2087(``Fast)X 2295(string)X 2499(matching)X 2819(with)X 2983(k)X 3045 0.2692(differences,'')AX 2 f 3499(Journal)X 3770(of)X 3854(Com-)X 1064 3735(puter)N 1253(and)X 1393(System)X 1636(Sciences,)X 3 f 1953(37)X 1 f 2053(\(1988\),)X 2307(pp.)X 2427(63)X 9 f (-)S 1 f 2551(78.)X 864 3888([LV89])N 1064 4008(Landau)N 1338(G.)X 1449(M.,)X 1593(and)X 1742(U.)X 1853(Vishkin,)X 2159(``Fast)X 2379(parallel)X 2653(and)X 2802(serial)X 3009(approximate)X 3444(string)X 3660(matching,'')X 2 f 1064 4128(Journal)N 1333(of)X 1415(Algorithms,)X 3 f 1810(10)X 1 f 1910(\(1989\).)X 864 4281([Le66])N 1064 4401(Levenshtein,)N 1509(V.)X 1620(I.,)X 1720(``Binary)X 2025(codes)X 2241(capable)X 2520(of)X 2621(correcting)X 2981(deletions,)X 3324(insertions,)X 3689(and)X 3839(rever-)X 1064 4521(sals,'')N 2 f 1278(Sov.)X 1434(Phys.)X 1630(Dokl.,)X 1 f 1846(\(February)X 2183(1966\),)X 2410(pp.)X 2530(707)X 9 f (-)S 1 f 2694(710.)X 864 4674([My86])N 1064 4794(Myers,)N 1309(E.)X 1398(W.,)X 1534(``An)X 1706(O\(ND\))X 1954(difference)X 2301(algorithm)X 2632(and)X 2768(its)X 2863(variations,'')X 2 f 3274(Algorithmica,)X 3 f 3737(1)X 1 f 3798(\(1986\),)X 1064 4914(pp.)N 1184(251)X 9 f (-)S 1 f 1348(266.)X 864 5067([MM89])N 1064 5187(Myers,)N 1321(E.)X 1422(W.,)X 1570(and)X 1719(W.)X 1848(Miller,)X 2101(``Approximate)X 2611(matching)X 2942(of)X 3042(regular)X 3303(expressions,'')X 2 f 3784(Bull.)X 3970(of)X 1064 5307(Mathematical)N 1529(Biology)X 1 f 1778(,)X 3 f 1818(51)X 1 f 1918(\(1989\),)X 2172(pp.)X 2292(5)X 9 f (-)S 1 f 2376(37.)X 864 5460([TU90])N 1064 5580(Tarhio)N 1322(J.,)X 1437(and)X 1597(E.)X 1710(Ukkonen,)X 2068(``Approximate)X 2589(Boyer-Moore)X 3071(string)X 3298(matching,'')X 3715(Technical)X 1064 5700(Report)N 1302(#A-1990-3,)X 1694(Dept.)X 1890(of)X 1977(Computer)X 2317(Science,)X 2607(University)X 2965(of)X 3052(Helsinki)X 3343(\(March)X 3600(1990\))X 17 p %%Page: 17 18 10 s 10 xH 0 xS 1 f 3 f 2408 696(17)N 1 f 864 984([Uk85a])N 1064 1104(Ukkonen)N 1387(E.,)X 1505(``Finding)X 1836(approximate)X 2266(patterns)X 2549(in)X 2640(strings,'')X 2 f 2956(Journal)X 3234(of)X 3325(Algorithms,)X 3 f 3729(6)X 1 f 3798(\(1985\),)X 1064 1224(pp.)N 1184(132)X 9 f (-)S 1 f 1348(137.)X 864 1377([Uk85b])N 1064 1497(Ukkonen)N 1382(E.,)X 1495(``Algorithms)X 1938(for)X 2057(approximate)X 2483(string)X 2690(matching,'')X 2 f 3087(Information)X 3494(and)X 3639(Control,)X 3 f 3932(64)X 1 f (,)S 1064 1617(\(1985\),)N 1318(pp.)X 1438(100)X 9 f (-)S 1 f 1602(118.)X 18 p %%Trailer xt xs debian/doc/agrep.ps.20000644000000000000000000026017611721405564011556 0ustar %!PS-Adobe-1.0 %%Creator: optima:sw (Sun Wu) %%Title: stdin (ditroff) %%CreationDate: Thu Jan 9 13:58:28 1992 %%EndComments % Start of psdit.pro -- prolog for ditroff translator % Copyright (c) 1985,1987 Adobe Systems Incorporated. All Rights Reserved. % GOVERNMENT END USERS: See Notice file in TranScript library directory % -- probably /usr/lib/ps/Notice % RCS: $Header: psdit.pro,v 2.2 87/11/17 16:40:42 byron Rel $ % Psfig RCSID $Header: psdit.pro,v 1.5 88/01/04 17:48:22 trevor Exp $ /$DITroff 180 dict def $DITroff begin /DocumentInitState [ matrix currentmatrix currentlinewidth currentlinecap currentlinejoin currentdash currentgray currentmiterlimit ] cvx def %% Psfig additions /startFig { /SavedState save def userdict maxlength dict begin currentpoint transform DocumentInitState setmiterlimit setgray setdash setlinejoin setlinecap setlinewidth setmatrix itransform moveto /ury exch def /urx exch def /lly exch def /llx exch def /y exch 72 mul resolution div def /x exch 72 mul resolution div def currentpoint /cy exch def /cx exch def /sx x urx llx sub div def % scaling for x /sy y ury lly sub div def % scaling for y sx sy scale % scale by (sx,sy) cx sx div llx sub cy sy div ury sub translate /DefFigCTM matrix currentmatrix def /initmatrix { DefFigCTM setmatrix } def /defaultmatrix { DefFigCTM exch copy } def /initgraphics { DocumentInitState setmiterlimit setgray setdash setlinejoin setlinecap setlinewidth setmatrix DefFigCTM setmatrix } def /showpage { initgraphics } def } def % Args are llx lly urx ury (in figure coordinates) /clipFig { currentpoint 6 2 roll newpath 4 copy 4 2 roll moveto 6 -1 roll exch lineto exch lineto exch lineto closepath clip newpath moveto } def % doclip, if called, will always be just after a `startfig' /doclip { llx lly urx ury clipFig } def /endFig { end SavedState restore } def /globalstart { % Push details about the enviornment on the stack. fontnum fontsize fontslant fontheight % firstpage mh my resolution slotno currentpoint pagesave restore gsave } def /globalend { grestore moveto /slotno exch def /resolution exch def /my exch def /mh exch def % /firstpage exch def /fontheight exch def /fontslant exch def /fontsize exch def /fontnum exch def F /pagesave save def } def %% end XMOD additions /fontnum 1 def /fontsize 10 def /fontheight 10 def /fontslant 0 def /xi {0 72 11 mul translate 72 resolution div dup neg scale 0 0 moveto /fontnum 1 def /fontsize 10 def /fontheight 10 def /fontslant 0 def F /pagesave save def}def /PB{save /psv exch def currentpoint translate resolution 72 div dup neg scale 0 0 moveto}def /PE{psv restore}def /m1 matrix def /m2 matrix def /m3 matrix def /oldmat matrix def /tan{dup sin exch cos div}bind def /point{resolution 72 div mul}bind def /dround {transform round exch round exch itransform}bind def /xT{/devname exch def}def /xr{/mh exch def /my exch def /resolution exch def}def /xp{}def /xs{docsave restore end}def /xt{}def /xf{/fontname exch def /slotno exch def fontnames slotno get fontname eq not {fonts slotno fontname findfont put fontnames slotno fontname put}if}def /xH{/fontheight exch def F}bind def /xS{/fontslant exch def F}bind def /s{/fontsize exch def /fontheight fontsize def F}bind def /f{/fontnum exch def F}bind def /F{fontheight 0 le {/fontheight fontsize def}if fonts fontnum get fontsize point 0 0 fontheight point neg 0 0 m1 astore fontslant 0 ne{1 0 fontslant tan 1 0 0 m2 astore m3 concatmatrix}if makefont setfont .04 fontsize point mul 0 dround pop setlinewidth}bind def /X{exch currentpoint exch pop moveto show}bind def /N{3 1 roll moveto show}bind def /Y{exch currentpoint pop exch moveto show}bind def /S /show load def /ditpush{}def/ditpop{}def /AX{3 -1 roll currentpoint exch pop moveto 0 exch ashow}bind def /AN{4 2 roll moveto 0 exch ashow}bind def /AY{3 -1 roll currentpoint pop exch moveto 0 exch ashow}bind def /AS{0 exch ashow}bind def /MX{currentpoint exch pop moveto}bind def /MY{currentpoint pop exch moveto}bind def /MXY /moveto load def /cb{pop}def % action on unknown char -- nothing for now /n{}def/w{}def /p{pop showpage pagesave restore /pagesave save def}def /abspoint{currentpoint exch pop add exch currentpoint pop add exch}def /dstroke{currentpoint stroke moveto}bind def /Dl{2 copy gsave rlineto stroke grestore rmoveto}bind def /arcellipse{oldmat currentmatrix pop currentpoint translate 1 diamv diamh div scale /rad diamh 2 div def rad 0 rad -180 180 arc oldmat setmatrix}def /Dc{gsave dup /diamv exch def /diamh exch def arcellipse dstroke grestore diamh 0 rmoveto}def /De{gsave /diamv exch def /diamh exch def arcellipse dstroke grestore diamh 0 rmoveto}def /Da{currentpoint /by exch def /bx exch def /fy exch def /fx exch def /cy exch def /cx exch def /rad cx cx mul cy cy mul add sqrt def /ang1 cy neg cx neg atan def /ang2 fy fx atan def cx bx add cy by add 2 copy rad ang1 ang2 arcn stroke exch fx add exch fy add moveto}def /Barray 200 array def % 200 values in a wiggle /D~{mark}def /D~~{counttomark Barray exch 0 exch getinterval astore /Bcontrol exch def pop /Blen Bcontrol length def Blen 4 ge Blen 2 mod 0 eq and {Bcontrol 0 get Bcontrol 1 get abspoint /Ycont exch def /Xcont exch def Bcontrol 0 2 copy get 2 mul put Bcontrol 1 2 copy get 2 mul put Bcontrol Blen 2 sub 2 copy get 2 mul put Bcontrol Blen 1 sub 2 copy get 2 mul put /Ybi /Xbi currentpoint 3 1 roll def def 0 2 Blen 4 sub {/i exch def Bcontrol i get 3 div Bcontrol i 1 add get 3 div Bcontrol i get 3 mul Bcontrol i 2 add get add 6 div Bcontrol i 1 add get 3 mul Bcontrol i 3 add get add 6 div /Xbi Xcont Bcontrol i 2 add get 2 div add def /Ybi Ycont Bcontrol i 3 add get 2 div add def /Xcont Xcont Bcontrol i 2 add get add def /Ycont Ycont Bcontrol i 3 add get add def Xbi currentpoint pop sub Ybi currentpoint exch pop sub rcurveto }for dstroke}if}def end /ditstart{$DITroff begin /nfonts 60 def % NFONTS makedev/ditroff dependent! /fonts[nfonts{0}repeat]def /fontnames[nfonts{()}repeat]def /docsave save def }def % character outcalls /oc {/pswid exch def /cc exch def /name exch def /ditwid pswid fontsize mul resolution mul 72000 div def /ditsiz fontsize resolution mul 72 div def ocprocs name known{ocprocs name get exec}{name cb} ifelse}def /fractm [.65 0 0 .6 0 0] def /fraction {/fden exch def /fnum exch def gsave /cf currentfont def cf fractm makefont setfont 0 .3 dm 2 copy neg rmoveto fnum show rmoveto currentfont cf setfont(\244)show setfont fden show grestore ditwid 0 rmoveto} def /oce {grestore ditwid 0 rmoveto}def /dm {ditsiz mul}def /ocprocs 50 dict def ocprocs begin (14){(1)(4)fraction}def (12){(1)(2)fraction}def (34){(3)(4)fraction}def (13){(1)(3)fraction}def (23){(2)(3)fraction}def (18){(1)(8)fraction}def (38){(3)(8)fraction}def (58){(5)(8)fraction}def (78){(7)(8)fraction}def (sr){gsave .05 dm .16 dm rmoveto(\326)show oce}def (is){gsave 0 .15 dm rmoveto(\362)show oce}def (->){gsave 0 .02 dm rmoveto(\256)show oce}def (<-){gsave 0 .02 dm rmoveto(\254)show oce}def (==){gsave 0 .05 dm rmoveto(\272)show oce}def end % DIThacks fonts for some special chars 50 dict dup begin /FontType 3 def /FontName /DIThacks def /FontMatrix [.001 0.0 0.0 .001 0.0 0.0] def /FontBBox [-220 -280 900 900] def% a lie but ... /Encoding 256 array def 0 1 255{Encoding exch /.notdef put}for Encoding dup 8#040/space put %space dup 8#110/rc put %right ceil dup 8#111/lt put %left top curl dup 8#112/bv put %bold vert dup 8#113/lk put %left mid curl dup 8#114/lb put %left bot curl dup 8#115/rt put %right top curl dup 8#116/rk put %right mid curl dup 8#117/rb put %right bot curl dup 8#120/rf put %right floor dup 8#121/lf put %left floor dup 8#122/lc put %left ceil dup 8#140/sq put %square dup 8#141/bx put %box dup 8#142/ci put %circle dup 8#143/br put %box rule dup 8#144/rn put %root extender dup 8#145/vr put %vertical rule dup 8#146/ob put %outline bullet dup 8#147/bu put %bullet dup 8#150/ru put %rule dup 8#151/ul put %underline pop /DITfd 100 dict def /BuildChar{0 begin /cc exch def /fd exch def /charname fd /Encoding get cc get def /charwid fd /Metrics get charname get def /charproc fd /CharProcs get charname get def charwid 0 fd /FontBBox get aload pop setcachedevice 40 setlinewidth newpath 0 0 moveto gsave charproc grestore end}def /BuildChar load 0 DITfd put %/UniqueID 5 def /CharProcs 50 dict def CharProcs begin /space{}def /.notdef{}def /ru{500 0 rls}def /rn{0 750 moveto 500 0 rls}def /vr{20 800 moveto 0 -770 rls}def /bv{20 800 moveto 0 -1000 rls}def /br{20 770 moveto 0 -1040 rls}def /ul{0 -250 moveto 500 0 rls}def /ob{200 250 rmoveto currentpoint newpath 200 0 360 arc closepath stroke}def /bu{200 250 rmoveto currentpoint newpath 200 0 360 arc closepath fill}def /sq{80 0 rmoveto currentpoint dround newpath moveto 640 0 rlineto 0 640 rlineto -640 0 rlineto closepath stroke}def /bx{80 0 rmoveto currentpoint dround newpath moveto 640 0 rlineto 0 640 rlineto -640 0 rlineto closepath fill}def /ci{355 333 rmoveto currentpoint newpath 333 0 360 arc 50 setlinewidth stroke}def /lt{20 -200 moveto 0 550 rlineto currx 800 2cx s4 add exch s4 a4p stroke}def /lb{20 800 moveto 0 -550 rlineto currx -200 2cx s4 add exch s4 a4p stroke}def /rt{20 -200 moveto 0 550 rlineto currx 800 2cx s4 sub exch s4 a4p stroke}def /rb{20 800 moveto 0 -500 rlineto currx -200 2cx s4 sub exch s4 a4p stroke}def /lk{20 800 moveto 20 300 -280 300 s4 arcto pop pop 1000 sub currentpoint stroke moveto 20 300 4 2 roll s4 a4p 20 -200 lineto stroke}def /rk{20 800 moveto 20 300 320 300 s4 arcto pop pop 1000 sub currentpoint stroke moveto 20 300 4 2 roll s4 a4p 20 -200 lineto stroke}def /lf{20 800 moveto 0 -1000 rlineto s4 0 rls}def /rf{20 800 moveto 0 -1000 rlineto s4 neg 0 rls}def /lc{20 -200 moveto 0 1000 rlineto s4 0 rls}def /rc{20 -200 moveto 0 1000 rlineto s4 neg 0 rls}def end /Metrics 50 dict def Metrics begin /.notdef 0 def /space 500 def /ru 500 def /br 0 def /lt 250 def /lb 250 def /rt 250 def /rb 250 def /lk 250 def /rk 250 def /rc 250 def /lc 250 def /rf 250 def /lf 250 def /bv 250 def /ob 350 def /bu 350 def /ci 750 def /bx 750 def /sq 750 def /rn 500 def /ul 500 def /vr 0 def end DITfd begin /s2 500 def /s4 250 def /s3 333 def /a4p{arcto pop pop pop pop}def /2cx{2 copy exch}def /rls{rlineto stroke}def /currx{currentpoint pop}def /dround{transform round exch round exch itransform} def end end /DIThacks exch definefont pop ditstart (psc)xT 576 1 1 xr 1(Times-Roman)xf 1 f 2(Times-Italic)xf 2 f 3(Times-Bold)xf 3 f 4(Times-BoldItalic)xf 4 f 5(Helvetica)xf 5 f 6(Helvetica-Bold)xf 6 f 7(Courier)xf 7 f 8(Courier-Bold)xf 8 f 9(Symbol)xf 9 f 10(DIThacks)xf 10 f 10 s 1 f xi %%EndProlog %%Page: 1 1 10 s 10 xH 0 xS 1 f 3 f 12 s 1025 816(AGREP)N 1385(\320)X 1505(A)X 1598(FAST)X 1867(APPROXIMATE)X 2616(PATTERN-MATCHING)X 3679(TOOL)X 1 f 10 s 2147 1000(\(Preliminary)N 2572(version\))X 2064 1200(Sun)N 2208(Wu)X 2344(and)X 2480(Udi)X 2620(Manber)X 7 s 2870 1168(1)N 10 s 1953 1384(Department)N 2352(of)X 2439(Computer)X 2779(Science)X 2139 1504(University)N 2497(of)X 2584(Arizona)X 2189 1624(Tucson,)N 2465(AZ)X 2592(85721)X 2073 1744(\(sw)N 9 f 2209(|)X 1 f 2245(udi\)@cs.arizona.edu)X 4 f 2287 2194(ABSTRACT)N 1 f 648 2447(Searching)N 995(for)X 1115(a)X 1178(pattern)X 1428(in)X 1517(a)X 1580(text)X 1727(\256le)X 1856(is)X 1936(a)X 1999(very)X 2169(common)X 2476(operation)X 2806(in)X 2895(many)X 3100(applications)X 3514(ranging)X 3786(from)X 3969(text)X 4116(editors)X 648 2557(and)N 796(databases)X 1136(to)X 1230(applications)X 1649(in)X 1743(molecular)X 2096(biology.)X 2412(In)X 2511(many)X 2721(instances)X 3047(the)X 3177(pattern)X 3432(does)X 3611(not)X 3745(appear)X 3992(in)X 4085(the)X 4214(text)X 648 2667(exactly.)N 943(Errors)X 1167(in)X 1252(the)X 1373(text)X 1516(or)X 1606(in)X 1691(the)X 1812(query)X 2018(can)X 2153(result)X 2354(from)X 2533(misspelling)X 2925(or)X 3016(from)X 3196(experimental)X 3639(errors)X 3851(\(e.g.,)X 4038(when)X 4236(the)X 648 2777(text)N 796(is)X 877(a)X 941(DNA)X 1143(sequence\).)X 1533(The)X 1686(use)X 1821(of)X 1916(such)X 2091(approximate)X 2520(pattern)X 2771(matching)X 3096(has)X 3230(been)X 3409(limited)X 3662(until)X 3835(now)X 4000(to)X 4089(speci\256c)X 648 2887(applications.)N 1099(Most)X 1287(text)X 1431(editors)X 1673(and)X 1813(searching)X 2145(programs)X 2472(do)X 2576(not)X 2702(support)X 2966(searching)X 3298(with)X 3464(errors)X 3676(because)X 3955(of)X 4046(the)X 4169(com-)X 648 2997(plexity)N 896(involved)X 1202(in)X 1290(implementing)X 1760(it.)X 1870(In)X 1963(this)X 2104(paper)X 2309(we)X 2429(describe)X 2723(a)X 2785(new)X 2945(tool,)X 3115(called)X 2 f 3333(agrep)X 1 f 3520(,)X 3566(for)X 3685(approximate)X 4111(pattern)X 648 3107(matching.)N 1014(Agrep)X 1243(is)X 1325(based)X 1537(on)X 1646(a)X 1711(new)X 1874(ef\256cient)X 2166(and)X 2311(\257exible)X 2580(algorithm)X 2920(for)X 3043(approximate)X 3473(string)X 3684(matching.)X 4051(Agrep)X 4281(is)X 648 3217(also)N 808(competitive)X 1217(with)X 1390(other)X 1586(tools)X 1772(for)X 1897(exact)X 2098(string)X 2311(matching;)X 2662(it)X 2737(include)X 3004(many)X 3212(options)X 3477(that)X 3627(make)X 3831(searching)X 4169(more)X 648 3327(powerful)N 958(and)X 1094(convenient.)X 6 f 14 s 648 3579(1.)N 803(Introduction)X 1 f 10 s 648 3754(The)N 797(most)X 976(common)X 1280(string-searching)X 1821(problem)X 2112(is)X 2189(to)X 2275(\256nd)X 2423(all)X 2527 0.3125(occurrences)AX 2936(of)X 3027(a)X 3087(string)X 2 f 3293(P)X 9 f 3361(=)X 2 f 3418(p)X 1 f 7 s 3467 3770(1)N 2 f 10 s 3501 3754(p)N 1 f 7 s 3550 3770(2)N 2 f 10 s 3584 3754(...p)N 7 s 3770(m)Y 1 f 10 s 3754 3754(inside)N 3969(a)X 4029(large)X 4214(text)X 648 3864(\256le)N 2 f 771(T)X 9 f 834(=)X 2 f 891(t)X 1 f 7 s 922 3880(1)N 2 f 10 s 956 3864(t)N 1 f 7 s 987 3880(2)N 10 s 1041 3840(.)N 1081(.)X 1121(.)X 2 f 1161 3864(t)N 7 s 1183 3880(n)N 1 f 10 s 1217 3864(.)N 1258(We)X 1391(assume)X 1648(that)X 1789(the)X 1908(string)X 2111(and)X 2248(the)X 2367(text)X 2508(are)X 2628(sequences)X 2975(of)X 2 f 3063(characters)X 1 f 3426(from)X 3602(a)X 3658(\256nite)X 3842(character)X 4158(set)X 2 f 9 f 4267(S)X 1 f 4314(.)X 648 3974(The)N 803(characters)X 1160(may)X 1328(be)X 1434(English)X 1708(characters)X 2065(in)X 2157(a)X 2223(text)X 2373(\256le,)X 2525(DNA)X 2729(base)X 2902(pairs,)X 3108(lines)X 3289(of)X 3386(source)X 3627(code,)X 3830(angles)X 4066(between)X 648 4084(edges)N 852(in)X 935(polygons,)X 1269(machines)X 1593(or)X 1681(machine)X 1974(parts)X 2151(in)X 2234(a)X 2291(production)X 2659(schedule,)X 2981(music)X 3192(notes)X 3381(and)X 3517(tempo)X 3737(in)X 3819(a)X 3875(musical)X 4144(score,)X 648 4194(etc.)N 811(The)X 965(two)X 1114(most)X 1298(famous)X 1564(algorithms)X 1936(for)X 2060(this)X 2205(problem)X 2502(are)X 2631(the)X 2759(Knuth-Morris-Pratt)X 3412(algorithm)X 3753([KMP77])X 4090(and)X 4236(the)X 648 4304(Boyer-Moore)N 1113(algorithm)X 1452([BM77])X 1738(\(see)X 1896(also)X 2053([Ba89])X 2304(and)X 2448([HS91]\).)X 2759(There)X 2975(are)X 3102(many)X 3308(extensions)X 3673(to)X 3762(this)X 3904(problem;)X 4240(for)X 648 4414(example,)N 965(we)X 1084(may)X 1247(be)X 1348(looking)X 1617(for)X 1736(a)X 1797(set)X 1911(of)X 2003(patterns,)X 2303(a)X 2365(regular)X 2619(expression,)X 3008(a)X 3070(pattern)X 3319(with)X 3487(``wild)X 3709(cards,'')X 3979(etc.)X 4139(String)X 648 4524(searching)N 976(in)X 1058(Unix)X 1238(is)X 1311(most)X 1486(often)X 1671(done)X 1847(with)X 2009(the)X 2 f 2127(grep)X 1 f 2294(family.)X 848 4667(In)N 936(some)X 1126(instances,)X 1461(however,)X 1779(the)X 1898(pattern)X 2142(and/or)X 2368(the)X 2487(text)X 2628(are)X 2748(not)X 2871(exact.)X 3102(We)X 3235(may)X 3394(not)X 3518(remember)X 3866(the)X 3986(exact)X 4178(spel-)X 648 4777(ling)N 794(of)X 882(a)X 939(name)X 1134(we)X 1249(are)X 1369(searching,)X 1718(the)X 1837(name)X 2032(may)X 2191(be)X 2288(misspelled)X 2651(in)X 2734(the)X 2853(text,)X 3014(the)X 3133(text)X 3274(may)X 3433(correspond)X 3811(to)X 3894(a)X 3951(sequence)X 4267(of)X 648 4887(numbers)N 959(with)X 1136(a)X 1207(certain)X 1461(property)X 1768(and)X 1919(we)X 2048(do)X 2163(not)X 2300(have)X 2487(an)X 2598(exact)X 2803(pattern,)X 3081(the)X 3214(text)X 3369(may)X 3542(be)X 3654(a)X 3726(sequence)X 4057(of)X 4160(DNA)X 648 4997(molecules)N 998(and)X 1139(we)X 1258(are)X 1382(looking)X 1651(for)X 1770(approximate)X 2195(patterns,)X 2493(etc.)X 2651(The)X 2800(approximate)X 3225(string-matching)X 3756(problem)X 4047(is)X 4124(to)X 4210(\256nd)X 648 5107(all)N 757(substrings)X 1110(in)X 2 f 1201(T)X 1 f 1274(that)X 1423(are)X 2 f 1551(close)X 1 f 1745(to)X 2 f 1836(P)X 1 f 1914(under)X 2126(some)X 2324(measure)X 2621(of)X 2717(closeness.)X 3089(We)X 3230(will)X 3383(concentrate)X 3783(here)X 3951(on)X 4060(the)X 4187(edit-)X 648 5217(distance)N 936(measure)X 1229(\(also)X 1410(known)X 1653(as)X 1744(the)X 2 f 1866(Levenshtein)X 2273(measure)X 1 f 2545(\).)X 2636(A)X 2718(string)X 2 f 2924(P)X 1 f 2997(is)X 3074(said)X 3227(to)X 3313(be)X 3413(of)X 3504(distance)X 2 f 3791(k)X 1 f 3851(to)X 3937(a)X 3997(string)X 2 f 4203(Q)X 1 f 4285(if)X 648 5327(we)N 764(can)X 898(transform)X 2 f 1232(P)X 1 f 1303(to)X 1387(be)X 1485(equal)X 1681(to)X 2 f 1765(Q)X 1 f 1845(with)X 2009(a)X 2067(sequence)X 2384(of)X 2 f 2473(k)X 1 f 2531(insertions)X 2864(of)X 2953(single)X 3167(characters)X 3517(in)X 3602(\(arbitrary)X 3929(places)X 4153(in\))X 2 f 4265(P)X 1 f 4314(,)X 8 s 10 f 648 5423(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)N 5 s 1 f 648 5525(1)N 8 s 684 5550(Supported)N 963(in)X 1029(part)X 1144(by)X 1224(an)X 1300(NSF)X 1434(Presidential)X 1753(Young)X 1944(Investigator)X 2266(Award)X 2456(\(grant)X 2625(DCR-8451397\),)X 3056(with)X 3187(matching)X 3442(funds)X 3601(from)X 3742(AT&T,)X 3949(and)X 4058(by)X 4139(an)X 4216(NSF)X 648 5646(grant)N 795(CCR-9002351.)X 2 p %%Page: 2 2 8 s 8 xH 0 xS 1 f 10 s 3 f 1 f 648 686(deletions)N 958(of)X 1046(single)X 1258(characters)X 1606(in)X 2 f 1689(P)X 1 f 1738(,)X 1779(or)X 1867(substitutions)X 2291(of)X 2379(characters.)X 2767(Sometimes)X 3143(one)X 3280(wants)X 3489(to)X 3573(vary)X 3738(the)X 3858(cost)X 4009(of)X 4098(the)X 4218(dif-)X 648 796(ferent)N 856(edit)X 996(operations,)X 1370(say)X 1497(deletions)X 1806(cost)X 1955(3,)X 2035(insertions)X 2366(2,)X 2446(and)X 2582(substitutions)X 3005(1.)X 848 939(Many)N 1065(different)X 1372(approximate)X 1803(string-matching)X 2340(algorithms)X 2712(have)X 2895(been)X 3078(suggested)X 3425(\(too)X 3585(many)X 3794(to)X 3887(list)X 4015(here\),)X 4232(but)X 648 1049(none)N 831(is)X 911(widely)X 1156(used,)X 1350(mainly)X 1599(because)X 1881(of)X 1975(their)X 2149(complexity)X 2536(and/or)X 2767(lack)X 2927(of)X 3020(generality.)X 3407(We)X 3545(present)X 3803(here)X 3968(a)X 4030(new)X 4190(tool,)X 648 1159(called)N 2 f 863(agrep)X 1 f 1073(\(for)X 2 f 1217(approximate)X 1 f 1646(grep\),)X 1860(which)X 2080(has)X 2211(a)X 2271(very)X 2438(similar)X 2684(user)X 2842(interface)X 3148(to)X 3234(the)X 3356(grep)X 3523(family)X 3756(\(although)X 4087(it)X 4155(is)X 4232(not)X 648 1269(100%)N 858(compatible\),)X 1284(and)X 1423(which)X 1642(supports)X 1936(several)X 2187(important)X 2521(extensions)X 2882(to)X 2967(grep.)X 3173(Version)X 3450(1.0)X 3573(of)X 3663(agrep)X 3865(is)X 3941(available)X 4254(by)X 648 1379(anonymous)N 1055(ftp)X 1182(from)X 1376(cs.arizona.edu)X 1874(\(IP)X 2010(192.12.69.5\))X 2455(as)X 2560 0.1912(agrep/agrep.tar.Z.)AX 3192(It)X 3279(has)X 3424(been)X 3614(developed)X 3982(on)X 4100(a)X 4174(SUN)X 648 1489(SparcStation)N 1080(and)X 1219(has)X 1349(been)X 1524(successfully)X 1939(ported)X 2167(to)X 2252(DECstation)X 2648(5000,)X 2851(NeXT,)X 3094(Sequent,)X 3394(HP)X 3518(9000,)X 3720(and)X 3858(Silicon)X 4106(Graph-)X 648 1599(ics)N 758(workstations.)X 1208(We)X 1341(expect)X 1572(version)X 1829(2.0)X 1950(to)X 2033(be)X 2130(available)X 2441(\(at)X 2547(the)X 2666(same)X 2852(place\))X 3070(by)X 3171(the)X 3290(end)X 3428(of)X 3517(1991;)X 3721(most)X 3898(of)X 3987(the)X 4107(discus-)X 648 1709(sion)N 802(here)X 962(refers)X 1167(to)X 1250(version)X 1507(2.)X 1608(The)X 1754(three)X 1936(most)X 2112(signi\256cant)X 2466(features)X 2741(of)X 2828(agrep)X 3027(that)X 3167(are)X 3286(not)X 3408(supported)X 3744(by)X 3844(the)X 3962(grep)X 4125(family)X 648 1819(are)N 768(1\))X 856(searching)X 1185(for)X 1300(approximate)X 1722(patterns,)X 2017(2\))X 2105(searching)X 2435(for)X 2551(records)X 2810(rather)X 3020(than)X 3180(just)X 3317(lines,)X 3510(and)X 3648(3\))X 3737(searching)X 4067(for)X 4183(mul-)X 648 1929(tiple)N 825(patterns)X 1114(with)X 1291(AND)X 1500(\(or)X 1629(OR\))X 1802(logic)X 1997(queries.)X 2304(\(All)X 2468(3)X 2543(features)X 2833(are)X 2967(available)X 3292(in)X 3389(version)X 3660(1.0.\))X 3862(Other)X 4079(features)X 648 2039(include)N 912(searching)X 1249(for)X 1372(regular)X 1629(expressions)X 2032(\(with)X 2230(or)X 2326(without)X 2599(errors\),)X 2863(ef\256cient)X 3155(multi-pattern)X 3602(search,)X 3857(unlimited)X 4192(wild)X 648 2149(cards,)N 865(limiting)X 1144(the)X 1269(errors)X 1484(to)X 1573(only)X 1742(insertions)X 2080(or)X 2174(only)X 2343(substitutions)X 2773(or)X 2867(any)X 3010(combination,)X 3456(allowing)X 3762(each)X 3936(deletion,)X 4240(for)X 648 2259(example,)N 960(to)X 1042(be)X 1138(counted)X 1412(as,)X 1519(say,)X 1667(2)X 1728(substitutions)X 2152(or)X 2240(3)X 2301(insertions,)X 2653(restricting)X 2999(parts)X 3176(of)X 3264(the)X 3383(query)X 3587(to)X 3670(be)X 3767(exact)X 3958(and)X 4095(parts)X 4272(to)X 648 2369(be)N 744(approximate,)X 1185(and)X 1321(many)X 1519(more.)X 1744(Examples)X 2080(of)X 2167(the)X 2285(use)X 2412(of)X 2499(agrep)X 2698(are)X 2817(given)X 3015(in)X 3097(the)X 3215(next)X 3373(section.)X 848 2512(Agrep)N 1074(not)X 1202(only)X 1370(supports)X 1667(a)X 1729(large)X 1916(number)X 2187(of)X 2280(options,)X 2561(but)X 2689(it)X 2759(is)X 2838(also)X 2993(very)X 3162(ef\256cient.)X 3491(In)X 3584(our)X 3717(experiments,)X 4155(agrep)X 648 2622(was)N 794(competitive)X 1193(with)X 1356(the)X 1475(best)X 1625(exact)X 1815(string-matching)X 2342(tools)X 2517(that)X 2657(we)X 2771(could)X 2969(\256nd)X 3113(\(Hume's)X 3414(gre)X 3537([Hu91])X 3789(and)X 3925(GNU)X 4119(e?grep)X 648 2732([Ha89]\),)N 949(and)X 1091(in)X 1179(many)X 1383(cases)X 1579(one)X 1721(to)X 1810(two)X 1957(orders)X 2185(of)X 2279(magnitude)X 2644(faster)X 2850(than)X 3015(other)X 3207(approximate)X 3635(string-matching)X 4169(algo-)X 648 2842(rithms.)N 913(For)X 1045(example,)X 1358(\256nding)X 1605(all)X 1706 0.3125(occurrences)AX 2112(of)X 2 f 2200(Homogenos)X 1 f 2604(allowing)X 2905(two)X 3046(errors)X 3255(in)X 3338(a)X 3395(1MB)X 3580(bibliographic)X 4028(text)X 4169(takes)X 648 2952(about)N 851(0.2)X 976(seconds)X 1255(on)X 1360(a)X 1421(SUN)X 1606(SparcStation)X 2040(II.)X 2159(\(We)X 2323(actually)X 2602(used)X 2774(this)X 2914(example)X 3211(and)X 3352(found)X 3564(a)X 3626(misspelling)X 4020(in)X 4108(the)X 4232(bib)X 648 3062(\256le.\))N 837(This)X 999(is)X 1072(almost)X 1305(as)X 1392(fast)X 1528(as)X 1615(exact)X 1805(string)X 2007(matching.)X 848 3205(This)N 1010(paper)X 1209(is)X 1282(organized)X 1619(as)X 1706(follows.)X 2006(We)X 2138(start)X 2296(by)X 2396(giving)X 2620(examples)X 2943(of)X 3030(the)X 3148(use)X 3276(of)X 3364(agrep)X 3564(that)X 3705(illustrate)X 4006(how)X 4165(\257exi-)X 648 3315(ble)N 772(and)X 914(general)X 1177(it)X 1247(is.)X 1366(We)X 1504(then)X 1668(brie\257y)X 1903(describe)X 2197(the)X 2321(main)X 2507(ideas)X 2698(behind)X 2942(the)X 3066(algorithms)X 3434(and)X 3576(their)X 3749(extensions.)X 4133(\(More)X 648 3425(details)N 884(are)X 1010(given)X 1215(in)X 1304(the)X 1429(technical)X 1746(report)X 1965(and)X 2108(man)X 2273(pages)X 2483(which)X 2706(are)X 2832(available)X 3149(by)X 3256(ftp.\))X 3439(We)X 3578(then)X 3743(give)X 3909(some)X 4106(experi-)X 648 3535(mental)N 886(results,)X 1135(and)X 1271(we)X 1385(close)X 1570(with)X 1732(conclusions.)X 6 f 14 s 648 3787(2.)N 803(Using)X 1144(Agrep)X 1 f 10 s 648 3962(We)N 782(have)X 956(been)X 1130(using)X 1325(agrep)X 1526(for)X 1642(about)X 1842(6)X 1905(months)X 2163(now)X 2324(and)X 2463(\256nd)X 2610(it)X 2677(an)X 2776(indispensable)X 3235(tool.)X 3402(We)X 3537(present)X 3792(here)X 3954(only)X 4119(a)X 4178(sam-)X 648 4072(ple)N 770(of)X 861(the)X 983(uses)X 1145(that)X 1289(we)X 1407(found.)X 1658(As)X 1771(we)X 1889(said)X 2042(in)X 2128(the)X 2249(introduction,)X 2683(the)X 2804(three)X 2988(most)X 3166(signi\256cant)X 3522(features)X 3800(of)X 3890(agrep)X 4092(that)X 4235(are)X 648 4182(not)N 770(supported)X 1106(by)X 1206(the)X 1324(grep)X 1487(family)X 1716(are)X 648 4325(1.)N 3 f 748(the)X 875(ability)X 1112(to)X 1199(search)X 1442(for)X 1565(approximate)X 2021(patterns)X 1 f 848 4435(for)N 967(example,)X 7 f 1313(agrep)X 1607(-2)X 1757(Homogenos)X 2243(bib)X 1 f 2413(will)X 2563(\256nd)X 2713(Homogeneous)X 3202(as)X 3295(well)X 3459(as)X 3552(any)X 3694(other)X 3885(word)X 4076(that)X 4222(can)X 848 4545(be)N 951(obtained)X 1254(from)X 1437(Homogenos)X 1851(with)X 2019(at)X 2103(most)X 2284(2)X 2350(substitutions,)X 2799(insertions,)X 3156(or)X 3249(deletions.)X 3604(It)X 3679(is)X 3758(possible)X 4046(to)X 4134(assign)X 848 4655(different)N 1147(costs)X 1329(to)X 1413(insertions,)X 1766(deletions,)X 2097(or)X 2186(substitutions.)X 2651(For)X 2784(example,)X 7 f 3126(agrep)X 3416(-1)X 3562(-I2)X 3756(-D2)X 3950(555-3217)X 848 4765(phone)N 1 f 1110(will)X 1256(\256nd)X 1402(all)X 1504(numbers)X 1802(that)X 1944(differ)X 2145(from)X 2323(555-3217)X 2652(in)X 2736(at)X 2816(most)X 2993(one)X 3131(digit.)X 3339(The)X 3485(-I)X 3560(\(-D\))X 3720(option)X 3945(sets)X 4086(the)X 4205(cost)X 848 4875(of)N 935(insertions)X 1266(\(deletions\);)X 1651(in)X 1733(this)X 1868(case,)X 2047(setting)X 2280(it)X 2344(to)X 2426(2)X 2486(prevents)X 2778(insertions)X 3109(and)X 3245(deletions.)X 648 5018(2.)N 3 f 748(agrep)X 964(is)X 1037(record)X 1285(oriented)X 1590(rather)X 1829(than)X 2004(just)X 2153(line)X 2297(oriented)X 1 f 848 5128(a)N 907(record)X 1136(is)X 1212(by)X 1315(default)X 1562(a)X 1622(line,)X 1786(but)X 1912(it)X 1980(can)X 2116(be)X 2216(user)X 2374(de\256ned;)X 2656(for)X 2774(example,)X 7 f 3118(agrep)X 3410(-d)X 3558('\303From)X 3898(')X 3998('pizza')X 848 5238(mbox)N 1 f 1066(outputs)X 1327(all)X 1433(mail)X 1601(messages)X 1929(that)X 2074(contain)X 2335(the)X 2458(keyword)X 2764("pizza".)X 3065(Another)X 3353(example:)X 7 f 3700(agrep)X 3993(-d)X 4142('$$')X 848 5348(pattern)N 1232(foo)X 1 f 1396(will)X 1540(output)X 1764(all)X 1864(paragraphs)X 2237(\(separated)X 2588(by)X 2688(an)X 2784(empty)X 3004(line\))X 3171(that)X 3311(contain)X 3567(pattern.)X 648 5491(3.)N 3 f 748(multiple)X 1052(patterns)X 1357(with)X 1528(AND)X 1722(\(or)X 1845(OR\))X 2012(logic)X 2192(queries)X 1 f 848 5601(For)N 982(example,)X 7 f 1325(agrep)X 1616(-d)X 1763('\303From)X 2102(')X 2201('burger,pizza')X 2924(mbox)X 1 f 3140(outputs)X 3399(all)X 3503(mail)X 3669(messages)X 3996(containing)X 848 5711(at)N 932(least)X 1105(one)X 1247(of)X 1340(the)X 1464(two)X 1609(keywords)X 1946(\(`,')X 2072(stands)X 2297(for)X 2416(OR\);)X 7 f 2629(agrep)X 2922(-d)X 3071('\303From)X 3412(')X 3513('good;pizza')X 4142(mbox)X 3 p %%Page: 3 3 10 s 10 xH 0 xS 7 f 3 f 1 f 848 686(outputs)N 1103(all)X 1203(mail)X 1365(messages)X 1688(containing)X 2046(both)X 2208(keywords)X 2540(\(`;')X 2663(stands)X 2883(for)X 2997(AND\).)X 648 829(Putting)N 898(these)X 1083(options)X 1338(together)X 1621(one)X 1757(can)X 1889(ask)X 2016(queries)X 2268(like)X 7 f 936 939(agrep)N 1224(-d)X 1368('$$')X 1608(-1)X 1752(';TheAuthor;Curriculum;<198[5-9]>')X 3720(bib-file)X 1 f 648 1049(which)N 871(outputs)X 1133(all)X 1240(paragraphs)X 1620(referencing)X 2014(articles)X 2273(in)X 2362(CACM)X 2624(between)X 2920(1985)X 3108(and)X 3252(1989)X 3440(by)X 3548(TheAuthor)X 3928(dealing)X 4192(with)X 648 1159(curriculum.)N 1065(One)X 1224(error)X 1405(is)X 1482(allowed)X 1760(in)X 1846(any)X 1986(of)X 2077(the)X 2199(sub-patterns,)X 2635(but)X 2761(it)X 2829(cannot)X 3067(be)X 3167(in)X 3253(either)X 3460(CACM)X 3719(or)X 3810(the)X 3932(year)X 4095(\(the)X 4244(<>)X 648 1269(brackets)N 936(forbid)X 1152(errors)X 1360(in)X 1442(the)X 1560(pattern)X 1803(between)X 2091(them\).)X 848 1412(These)N 1068(features)X 1351(and)X 1495(several)X 1751(more)X 1944(enable)X 2182(users)X 2375(to)X 2465(compose)X 2779(complex)X 3084(queries)X 3345(rather)X 3562(easily.)X 3798(We)X 3939(give)X 4106(several)X 648 1522(examples)N 974(of)X 1064(the)X 1185(daily)X 1368(use)X 1498(of)X 1588(agrep)X 1790(from)X 1969(our)X 2099(experience.)X 2511(For)X 2645(a)X 2704(complete)X 3021(list)X 3140(of)X 3229(options,)X 3506(see)X 3631(the)X 3751(manual)X 4009(pages)X 4214(dis-)X 648 1632(tributed)N 917(with)X 1079(agrep.)X 6 f 12 s 648 1852(2.1.)N 862(Finding)X 1238(words)X 1548(in)X 1661(a)X 1741(dictionary)X 1 f 10 s 648 1995(The)N 799(most)X 980(common)X 1286(tool)X 1436(available)X 1753(in)X 1842(UNIX)X 2070(for)X 2191(\256nding)X 2444(the)X 2569(correct)X 2820(spelling)X 3100(of)X 3194(a)X 3257(word)X 3449(is)X 3529(the)X 3654(program)X 2 f 3953(look)X 1 f 4091(,)X 4138(which)X 648 2105(outputs)N 913(all)X 1023(words)X 1249(in)X 1341(the)X 1469(dictionary)X 1824(with)X 1996(a)X 2062(given)X 2270(pre\256x.)X 2527(We)X 2669(have)X 2851(many)X 3059(times)X 3262(looked)X 3510(for)X 3634(spelling)X 3917(of)X 4014(words)X 4240(for)X 648 2215(which)N 864(we)X 978(did)X 1100(not)X 1222(know)X 1420(a)X 1476(pre\256x.)X 1723(We)X 1855(use)X 1982(the)X 2100(following)X 2431(alias)X 2598(for)X 2712(\256ndword:)X 7 f 792 2325(alias)N 1080(findword)X 1512(agrep)X 1800(-i)X 1944(-!:2)X 2184(!:1)X 2376(/usr/dict/web2)X 1 f 648 2435(\(web2)N 873(is)X 950(a)X 1010(large)X 1196(collection)X 1537(of)X 1629(words,)X 1870(about)X 2073(2.5MB)X 2322(long;)X 2531(one)X 2672(can)X 2809(use)X 2941(/usr/dict/words)X 3446(instead.\))X 3765(For)X 3901(example,)X 4218(one)X 648 2545(of)N 736(the)X 855(authors)X 1112(can)X 1245(never)X 1445(remember)X 1792(the)X 1911(correct)X 2156(spelling)X 2430(of)X 2518 0.3250(bureaucracy)AX 2933(\(and)X 3097(he)X 3194(is)X 3268(irritated)X 3543(enough)X 3800(with)X 3963(it)X 4028(not)X 4151(want-)X 648 2655(ing)N 781(to)X 874(remember\).)X 7 f 1354(findword)X 1797(breacracy)X 2288(2)X 1 f 2367(searches)X 2671(for)X 2796(all)X 2907 0.3125(occurrences)AX 3323(of)X 3421 0.4062(breacracy)AX 3766(with)X 3939(at)X 4028(most)X 4214(two)X 648 2765(errors.)N 896(\(web2)X 1117(contains)X 1404(one)X 1540(more)X 1725(match)X 1941(-)X 1988(squireocracy\).)X 848 2908(One)N 1007(can)X 1144(also)X 1298(use)X 1430(the)X 1553(-w)X 1663(option)X 1892(which)X 2114(matches)X 2403(the)X 2527(pattern)X 2776(to)X 2864(a)X 2926(complete)X 3246(word)X 3437(\(rather)X 3678(than)X 3842(possibly)X 4134(a)X 4196(sub-)X 648 3018(word\).)N 905(In)X 997(the)X 1120(example)X 1417(above,)X 1654(the)X 1777(extra)X 1963(match)X 2184 0.2404(\(squireocracy\))AX 2674(will)X 2823(not)X 2950(be)X 3051(a)X 3112(match,)X 3353(because)X 3633(with)X 3800(the)X 3922(-w)X 4031(option)X 4259(its)X 648 3128(beginning)N 988(\(squi\))X 1195(will)X 1339(count)X 1537(as)X 1624(4)X 1684(extra)X 1865(errors.)X 6 f 12 s 648 3348(2.2.)N 862(Searching)X 1353(a)X 1433(Mail)X 1647(File)X 1 f 10 s 648 3491(We)N 787(found)X 1001(that)X 1148(one)X 1291(of)X 1385(the)X 1510(most)X 1693(frequent)X 1989(uses)X 2155(of)X 2250(agrep)X 2457(is)X 2538(to)X 2628(search)X 2862(inside)X 3081(mail)X 3251(\256les)X 3412(for)X 3534(mail)X 3704(messages)X 4035(using)X 4236(the)X 648 3601(record)N 874(option.)X 1138(We)X 1270(use)X 1397(the)X 1515(following)X 1846(alias)X 7 f 792 3711(alias)N 1080(agmail)X 1416(agrep)X 1704(-!:2)X 1944(-d)X 2088('\303From)X 2424(')X 2520(!:1)X 1 f 648 3821(Notice)N 882(that)X 1022(it)X 1086(is)X 1159(possible)X 1441(with)X 1603(this)X 1738(alias)X 1905(to)X 1987(use)X 2114(complicated)X 2526(queries;)X 2800(for)X 2914(example,)X 7 f 648 3931(agmail)N 984(';;Manbar')X 2184(1)X 2280(mail/food,)X 1 f 2780(or)X 7 f 648 4041(agmail)N 997('\\.gov;October;surprise')X 2210(0)X 2319(mail/*,)X 1 f 2689(which)X 2919(searches)X 3226(all)X 3340(mail)X 3516(messages)X 3853(from)X 4043(.gov)X 4217(\(a)X 4314(.)X 648 4151(without)N 912(the)X 1030(\\)X 1072(matches)X 1355(every)X 1554 0.3750(character\))AX 1897(that)X 2037(include)X 2293(the)X 2411(two)X 2551(keywords.)X 6 f 12 s 648 4371(2.3.)N 862(Extracting)X 1358(Procedures)X 1 f 10 s 648 4514(It)N 722(is)X 800(usually)X 1056(possible)X 1343(to)X 1430(easily)X 1643(extract)X 1888(a)X 1950(procedure)X 2298(from)X 2480(a)X 2542(large)X 2729(program)X 3027(by)X 3133(de\256ning)X 3421(a)X 3483(procedure)X 3831(as)X 3924(a)X 3986(record)X 4218(and)X 648 4624(using)N 842(agrep.)X 1082(For)X 1214(example,)X 7 f 1555(agrep)X 1844(-t)X 1989(-d)X 2133('\303}')X 2373('\303routine1')X 2949(prog1/*.c)X 3429(>)X 3525(routine1.c)X 1 f 4025(will)X 4169(work)X 648 4734(assuming)N 975(routines)X 1258(in)X 1345(C)X 1423(always)X 1671(end)X 1813(with)X 1981(})X 2045(at)X 2129(the)X 2253(beginning)X 2599(of)X 2692(a)X 2754(line)X 2900(\(and)X 3069(that)X 3215('\303routine1')X 3589(uniquely)X 3895(identi\256es)X 4214(that)X 648 4844(routine\).)N 966(One)X 1124(should)X 1360(be)X 1459(careful)X 1706(when)X 1903(dealing)X 2162(with)X 2327(other)X 2515(people's)X 2810(programs)X 3136(\(because)X 3441(the)X 3562(conventions)X 3972(may)X 4133(not)X 4258(be)X 648 4954(followed\).)N 1022(Other)X 1227(programming)X 1685(languages)X 2028(have)X 2202(other)X 2389(ways)X 2576(to)X 2660(identify)X 2931(the)X 3051(end)X 3189(\(or)X 3305(beginning)X 3648(of)X 3738(a)X 3797(procedure\).)X 4209(The)X 648 5064(-t)N 720(option)X 947(puts)X 1103(the)X 1224(record)X 1453(delimiter)X 1765(at)X 1846(the)X 1967(end)X 2106(of)X 2196(the)X 2317(record)X 2546(rather)X 2757(than)X 2918(at)X 2999(the)X 3119(beginning)X 3461(\(which)X 3706(is)X 3781(more)X 3968(appropriate)X 648 5174(for)N 762(mail)X 924(messages,)X 1267(for)X 1381(example\).)X 6 f 12 s 648 5394(2.4.)N 862(Finding)X 1238(Interesting)X 1756(Words)X 1 f 10 s 648 5537(At)N 750(some)X 941(point)X 1127(we)X 1243(needed)X 1493(to)X 1577(\256nd)X 1723(all)X 1825(words)X 2043(in)X 2127(the)X 2247(dictionary)X 2594(with)X 2758(4-7)X 2887(characters.)X 3276(This)X 3440(can)X 3574(be)X 3672(done)X 3851(with)X 4016(one)X 4155(agrep)X 648 5647(command)N 7 f 1021(agrep)X 1318(-3)X 1471(-w)X 1624(-D4)X 1824('....')X 2168(/usr/dict/words.)X 1 f 2984(\(The)X 3164(-D4)X 3317(prevents)X 3617(deletions,)X 3954(and)X 4098(the)X 4224(.)X 4272(in)X 648 5757(the)N 766(pattern)X 1009(stands)X 1229(for)X 1343(any)X 1479 0.3375(character.\))AX 4 p %%Page: 4 4 10 s 10 xH 0 xS 1 f 3 f 1 f 848 686(We)N 981(end)X 1118(this)X 1254(section)X 1502(with)X 1665(a)X 1722(cute)X 1877(example,)X 2190(which)X 2407(although)X 2708(is)X 2782(not)X 2906(important,)X 3259(shows)X 3481(how)X 3641(\257exible)X 3903(agrep)X 4104(can)X 4238(be.)X 648 796(The)N 797(following)X 1132(query)X 1339(\256nds)X 1518(all)X 1622(words)X 1842(in)X 1928(the)X 2050(dictionary)X 2399(that)X 2543(contain)X 2803(5)X 2867(of)X 2958(the)X 3080(\256rst)X 3228(10)X 3332(letters)X 3551(of)X 3641(the)X 3762(alphabet)X 4057(in)X 4142(order:)X 7 f 648 906(agrep)N 942(-5)X 1092('a#b#c#d#e#f#g#h#i#j')X 2154(/usr/dict/words)X 1 f 2900(\(the)X 3051(#)X 3117(symbol)X 3378(stands)X 3605(for)X 3726(a)X 3789(wild)X 3958(card)X 4124(of)X 4218(any)X 648 1016(size)N 794(-)X 842(the)X 961(same)X 1147(as)X 1235(.*\).)X 1383(Try)X 1520(it.)X 1624(The)X 1769(answer)X 2017(starts)X 2206(with)X 2368(the)X 2486(word)X 2 f 2671(academia)X 1 f 3023(and)X 3159(ends)X 3326(with)X 2 f 3488(sacrilegious)X 1 f 3879(;)X 3921(it)X 3985(must)X 4160(mean)X 648 1126(something..)N 6 f 14 s 648 1378(3.)N 803(The)X 1032(Algorithm)X 1561(s)X 1 f 10 s 648 1553(Agrep)N 874(utilizes)X 1130(several)X 1383(different)X 1686(algorithms)X 2054(to)X 2142(optimize)X 2448(the)X 2572(performance)X 3005(for)X 3125(the)X 3249(different)X 3552(cases.)X 3788(For)X 3925(simple)X 4164(exact)X 648 1663(queries)N 916(we)X 1046(use)X 1189(a)X 1261(variant)X 1520(of)X 1623(the)X 1757(Boyer-Moore)X 2230(algorithm.)X 2617(For)X 2764(simple)X 3012(patterns)X 3301(with)X 3478(errors,)X 3721(we)X 3850(use)X 3992(a)X 4063(partition)X 648 1773(scheme,)N 934(described)X 1267(at)X 1350(the)X 1473(end)X 1614(of)X 1706(section)X 1958(3.2,)X 2103(hand)X 2284(in)X 2371(hand)X 2552(with)X 2720(the)X 2844(Boyer-Moore)X 3307(scheme.)X 3614(For)X 3751(more)X 3942(complicated)X 648 1883(patterns,)N 958(e.g.,)X 1130(patterns)X 1420(with)X 1598(unlimited)X 1940(wild)X 2118(cards,)X 2344(patterns)X 2634(with)X 2812(uneven)X 3080(costs)X 3276(of)X 3379(the)X 3513(different)X 3825(edit)X 3980(operations,)X 648 1993(multi-patterns,)N 1140(arbitrary)X 1440(regular)X 1691(expressions,)X 2108(we)X 2225(use)X 2355(new)X 2512(algorithms)X 2877(altogether.)X 3261(In)X 3351(this)X 3490(section,)X 3761(we)X 3879(brie\257y)X 4112(outline)X 648 2103(the)N 768(basis)X 950(for)X 1066(two)X 1208(of)X 1297(the)X 1417(interesting)X 1777(new)X 1933(algorithms)X 2297(that)X 2439(we)X 2555(use,)X 2704(the)X 2824(algorithm)X 3157(for)X 3273(arbitrary)X 3571(patterns)X 3846(with)X 4009(errors)X 4218(and)X 648 2213(the)N 766(algorithm)X 1097(for)X 1211(multi)X 1399(patterns.)X 1713(For)X 1844(some)X 2033(more)X 2218(details)X 2447(on)X 2547(the)X 2665(algorithms)X 3027(see)X 3150([WM91,)X 3444(WM92].)X 6 f 12 s 648 2433(3.1.)N 862(Arbitrary)X 1293(Patterns)X 1703(With)X 1939(Errors)X 1 f 10 s 648 2576(We)N 781(describe)X 1070(only)X 1233(the)X 1352(main)X 1533(idea)X 1688(behind)X 1927(the)X 2046(simplest)X 2333(case)X 2493(of)X 2581(the)X 2700(algorithm,)X 3052(\256nding)X 3299(all)X 3400 0.3125(occurrences)AX 3806(of)X 3894(a)X 3952(given)X 4152(string)X 648 2686(in)N 731(a)X 788(given)X 987(text.)X 1168(The)X 1314(algorithm)X 1646(is)X 1720(based)X 1924(on)X 2025(the)X 2144(`shift-or')X 2455(algorithm)X 2787(of)X 2875(Baeza-Yates)X 3303(and)X 3440(Gonnet)X 3697([BG89].)X 4003(Let)X 2 f 4131(R)X 1 f 4201(be)X 4298(a)X 648 2796(bit)N 757(array)X 948(of)X 1040(size)X 2 f 1190(m)X 1 f 1273(\(the)X 1423(size)X 1573(of)X 1665(the)X 1788(pattern\).)X 2083(We)X 2220(denote)X 2459(by)X 2 f 2564(R)X 7 s 2617 2812(j)N 1 f 10 s 2664 2796(the)N 2787(value)X 2986(of)X 3078(the)X 3202(array)X 2 f 3394(R)X 1 f 3469(after)X 3643(the)X 2 f 3773(j)X 1 f 3821(character)X 4143(of)X 4236(the)X 648 2906(text)N 791(has)X 921(been)X 1096(processed.)X 1476(The)X 1624(array)X 2 f 1813(R)X 7 s 1866 2922(j)N 1 f 10 s 1911 2906(contains)N 2201(information)X 2602(about)X 2803(all)X 2906(matches)X 3192(of)X 3282(pre\256xes)X 3559(of)X 2 f 3649(P)X 1 f 3721(with)X 3885(a)X 3943(suf\256x)X 4147(of)X 4236(the)X 648 3016(text)N 789(that)X 930(ends)X 1098(at)X 2 f 1183(j)X 1 f 1205(.)X 1246(More)X 1441(precisely,)X 2 f 1772(R)X 7 s 1825 3032(j)N 1 f 10 s 1847 3016([)N 2 f 1874(i)X 1 f 1909(])X 2 f 9 f 1949(=)X 1 f 2006(1)X 2067(if)X 2137(the)X 2256(\256rst)X 2 f 2401(i)X 1 f 2444(characters)X 2792(of)X 2880(the)X 2999(pattern)X 3243(match)X 3460(exactly)X 3713(the)X 3832(last)X 2 f 3964(i)X 1 f 4007(characters)X 648 3126(up)N 751(to)X 2 f 842(j)X 1 f 887(in)X 972(the)X 1093(text.)X 1275(These)X 1489(are)X 1610(all)X 1712(the)X 1832(partial)X 2059(matches)X 2344(that)X 2486(may)X 2646(lead)X 2802(to)X 2886(full)X 3019(matches)X 3304(later)X 3469(on.)X 3611(When)X 3825(we)X 3941(read)X 2 f 4102(t)X 7 s 4128 3142(j)N 9 f 4153(+)X 1 f 4184(1)X 10 s 4240 3126(we)N 648 3236(need)N 821(to)X 904(determine)X 1246(whether)X 2 f 1526(t)X 7 s 1552 3252(j)N 9 f 1577(+)X 1 f 1608(1)X 10 s 1663 3236(can)N 1796(extend)X 2031(any)X 2168(of)X 2256(the)X 2375(partial)X 2601(matches)X 2885(so)X 2977(far.)X 3129(The)X 3276(transition)X 3600(from)X 2 f 3778(R)X 7 s 3831 3252(j)N 1 f 10 s 3875 3236(to)N 2 f 3959(R)X 7 s 4012 3252(j)N 9 f 4037(+)X 1 f 4068(1)X 10 s 4124 3236(can)N 4258(be)X 648 3346(summarized)N 1060(as)X 1147(follows:)X 792 3456(Initially,)N 2 f 1085(R)X 1 f 7 s 1143 3472(0)N 10 s 1177 3456([)N 2 f 1204(i)X 1 f 1239(])X 2 f 9 f 1286(=)X 1 f 1350(0)X 1410(for)X 1524(all)X 2 f 1624(i)X 1 f 1646(,)X 1686(1)X 2 f 9 f 1739(\243)X 2 f 1796(i)X 9 f 1837(\243)X 2 f 1894(m)X 1 f 1972(;)X 2 f 2014(R)X 1 f 7 s 2072 3472(0)N 10 s 2106 3456([0])N 2 f 9 f 2220(=)X 1 f 2284(1.)X 2 f 808 3670(R)N 7 s 861 3686(j)N 9 f 886(+)X 1 f 917(1)X 10 s 951 3670([)N 2 f 978(i)X 1 f 1013(])X 2 f 9 f 1080(=)X 1 f 10 f 1177 3590(I)N 1177 3670(K)N 1177 3750(L)N 1 f 1237 3710(0)N 1237 3598(1)N 1317 3710(otherwise)N 1317 3582(if)N 2 f 1386(R)X 7 s 1439 3598(j)N 1 f 10 s 1461 3582([)N 2 f 1488(i)X 9 f 1523(-)X 1 f 1567(1])X 2 f 9 f 1654(=)X 1 f 1718(1)X 1778(and)X 2 f 1914(p)X 7 s 3598(i)Y 10 s 9 f 1989 3582(=)N 2 f 2046(t)X 7 s 2072 3598(j)N 9 f 2097(+)X 1 f 2128(1)X 10 s 792 3900(If)N 2 f 866(R)X 7 s 919 3916(j)N 9 f 944(+)X 1 f 975(1)X 10 s 1009 3900([)N 2 f 1036(m)X 1 f 1107(])X 2 f 9 f 1154(=)X 1 f 1218(1)X 1278(then)X 1436(we)X 1550(output)X 1774(a)X 1830(match)X 2046(that)X 2186(ends)X 2353(at)X 2431(position)X 2 f 2714(j)X 9 f 2749(+)X 1 f 2793(1)X 2853(;)X 848 4043(This)N 1018(transition,)X 1368(which)X 1592(we)X 1714(have)X 1894(to)X 1985(compute)X 2290(once)X 2471(for)X 2594(every)X 2802(text)X 2951(character,)X 3296(seems)X 3521(quite)X 3710(complicated.)X 4151(How-)X 648 4153(ever,)N 828(suppose)X 1107(that)X 2 f 1248(m)X 9 f 1325(\243)X 1 f 1382(32)X 1483(\(which)X 1727(is)X 1801(usually)X 2053(the)X 2172(case)X 2332(in)X 2415(practice\),)X 2738(and)X 2875(that)X 2 f 3016(R)X 1 f 3086(is)X 3160(represented)X 3552(as)X 3640(a)X 3697(bit)X 3802(vector)X 4024(using)X 4218(one)X 648 4263(32-bit)N 860(word.)X 1086(For)X 1218(each)X 1387(character)X 2 f 1704(s)X 7 s 1735 4279(i)N 1 f 10 s 1778 4263(in)N 1861(the)X 1980(alphabet)X 2273(we)X 2388(construct)X 2703(a)X 2760(bit)X 2865(array)X 2 f 3052(S)X 7 s 4279(i)Y 1 f 10 s 3135 4263(of)N 3223(size)X 2 f 3369(m)X 1 f 3449(such)X 3618(that)X 2 f 3760(S)X 7 s 4279(i)Y 1 f 10 s 3822 4263([)N 2 f 3849(r)X 1 f 3893(])X 2 f 9 f 3933(=)X 1 f 3990(1)X 4052(if)X 2 f 4123(p)X 7 s 4279(r)Y 10 s 9 f 4204 4263(=)N 2 f 4261(s)X 7 s 4292 4279(i)N 1 f 10 s 4314 4263(.)N 648 4373(\(It)N 747(is)X 822(suf\256cient)X 1142(to)X 1226(construct)X 1542(the)X 2 f 1662(S)X 1 f 1724(arrays)X 1943(only)X 2107(for)X 2223(the)X 2343(characters)X 2692(that)X 2834(appear)X 3071(in)X 3155(the)X 3275(pattern.\))X 3587(It)X 3658(is)X 3733(easy)X 3898(to)X 3982(verify)X 4196(now)X 648 4483(that)N 797(the)X 924(transition)X 1255(from)X 2 f 1440(R)X 7 s 1493 4499(j)N 1 f 10 s 1544 4483(to)N 2 f 1635(R)X 7 s 1688 4499(j)N 9 f 1713(+)X 1 f 1744(1)X 10 s 1807 4483(amounts)N 2107(to)X 2198(no)X 2308(more)X 2503(than)X 2671(a)X 2 f 2737(right)X 2922(shift)X 1 f 3089(of)X 2 f 3186(R)X 7 s 3239 4499(j)N 1 f 10 s 3291 4483(and)N 3437(an)X 3543(AND)X 3747(operation)X 4080(with)X 2 f 4252(S)X 7 s 4499(i)Y 1 f 10 s 4314 4483(,)N 648 4593(where)N 2 f 876(s)X 7 s 907 4609(i)N 10 s 9 f 942 4593(=)N 2 f 999(t)X 7 s 1025 4609(j)N 9 f 1050(+)X 1 f 1081(1)X 10 s 1115 4593(.)N 1186(So,)X 1321(each)X 1500(transition)X 1833(can)X 1976(be)X 2082(executed)X 2398(with)X 2570(only)X 2742(two)X 2892(simple)X 3135(arithmetic)X 3490(operations,)X 3874(a)X 3940(shift)X 4112(and)X 4258(an)X 648 4719(AND.)N 7 s 842 4687(2)N 10 s 848 4862(Suppose)N 1146(now)X 1311(that)X 1458(we)X 1579(want)X 1762(to)X 1851(allow)X 2056(one)X 2199(substitution)X 2598(error.)X 2822(We)X 2961(introduce)X 3291(one)X 3434(more)X 3626(array,)X 3839(denoted)X 4120(by)X 2 f 4227(R)X 7 s 4284 4878(j)N 1 f 4280 4830(1)N 10 s 4314 4862(,)N 648 4972(which)N 873(indicates)X 1187(all)X 1295(possible)X 1585(matches)X 1876(up)X 1984(to)X 2 f 2074(t)X 7 s 2100 4988(j)N 1 f 10 s 2150 4972(with)N 2320(at)X 2406(most)X 2589(one)X 2733(substitution.)X 3153(The)X 3306(transition)X 3636(for)X 3758(the)X 2 f 3884(R)X 1 f 3961(array)X 4155(is)X 4236(the)X 648 5082(same)N 836(as)X 926(before.)X 1195(We)X 1330(need)X 1505(only)X 1670(to)X 1755(specify)X 2010(the)X 2131(transition)X 2456(for)X 2 f 2573(R)X 1 f 7 s 2631 5050(1)N 10 s 2665 5082(.)N 2728(There)X 2939(are)X 3061(two)X 3204(cases)X 3397(for)X 3514(a)X 3573(match)X 3792(with)X 3957(at)X 4039(most)X 4218(one)X 648 5192(substitution)N 1040(of)X 1127(the)X 1245(\256rst)X 2 f 1389(i)X 1 f 1431(characters)X 1778(of)X 2 f 1865(P)X 1 f 1934(up)X 2034(to)X 2 f 2116(t)X 7 s 2142 5208(j)N 9 f 2167(+)X 1 f 2198(1)X 10 s 2232 5192(:)N 8 s 10 f 648 5280(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)N 5 s 1 f 648 5374(2)N 8 s 686 5399(We)N 792(assume)X 998(that)X 1112(the)X 1208(right)X 1347(shift)X 1479(\256lls)X 1594(the)X 1690(\256rst)X 1808(position)X 2033(with)X 2165(a)X 2211(1.)X 2293(If)X 2353(only)X 2485(0-\256lled)X 2688(shifts)X 2845(are)X 2941(available)X 3190(\(as)X 3283(is)X 3345(the)X 3442(case)X 3570(with)X 3703(C\),)X 3802(then)X 3931(we)X 4024(can)X 4131(add)X 4242(one)X 648 5487(more)N 797(OR)X 904(operation)X 1163(with)X 1295(a)X 1340(mask)X 1492(that)X 1605(has)X 1707(one)X 1816(bit.)X 1933(Alternatively,)X 2303(we)X 2394(can)X 2499(use)X 2601(0)X 2650(to)X 2717(indicate)X 2936(a)X 2981(match)X 3154(and)X 3263(an)X 3340(OR)X 3446(operation)X 3704(instead)X 3902(of)X 3972(an)X 4049(AND;)X 4238(that)X 648 5575(way,)N 791(0-\256lled)X 998(shifts)X 1159(are)X 1258(suf\256cient.)X 1550(This)X 1686(is)X 1751(counterintuitive)X 2178(to)X 2250(explain)X 2460(\(and)X 2595(it)X 2653(is)X 2718(not)X 2822(adaptable)X 3088(to)X 3160(some)X 3317(of)X 3392(the)X 3492(extensions\),)X 3821(so)X 3900(we)X 3996(opted)X 4160(for)X 4256(the)X 648 5663(easier)N 812(de\256nition.)X 5 p %%Page: 5 5 8 s 8 xH 0 xS 1 f 10 s 3 f 1 f 648 686(S1.)N 848(There)X 1058(is)X 1133(an)X 1231(exact)X 1423(match)X 1641(of)X 1731(the)X 1852(\256rst)X 2 f 1999(i)X 9 f 2034(-)X 1 f 2078(1)X 2141(characters)X 2491(up)X 2594(to)X 2 f 2679(t)X 7 s 2705 702(j)N 1 f 10 s 2750 686(This)N 2915(case)X 3077(corresponds)X 3488(to)X 3573(substituting)X 2 f 3968(t)X 7 s 3994 702(j)N 9 f 4019(+)X 1 f 4050(1)X 10 s 4107 686(with)N 2 f 4272(p)X 7 s 702(i)Y 1 f 10 s 848 796(\(whether)N 1154(or)X 1241(not)X 1363(they)X 1521(are)X 1640(equal)X 1834(\320)X 1934(the)X 2052(equality)X 2330(will)X 2474(be)X 2570(indicated)X 2884(in)X 2 f 2966(R)X 1 f 3015(\))X 3062(and)X 3198(matching)X 3516(the)X 3634(\256rst)X 2 f 3778(i)X 9 f 3813(-)X 1 f 3857(1)X 3917(characters.)X 648 939(S2.)N 848(There)X 1056(is)X 1129(a)X 1185(match)X 1401(of)X 1488(the)X 1606(\256rst)X 2 f 1750(i)X 9 f 1785(-)X 1 f 1829(1)X 1889(characters)X 2236(up)X 2336(to)X 2 f 2418(t)X 7 s 2444 955(j)N 1 f 10 s 2486 939(with)N 2648(one)X 2784(substitution)X 2 f 3176(and)X 3316(t)X 7 s 3342 955(j)N 9 f 3367(+)X 1 f 3398(1)X 2 f 10 s 9 f 3445 939(=)N 2 f 3502(p)X 7 s 955(i)Y 1 f 10 s 3564 939(.)N 848 1082(It)N 925(turns)X 1113(out)X 1243(that)X 1391(both)X 1561(cases)X 1759(can)X 1899(be)X 2003(handled)X 2285(with)X 2455(two)X 2603(arithmetic)X 2957(operations)X 3320(on)X 2 f 3429(R)X 1 f 7 s 3487 1050(1)N 10 s 3521 1082(.)N 3590(If)X 3673(we)X 3796(allow)X 4003(insertions,)X 648 1192(deletions,)N 981(and)X 1121(substitutions,)X 1568(then)X 1730(we)X 1848(will)X 1996(need)X 2171(4)X 2234(operations)X 2591(on)X 2 f 2694(R)X 1 f 7 s 2752 1160(1)N 10 s 2786 1192(.)N 2849(If)X 2926(we)X 3043(want)X 3222(to)X 3307(allow)X 3508(more)X 3696(than)X 3857(one)X 3996(error,)X 4196(then)X 648 1302(we)N 766(maintain)X 1070(more)X 1259(than)X 1421(one)X 1561(additional)X 2 f 1905(R)X 1 f 7 s 1963 1270(1)N 10 s 2021 1302(array.)N 2251(Overall,)X 2536(the)X 2659(number)X 2929(of)X 3021(operations)X 3380(is)X 3458(proportional)X 3879(to)X 3966(the)X 4089(number)X 648 1412(of)N 735(errors.)X 983(But)X 1118(we)X 1232(can)X 1364(do)X 1464(even)X 1636(better)X 1839(than)X 1997(that.)X 848 1587(Suppose)N 1140(again)X 1335(that)X 1476(the)X 1595(pattern)X 2 f 1839(P)X 1 f 1909(is)X 1983(of)X 2071(size)X 2 f 2217(m)X 1 f 2296(and)X 2433(that)X 2574(at)X 2653(most)X 2 f 2829(k)X 1 f 2886(errors)X 3095(are)X 3215(allowed.)X 3530(Let)X 2 f 3658(r)X 1 f 2 f 9 f 3715(=)X 1 f 2 f 10 f 3779(Q)X 2 f 3832 1643(k)N 9 f 3881(+)X 1 f 3925(1)X 2 f 3870 1539(m)N 1 f 10 f 3821 1563(hhhh)N 2 f 10 f 3999 1587(P)N 1 f (;)S 4063(divide)X 2 f 4285(P)X 1 f 648 1737(into)N 2 f 797(k)X 9 f 846(+)X 1 f 890(1)X 954(blocks)X 1187(each)X 1359(of)X 1450(size)X 2 f 1599(r)X 1 f 1654(and)X 1794(call)X 1934(them)X 2 f 2118(P)X 1 f 7 s 2176 1753(1)N 10 s 2210 1737(,)N 2 f 2249(P)X 1 f 7 s 2307 1753(2)N 10 s 2341 1737(,)N 2 f 2380(...)X 1 f (,)S 2 f 2479(P)X 7 s 2528 1753(k)N 9 f 2562(+)X 1 f 2593(1)X 10 s 2627 1737(.)N 2691(If)X 2 f 2769(P)X 1 f 2842(matches)X 3129(the)X 3251(text)X 3395(with)X 3561(at)X 3643(most)X 2 f 3822(k)X 1 f 3882(errors,)X 4114(then)X 4276(at)X 648 1847(least)N 818(one)X 957(of)X 1047(the)X 2 f 1168(P)X 7 s 1221 1863(j)N 1 f 10 s 1243 1847('s)N 1324(must)X 1502(match)X 1721(the)X 1842(text)X 1985(exactly.)X 2280(We)X 2415(can)X 2550(search)X 2779(for)X 2896(all)X 2 f 2999(P)X 7 s 3052 1863(j)N 1 f 10 s 3074 1847('s)N 3155(at)X 3236(the)X 3357(same)X 3545(time)X 3710(\(we)X 3855(discuss)X 4110(how)X 4272(to)X 648 1957(do)N 756(that)X 904(in)X 994(the)X 1120(next)X 1286(paragraph\))X 1663(and,)X 1827(if)X 1904(one)X 2047(of)X 2141(them)X 2328(matches,)X 2638(then)X 2803(we)X 2924(check)X 3139(the)X 3264(whole)X 3487(pattern)X 3737(directly)X 4009(\(using)X 4236(the)X 648 2067(previous)N 946(scheme\))X 1237(but)X 1362(only)X 1527(within)X 1754(a)X 1813(neighborhood)X 2281(of)X 2371(size)X 2 f 2519(m)X 1 f 2600(from)X 2779(the)X 2900(position)X 3180(of)X 3270(the)X 3391(match.)X 3650(Since)X 3851(we)X 3968(are)X 4090(looking)X 648 2177(for)N 765(an)X 863(exact)X 1055(match,)X 1293(there)X 1476(is)X 1551(no)X 1653(need)X 1827(to)X 1911(maintain)X 2213(the)X 2333(additional)X 2675(vectors.)X 2969(This)X 3133(scheme)X 3396(will)X 3542(run)X 3671(fast)X 3809(if)X 3880(the)X 4000(number)X 4267(of)X 648 2287(exact)N 838(matches)X 1121(to)X 1203(any)X 1339(one)X 1475(of)X 1562(the)X 2 f 1680(P)X 7 s 1733 2303(j)N 1 f 10 s 1755 2287('s)N 1833(is)X 1906(not)X 2028(too)X 2150(high.)X 848 2430(The)N 1000(main)X 1187(advantage)X 1540(of)X 1634(this)X 1777(scheme)X 2046(is)X 2127(that)X 2275(the)X 2401(algorithm)X 2740(for)X 2862(exact)X 3060(matching)X 3386(can)X 3526(be)X 3630(adapted)X 3908(in)X 3998(an)X 4102(elegant)X 648 2540(way)N 805(to)X 890(support)X 1153(it.)X 1260(We)X 1395(illustrate)X 1698(the)X 1819(idea)X 1976(with)X 2141(an)X 2239(example.)X 2573(Suppose)X 2866(that)X 3008(the)X 3128(pattern)X 3373(is)X 3448(ABCDEFGHIJKL)X 4066(\()X 2 f 4093(m)X 9 f 4170(=)X 1 f 4227(12\))X 648 2650(and)N 2 f 786(k)X 9 f 841(=)X 1 f 898(3.)X 1000(We)X 1134(divide)X 1356(the)X 1476(pattern)X 1721(into)X 2 f 1867(k)X 9 f 1916(+)X 1 f 1960(1)X 2 f 9 f 2013(=)X 1 f 2070(4)X 2132(blocks:)X 2385(ABC,)X 2591(DEF,)X 2784(GHI,)X 2969(and)X 3107(JKL.)X 3307(We)X 3441(need)X 3615(to)X 3699(\256nd)X 3846(whether)X 4128(any)X 4267(of)X 648 2760(them)N 835(appears)X 1108(in)X 1197(the)X 1322(text.)X 1508(We)X 1646(create)X 1865(one)X 2007(combined)X 2349(pattern)X 2598(by)X 2704(interleaving)X 3113(the)X 3237(4)X 3303(blocks:)X 3560(ADGJBEHKCFIL.)X 4222(We)X 648 2870(then)N 811(build)X 1000(the)X 1123(mask)X 1317(vector)X 2 f 1543(R)X 1 f 1617(as)X 1709(usual)X 1903(for)X 2022(this)X 2162(interleaved)X 2544(pattern.)X 2832(The)X 2982(only)X 3149(difference)X 3501(is)X 3579(that,)X 3744(instead)X 3997(of)X 4090(shifting)X 648 2980(by)N 750(one)X 888(in)X 972(each)X 1142(step,)X 1313(we)X 1429(shift)X 1593(by)X 1695(four!)X 1897(There)X 2106(is)X 2180(a)X 2237(match)X 2454(if)X 2524(any)X 2661(of)X 2749(the)X 2868(last)X 3000(four)X 3155(bits)X 3291(is)X 3365(1.)X 3466(\(When)X 3706(we)X 3821(shift)X 3984(we)X 4099(need)X 4272(to)X 648 3090(\256ll)N 756(the)X 874(\256rst)X 1019(four)X 1174(positions)X 1483(with)X 1646(1's,)X 1785(or)X 1873(better)X 2077(yet,)X 2216(use)X 2344(shift-OR.\))X 2712(Thus,)X 2913(the)X 3032(match)X 3249(for)X 3364(all)X 3465(blocks)X 3695(can)X 3828(be)X 3925(done)X 4102(exactly)X 648 3200(the)N 766(same)X 951(way)X 1105(as)X 1192(regular)X 1440(matches)X 1723(and)X 1859(it)X 1923(takes)X 2108(essentially)X 2466(the)X 2584(same)X 2769(running)X 3038(time.)X 848 3343(The)N 996(algorithm)X 1330(described)X 1661(so)X 1755(far)X 1868(is)X 1944(ef\256cient)X 2230(for)X 2347(simple)X 2584(string)X 2790(matching.)X 3152(But)X 3291(more)X 3480(important,)X 3835(it)X 3903(is)X 3980(also)X 4133(adapt-)X 648 3453(able)N 804(to)X 888(many)X 1088(extensions)X 1448(of)X 1537(the)X 1657(basic)X 1844(problem.)X 2173(For)X 2306(example,)X 2620(suppose)X 2900(that)X 3042(we)X 3158(want)X 3336(to)X 3420(search)X 3648(for)X 3763(ABC)X 3948(followed)X 4254(by)X 648 3563(a)N 711(digit,)X 904(which)X 1127(is)X 1207(de\256ned)X 1470(in)X 1559(agrep)X 1765(by)X 1872(ABC[0-9].)X 2264(The)X 2416(only)X 2585(thing)X 2776(we)X 2897(need)X 3076(to)X 3165(do)X 3273(is)X 3354(in)X 3444(the)X 3570(preprocessing,)X 4064(for)X 4186(each)X 648 3673(digit,)N 840(allow)X 1044(a)X 1106(match)X 1328(at)X 1412(position)X 1695(4.)X 1801(Everything)X 2183(else)X 2334(remains)X 2614(exactly)X 2872(the)X 2995(same.)X 3225(Other)X 3433(extensions)X 3796(include)X 4057(arbitrary)X 648 3783(wild)N 813(cards,)X 1026(a)X 1085(combination)X 1508(of)X 1598(patterns)X 1875(with)X 2040(and)X 2179(without)X 2446(errors,)X 2677(different)X 2977(costs)X 3161(for)X 3279(insertions,)X 3634(deletions,)X 3967(and/or)X 4196(sub-)X 648 3893(stitutions,)N 980(and)X 1116(probably)X 1421(the)X 1539(most)X 1714(important,)X 2065(arbitrary)X 2362(regular)X 2610(expressions.)X 3044(We)X 3176(have)X 3348(no)X 3448(room)X 3637(to)X 3719(describe)X 4007(the)X 4125(imple-)X 648 4003(mentation)N 989(of)X 1077(these)X 1263(extensions)X 1622(\(see)X 1774([WM91]\).)X 2144(The)X 2291(main)X 2473(technique)X 2807(is)X 2882(to)X 2966(use)X 3095(additional)X 3437(masking)X 3730(and)X 3868(preprocessing.)X 648 4113(It)N 718(is)X 792(sometimes)X 1155(relatively)X 1479(easy)X 1642(\(as)X 1756(is)X 1829(the)X 1947(case)X 2106(with)X 2268(wild)X 2430(cards\))X 2647(and)X 2783(it)X 2847(sometimes)X 3209(requires)X 3488(clever)X 3705(ideas)X 3890(\(as)X 4004(is)X 4077(the)X 4195(case)X 648 4223(with)N 824(arbitrary)X 1135(regular)X 1397(expressions)X 1805(with)X 1981(errors\).)X 2270(Next,)X 2480(we)X 2608(describe)X 2910(a)X 2980(very)X 3157(fast)X 3307(algorithm)X 3652(for)X 3780(multiple)X 4080(patterns)X 648 4333(which)N 864(also)X 1013(leads)X 1198(to)X 1280(a)X 1336(fast)X 1472(approximate-matching)X 2218(algorithm)X 2549(for)X 2663(simple)X 2896(patterns.)X 6 f 12 s 648 4553(3.2.)N 862(An)X 1017(Algorithm)X 1498(for)X 1653(Multi)X 1905(Patterns)X 1 f 10 s 648 4696(Suppose)N 939(that)X 1079(the)X 1197(pattern)X 1440(consists)X 1713(of)X 1800(a)X 1856(set)X 1965(of)X 2 f 2052(k)X 1 f 2108(simple)X 2341(patterns)X 2 f 2615(P)X 1 f 7 s 2673 4712(1)N 10 s 2707 4696(,)N 2 f 2746(P)X 1 f 7 s 2804 4712(2)N 10 s 2838 4696(,)N 2 f 2877(...)X 1 f (,)S 2 f 2976(P)X 7 s 3025 4712(k)N 1 f 10 s 3056 4696(,)N 3097(such)X 3265(that)X 3406(each)X 2 f 3575(P)X 7 s 3624 4712(i)N 1 f 10 s 3667 4696(is)N 3741(a)X 3798(string)X 4001(of)X 2 f 4089(m)X 1 f 4168(char-)X 648 4806(acters)N 861(from)X 1042(a)X 1103(\256xed)X 1288(alphabet)X 2 f 9 f 1585(S)X 1 f 1632(.)X 1697(The)X 1847(text)X 1992(is)X 2070(again)X 2269(a)X 2330(large)X 2516(string)X 2 f 2723(T)X 1 f 2792(of)X 2883(characters)X 3234(from)X 2 f 9 f 3414(S)X 1 f 3461(.)X 3525(\(We)X 3688(assume)X 3948(that)X 4092(all)X 4196(sub-)X 648 4916(patterns)N 935(have)X 1120(the)X 1251(same)X 1449(size)X 1607(for)X 1734(simplicity)X 2086(of)X 2186(description)X 2575(only;)X 2772(agrep)X 2985(makes)X 3224(no)X 3338(such)X 3519(assumption.\))X 3984(The)X 2 f 4143(multi-)X 648 5026(pattern)N 899(string)X 1105(matching)X 1423(problem)X 1 f 1710(is)X 1783(to)X 1865(\256nd)X 2009(all)X 2109(substrings)X 2453(in)X 2535(the)X 2653(text)X 2793(that)X 2933(match)X 3149(at)X 3227(least)X 3394(one)X 3530(of)X 3617(the)X 3735(patterns)X 4009(in)X 4091(the)X 4209(set.)X 848 5169(The)N 993(\256rst)X 1137(ef\256cient)X 1420(algorithm)X 1751(for)X 1865(solving)X 2120(this)X 2255(problem)X 2542(is)X 2615(by)X 2715(Aho)X 2874(and)X 3011(Corasick)X 3317([AC75],)X 3603(which)X 3820(solves)X 4041(the)X 4160(prob-)X 648 5279(lem)N 801(in)X 896(linear)X 1112(time.)X 1327(\(This)X 1528(algorithm)X 1871(is)X 1956(the)X 2086(basis)X 2278(of)X 2 f 2383(fgrep)X 1 f 2552(.\))X 2651(Commentz-Walter)X 3280([CW79])X 3575(presented)X 3915(an)X 4023(algorithm)X 648 5389(which)N 865(combines)X 1193(the)X 1313(Boyer-Moore)X 1772(technique)X 2106(with)X 2270(the)X 2390(Aho-Corasick)X 2862(algorithm.)X 3235(The)X 3382(Commentz-Walter)X 4001(Algorithm)X 648 5499(is)N 729(substantially)X 1161(faster)X 1368(than)X 1534(the)X 1660(Aho-Corasick)X 2138(algorithm)X 2477(when)X 2679(the)X 2805(number)X 3077(of)X 3171(patterns)X 3452(is)X 3532(not)X 3661(too)X 3790(big.)X 3959(The)X 4111(pattern)X 648 5609(matching)N 975(tool)X 2 f 1128(gre)X 1 f 1264([Hu91])X 1525(\(which)X 1777(covers)X 2016(almost)X 2258(all)X 2367(functions)X 2694(of)X 2790 0.2109(egrep/grep/fgrep\))AX 3382(developed)X 3741(by)X 3850(Andrew)X 4138(Hume)X 648 5719(has)N 775(incorporated)X 1201(the)X 1319(Commentz-Walter)X 1936(algorithm)X 2267(for)X 2381(the)X 2499(multi-pattern)X 2937(string)X 3139(matching)X 3457(problem.)X 6 p %%Page: 6 6 10 s 10 xH 0 xS 1 f 3 f 1 f 848 686(Our)N 1008(algorithm)X 1354(uses)X 1527(a)X 1598(hashing)X 1882(technique)X 2229(combined)X 2580(with)X 2757(a)X 2828(different)X 3140(Boyer-Moore)X 3612(technique.)X 3999(Instead)X 4267(of)X 648 796(building)N 936(a)X 994(shift)X 1158(table)X 1336(based)X 1541(on)X 1643(single)X 1856(characters,)X 2225(we)X 2341(build)X 2527(the)X 2647(shift)X 2811(table)X 2989(based)X 3194(on)X 3295(a)X 3352(block)X 3551(of)X 3639(characters.)X 4027(\(The)X 4200(idea)X 648 906(of)N 745(using)X 948(a)X 1014(block)X 1222(of)X 1319(characters)X 1676(was)X 1831(\256rst)X 1985(proposed)X 2309(by)X 2419(Knuth-Morris-Pratt)X 3072(in)X 3164(section)X 3421(8)X 3491(of)X 3588([KMP77].\))X 3992(Like)X 4169(other)X 648 1016(Boyer-Moore)N 1109(style)X 1284(algorithms,)X 1670(our)X 1801(algorithm)X 2135(preprocesses)X 2569(the)X 2690(patterns)X 2967(to)X 3052(build)X 3239(some)X 3431(data)X 3588(structures)X 3923(such)X 4093(that)X 4236(the)X 648 1126(search)N 877(process)X 1141(can)X 1276(be)X 1375(speeded)X 1657(up.)X 1800(Let)X 2 f 1930(c)X 1 f 1989(denote)X 2226(the)X 2347(size)X 2495(of)X 2585(the)X 2706(alphabet,)X 2 f 3021(M)X 1 f 3111(=)X 2 f 3180(k)X 1 f 3235 1102(.)N 2 f 3268 1126(m)N 1 f 3326(,)X 3370(and)X 2 f 3510(b)X 9 f 3569(=)X 10 f 3626(R)X 1 f 3659(log)X 2 f 7 s 3761 1142(c)N 10 s 3792 1126(M)N 10 f 3878(H)X 1 f (.)S 3962(We)X 4098(assume)X 648 1236(that)N 2 f 791(b)X 9 f 863(\243)X 2 f 933(m)X 997(/)X 1 f 1025(2.)X 1108(In)X 1198(the)X 1319(preprocessing,)X 1808(a)X 1867(shift)X 2032(table)X 2 f 2211(SHIFT)X 1 f 2452(and)X 2591(a)X 2649(hashing)X 2920(table)X 2 f 3098(HASH)X 1 f 3325(are)X 3446(built.)X 3654(We)X 3788(look)X 3952(at)X 4032(the)X 4152(text)X 2 f 4294(b)X 1 f 648 1346(characters)N 995(at)X 1073(a)X 1129(time.)X 1331(The)X 1477(values)X 1703(in)X 1786(the)X 1905(SHIFT)X 2148(table)X 2325(determine)X 2667(how)X 2826(far)X 2937(we)X 3052(can)X 3185(shift)X 3348(forward)X 3624(during)X 3854(the)X 3973(search)X 4200(pro-)X 648 1456(cess.)N 851(The)X 1005(shift)X 1176(table)X 2 f 1361(SHIFT)X 1 f 1608(is)X 1690(an)X 1795(array)X 1990(of)X 2086(size)X 2 f 2240(c)X 7 s 2285 1424(b)N 1 f 10 s 2319 1456(.)N 2388(Each)X 2578(entry)X 2772(of)X 2 f 2867(SHIFT)X 1 f 3113(corresponds)X 3529(to)X 3619(a)X 3683(distinct)X 3946(substring)X 4267(of)X 648 1566(length)N 2 f 872(b)X 1 f (.)S 956(Let)X 2 f 1087(X)X 1 f 1160(=)X 2 f 1229(x)X 1 f 7 s 1274 1582(1)N 2 f 10 s 1308 1566(x)N 1 f 7 s 1353 1582(2)N 10 s 1407 1542(.)N 1447(.)X 1487(.)X 2 f 1527 1566(x)N 7 s 1563 1582(b)N 1 f 10 s 1621 1566(be)N 1721(a)X 1781(string)X 1987(corresponding)X 2470(to)X 2556(the)X 2 f 2678(i)X 1 f 2700('th)X 2813(entry)X 3003(of)X 2 f 3095(SHIFT)X 1 f 3313(.)X 3378(There)X 3591(are)X 3715(two)X 3860(cases:)X 4077(either)X 2 f 4285(X)X 1 f 648 1676(appears)N 920(somewhere)X 1312(in)X 1400(one)X 1542(of)X 1635(the)X 2 f 1759(P)X 7 s 1812 1692(j)N 1 f 10 s 1834 1676('s)N 1918(or)X 2011(not.)X 2179(For)X 2316(the)X 2440(latter)X 2631(case,)X 2816(we)X 2936(store)X 2 f 3118(m)X 9 f 3182(-)X 2 f 3226(b)X 9 f 3272(+)X 1 f 3316(1)X 3382(in)X 2 f 3470(SHIFT)X 1 f 3701([)X 2 f 3728(i)X 1 f 3763(].)X 3856(For)X 3992(the)X 4115(former)X 648 1786(case,)N 829(we)X 945(\256nd)X 1091(the)X 1211(rightmost)X 1539 0.3611(occurrence)AX 1915(of)X 2 f 2004(X)X 1 f 2075(in)X 2159(any)X 2297(of)X 2386(the)X 2 f 2506(P)X 7 s 2559 1802(j)N 1 f 10 s 2581 1786('s)N 2661(that)X 2803(contain)X 3061(it;)X 3149(suppose)X 3429(it)X 3495(is)X 3570(in)X 2 f 3654(P)X 7 s 3703 1802(y)N 1 f 10 s 3756 1786(and)N 3894(that)X 2 f 4036(X)X 1 f 4107(ends)X 4276(at)X 648 1896(position)N 2 f 925(q)X 1 f 985(of)X 2 f 1072(P)X 7 s 1121 1912(y)N 1 f 10 s 1152 1896(.)N 1212(Then)X 1397(we)X 1511(store)X 2 f 1687(m)X 9 f 1764(-)X 2 f 1821(q)X 1 f 1881(in)X 2 f 1963(SHIFT)X 1 f 2194([)X 2 f 2221(i)X 1 f 2256(].)X 848 2039(If)N 925(the)X 1046(shift)X 1211(value)X 1409(is)X 1486(>)X 1555(0,)X 1639(then)X 1801(we)X 1919(can)X 2055(safely)X 2271(shift.)X 2477(Otherwise,)X 2851(it)X 2919(is)X 2996(possible)X 3282(that)X 3426(the)X 3548(current)X 3800(substring)X 4117(we)X 4235(are)X 648 2149(looking)N 913(at)X 992(in)X 1074(the)X 1192(text)X 1332(matches)X 1615(some)X 1804(pattern)X 2047(in)X 2129(the)X 2247(pattern)X 2490(list.)X 2647(To)X 2756(avoid)X 2954(comparing)X 3317(the)X 3435(substring)X 3748(to)X 3830(every)X 4029(pattern)X 4272(in)X 648 2259(the)N 766(pattern)X 1009(list,)X 1146(we)X 1260(use)X 1387(a)X 1443(hashing)X 1712(technique)X 2044(to)X 2127(minimize)X 2450(the)X 2569(number)X 2835(of)X 2923(patterns)X 3198(to)X 3281(be)X 3378(compared.)X 3756(In)X 3844(the)X 3963(preprocess-)X 648 2369(ing)N 771(we)X 886(build)X 1071(a)X 1128(hash)X 1296(table)X 2 f 1473(HASH)X 1 f 1699(such)X 1867(that)X 2008(a)X 2065(pattern)X 2309(with)X 2472(hash)X 2640(value)X 2 f 2840(j)X 1 f 2882(is)X 2955(put)X 3077(in)X 3159(a)X 3215(linked-list)X 3559(pointed)X 3819(to)X 3901(by)X 2 f 4001(HASH)X 1 f 4219([)X 2 f 4252(j)X 1 f 4287(].)X 648 2479(So,)N 780(in)X 870(the)X 996(search)X 1231(process,)X 1521(whenever)X 1863(we)X 1986(are)X 2114(going)X 2325(to)X 2416(compare)X 2722(current)X 2979(aligned)X 3244(substring)X 3566(to)X 3657(the)X 3784(patterns,)X 4087(we)X 4210(\256rst)X 648 2589(compute)N 948(the)X 1070(hash)X 1241(value)X 1438(of)X 1528(the)X 1649(substring)X 1965(and)X 2104(compare)X 2404(the)X 2525(substring)X 2841(only)X 3006(to)X 3091(those)X 3283(patterns)X 3560(that)X 3703(have)X 3878(the)X 3999(same)X 4187(hash)X 648 2699(value.)N 862(The)X 1007(algorithm)X 1338(for)X 1452(searching)X 1780(the)X 1898(text)X 2038(is)X 2111(sketched)X 2412(in)X 2494(Figure)X 2723(1.)X 848 2842(The)N 993(multi-pattern)X 1431(matching)X 1749(algorithm)X 2080(described)X 2408(above)X 2620(can)X 2752(be)X 2848(used)X 3015(to)X 3097(solve)X 3286(the)X 3405(approximate)X 3827(string-matching)X 648 2952(problem.)N 982(Let)X 2 f 1116(P)X 1 f 1192(=)X 2 f 1263(p)X 1 f 7 s 1312 2968(1)N 10 s 1359 2952(,)N 2 f 1385(p)X 1 f 7 s 1434 2968(2)N 10 s 1481 2952(,)N 2 f 1507(...)X 1 f (,)S 2 f 1606(p)X 7 s 2968(M)Y 1 f 10 s 1725 2952(be)N 1827(a)X 1889(pattern)X 2138(string,)X 2366(and)X 2508(let)X 2 f 2614(T)X 1 f 2684(=)X 2 f 2755(a)X 1 f 7 s 2804 2968(1)N 10 s 2838 2952(,)N 2 f 2864(a)X 1 f 7 s 2913 2968(2)N 10 s 2947 2952(,)N 2 f 2973(...a)X 7 s 2968(N)Y 1 f 10 s 3142 2952(be)N 3244(a)X 3306(text)X 3452(string.)X 3700(We)X 3838(partition)X 2 f 4135(P)X 1 f 4210(into)X 2 f 648 3062(k)N 9 f 703(+)X 1 f 760(1)X 821(fragments)X 2 f 1163(P)X 1 f 7 s 1221 3078(1)N 10 s 1255 3062(,)N 2 f 1281(P)X 1 f 7 s 1339 3078(2)N 10 s 1373 3062(,)N 2 f 1399(...)X 1 f (,)S 2 f 1485(P)X 7 s 1534 3078(k)N 9 f 1568(+)X 1 f 1599(1)X 10 s 1633 3062(,)N 1674(each)X 1843(of)X 1931(size)X 2 f 2077(m)X 1 f 2156(=)X 2 f 2223(M)X 2296(/)X 1 f 2324(\()X 2 f 2351(k)X 9 f 2400(+)X 1 f 2444(1\).)X 2573(Let)X 2 f 2702(T)X 7 s 2746 3078 4.8750(ij)AN 1 f 10 s 2810 3062(=)N 2 f 2877(a)X 7 s 3078(i)Y 1 f 10 s 2939 3062(,)N 2 f 2965(...)X 1 f (,)S 2 f 3051(a)X 7 s 3095 3078(j)N 1 f 10 s 3139 3062(be)N 3237(a)X 3295(substring)X 3610(of)X 2 f 3699(T)X 1 f 3743(.)X 3805(By)X 3920(a)X 3978(pigeonhole)X 648 3172(principle,)N 980(if)X 2 f 1056(T)X 7 s 1100 3188 4.8750(ij)AN 1 f 10 s 1169 3172(differs)N 1406(from)X 2 f 1589(P)X 1 f 1665(by)X 1772(no)X 1879(more)X 2071(than)X 2 f 2236(k)X 1 f 2298(errors,)X 2532(then)X 2696(one)X 2838(of)X 2931(the)X 3055(fragment)X 3371(must)X 3552(match)X 3774(a)X 3836(substring)X 4155(of)X 2 f 4248(T)X 7 s 4292 3188 4.8750(ij)AN 1 f 10 s 648 3282(exactly.)N 848 3425(The)N 998(approximate)X 1425(string)X 1633(matching)X 1957(algorithm)X 2294(is)X 2373(conducted)X 2729(in)X 2817(two)X 2963(phases.)X 3243(In)X 3336(the)X 3460(\256rst)X 3610(phase)X 3819(we)X 3939(partition)X 4236(the)X 648 3535(pattern)N 892(into)X 2 f 1037(k)X 9 f 1092(+)X 1 f 1149(1)X 1210(fragments)X 1552(and)X 1688(use)X 1815(the)X 1933(multi-pattern)X 2371(string)X 2573(matching)X 2891(algorithm)X 3222(to)X 3304(\256nd)X 3448(all)X 3548(those)X 3737(places)X 3958(that)X 4098(contain)X 648 3645(one)N 786(of)X 875(the)X 995(fragments.)X 1378(If)X 1455(there)X 1639(is)X 1715(a)X 1774(match)X 1993(of)X 2083(a)X 2142(fragment)X 2455(at)X 2536(position)X 2 f 2816(i)X 1 f 2861(of)X 2951(the)X 3072(text,)X 3235(we)X 3352(mark)X 3540(the)X 3661(positions)X 2 f 3972(i)X 9 f 4013(-)X 2 f 4070(M)X 9 f 4156(-)X 2 f 4213(k)X 1 f 4272(to)X 2 f 648 3755(i)N 9 f 689(+)X 2 f 746(M)X 9 f 832(+)X 2 f 889(k)X 9 f 944(-)X 2 f 988(m)X 1 f 1077(as)X 1175(a)X 1242('candidate')X 1635(area.)X 1840(After)X 2040(the)X 2168(\256rst)X 2322(phase)X 2535(is)X 2618(done)X 2804(we)X 2928(apply)X 3136(the)X 3264(approximate)X 3695(matching)X 4023(algorithm)X 648 3865(described)N 988(in)X 1082(section)X 1341(3.1)X 1473(to)X 1567(\256nd)X 1724(the)X 1855(actual)X 2080(matches)X 2376(in)X 2471(those)X 2673(marked)X 2947(area.)X 3155(\(If)X 3269(the)X 3400(pattern)X 3656(size)X 3814(is)X 2 f 3900(>)X 1 f 3967(32,)X 4100(we)X 4227(use)X 3 f 648 4071(Algorithm)N 1 f 1024(Multi-Patterns)X 936 4181(Let)N 2 f 1063(p)X 1 f 1123(be)X 1219(the)X 1337(current)X 1585(position)X 1862(of)X 1949(the)X 2067(text)X 2207(;)X 3 f 936 4291(while)N 1 f 1138(\()X 2 f 1165(p)X 1 f 1225(<)X 2 f 1290(N)X 1 f 1343(\))X 1410(/*)X 2 f 1492(N)X 1 f 1565(is)X 1638(the)X 1756(end)X 1892(position)X 2169(of)X 2256(the)X 2374(text)X 2514(*/)X 936 4401({)N 2 f 1108 4511(blk_idx)N 1 f 1364(=)X 2 f 1429(map)X 1 f 1573(\()X 2 f 1600(a)X 7 s 4527(p)Y 9 f 1677(-)X 2 f 1708(b)X 9 f 1745(+)X 1 f 1776(1)X 2 f 10 s 1810 4511(a)N 7 s 4527(p)Y 9 f 1887(-)X 2 f 1918(b)X 9 f 1955(+)X 1 f 1986(2)X 10 s 2040 4487(.)N 2080(.)X 2120(.)X 2 f 2160 4511(a)N 7 s 4527(p)Y 1 f 10 s 2234 4511(\))N 2281(/*)X 2 f 2363(map)X 1 f 2521(transforms)X 2884(a)X 2940(string)X 3142(of)X 3229(size)X 2 f 3374(b)X 1 f 3434(into)X 3578(an)X 3674(integer)X 3917(*/)X 2 f 1108 4621(shift_value)N 1 f 1479(=)X 2 f 1544(SHIFT)X 1 f 1775([)X 2 f 1802(blk_idx)X 1 f 2051(];)X 3 f 1108 4731(if)N 1 f 1177(\()X 2 f 1204(shift_value)X 1 f 1575(>)X 1640(0\))X 2 f 1727(p)X 1 f 1787(=)X 2 f 1852(p)X 1 f 1912(+)X 2 f 1977(shift_value)X 1 f 2328(;)X 3 f 1108 4841(else)N 1 f 1281 4951(compute)N 1577(the)X 1695(hash)X 1862(value)X 2056(of)X 2 f 2143(a)X 7 s 4967(p)Y 9 f 2220(-)X 2 f 2251(m)X 9 f 2300(+)X 1 f 2331(1)X 10 s 2385 4927(.)N 2425(.)X 2465(.)X 2 f 2505 4951(a)N 7 s 4967(p)Y 1 f 10 s 2579 4951(;)N 1281 5061(compare)N 2 f 1578(a)X 7 s 5077(p)Y 9 f 1655(-)X 2 f 1686(m)X 9 f 1735(+)X 1 f 1766(1)X 10 s 1820 5037(.)N 1860(.)X 1900(.)X 2 f 1940 5061(a)N 7 s 5077(p)Y 1 f 10 s 2034 5061(to)N 2116(every)X 2315(pattern)X 2558(that)X 2698(has)X 2825(the)X 2943(same)X 3128(hash)X 3295(value;)X 3 f 1281 5171(if)N 1 f 1350(there)X 1531(is)X 1604(a)X 1660(match)X 3 f 1876(then)X 1 f 2047(reports)X 2 f 2290(a)X 7 s 5187(p)Y 9 f 2367(-)X 2 f 2398(m)X 9 f 2447(+)X 1 f 2478(1)X 10 s 2532 5147(.)N 2572(.)X 2612(.)X 2 f 2652 5171(a)N 7 s 5187(p)Y 1 f 10 s 2726 5171(;)N 2 f 1281 5281(p)N 1 f 1341(=)X 2 f 1406(p)X 1 f 1466(+)X 1531(1;)X 936 5391(})N 3 f 1454 5611(Figure)N 1701(1:)X 1 f 1808(A)X 1886(sketch)X 2111(of)X 2198(the)X 2316(algorithm)X 2647(for)X 2761(multi-pattern)X 3199(searching.)X 7 p %%Page: 7 7 10 s 10 xH 0 xS 1 f 3 f 1 f 648 686(Ukkonen's)N 2 f 1020(O)X 1 f 1091(\()X 2 f 1118(Nk)X 1 f 1213(\))X 1260(expected-time)X 1735(algorithm)X 2066([Uk85].\))X 848 829(Our)N 1003(algorithm)X 1345(is)X 1429(very)X 1603(fast)X 1750(when)X 1955(the)X 2084(pattern)X 2338(is)X 2422(long)X 2595(and)X 2742(the)X 2871(number)X 3147(of)X 3245(errors)X 3464(is)X 3548(not)X 3681(high)X 3854(\(assuming)X 4214(that)X 2 f 648 939(k)N 703(<)X 770(M)X 843(/)X 1 f 871(log)X 2 f 7 s 973 955(b)N 10 s 1007 939(M)N 1 f 1074(\).)X 1170(Unlike)X 1417(the)X 1544(approximate)X 1974(Boyer-Moore)X 2440(string)X 2650(matching)X 2976(algorithm)X 3315(in)X 3405([TU90],)X 3694(whose)X 3927(performance)X 648 1049(degrades)N 956(greatly)X 1201(when)X 1397(the)X 1517(size)X 1664(of)X 1753(the)X 1873(alphabet)X 2167(set)X 2278(is)X 2353(small,)X 2568(our)X 2697(algorithm)X 3030(is)X 3106(not)X 3231(sensitive)X 3534(to)X 3619(the)X 3740(alphabet)X 4035(size.)X 4223(For)X 648 1159(example,)N 964(for)X 1082(DNA)X 1280(patterns)X 1558(of)X 1648(size)X 1796(500,)X 1959(allowing)X 2262(25)X 2365(errors,)X 2596(our)X 2726(algorithm)X 3060(is)X 3136(about)X 3337(two)X 3480(orders)X 3704(of)X 3794(magnitude)X 4155(faster)X 648 1269(than)N 814(Ukkonen's)X 2 f 1194(O)X 1 f 1265(\()X 2 f 1292(Nk)X 1 f 1387(\))X 1442(expected-time)X 1925(algorithm)X 2264([Uk85])X 2525(and)X 2670(algorithm)X 3010(MN2)X 3208([GP90])X 3473(\(which)X 3725(are)X 3853(the)X 3980(two)X 4129(fastest)X 648 1379(algorithms)N 1017(among)X 1262(the)X 1387(algorithms)X 1756(compared)X 2100(in)X 2189([CL90]\).)X 2518(Experimental)X 2976(results)X 3211(are)X 3336(given)X 3540(in)X 3628(the)X 3752(next)X 3916(section.)X 4209(The)X 648 1489(algorithm)N 979(is)X 1052(very)X 1215(fast)X 1351(for)X 1465(practical)X 1762(applications)X 2169(for)X 2283(searching)X 2611(both)X 2773(English)X 3037(text)X 3177(and)X 3313(DNA)X 3507(data.)X 6 f 14 s 648 1741(4.)N 803(Experim)X 1244(ental)X 1536(Results)X 1 f 10 s 648 1916(We)N 794(present)X 1060(four)X 1228(brief)X 1414(experiments.)X 1860(The)X 2019(numbers)X 2329(given)X 2541(here)X 2714(should)X 2962(be)X 3073(taken)X 3282(with)X 3459(caution.)X 3770(Any)X 3943(such)X 4125(results)X 648 2026(depend)N 910(on)X 1019(the)X 1146(architecture,)X 1575(the)X 1702(operating)X 2034(system,)X 2305(the)X 2432(compilers)X 2777(used,)X 2973(not)X 3104(to)X 3195(mention)X 3486(the)X 3613(patterns)X 3896(and)X 4041(test)X 4181(\256les.)X 648 2136(These)N 862(tests)X 1026(are)X 1147(by)X 1249(no)X 1351(means)X 1578(comprehensive.)X 2126(They)X 2313(are)X 2434(given)X 2634(here)X 2795(to)X 2879(show)X 3070(that)X 3212(agrep)X 3413(is)X 3488(fast)X 3627(enough)X 3886(to)X 3971(be)X 4070(used)X 4240(for)X 648 2246(large)N 832(\256les.)X 1028(Differences)X 1427(of)X 1517(20-30%)X 1794(in)X 1879(the)X 2000(running)X 2272(times)X 2468(are)X 2589(not)X 2713(signi\256cant.)X 3108(Thus,)X 3310(all)X 3412(Boyer-Moore)X 3871(type)X 4031(programs)X 648 2356(are)N 767(about)X 965(the)X 1083(same)X 1268(for)X 1382(simple)X 1615(patterns.)X 1929(Agrep)X 2150(seems)X 2366(better)X 2569(for)X 2684(multi)X 2873(patterns.)X 3188(For)X 3320(approximate)X 3742(matching,)X 4081(agrep)X 4281(is)X 648 2466(one)N 791(to)X 880(two)X 1027(orders)X 1255(of)X 1349(magnitude)X 1714(faster)X 1920(than)X 2085(other)X 2277(programs)X 2607(that)X 2754(we)X 2875(tested.)X 3129(We)X 3268(believe)X 3527(that)X 3673(the)X 3797(main)X 3983(strength)X 4267(of)X 648 2576(agrep)N 847(is)X 920(that)X 1060(it)X 1124(is)X 1197(more)X 1382(\257exible,)X 1662(general,)X 1939(and)X 2075(convenient)X 2447(than)X 2605(all)X 2705(previous)X 3001(programs.)X 848 2719(All)N 973(tests)X 1138(were)X 1318(run)X 1448(on)X 1551(a)X 1611(SUN)X 1795(SparcStation)X 2228(II)X 2306(running)X 2579(UNIX.)X 2844(Two)X 3015(\256les)X 3172(were)X 3353(used,)X 3544(both)X 3710(of)X 3801(size)X 3950(1MB,)X 4158(one)X 4298(a)X 648 2829(sub-dictionary)N 1140(and)X 1285(one)X 1430(a)X 1495(collection)X 1840(of)X 1936(bibliographic)X 2392(data.)X 2595(The)X 2749(numbers)X 3054(are)X 3182(in)X 3273(seconds)X 3556(and)X 3701(are)X 3829(the)X 3956(averages)X 4267(of)X 648 2939(several)N 903(experiments.)X 1362(They)X 1554(were)X 1738(measured)X 2073(by)X 2180(the)X 2305(time)X 2474(facility)X 2728(in)X 2818(UNIX)X 3047(and)X 3191(only)X 3361(user)X 3523(times)X 3724(were)X 3909(taken)X 4111(\(which)X 648 3049(adds)N 815(considerably)X 1245(to)X 1327(their)X 1494(impreciseness\).)X 848 3192(Table)N 1062(1)X 1133(compares)X 1472(agrep)X 1683(against)X 1942(other)X 2139(programs)X 2474(for)X 2 f 2600(exact)X 1 f 2802(string)X 3016(matching.)X 3386(The)X 3543(\256rst)X 3699(three)X 3892(programs)X 4227(use)X 648 3302(Boyer-Moore)N 1107(type)X 1267(algorithms.)X 1671(The)X 1818(original)X 2089(egrep)X 2289(does)X 2457(not.)X 2620(We)X 2753(used)X 2921(50)X 3022(words)X 3239(of)X 3327(varying)X 3593(sizes)X 3770(\(3-12\))X 3992(as)X 4080(patterns)X 648 3412(and)N 784(averaged)X 1095(the)X 1213(results.)X 10 f 1781 3508(i)N 1801(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1 f 1821 3618(text)N 1961(size)X 2206(agrep)X 2544(gre)X 2806(e?grep)X 3141(egrep)X 10 f 1781 3648(i)N 1801(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1 f 1821 3758(1Mb)N 2226(0.09)X 2526(0.11)X 2843(0.11)X 3161(0.79)X 1821 3868(200Kb)N 2206(0.028)X 2506(0.024)X 2823(0.038)X 3141(0.218)X 10 f 1781 3898(i)N 1801(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1781(c)X 3828(c)Y 3748(c)Y 3668(c)Y 3588(c)Y 3361 3898(c)N 3828(c)Y 3748(c)Y 3668(c)Y 3588(c)Y 1 f 1881 4041(Table)N 2084(1:)X 2186(Exact)X 2389(matching)X 2707(of)X 2794(simple)X 3027(strings.)X 848 4217(Table)N 1053(2)X 1115(shows)X 1337(results)X 1568(of)X 1657(searching)X 1987(for)X 2103(multi)X 2293(patterns.)X 2609(In)X 2698(the)X 2819(\256rst)X 2966(line)X 3109(the)X 3230(pattern)X 3476(consisted)X 3797(of)X 3887(50)X 3990(words)X 4209(\(the)X 648 4327(same)N 837(words)X 1057(that)X 1201(were)X 1382(used)X 1553(in)X 1639(Table)X 1846(1,)X 1930(but)X 2056(all)X 2160(in)X 2245(once\))X 2447(searched)X 2752(inside)X 2966(a)X 3025(dictionary;)X 3395(in)X 3480(the)X 3601(second)X 3847(line)X 3990(the)X 4111(pattern)X 648 4437(consists)N 921(of)X 1008(20)X 1108(separate)X 1392(titles)X 1567(\(each)X 1762(two)X 1902(words)X 2118(long\),)X 2327(searched)X 2629(in)X 2711(a)X 2767(bibliographic)X 3214(\256le.)X 10 f 1791 4533(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)N 1 f 1831 4643(pattern)N 2247(agrep)X 2564(gre)X 2806(e?grep)X 3141(fgrep)X 10 f 1791 4673(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)N 1 f 1831 4783(50)N 1931(words)X 2266(1.15)X 2546(2.57)X 2843(6.11)X 3156(8.13)X 1831 4893(20)N 1931(titles)X 2266(0.18)X 2546(0.71)X 2843(1.53)X 3156(5.64)X 10 f 1791 4923(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)N 1791(c)X 4853(c)Y 4773(c)Y 4693(c)Y 4613(c)Y 3351 4923(c)N 4853(c)Y 4773(c)Y 4693(c)Y 4613(c)Y 1 f 1883 5066(Table)N 2086(2:)X 2188(Exact)X 2391(matching)X 2709(of)X 2796(multi)X 2984(patterns.)X 848 5242(Table)N 1053(3)X 1115(shows)X 1337(typical)X 1577(running)X 1848(times)X 2043(for)X 2159(approximate)X 2582(matching.)X 2943(Two)X 3113(patterns)X 3390(were)X 3570(used)X 3740(\320)X 3843(`matching')X 4218(and)X 648 5352(`string)N 888(matching')X 1243(\320)X 1353(and)X 1499(we)X 1623(tested)X 1840(each)X 2018(one)X 2164(with)X 2336(1,)X 2426(2,)X 2516(and)X 2662(3)X 2732(errors)X 2950(\(denoted)X 3261(by)X 3371(Er\).)X 3544(Other)X 3757(programs)X 4090(that)X 4240(we)X 648 5462(tested)N 855(did)X 977(not)X 1099(come)X 1293(close.)X 8 p %%Page: 8 8 10 s 10 xH 0 xS 1 f 3 f 1 f 10 f 1772 639(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)N 1 f 1812 749(pattern)N 2486(Er)X 2582(=)X 2647(1)X 2807(Er)X 2903(=)X 2968(2)X 3128(Er)X 3224(=)X 3289(3)X 10 f 1772 779(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)N 1 f 1812 889(`string)N 2041(matching')X 2516(0.26)X 2837(0.55)X 3158(0.76)X 1812 999(`matching')N 2516(0.22)X 2837(0.66)X 3158(1.14)X 10 f 1772 1029(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)N 1772(c)X 959(c)Y 879(c)Y 799(c)Y 719(c)Y 3369 1029(c)N 959(c)Y 879(c)Y 799(c)Y 719(c)Y 1 f 1761 1172(Table)N 1964(3:)X 2066(Approximate)X 2509(matching)X 2827(of)X 2914(simple)X 3147(strings.)X 848 1348(Table)N 1058(4)X 1125(shows)X 1352(typical)X 1597(running)X 1873(times)X 2073(for)X 2194(more)X 2386(complicated)X 2805(patterns,)X 3106(including)X 3435(regular)X 3691(expressions.)X 4133(Agrep)X 648 1458(does)N 819(not)X 945(yet)X 1067(use)X 1198(any)X 1337(Boyer-Moore)X 1797(type)X 1958(\256ltering)X 2234(for)X 2351(these)X 2539(patterns.)X 2856(As)X 2968(a)X 3027(result,)X 3248(the)X 3369(running)X 3641(times)X 3837(are)X 3959(slower)X 4196(than)X 648 1568(they)N 810(are)X 933(for)X 1051(simpler)X 1315(patterns.)X 1633(The)X 1782(best)X 1935(algorithm)X 2270(we)X 2388(know)X 2590(for)X 2708(approximate)X 3133(matching)X 3455(to)X 3542(arbitrary)X 3844(regular)X 4097(expres-)X 648 1678(sions)N 835(is)X 911(by)X 1014(Myers)X 1242(and)X 1381(Miller)X 1604([MM89].)X 1943(Its)X 2045(running)X 2316(times)X 2511(for)X 2627(the)X 2747(cases)X 2939(we)X 3055(tested)X 3264(were)X 3443(more)X 3630(than)X 3790(an)X 3888(order)X 4080(of)X 4169(mag-)X 648 1788(nitude)N 870(slower)X 1106(than)X 1266(our)X 1395(algorithm,)X 1748(but)X 1872(this)X 2009(is)X 2084(not)X 2208(a)X 2266(fair)X 2400(test,)X 2553(because)X 2830(Myers)X 3057(and)X 3195(Miller's)X 3476(algorithm)X 3810(can)X 3945(handle)X 4182(arbi-)X 648 1898(trary)N 824(costs)X 1007(\(which)X 1253(we)X 1370(cannot)X 1607(handle\))X 1871(and)X 2010(its)X 2108(running)X 2380(time)X 2545(is)X 2621(independent)X 3036(of)X 3126(the)X 3247(number)X 3515(of)X 3605(errors)X 3816(\(which)X 4062(makes)X 4290(it)X 648 2008(competitive)N 1046(with)X 1208(or)X 1295(better)X 1498(than)X 1656(ours)X 1814(if)X 1883(the)X 2001(number)X 2266(of)X 2353(errors)X 2561(is)X 2634(very)X 2797(large\).)X 10 f 1397 2104(i)N 1425(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1 f 1437 2214(pattern)N 2541(Er)X 2637(=)X 2702(0)X 2862(Er)X 2958(=)X 3023(1)X 3183(Er)X 3279(=)X 3344(2)X 3504(Er)X 3600(=)X 3665(3)X 10 f 1397 2244(i)N 1425(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1 f 1437 2354(ogenious)N 2571(0.53)X 2892(1.10)X 3213(1.42)X 3534(1.74)X 1437 2464(JACM;)N 1692(1981;)X 1894(Graph)X 2571(0.53)X 2892(1.10)X 3213(1.43)X 3534(1.75)X 1437 2574(Prob#tic;)N 1750(Algo#m)X 2571(0.55)X 2892(1.10)X 3213(1.42)X 3534(1.76)X 1437 2684(<[CJ]ACM>;)N 1889(Prob#tic;)X 2202(trees)X 2571(0.54)X 2892(1.11)X 3213(1.43)X 3534(1.75)X 1437 2794(\(<[23]>)N 9 f 1688(-)X 1 f 1732([23]*)X 9 f 1906(|)X 1 f (\).*ees)S 2571(0.66)X 2892(1.53)X 3213(2.19)X 3534(2.83)X 10 f 1397 2824(i)N 1425(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X 1397(c)X 2744(c)Y 2664(c)Y 2584(c)Y 2504(c)Y 2424(c)Y 2344(c)Y 2264(c)Y 2184(c)Y 3745 2824(c)N 2744(c)Y 2664(c)Y 2584(c)Y 2504(c)Y 2424(c)Y 2344(c)Y 2264(c)Y 2184(c)Y 1 f 1651 3031(Table)N 1854(4:)X 1956(Approximate)X 2399(matching)X 2717(of)X 2804(complicated)X 3216(patterns.)X 6 f 14 s 648 3316(5.)N 803 -0.3063(Conclusions)AX 1 f 10 s 648 3491(Searching)N 991(text)X 1133(in)X 1217(the)X 1337(presence)X 1641(of)X 1730(errors)X 1940(is)X 2015(commonly)X 2379(done)X 2557(`by)X 2687(hand')X 2893(\320)X 2996(one)X 3135(tries)X 3296(many)X 3497(possibilities.)X 3941(This)X 4106(is)X 4182(frus-)X 648 3601(trating,)N 902(slow,)X 1098(and)X 1239(with)X 1406(no)X 1511(guarantee)X 1849(of)X 1941(success.)X 2247(Agrep)X 2473(can)X 2610(alleviate)X 2907(this)X 3047(problem)X 3339(and)X 3480(make)X 3679(searching)X 4011(in)X 4097(general)X 648 3711(more)N 835(robust.)X 1098(It)X 1170(also)X 1322(makes)X 1550(searching)X 1881(more)X 2069(convenient)X 2444(by)X 2547(not)X 2672(having)X 2913(to)X 2998(spell)X 3172(everything)X 3538(precisely.)X 3891(Agrep)X 4115(is)X 4191(very)X 648 3821(fast)N 791(and)X 934(general)X 1198(and)X 1341(it)X 1412(should)X 1652(\256nd)X 1803(numerous)X 2146(applications.)X 2600(It)X 2676(has)X 2810(already)X 3074(been)X 3253(used)X 3427(in)X 3516(the)X 3641(Collaboratory)X 4112(system)X 648 3931([HPS90],)N 975(in)X 1064(a)X 1127(new)X 1288(tool)X 1439(\(under)X 1676(development\))X 2144(for)X 2265(locating)X 2550(\256les)X 2710(in)X 2799(a)X 2862(UNIX)X 3090(system)X 3339([FMW92],)X 3711(and)X 3854(in)X 3943(a)X 4007(new)X 4169(algo-)X 648 4041(rithm)N 844(for)X 961(\256nding)X 1210(information)X 1611(in)X 1696(a)X 1755(distributed)X 2120(environment)X 2548([FM92].)X 2860(In)X 2950(the)X 3070(last)X 3203(two)X 3345(applications,)X 3774(agrep)X 3975(is)X 4050(modi\256ed)X 648 4151(in)N 730(a)X 786(novel)X 984(way)X 1138(to)X 1220(search)X 1446(inside)X 1657(specially)X 1962(compressed)X 2361(\256les)X 2 f 2514(without)X 1 f 2773(having)X 3011(to)X 3093(decompress)X 3492(them)X 3672(\256rst.)X 6 f 14 s 648 4371(Acknowledgem)N 1467(ents:)X 1 f 10 s 648 4514(We)N 784(thank)X 986(Ricardo)X 1264(Baeza-Yates,)X 1715(Gene)X 1909(Myers,)X 2158(and)X 2298(Chunghwa)X 2669(Rao)X 2822(for)X 2940(many)X 3142(helpful)X 3393(conversations)X 3859(about)X 4062(approxi-)X 648 4624(mate)N 828(string)X 1034(matching)X 1356(and)X 1496(for)X 1614(comments)X 1967(that)X 2111(improved)X 2442(the)X 2564(manuscript.)X 2984(We)X 3120(thank)X 3322(Ric)X 3457(Anderson,)X 3813(Cliff)X 3988(Hathaway,)X 648 4734(Andrew)N 930(Hume,)X 1169(David)X 1388(Sanderson,)X 1765(and)X 1904(Shu-Ing)X 2186(Tsuei)X 2388(for)X 2506(their)X 2677(help)X 2839(and)X 2979(comments)X 3332(that)X 3476(improved)X 3807(the)X 3929(implementa-)X 648 4844(tion)N 799(of)X 893(agrep.)X 1139(We)X 1278(also)X 1434(thank)X 1639(William)X 1928(Chang)X 2163(and)X 2305(Andrew)X 2590(Hume)X 2812(for)X 2932(kindly)X 3162(providing)X 3499(programs)X 3828(for)X 3948(some)X 4143(of)X 4236(the)X 648 4954(experiments.)N 9 p %%Page: 9 9 10 s 10 xH 0 xS 1 f 3 f 6 f 14 s 648 686(References)N 1 f 10 s 648 893([AC75])N 848 1003(Aho,)N 1032(A.)X 1136(V.,)X 1260(and)X 1402(M.)X 1519(J.)X 1597(Corasick,)X 1929(``Ef\256cient)X 2286(string)X 2495(matching:)X 2842(an)X 2945(aid)X 3070(to)X 3159(bibliographic)X 3613(search,'')X 2 f 3920(Communica-)X 848 1113(tions)N 1023(of)X 1105(the)X 1223(ACM)X 3 f 1412(18)X 1 f 1512(\(June)X 1706(1975\),)X 1933(pp.)X 2053(333)X 9 f (-)S 1 f 2217(340.)X 648 1256([Ba89])N 848 1366(Baeza-Yates)N 1290(R.)X 1398(A.,)X 1531(``Improved)X 1932(string)X 2149(searching,'')X 2 f 2566(Software)X 2885(\320)X 2991(Practice)X 3298(and)X 3453(Experience)X 3 f 3850(19)X 1 f 3965(\(1989\),)X 4234(pp.)X 848 1476(257)N 9 f (-)S 1 f 1012(271.)X 648 1619([BG89])N 848 1729(Baeza-Yates)N 1275(R.)X 1368(A.,)X 1486(and)X 1622(G.)X 1720(H.)X 1818(Gonnet,)X 2094(``A)X 2226(new)X 2380(approach)X 2695(to)X 2777(text)X 2917(searching,'')X 2 f 3319(Proceedings)X 3740(of)X 3822(the)X 3940(12th)X 4103(Annual)X 848 1839(ACM-SIGIR)N 1265(conference)X 1638(on)X 1738(Information)X 2140(Retrieval,)X 1 f 2474(Cambridge,)X 2870(MA)X 3019(\(June)X 3213(1989\),)X 3440(pp.)X 3560(168)X 9 f (-)S 1 f 3724(175.)X 648 1982([BM77])N 848 2092(Boyer)N 1082(R.)X 1193(S.,)X 1315(and)X 1469(J.)X 1558(S.)X 1660(Moore,)X 1932(``A)X 2082(fast)X 2236(string)X 2456(searching)X 2803(algorithm,'')X 2 f 3227(Communications)X 3808(of)X 3909(the)X 4046(ACM)X 3 f 4254(20)X 1 f 848 2202(\(October)N 1154(1977\),)X 1381(pp.)X 1501(762)X 9 f (-)S 1 f 1665(772.)X 648 2345([CL90])N 848 2455(Chang)N 1077(W.)X 1193(I.,)X 1280(and)X 1416(E.)X 1505(L.)X 1594(Lawler,)X 1862(``Approximate)X 2359(string)X 2561(matching)X 2879(in)X 2962(sublinear)X 3277(expected)X 3584(time,'')X 2 f 3821(the)X 3940(31th)X 4103(Annual)X 848 2565(Symp.)N 1062(on)X 1162(Foundations)X 1586(of)X 1668(Computer)X 2008(Science,)X 1 f 2294(\(October)X 2600(1990\),)X 2827(pp.)X 2947(116)X 9 f (-)S 1 f 3111(124.)X 648 2708([CW79])N 848 2818(Commentz-Walter,)N 1493(B,)X 1594(``A)X 1734(string)X 1944(matching)X 2270(algorithm)X 2609(fast)X 2753(on)X 2862(the)X 2989 0.3472(average,'')AX 2 f 3343(Proc.)X 3548(6th)X 3679(International)X 4130(Collo-)X 848 2928(quium)N 1068(on)X 1168(Automata,)X 1519(Languages,)X 1910(and)X 2050(Programming)X 1 f 2519(\(1979\),)X 2773(pp.)X 2893(118)X 9 f (-)S 1 f 3057(132.)X 648 3071([FM92])N 848 3181(Finkel)N 1076(R.)X 1173(A.,)X 1295(and)X 1435(U.)X 1537(Manber,)X 1831(``The)X 2034(design)X 2267(and)X 2407(implementation)X 2933(of)X 3024(a)X 3084(server)X 3305(for)X 3423(retrieving)X 3759(distributed)X 4126(data,'')X 848 3291(in)N 930(preparation.)X 648 3434([FMW92])N 848 3544(Finkel)N 1072(R.)X 1165(A.,)X 1283(U.)X 1381(Manber,)X 1671(and)X 1807(S.)X 1891(Wu,)X 2047(``Find\256le)X 2369(\320)X 2470(a)X 2527(tool)X 2672(for)X 2787(locating)X 3066(\256les)X 3220(in)X 3303(a)X 3360(large)X 3542(\256le)X 3665(system,'')X 3982(in)X 4065(prepara-)X 848 3654(tion.)N 648 3797([GP90])N 848 3907(Galil)N 1033(Z.,)X 1147(and)X 1288(K.)X 1391(Park,)X 1583(``An)X 1760(improved)X 2092(algorithm)X 2429(for)X 2549(approximate)X 2976(string)X 3184(matching,'')X 2 f 3582(SIAM)X 3791(J.)X 3873(on)X 3979(Computing)X 3 f 848 4017(19)N 1 f 948(\(December)X 1326(1990\),)X 1553(pp.)X 1673(989)X 9 f (-)S 1 f 1837(999.)X 648 4160([Ha89])N 848 4270(Haertel,)N 1125(M.,)X 1256(``GNU)X 1504(e?grep,'')X 1813(Usenet)X 2056(archive)X 7 f 2341(comp.source.unix,)X 1 f 3177(Volume)X 3455(17)X 3555(\(February)X 3892(1989\).)X 648 4413([HPS90])N 848 4523(Hudson,)N 1142(S.)X 1231(E.,)X 1345(L.)X 1439(L.)X 1533(Peterson,)X 1854(and)X 1995(B.)X 2093(R.)X 2191(Schatz,)X 2450(``Systems)X 2795(Technology)X 3203(for)X 3322(Building)X 3627(a)X 3689(National)X 3991(Collabora-)X 848 4633(tory,'')N 1071(University)X 1429(of)X 1516(Arizona)X 1795(Technical)X 2132(Report)X 2370(#TR)X 2532(90-24)X 2739(\(July)X 2919(1990\).)X 648 4776([HS91])N 848 4886(Hume)N 1074(A.,)X 1202(and)X 1349(D.)X 1458(Sunday,)X 1749(``Fast)X 1967(string)X 2180(searching,'')X 2 f 2593(Software)X 2908(\320)X 3010(Practice)X 3313(and)X 3464(Experience)X 3 f 3857(21)X 1 f 3968(\(November)X 848 4996(1991\),)N 1075(pp.)X 1195(1221)X 9 f (-)S 1 f 1399(1248.)X 648 5139([Hu91])N 848 5249(Hume)N 1064(A.,)X 1182(personal)X 1474(communication)X 1992(\(1991\).)X 648 5392([KMP77])N 848 5502(Knuth)N 1069(D.)X 1168(E.,)X 1278(J.)X 1350(H.)X 1449(Morris,)X 1708(and)X 1846(V.)X 1946(R.)X 2041(Pratt,)X 2234(``Fast)X 2443(pattern)X 2688(matching)X 3008(in)X 3092(strings,'')X 2 f 3401(SIAM)X 3606(Journal)X 3877(on)X 3979(Computing)X 3 f 848 5612(6)N 1 f 908(\(June)X 1102(1977\),)X 1329(pp.)X 1449(323)X 9 f (-)S 1 f 1613(350.)X 10 p %%Page: 10 10 10 s 10 xH 0 xS 1 f 3 f 1 f 648 686([MM89])N 848 796(Myers,)N 1094(E.)X 1184(W.,)X 1322(and)X 1460(W.)X 1578(Miller,)X 1820(``Approximate)X 2319(matching)X 2639(of)X 2728(regular)X 2978(expressions,'')X 2 f 3448(Bull.)X 3623(of)X 3707(Mathematical)X 4174(Biol-)X 848 906(ogy)N 3 f 984(51)X 1 f 1084(\(1989\),)X 1338(pp.)X 1458(5)X 9 f (-)S 1 f 1542(37.)X 648 1049([TU90])N 848 1159(Tarhio)N 1093(J.)X 1175(and)X 1322(E.)X 1422(Ukkonen,)X 1767(``Approximate)X 2276(Boyer-Moore)X 2745(string)X 2959(matching,'')X 3363(Technical)X 3712(Report)X 3962(#A-1990-3,)X 848 1269(Dept.)N 1044(of)X 1131(Computer)X 1471(Science,)X 1761(University)X 2119(of)X 2206(Helsinki)X 2497(\(March)X 2754(1990\).)X 648 1412([Uk85])N 848 1522(Ukkonen)N 1162(E.,)X 1271(``Finding)X 1593(approximate)X 2014(patterns)X 2288(in)X 2370(strings,'')X 2 f 2677(Journal)X 2946(of)X 3028(Algorithms)X 3 f 3403(6)X 1 f 3463(\(1985\),)X 3717(pp.)X 3837(132)X 9 f (-)S 1 f 4001(137.)X 648 1665([WM91])N 848 1775(Wu)N 995(S.)X 1090(and)X 1237(U.)X 1346(Manber,)X 1647(``Fast)X 1865(Text)X 2043(Searching)X 2395(With)X 2586(Errors,'')X 2892(Technical)X 3240(Report)X 3489(TR-91-11,)X 3856(Department)X 4267(of)X 848 1885(Computer)N 1188(Science,)X 1478(University)X 1836(of)X 1923(Arizona)X 2202(\(June)X 2396(1991\).)X 648 2028([WM92])N 848 2138(Wu)N 984(S.)X 1068(and)X 1204(U.)X 1302(Manber,)X 1592(``Filtering)X 1941(search)X 2167(approach)X 2482(for)X 2596(some)X 2785(string)X 2987(matching)X 3305(problems,'')X 3697(in)X 3779(preparation.)X 6 f 14 s 648 2358(Biographical)N 1355(Sketches)X 3 f 10 s 648 2578(Sun)N 809(Wu)X 1 f 962(is)X 1044(a)X 1109(Ph.D.)X 1321(candidate)X 1659(in)X 1751(computer)X 2084(science)X 2351(at)X 2439(the)X 2567(University)X 2935(of)X 3032(Arizona.)X 3361(His)X 3502(research)X 3801(interests)X 4098(include)X 648 2688(design)N 877(of)X 964(algorithms,)X 1346(in)X 1428(particular,)X 1776(string)X 1978(matching)X 2296(and)X 2432(graph)X 2635(algorithms.)X 3 f 648 2908(Udi)N 797(Manber)X 1 f 1098(is)X 1176(a)X 1237(professor)X 1561(of)X 1653(computer)X 1981(science)X 2243(at)X 2326(the)X 2449(University)X 2812(of)X 2904(Arizona,)X 3208(where)X 3430(he)X 3532(has)X 3665(been)X 3843(since)X 4034(1987.)X 4240(He)X 648 3018(received)N 944(his)X 1060(Ph.D.)X 1265(degree)X 1503(in)X 1588(computer)X 1914(science)X 2174(from)X 2353(the)X 2474(University)X 2835(of)X 2925(Washington)X 3335(in)X 3420(1982.)X 3643(His)X 3776(research)X 4067(interests)X 648 3128(include)N 921(design)X 1167(of)X 1271(algorithms)X 1650(and)X 1803(computer)X 2143(networks.)X 2514(He)X 2646(is)X 2737(the)X 2873(author)X 3116(of)X 3221(``Introduction)X 3709(to)X 3809(Algorithms)X 4211(-)X 4276(A)X 648 3238(Creative)N 943(Approach'')X 1337(\(Addison-Wesley,)X 1946(1989\).)X 2195(He)X 2311(received)X 2606(a)X 2664(Presidential)X 3064(Young)X 3304(Investigator)X 3709(Award)X 3950(in)X 4034(1985,)X 4236(the)X 648 3348(best)N 806(paper)X 1014(award)X 1240(of)X 1336(the)X 1463(seventh)X 1737(International)X 2176(Conference)X 2576(on)X 2685(Distributed)X 3074(Computing)X 3462(Systems,)X 3777(1987,)X 3986(and)X 4131(a)X 4196(Dis-)X 648 3458(tinguished)N 1001(Teaching)X 1320(Award)X 1559(of)X 1646(the)X 1764(Faculty)X 2024(of)X 2111(Science)X 2381(at)X 2459(the)X 2577(University)X 2935(of)X 3022(Arizona,)X 3321(1990.)X 10 p %%Trailer xt xs debian/rules0000755000000000000000000000127211741120771010251 0ustar #!/usr/bin/make -f export DEB_BUILD_MAINT_OPTIONS = hardening=+all export DEB_CFLAGS_MAINT_APPEND = -Wall -pedantic -I. export DEB_LDFLAGS_MAINT_APPEND = -Wl,--as-needed override_dh_clean: # The original sources contain *.orig and *.rej files # that must not be cleaned. # So do *not* run 'dh_clean' commands. rm -rf debian/agrep debian/.#* debian/*[~#] rm -f debian/files debian/*.log debian/*.substvars [ ! -f Makefile ] || $(MAKE) clean override_dh_auto_build: # These are the only variables that are free for user $(MAKE) DEBUGFLAGS="$(CFLAGS) $(CPPFLAGS)" OTHERLIBS="$(LDFLAGS)" override_dh_installchangelogs: dh_installchangelogs agrep.chronicle %: dh $@ # End of file debian/docs0000644000000000000000000000012611721405564010045 0ustar README agrep.algorithms contribution.list debian/doc/agrep.ps.1 debian/doc/agrep.ps.2 debian/copyright0000644000000000000000000000520611741120626011124 0ustar Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0 Upstream-Name: agrep Upstream-Contact: Source: http://freshmeat.net/projects/agrep X-Source: ftp://ftp.cs.arizona.edu/agrep There is also It claims 4.x, but it reality it contains the 2.x version of tar.gz of Sun Wu. X-Upstream-Comment: Latest agrep is distributed as part of glimpse (4.x) . The agrep included in glimpse contains non-DFSG compliant license, which prevents packaging newer versions. Files: * Copyright: 1991-1992 Sun Wu 1991-1992 Udi Manber 1993 Burra Gopal License: Custom Files: debian/* Copyright: 2007-2012 Jari Aalto 2005-2007 Daniel Baumann License: GPL-2+ License: Custom This material was developed by Sun Wu, Udi Manber, and Burra Gopal at the University of Arizona, Department of Computer Science. Permission is granted to copy this software, to redistribute it on a nonprofit basis, and to use it for any purpose, subject to the following restrictions and understandings. . 1. Any copy made of this software must include this copyright notice in full. . 2. All materials developed as a consequence of the use of this software shall duly acknowledge such use, in accordance with the usual standards of acknowledging credit in academic research. . 3. The authors have made no warranty or representation that the operation of this software will be error-free or suitable for any application, and they are under under no obligation to provide any services, by way of maintenance, update, or otherwise. The software is an experimental prototype offered on an as-is basis. . 4. Redistribution for profit requires the express, written permission of the authors. License: GPL-2+ This package is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. . This package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. . You should have received a copy of the GNU General Public License along with this program. If not, see . . On Debian systems, the complete text of the GNU General Public License can be found in "/usr/share/common-licenses/GPL-2". debian/source/0000755000000000000000000000000011721407315010467 5ustar debian/source/format0000644000000000000000000000001411721407316011676 0ustar 3.0 (quilt) debian/manpages0000644000000000000000000000001011721405564010700 0ustar agrep.1