jakarta-oro-2.0.8/0000755000175000017500000000000010423240033013230 5ustar arnaudarnaudjakarta-oro-2.0.8/src/0000755000175000017500000000000007773723336014051 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/0000755000175000017500000000000007773723336014772 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/org/0000755000175000017500000000000007773723336015561 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/org/apache/0000755000175000017500000000000007773723336017002 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/org/apache/oro/0000755000175000017500000000000010423237774017571 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/org/apache/oro/text/0000755000175000017500000000000010423237774020555 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/0000755000175000017500000000000010423237774021667 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/Perl5Substitution.java0000644000175000017500000004561407773723336026200 0ustar arnaudarnaud/* * $Id: Perl5Substitution.java,v 1.13 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; import java.util.*; /** * Perl5Substitution implements a Substitution consisting of a * literal string, but allowing Perl5 variable interpolation referencing * saved groups in a match. This class is intended for use with * {@link Util#substitute Util.substitute}. *

* The substitution string may contain variable interpolations referring * to the saved parenthesized groups of the search pattern. * A variable interpolation is denoted by $1, or $2, * or $3, etc. If you want such expressions to be * interpreted literally, you should set the numInterpolations * parameter to INTERPOLATE_NONE . It is easiest to explain * what an interpolated variable does by giving an example: *

*

* A final thing to keep in mind is that if you use an interpolation variable * that corresponds to a group not contained in the match, then it is * interpreted as the empty string. So given the regular expression from the * example, and a substitution expression of a$2-, the result * of the last sample input would be: *

Tank a- 85  Tank a- 32  Tank a- 22
* The special substitution $& will interpolate the entire portion * of the input matched by the regular expression. $0 will * do the same, but it is recommended that it be avoided because the * latest versions of Perl use $0 to store the program name rather * than duplicate the behavior of $&. * Also, the result of substituting $ followed by a non-positive integer * is undefined. In order to include a $ in a substitution, it should * be escaped with a backslash (e.g., "\\$0"). *

* Perl5 double-quoted string case modification is also supported in * the substitution. The following escape sequences are supported: *

*
\\U
make substitution uppercase until end of substitution or \\E *
\\u
make next character uppercase *
\\L
make substitution uppercase until end of substitution or \\E *
\\l
make next character uppercase *
\\E
mark the end of the case modification *
* The double backslashes are shown to remind you that to make a * backslash get past Java's string handling and appear as a backslash * to the substitution, you must escape the backslash. * * @version @version@ * @since 1.1 * @see Substitution * @see Util * @see Util#substitute * @see Substitution * @see StringSubstitution */ public class Perl5Substitution extends StringSubstitution { /** * A constant used when creating a Perl5Substitution indicating that * interpolation variables should be computed relative to the most * recent pattern match. */ public static final int INTERPOLATE_ALL = 0; /** * A constant used when creating a Perl5Substitution indicating that * interpolation variables should be interpreted literally, effectively * disabling interpolation. */ public static final int INTERPOLATE_NONE = -1; /** * The initial size and unit of growth for the * {@link #_subOpCodes _subOpCodes} array. */ private static final int __OPCODE_STORAGE_SIZE = 32; /** * The maximum number of groups supported by interpolation. */ private static final int __MAX_GROUPS = Character.MAX_VALUE; /** * A constant declaring opcode for copy operation. */ static final int _OPCODE_COPY = -1; /** * A constant declaring opcode for lowercase char operation. */ static final int _OPCODE_LOWERCASE_CHAR = -2; /** * A constant declaring opcode for uppercase char operation. */ static final int _OPCODE_UPPERCASE_CHAR = -3; /** * A constant declaring opcode for lowercase mode operation. */ static final int _OPCODE_LOWERCASE_MODE = -4; /** * A constant declaring opcode for lowercase mode operation. */ static final int _OPCODE_UPPERCASE_MODE = -5; /** * A constant declaring opcode for lowercase mode operation. */ static final int _OPCODE_ENDCASE_MODE = -6; int _numInterpolations; int[] _subOpcodes; int _subOpcodesCount; char[] _substitutionChars; transient String _lastInterpolation; private static final boolean __isInterpolationCharacter(char ch) { return (Character.isDigit(ch) || ch == '&'); } private void __addElement(int value) { int len = _subOpcodes.length; if (_subOpcodesCount == len) { int[] newarray = new int[len + __OPCODE_STORAGE_SIZE]; System.arraycopy(_subOpcodes, 0, newarray, 0, len); _subOpcodes = newarray; } _subOpcodes[_subOpcodesCount++] = value; } private void __parseSubs(String sub) { boolean saveDigits, escapeMode, caseMode; int posParam; int offset; char[] subChars = _substitutionChars = sub.toCharArray(); int subLength = subChars.length; _subOpcodes = new int[__OPCODE_STORAGE_SIZE]; _subOpcodesCount = 0; posParam = 0; offset = -1; saveDigits = false; escapeMode = false; caseMode = false; for (int current = 0; current < subLength; current++) { char c = subChars[current]; char nextc; int next = current + 1; // Save digits if (saveDigits) { int digit = Character.digit(c, 10); if (digit > -1) { if (posParam <= __MAX_GROUPS) { posParam *= 10; posParam += digit; } if (next == subLength) { __addElement(posParam); } continue; } else if(c == '&') { if(/*current > 0 &&*/subChars[current - 1] == '$') { __addElement(0); posParam = 0; saveDigits = false; continue; } } __addElement(posParam); posParam = 0; saveDigits = false; } if ((c != '$' && c != '\\') || escapeMode) { escapeMode = false; if (offset < 0) { offset = current; __addElement(_OPCODE_COPY); __addElement(offset); } if (next == subLength) { __addElement(next - offset); } continue; } if (offset >= 0) { __addElement(current - offset); offset = -1; } // Only do positional and escapes if we have a next char if (next == subLength) continue; nextc = subChars[next]; // Positional params if (c == '$') { saveDigits = __isInterpolationCharacter(nextc); } else if (c == '\\') { // Escape codes if (nextc == 'l') { if (!caseMode){ __addElement(_OPCODE_LOWERCASE_CHAR); current++; } } else if (nextc == 'u') { if (!caseMode) { __addElement(_OPCODE_UPPERCASE_CHAR); current++; } } else if (nextc == 'L') { __addElement(_OPCODE_LOWERCASE_MODE); current++; caseMode = true; } else if (nextc == 'U') { __addElement(_OPCODE_UPPERCASE_MODE); current++; caseMode = true; } else if (nextc == 'E') { __addElement(_OPCODE_ENDCASE_MODE); current++; caseMode = false; } else { escapeMode = true; } } } } String _finalInterpolatedSub(MatchResult result) { StringBuffer buffer = new StringBuffer(10); _calcSub(buffer, result); return buffer.toString(); } void _calcSub(StringBuffer buffer, MatchResult result) { int size, offset, count, caseMode; char[] sub, str, match; int[] subOpcodes = _subOpcodes; caseMode = 0; str = _substitutionChars; match = result.group(0).toCharArray(); size = _subOpcodesCount; for (int element = 0; element < size; element++) { int value = subOpcodes[element]; // If we have a group, set up interpolation, else // interpret op code. if(value >= 0 && value < result.groups()) { int end, len; offset = result.begin(value); if (offset < 0) continue; end = result.end(value); if (end < 0) continue; len = result.length(); if (offset >= len || end > len || offset >= end) continue; count = end - offset; sub = match; } else if (value == _OPCODE_COPY) { element++; if (element >= size) continue; offset = subOpcodes[element]; element++; if (element >= size) continue; count = subOpcodes[element]; sub = str; } else if (value == _OPCODE_LOWERCASE_CHAR || value == _OPCODE_UPPERCASE_CHAR) { if (caseMode != _OPCODE_LOWERCASE_MODE && caseMode != _OPCODE_UPPERCASE_MODE) caseMode = value; continue; } else if (value == _OPCODE_LOWERCASE_MODE || value == _OPCODE_UPPERCASE_MODE) { caseMode = value; continue; } else if (value == _OPCODE_ENDCASE_MODE) { caseMode = 0; continue; } else continue; // Apply modes to buf if (caseMode == _OPCODE_LOWERCASE_CHAR) { buffer.append(Character.toLowerCase(sub[offset++])); buffer.append(sub, offset, --count); caseMode = 0; } else if (caseMode == _OPCODE_UPPERCASE_CHAR) { buffer.append(Character.toUpperCase(sub[offset++])); buffer.append(sub, offset, --count); caseMode = 0; } else if (caseMode == _OPCODE_LOWERCASE_MODE) { for (int end = offset + count; offset < end; ) { buffer.append(Character.toLowerCase(sub[offset++])); } } else if (caseMode == _OPCODE_UPPERCASE_MODE) { for (int end = offset + count; offset < end; ) { buffer.append(Character.toUpperCase(sub[offset++])); } } else buffer.append(sub, offset, count); } } /** * Default constructor initializing substitution to a zero length * String and the number of interpolations to * {@link #INTERPOLATE_ALL}. */ public Perl5Substitution() { this("", INTERPOLATE_ALL); } /** * Creates a Perl5Substitution using the specified substitution * and setting the number of interpolations to * {@link #INTERPOLATE_ALL}. *

* @param substitution The string to use as a substitution. */ public Perl5Substitution(String substitution) { this(substitution, INTERPOLATE_ALL); } /** * Creates a Perl5Substitution using the specified substitution * and setting the number of interpolations to the specified value. *

* @param substitution The string to use as a substitution. * @param numInterpolations * If set to INTERPOLATE_NONE, interpolation variables are * interpreted literally and not as references to the saved * parenthesized groups of a pattern match. If set to * INTERPOLATE_ALL , all variable interpolations * are computed relative to the pattern match responsible for * the current substitution. If set to a positive integer, * the first numInterpolations substitutions have * their variable interpolation performed relative to the * most recent match, but the remaining substitutions have * their variable interpolations performed relative to the * numInterpolations 'th match. */ public Perl5Substitution(String substitution, int numInterpolations) { setSubstitution(substitution, numInterpolations); } /** * Sets the substitution represented by this Perl5Substitution, also * setting the number of interpolations to * {@link #INTERPOLATE_ALL}. * You should use this method in order to avoid repeatedly allocating new * Perl5Substitutions. It is recommended that you allocate a single * Perl5Substitution and reuse it by using this method when appropriate. *

* @param substitution The string to use as a substitution. */ public void setSubstitution(String substitution) { setSubstitution(substitution, INTERPOLATE_ALL); } /** * Sets the substitution represented by this Perl5Substitution, also * setting the number of interpolations to the specified value. * You should use this method in order to avoid repeatedly allocating new * Perl5Substitutions. It is recommended that you allocate a single * Perl5Substitution and reuse it by using this method when appropriate. *

* @param substitution The string to use as a substitution. * @param numInterpolations * If set to INTERPOLATE_NONE, interpolation variables are * interpreted literally and not as references to the saved * parenthesized groups of a pattern match. If set to * INTERPOLATE_ALL , all variable interpolations * are computed relative to the pattern match responsible for * the current substitution. If set to a positive integer, * the first numInterpolations substitutions have * their variable interpolation performed relative to the * most recent match, but the remaining substitutions have * their variable interpolations performed relative to the * numInterpolations 'th match. */ public void setSubstitution(String substitution, int numInterpolations) { super.setSubstitution(substitution); _numInterpolations = numInterpolations; if(numInterpolations != INTERPOLATE_NONE && (substitution.indexOf('$') != -1 || substitution.indexOf('\\') != -1)) __parseSubs(substitution); else _subOpcodes = null; _lastInterpolation = null; } /** * Appends the substitution to a buffer containing the original input * with substitutions applied for the pattern matches found so far. * See * {@link Substitution#appendSubstitution Substitution.appendSubstition()} * for more details regarding the expected behavior of this method. *

* @param appendBuffer The buffer containing the new string resulting * from performing substitutions on the original input. * @param match The current match causing a substitution to be made. * @param substitutionCount The number of substitutions that have been * performed so far by Util.substitute. * @param originalInput The original input upon which the substitutions are * being performed. This is a read-only parameter and is not modified. * @param matcher The PatternMatcher used to find the current match. * @param pattern The Pattern used to find the current match. */ public void appendSubstitution(StringBuffer appendBuffer, MatchResult match, int substitutionCount, PatternMatcherInput originalInput, PatternMatcher matcher, Pattern pattern) { if(_subOpcodes == null) { super.appendSubstitution(appendBuffer, match, substitutionCount, originalInput, matcher, pattern); return; } if(_numInterpolations < 1 || substitutionCount < _numInterpolations) _calcSub(appendBuffer, match); else { if(substitutionCount == _numInterpolations) _lastInterpolation = _finalInterpolatedSub(match); appendBuffer.append(_lastInterpolation); } } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/Util.java0000644000175000017500000005072307773723336023466 0ustar arnaudarnaud/* * $Id: Util.java,v 1.15 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000-2002 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; import java.util.*; /** * The Util class is a holder for useful static utility methods that can * be generically applied to Pattern and PatternMatcher instances. * This class cannot and is not meant to be instantiated. * The Util class currently contains versions of the split() and substitute() * methods inspired by Perl's split function and s operation * respectively, although they are implemented in such a way as not to * rely on the Perl5 implementations of the OROMatcher packages regular * expression interfaces. They may operate on any interface implementations * conforming to the OROMatcher API specification for the PatternMatcher, * Pattern, and MatchResult interfaces. Future versions of the class may * include additional utility methods. *

* A grep method is not included for two reasons: *

    *
  1. The details of reading a line at a time from an input stream * differ in JDK 1.0.2 and JDK 1.1, making it difficult to * retain compatibility across both Java releases. *
  2. Grep style processing is trivial for the programmer to implement * in a while loop. Rarely does anyone want to retrieve all * occurences of a pattern and then process them. More often a * programmer will retrieve pattern matches and process them as they * are retrieved, which is more efficient than storing them all in a * Vector and then accessing them. *
* * @version @version@ * @since 1.0 * @see Pattern * @see PatternMatcher */ public final class Util { /** * A constant passed to the {@link #substitute substitute()} * methods indicating that all occurrences of a pattern should be * substituted. */ public static final int SUBSTITUTE_ALL = -1; /** * A constant passed to the {@link #split split()} methods * indicating that all occurrences of a pattern should be used to * split a string. */ public static final int SPLIT_ALL = 0; /** * The default destructor for the Util class. It is made private * to prevent the instantiation of the class. */ private Util() { } /** * Splits up a String instance and stores results as a * List of substrings numbering no more than a specified * limit. The string is split with a regular expression as the delimiter. * The limit parameter essentially says to split the * string only on at most the first limit - 1 number of pattern * occurences. *

* This method is inspired by the Perl split() function and behaves * identically to it when used in conjunction with the Perl5Matcher and * Perl5Pattern classes except for the following difference: *

*

* @param results A Collection to which the split results are appended. * After the method returns, it contains the substrings of the input * that occur between the regular expression delimiter occurences. * The input will not be split into any more substrings than the * specified limit. A way of thinking of this is that * only the first limit - 1 matches of the delimiting * regular expression will be used to split the input. * @param matcher The regular expression matcher to execute the split. * @param pattern The regular expression to use as a split delimiter. * @param input The String to split. * @param limit The limit on the number of resulting split elements. * Values <= 0 produce the same behavior as using the * SPLIT_ALL constant which causes the limit to be * ignored and splits to be performed on all occurrences of * the pattern. You should use the SPLIT_ALL constant * to achieve this behavior instead of relying on the default * behavior associated with non-positive limit values. * @since 2.0 */ public static void split(Collection results, PatternMatcher matcher, Pattern pattern, String input, int limit) { int beginOffset; MatchResult currentResult; PatternMatcherInput pinput; pinput = new PatternMatcherInput(input); beginOffset = 0; while(--limit != 0 && matcher.contains(pinput, pattern)) { currentResult = matcher.getMatch(); results.add(input.substring(beginOffset, currentResult.beginOffset(0))); beginOffset = currentResult.endOffset(0); } results.add(input.substring(beginOffset, input.length())); } /** * Splits up a String instance and stores results as a * Collection of all its substrings using a regular expression * as the delimiter. * This method is inspired by the Perl split() function and behaves * identically to it when used in conjunction with the Perl5Matcher and * Perl5Pattern classes except for the following difference: *

*

*

* This method is identical to calling: *

   * split(matcher, pattern, input, Util.SPLIT_ALL);
   * 
*

* @param results A Collection to which all the substrings of * the input that occur between the regular expression delimiter * occurences are appended. * @param matcher The regular expression matcher to execute the split. * @param pattern The regular expression to use as a split delimiter. * @param input The String to split. * @since 2.0 */ public static void split(Collection results, PatternMatcher matcher, Pattern pattern, String input) { split(results, matcher, pattern, input, SPLIT_ALL); } /** * Splits up a String instance into strings contained in a * Vector of size not greater than a specified limit. The * string is split with a regular expression as the delimiter. * The limit parameter essentially says to split the * string only on at most the first limit - 1 number of pattern * occurences. *

* This method is inspired by the Perl split() function and behaves * identically to it when used in conjunction with the Perl5Matcher and * Perl5Pattern classes except for the following difference: *

*

* @deprecated Use * {@link #split(Collection, PatternMatcher, Pattern, String, int)} instead. * @param matcher The regular expression matcher to execute the split. * @param pattern The regular expression to use as a split delimiter. * @param input The String to split. * @param limit The limit on the size of the returned Vector. * Values <= 0 produce the same behavior as using the * SPLIT_ALL constant which causes the limit to be * ignored and splits to be performed on all occurrences of * the pattern. You should use the SPLIT_ALL constant * to achieve this behavior instead of relying on the default * behavior associated with non-positive limit values. * @return A Vector containing the substrings of the input * that occur between the regular expression delimiter occurences. * The input will not be split into any more substrings than the * specified limit. A way of thinking of this is that * only the first limit - 1 matches of the delimiting * regular expression will be used to split the input. * @since 1.0 */ public static Vector split(PatternMatcher matcher, Pattern pattern, String input, int limit) { Vector results = new Vector(20); split(results, matcher, pattern, input, limit); return results; } /** * Splits up a String instance into a Vector * of all its substrings using a regular expression as the delimiter. * This method is inspired by the Perl split() function and behaves * identically to it when used in conjunction with the Perl5Matcher and * Perl5Pattern classes except for the following difference: *

*

*

* This method is identical to calling: *

   * split(matcher, pattern, input, Util.SPLIT_ALL);
   * 
*

* @deprecated Use * {@link #split(Collection, PatternMatcher, Pattern, String)} instead. * @param matcher The regular expression matcher to execute the split. * @param pattern The regular expression to use as a split delimiter. * @param input The String to split. * @return A Vector containing all the substrings of the input * that occur between the regular expression delimiter occurences. * @since 1.0 */ public static Vector split( PatternMatcher matcher, Pattern pattern, String input) { return split(matcher, pattern, input, SPLIT_ALL); } /** * Searches a string for a pattern and replaces the first occurrences * of the pattern with a Substitution up to the number of * substitutions specified by the numSubs parameter. A * numSubs value of SUBSTITUTE_ALL will cause all occurrences * of the pattern to be replaced. *

* @param matcher The regular expression matcher to execute the pattern * search. * @param pattern The regular expression to search for and substitute * occurrences of. * @param sub The Substitution used to substitute pattern occurences. * @param input The String on which to perform substitutions. * @param numSubs The number of substitutions to perform. Only the * first numSubs patterns encountered are * substituted. If you want to substitute all occurences * set this parameter to SUBSTITUTE_ALL . * @return A String comprising the input string with the substitutions, * if any, made. If no substitutions are made, the returned String * is the original input String. * @since 1.0 */ public static String substitute(PatternMatcher matcher, Pattern pattern, Substitution sub, String input, int numSubs) { StringBuffer buffer = new StringBuffer(input.length()); PatternMatcherInput pinput = new PatternMatcherInput(input); // Users have indicated that they expect the result to be the // original input string, rather than a copy, if no substitutions // are performed, if(substitute(buffer, matcher, pattern, sub, pinput, numSubs) != 0) return buffer.toString(); return input; } /** * Searches a string for a pattern and substitutes only the first * occurence of the pattern. *

* This method is identical to calling: *

   * substitute(matcher, pattern, sub, input, 1);
   * 
*

* @param matcher The regular expression matcher to execute the pattern * search. * @param pattern The regular expression to search for and substitute * occurrences of. * @param sub The Substitution used to substitute pattern occurences. * @param input The String on which to perform substitutions. * @return A String comprising the input string with the substitutions, * if any, made. If no substitutions are made, the returned String * is the original input String. * @since 1.0 */ public static String substitute(PatternMatcher matcher, Pattern pattern, Substitution sub, String input) { return substitute(matcher, pattern, sub, input, 1); } /** * Searches a string for a pattern and replaces the first occurrences * of the pattern with a Substitution up to the number of * substitutions specified by the numSubs parameter. A * numSubs value of SUBSTITUTE_ALL will cause all occurrences * of the pattern to be replaced. The number of substitutions made * is returned. *

* @param result The StringBuffer in which to store the result of the * substitutions. The buffer is only appended to. * @param matcher The regular expression matcher to execute the pattern * search. * @param pattern The regular expression to search for and substitute * occurrences of. * @param sub The Substitution used to substitute pattern occurences. * @param input The input on which to perform substitutions. * @param numSubs The number of substitutions to perform. Only the * first numSubs patterns encountered are * substituted. If you want to substitute all occurences * set this parameter to SUBSTITUTE_ALL . * @return The number of substitutions made. * @since 2.0.6 */ public static int substitute(StringBuffer result, PatternMatcher matcher, Pattern pattern, Substitution sub, String input, int numSubs) { PatternMatcherInput pinput = new PatternMatcherInput(input); return substitute(result, matcher, pattern, sub, pinput, numSubs); } /** * Searches a string for a pattern and replaces the first occurrences * of the pattern with a Substitution up to the number of * substitutions specified by the numSubs parameter. A * numSubs value of SUBSTITUTE_ALL will cause all occurrences * of the pattern to be replaced. The number of substitutions made * is returned. *

* @param result The StringBuffer in which to store the result of the * substitutions. The buffer is only appended to. * @param matcher The regular expression matcher to execute the pattern * search. * @param pattern The regular expression to search for and substitute * occurrences of. * @param sub The Substitution used to substitute pattern occurences. * @param input The input on which to perform substitutions. * @param numSubs The number of substitutions to perform. Only the * first numSubs patterns encountered are * substituted. If you want to substitute all occurences * set this parameter to SUBSTITUTE_ALL . * @return The number of substitutions made. * @since 2.0.3 */ public static int substitute(StringBuffer result, PatternMatcher matcher, Pattern pattern, Substitution sub, PatternMatcherInput input, int numSubs) { int beginOffset, subCount; char[] inputBuffer; subCount = 0; beginOffset = input.getBeginOffset(); inputBuffer = input.getBuffer(); // Must be != 0 because SUBSTITUTE_ALL is represented by -1. // Do NOT change to numSubs > 0. while(numSubs != 0 && matcher.contains(input, pattern)) { --numSubs; ++subCount; result.append(inputBuffer, beginOffset, input.getMatchBeginOffset() - beginOffset); sub.appendSubstitution(result, matcher.getMatch(), subCount, input, matcher, pattern); beginOffset = input.getMatchEndOffset(); } result.append(inputBuffer, beginOffset, input.length() - beginOffset); return subCount; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/Perl5Compiler.java0000644000175000017500000016003007773723336025224 0ustar arnaudarnaud/* * $Id: Perl5Compiler.java,v 1.21 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; import java.util.*; /** * The Perl5Compiler class is used to create compiled regular expressions * conforming to the Perl5 regular expression syntax. It generates * Perl5Pattern instances upon compilation to be used in conjunction * with a Perl5Matcher instance. Please see the user's guide for more * information about Perl5 regular expressions. *

* Perl5Compiler and Perl5Matcher are designed with the intent that * you use a separate instance of each per thread to avoid the overhead * of both synchronization and concurrent access (e.g., a match that takes * a long time in one thread will block the progress of another thread with * a shorter match). If you want to use a single instance of each * in a concurrent program, you must appropriately protect access to * the instances with critical sections. If you want to share Perl5Pattern * instances between concurrently executing instances of Perl5Matcher, you * must compile the patterns with {@link Perl5Compiler#READ_ONLY_MASK}. * * @version @version@ * @since 1.0 * @see PatternCompiler * @see MalformedPatternException * @see Perl5Pattern * @see Perl5Matcher */ public final class Perl5Compiler implements PatternCompiler { private static final int __WORSTCASE = 0, __NONNULL = 0x1, __SIMPLE = 0x2, __SPSTART = 0x4, __TRYAGAIN = 0x8; private static final char __CASE_INSENSITIVE = 0x0001, __GLOBAL = 0x0002, __KEEP = 0x0004, __MULTILINE = 0x0008, __SINGLELINE = 0x0010, __EXTENDED = 0x0020, __READ_ONLY = 0x8000; private static final String __HEX_DIGIT = "0123456789abcdef0123456789ABCDEFx"; private CharStringPointer __input; private boolean __sawBackreference; private char[] __modifierFlags = { 0 }; // IMPORTANT: __numParentheses starts out equal to 1 during compilation. // It is always one greater than the number of parentheses encountered // so far in the regex. That is because it refers to the number of groups // to save, and the entire match is always saved (group 0) private int __numParentheses, __programSize, __cost; // When doing the second pass and actually generating code, __programSize // keeps track of the current offset. private char[] __program; /** Lookup table for POSIX character class names */ private static final HashMap __hashPOSIX; static { __hashPOSIX = new HashMap(); __hashPOSIX.put("alnum", new Character(OpCode._ALNUMC)); __hashPOSIX.put("word", new Character(OpCode._ALNUM)); __hashPOSIX.put("alpha", new Character(OpCode._ALPHA)); __hashPOSIX.put("blank", new Character(OpCode._BLANK)); __hashPOSIX.put("cntrl", new Character(OpCode._CNTRL)); __hashPOSIX.put("digit", new Character(OpCode._DIGIT)); __hashPOSIX.put("graph", new Character(OpCode._GRAPH)); __hashPOSIX.put("lower", new Character(OpCode._LOWER)); __hashPOSIX.put("print", new Character(OpCode._PRINT)); __hashPOSIX.put("punct", new Character(OpCode._PUNCT)); __hashPOSIX.put("space", new Character(OpCode._SPACE)); __hashPOSIX.put("upper", new Character(OpCode._UPPER)); __hashPOSIX.put("xdigit", new Character(OpCode._XDIGIT)); __hashPOSIX.put("ascii", new Character(OpCode._ASCII)); } /** * The default mask for the {@link #compile compile} methods. * It is equal to 0. * The default behavior is for a regular expression to be case sensitive * and to not specify if it is multiline or singleline. When MULITLINE_MASK * and SINGLINE_MASK are not defined, the ^, $, and . * metacharacters are * interpreted according to the value of isMultiline() in Perl5Matcher. * The default behavior of Perl5Matcher is to treat the Perl5Pattern * as though MULTILINE_MASK were enabled. If isMultiline() returns false, * then the pattern is treated as though SINGLINE_MASK were set. However, * compiling a pattern with the MULTILINE_MASK or SINGLELINE_MASK masks * will ALWAYS override whatever behavior is specified by the setMultiline() * in Perl5Matcher. */ public static final int DEFAULT_MASK = 0; /** * A mask passed as an option to the {@link #compile compile} methods * to indicate a compiled regular expression should be case insensitive. */ public static final int CASE_INSENSITIVE_MASK = __CASE_INSENSITIVE; /** * A mask passed as an option to the {@link #compile compile} methods * to indicate a compiled regular expression should treat input as having * multiple lines. This option affects the interpretation of * the ^ and $ metacharacters. When this mask is used, * the ^ metacharacter matches at the beginning of every line, * and the $ metacharacter matches at the end of every line. * Additionally the . metacharacter will not match newlines when * an expression is compiled with MULTILINE_MASK , which is its * default behavior. */ public static final int MULTILINE_MASK = __MULTILINE; /** * A mask passed as an option to the {@link #compile compile} methods * to indicate a compiled regular expression should treat input as being * a single line. This option affects the interpretation of * the ^ and $ metacharacters. When this mask is used, * the ^ metacharacter matches at the beginning of the input, * and the $ metacharacter matches at the end of the input. * The ^ and $ metacharacters will not match at the beginning * and end of lines occurring between the begnning and end of the input. * Additionally, the . metacharacter will match newlines when * an expression is compiled with SINGLELINE_MASK , unlike its * default behavior. */ public static final int SINGLELINE_MASK = __SINGLELINE; /** * A mask passed as an option to the {@link #compile compile} methods * to indicate a compiled regular expression should be treated as a Perl5 * extended pattern (i.e., a pattern using the /x modifier). This * option tells the compiler to ignore whitespace that is not backslashed or * within a character class. It also tells the compiler to treat the * # character as a metacharacter introducing a comment as in * Perl. In other words, the # character will comment out any * text in the regular expression between it and the next newline. * The intent of this option is to allow you to divide your patterns * into more readable parts. It is provided to maintain compatibility * with Perl5 regular expressions, although it will not often * make sense to use it in Java. */ public static final int EXTENDED_MASK = __EXTENDED; /** * A mask passed as an option to the {@link #compile compile} methods * to indicate that the resulting Perl5Pattern should be treated as a * read only data structure by Perl5Matcher, making it safe to share * a single Perl5Pattern instance among multiple threads without needing * synchronization. Without this option, Perl5Matcher reserves the right * to store heuristic or other information in Perl5Pattern that might * accelerate future matches. When you use this option, Perl5Matcher will * not store or modify any information in a Perl5Pattern. Use this option * when you want to share a Perl5Pattern instance among multiple threads * using different Perl5Matcher instances. */ public static final int READ_ONLY_MASK = __READ_ONLY; /** * Given a character string, returns a Perl5 expression that interprets * each character of the original string literally. In other words, all * special metacharacters are quoted/escaped. This method is useful for * converting user input meant for literal interpretation into a safe * regular expression representing the literal input. *

* In effect, this method is the analog of the Perl5 quotemeta() builtin * method. *

* @param expression The expression to convert. * @return A String containing a Perl5 regular expression corresponding to * a literal interpretation of the pattern. */ public static final String quotemeta(char[] expression) { int ch; StringBuffer buffer; buffer = new StringBuffer(2*expression.length); for(ch = 0; ch < expression.length; ch++) { if(!OpCode._isWordCharacter(expression[ch])) buffer.append('\\'); buffer.append(expression[ch]); } return buffer.toString(); } /** * Given a character string, returns a Perl5 expression that interprets * each character of the original string literally. In other words, all * special metacharacters are quoted/escaped. This method is useful for * converting user input meant for literal interpretation into a safe * regular expression representing the literal input. *

* In effect, this method is the analog of the Perl5 quotemeta() builtin * method. *

* @param pattern The pattern to convert. * @return A String containing a Perl5 regular expression corresponding to * a literal interpretation of the pattern. */ public static final String quotemeta(String expression) { return quotemeta(expression.toCharArray()); } private static boolean __isSimpleRepetitionOp(char ch) { return (ch == '*' || ch == '+' || ch == '?'); } private static boolean __isComplexRepetitionOp(char[] ch, int offset) { if(offset < ch.length && offset >= 0) return (ch[offset] == '*' || ch[offset] == '+' || ch[offset] == '?' || (ch[offset] == '{' && __parseRepetition(ch, offset))); return false; } // determines if {\d+,\d*} is the next part of the string private static boolean __parseRepetition(char[] str, int offset) { if(str[offset] != '{') return false; ++offset; if(offset >= str.length || !Character.isDigit(str[offset])) return false; while(offset < str.length && Character.isDigit(str[offset])) ++offset; if(offset < str.length && str[offset] == ',') ++offset; while(offset < str.length && Character.isDigit(str[offset])) ++offset; if(offset >= str.length || str[offset] != '}') return false; return true; } private static int __parseHex(char[] str, int offset, int maxLength, int[] scanned) { int val = 0, index; scanned[0] = 0; while(offset < str.length && maxLength-- > 0 && (index = __HEX_DIGIT.indexOf(str[offset])) != -1) { val <<= 4; val |= (index & 15); ++offset; ++scanned[0]; } return val; } private static int __parseOctal(char[] str, int offset, int maxLength, int[] scanned) { int val = 0; scanned[0] = 0; while(offset < str.length && maxLength > 0 && str[offset] >= '0' && str[offset] <= '7') { val <<= 3; val |= (str[offset] - '0'); --maxLength; ++offset; ++scanned[0]; } return val; } private static void __setModifierFlag(char[] flags, char ch) { switch(ch) { case 'i' : flags[0] |= __CASE_INSENSITIVE; return; case 'g' : flags[0] |= __GLOBAL; return; case 'o' : flags[0] |= __KEEP; return; case 'm' : flags[0] |= __MULTILINE; return; case 's' : flags[0] |= __SINGLELINE; return; case 'x' : flags[0] |= __EXTENDED; return; } } // Emit a specific character code. private void __emitCode(char code) { if(__program != null) __program[__programSize] = code; ++__programSize; } // Emit an operator with no arguments. // Return an offset into the __program array as a pointer to node. private int __emitNode(char operator) { int offset; offset = __programSize; if(__program == null) __programSize+=2; else { __program[__programSize++] = operator; __program[__programSize++] = OpCode._NULL_POINTER; } return offset; } // Emit an operator with arguments. // Return an offset into the __programarray as a pointer to node. private int __emitArgNode(char operator, char arg) { int offset; offset = __programSize; if(__program== null) __programSize+=3; else { __program[__programSize++] = operator; __program[__programSize++] = OpCode._NULL_POINTER; __program[__programSize++] = arg; } return offset; } // Insert an operator at a given offset. private void __programInsertOperator(char operator, int operand) { int src, dest, offset; offset = (OpCode._opType[operator] == OpCode._CURLY ? 2 : 0); if(__program== null) { __programSize+=(2 + offset); return; } src = __programSize; __programSize+=(2 + offset); dest = __programSize; while(src > operand) { --src; --dest; __program[dest] = __program[src]; } __program[operand++] = operator; __program[operand++] = OpCode._NULL_POINTER; while(offset-- > 0) __program[operand++] = OpCode._NULL_POINTER; } private void __programAddTail(int current, int value) { int scan, temp, offset; if(__program == null || current == OpCode._NULL_OFFSET) return; scan = current; while(true) { temp = OpCode._getNext(__program, scan); if(temp == OpCode._NULL_OFFSET) break; scan = temp; } if(__program[scan] == OpCode._BACK) offset = scan - value; else offset = value - scan; __program[scan + 1] = (char)offset; } private void __programAddOperatorTail(int current, int value) { if(__program == null || current == OpCode._NULL_OFFSET || OpCode._opType[__program[current]] != OpCode._BRANCH) return; __programAddTail(OpCode._getNextOperator(current), value); } private char __getNextChar() { char ret, value; ret = __input._postIncrement(); while(true) { value = __input._getValue(); if(value == '(' && __input._getValueRelative(1) == '?' && __input._getValueRelative(2) == '#') { // Skip comments while(value != CharStringPointer._END_OF_STRING && value != ')') value = __input._increment(); __input._increment(); continue; } if((__modifierFlags[0] & __EXTENDED) != 0) { if(Character.isWhitespace(value)) { __input._increment(); continue; } else if(value == '#') { while(value != CharStringPointer._END_OF_STRING && value != '\n') value = __input._increment(); __input._increment(); continue; } } return ret; } } private int __parseAlternation(int[] retFlags) throws MalformedPatternException { int chain, offset, latest; int flags = 0; char value; retFlags[0] = __WORSTCASE; offset = __emitNode(OpCode._BRANCH); chain = OpCode._NULL_OFFSET; if(__input._getOffset() == 0) { __input._setOffset(-1); __getNextChar(); } else { __input._decrement(); __getNextChar(); } value = __input._getValue(); while(value != CharStringPointer._END_OF_STRING && value != '|' && value != ')') { flags &= ~__TRYAGAIN; latest = __parseBranch(retFlags); if(latest == OpCode._NULL_OFFSET) { if((flags & __TRYAGAIN) != 0){ value = __input._getValue(); continue; } return OpCode._NULL_OFFSET; } retFlags[0] |= (flags & __NONNULL); if(chain == OpCode._NULL_OFFSET) retFlags[0] |= (flags & __SPSTART); else { ++__cost; __programAddTail(chain, latest); } chain = latest; value = __input._getValue(); } // If loop was never entered. if(chain == OpCode._NULL_OFFSET) __emitNode(OpCode._NOTHING); return offset; } private int __parseAtom(int[] retFlags) throws MalformedPatternException { boolean doDefault; char value; int offset, flags[] = { 0 }; retFlags[0] = __WORSTCASE; doDefault = false; offset = OpCode._NULL_OFFSET; tryAgain: while(true) { value = __input._getValue(); switch(value) { case '^' : __getNextChar(); // The order here is important in order to support /ms. // /m takes precedence over /s for ^ and $, but not for . if((__modifierFlags[0] & __MULTILINE) != 0) offset = __emitNode(OpCode._MBOL); else if((__modifierFlags[0] & __SINGLELINE) != 0) offset = __emitNode(OpCode._SBOL); else offset = __emitNode(OpCode._BOL); break tryAgain; case '$': __getNextChar(); // The order here is important in order to support /ms. // /m takes precedence over /s for ^ and $, but not for . if((__modifierFlags[0] & __MULTILINE) != 0) offset = __emitNode(OpCode._MEOL); else if((__modifierFlags[0] & __SINGLELINE) != 0) offset = __emitNode(OpCode._SEOL); else offset = __emitNode(OpCode._EOL); break tryAgain; case '.': __getNextChar(); // The order here is important in order to support /ms. // /m takes precedence over /s for ^ and $, but not for . if((__modifierFlags[0] & __SINGLELINE) != 0) offset = __emitNode(OpCode._SANY); else offset = __emitNode(OpCode._ANY); ++__cost; retFlags[0] |= (__NONNULL | __SIMPLE); break tryAgain; case '[': __input._increment(); offset = __parseUnicodeClass(); retFlags[0] |= (__NONNULL | __SIMPLE); break tryAgain; case '(': __getNextChar(); offset = __parseExpression(true, flags); if(offset == OpCode._NULL_OFFSET) { if((flags[0] & __TRYAGAIN) != 0) continue tryAgain; return OpCode._NULL_OFFSET; } retFlags[0] |= (flags[0] & (__NONNULL | __SPSTART)); break tryAgain; case '|': case ')': if((flags[0] & __TRYAGAIN) != 0) { retFlags[0] |= __TRYAGAIN; return OpCode._NULL_OFFSET; } throw new MalformedPatternException("Error in expression at " + __input._toString(__input._getOffset())); //break tryAgain; case '?': case '+': case '*': throw new MalformedPatternException( "?+* follows nothing in expression"); //break tryAgain; case '\\': value = __input._increment(); switch(value) { case 'A' : offset = __emitNode(OpCode._SBOL); retFlags[0] |= __SIMPLE; __getNextChar(); break; case 'G': offset = __emitNode(OpCode._GBOL); retFlags[0] |= __SIMPLE; __getNextChar(); break; case 'Z': offset = __emitNode(OpCode._SEOL); retFlags[0] |= __SIMPLE; __getNextChar(); break; case 'w': offset = __emitNode(OpCode._ALNUM); retFlags[0] |= (__NONNULL | __SIMPLE); __getNextChar(); break; case 'W': offset = __emitNode(OpCode._NALNUM); retFlags[0] |= (__NONNULL | __SIMPLE); __getNextChar(); break; case 'b': offset = __emitNode(OpCode._BOUND); retFlags[0] |= __SIMPLE; __getNextChar(); break; case 'B': offset = __emitNode(OpCode._NBOUND); retFlags[0] |= __SIMPLE; __getNextChar(); break; case 's': offset = __emitNode(OpCode._SPACE); retFlags[0] |= (__NONNULL | __SIMPLE); __getNextChar(); break; case 'S': offset = __emitNode(OpCode._NSPACE); retFlags[0] |= (__NONNULL | __SIMPLE); __getNextChar(); break; case 'd': offset = __emitNode(OpCode._DIGIT); retFlags[0] |= (__NONNULL | __SIMPLE); __getNextChar(); break; case 'D': offset = __emitNode(OpCode._NDIGIT); retFlags[0] |= (__NONNULL | __SIMPLE); __getNextChar(); break; case 'n': case 'r': case 't': case 'f': case 'e': case 'a': case 'x': case 'c': case '0': doDefault = true; break tryAgain; case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': int num; StringBuffer buffer = new StringBuffer(10); num = 0; value = __input._getValueRelative(num); while(Character.isDigit(value)) { buffer.append(value); ++num; value = __input._getValueRelative(num); } try { num = Integer.parseInt(buffer.toString()); } catch(NumberFormatException e) { throw new MalformedPatternException( "Unexpected number format exception. Please report this bug." + "NumberFormatException message: " + e.getMessage()); } if(num > 9 && num >= __numParentheses) { doDefault = true; break tryAgain; } else { // A backreference may only occur AFTER its group if(num >= __numParentheses) throw new MalformedPatternException("Invalid backreference: \\" + num); __sawBackreference = true; offset = __emitArgNode(OpCode._REF, (char)num); retFlags[0] |= __NONNULL; value = __input._getValue(); while(Character.isDigit(value)) value = __input._increment(); __input._decrement(); __getNextChar(); } break; case '\0': case CharStringPointer._END_OF_STRING: if(__input._isAtEnd()) throw new MalformedPatternException("Trailing \\ in expression."); // fall through to default default: doDefault = true; break tryAgain; } break tryAgain; case '#': // skip over comments if((__modifierFlags[0] & __EXTENDED) != 0) { while(!__input._isAtEnd() && __input._getValue() != '\n') __input._increment(); if(!__input._isAtEnd()) continue tryAgain; } // fall through to default default: __input._increment(); doDefault = true; break tryAgain; }// end master switch } // end tryAgain if(doDefault) { char ender; int length, pOffset, maxOffset, lastOffset, numLength[]; offset = __emitNode(OpCode._EXACTLY); // Not sure that it's ok to use 0 to mark end. //__emitCode((char)0); __emitCode((char)CharStringPointer._END_OF_STRING); forLoop: for(length = 0, pOffset = __input._getOffset() - 1, maxOffset = __input._getLength(); length < 127 && pOffset < maxOffset; ++length) { lastOffset = pOffset; value = __input._getValue(pOffset); switch(value) { case '^': case '$': case '.': case '[': case '(': case ')': case '|': break forLoop; case '\\': value = __input._getValue(++pOffset); switch(value) { case 'A': case 'G': case 'Z': case 'w': case 'W': case 'b': case 'B': case 's': case 'S': case 'd': case 'D': --pOffset; break forLoop; case 'n': ender = '\n'; ++pOffset; break; case 'r': ender = '\r'; ++pOffset; break; case 't': ender = '\t'; ++pOffset; break; case 'f': ender = '\f'; ++pOffset; break; case 'e': ender = '\033'; ++pOffset; break; case 'a': ender = '\007'; ++pOffset; break; case 'x': numLength = new int[1]; ender = (char)__parseHex(__input._array, ++pOffset, 2, numLength); pOffset+=numLength[0]; break; case 'c': ++pOffset; ender = __input._getValue(pOffset++); if(Character.isLowerCase(ender)) ender = Character.toUpperCase(ender); ender ^= 64; break; case '0': case '1': case '2': case'3': case '4': case '5': case '6': case '7': case '8': case '9': boolean doOctal = false; value = __input._getValue(pOffset); if(value == '0') doOctal = true; value = __input._getValue(pOffset + 1); if(Character.isDigit(value)) { int num; StringBuffer buffer = new StringBuffer(10); num = pOffset; value = __input._getValue(num); while(Character.isDigit(value)){ buffer.append(value); ++num; value = __input._getValue(num); } try { num = Integer.parseInt(buffer.toString()); } catch(NumberFormatException e) { throw new MalformedPatternException( "Unexpected number format exception. Please report this bug." + "NumberFormatException message: " + e.getMessage()); } if(!doOctal) doOctal = (num >= __numParentheses); } if(doOctal) { numLength = new int[1]; ender = (char)__parseOctal(__input._array, pOffset, 3, numLength); pOffset+=numLength[0]; } else { --pOffset; break forLoop; } break; case CharStringPointer._END_OF_STRING: case '\0': if(pOffset >= maxOffset) throw new MalformedPatternException("Trailing \\ in expression."); // fall through to default default: ender = __input._getValue(pOffset++); break; } // end backslash switch break; case '#': if((__modifierFlags[0] & __EXTENDED) != 0) { while(pOffset < maxOffset && __input._getValue(pOffset) != '\n') ++pOffset; } // fall through to whitespace handling case ' ': case '\t': case '\n': case '\r': case '\f': case '\013': if((__modifierFlags[0] & __EXTENDED) != 0) { ++pOffset; --length; continue; } // fall through to default default: ender = __input._getValue(pOffset++); break; } // end master switch if((__modifierFlags[0] & __CASE_INSENSITIVE) != 0 && Character.isUpperCase(ender)) ender = Character.toLowerCase(ender); if(pOffset < maxOffset && __isComplexRepetitionOp(__input._array, pOffset)) { if(length > 0) pOffset = lastOffset; else { ++length; __emitCode(ender); } break; } __emitCode(ender); } // end for loop __input._setOffset(pOffset - 1); __getNextChar(); if(length < 0) throw new MalformedPatternException( "Unexpected compilation failure. Please report this bug!"); if(length > 0) retFlags[0] |= __NONNULL; if(length == 1) retFlags[0] |= __SIMPLE; if(__program!= null) __program[OpCode._getOperand(offset)] = (char)length; //__emitCode('\0'); // debug __emitCode(CharStringPointer._END_OF_STRING); } return offset; } // These are the original 8-bit character class handling methods. // We don't want to delete them just yet only to have to dig it out // of revision control later. /* // Set the bits in a character class. Only recognizes ascii. private void __setCharacterClassBits(char[] bits, int offset, char deflt, char ch) { if(__program== null || ch >= 256) return; ch &= 0xffff; if(deflt == 0) { bits[offset + (ch >> 4)] |= (1 << (ch & 0xf)); } else { bits[offset + (ch >> 4)] &= ~(1 << (ch & 0xf)); } } private int __parseCharacterClass() throws MalformedPatternException { boolean range = false, skipTest; char clss, deflt, lastclss = Character.MAX_VALUE; int offset, bits, numLength[] = { 0 }; offset = __emitNode(OpCode._ANYOF); if(__input._getValue() == '^') { ++__cost; __input._increment(); deflt = 0; } else { deflt = 0xffff; } bits = __programSize; for(clss = 0; clss < 16; clss++) __emitCode(deflt); clss = __input._getValue(); if(clss == ']' || clss == '-') skipTest = true; else skipTest = false; while((!__input._isAtEnd() && (clss = __input._getValue()) != ']') || skipTest) { // It sucks, but we have to make this assignment every time skipTest = false; __input._increment(); if(clss == '\\') { clss = __input._postIncrement(); switch(clss){ case 'w': for(clss = 0; clss < 256; clss++) if(OpCode._isWordCharacter(clss)) __setCharacterClassBits(__program, bits, deflt, clss); lastclss = Character.MAX_VALUE; continue; case 'W': for(clss = 0; clss < 256; clss++) if(!OpCode._isWordCharacter(clss)) __setCharacterClassBits(__program, bits, deflt, clss); lastclss = Character.MAX_VALUE; continue; case 's': for(clss = 0; clss < 256; clss++) if(Character.isWhitespace(clss)) __setCharacterClassBits(__program, bits, deflt, clss); lastclss = Character.MAX_VALUE; continue; case 'S': for(clss = 0; clss < 256; clss++) if(!Character.isWhitespace(clss)) __setCharacterClassBits(__program, bits, deflt, clss); lastclss = Character.MAX_VALUE; continue; case 'd': for(clss = '0'; clss <= '9'; clss++) __setCharacterClassBits(__program, bits, deflt, clss); lastclss = Character.MAX_VALUE; continue; case 'D': for(clss = 0; clss < '0'; clss++) __setCharacterClassBits(__program, bits, deflt, clss); for(clss = (char)('9' + 1); clss < 256; clss++) __setCharacterClassBits(__program, bits, deflt, clss); lastclss = Character.MAX_VALUE; continue; case 'n': clss = '\n'; break; case 'r': clss = '\r'; break; case 't': clss = '\t'; break; case 'f': clss = '\f'; break; case 'b': clss = '\b'; break; case 'e': clss = '\033'; break; case 'a': clss = '\007'; break; case 'x': clss = (char)__parseHex(__input._array, __input._getOffset(), 2, numLength); __input._increment(numLength[0]); break; case 'c': clss = __input._postIncrement(); if(Character.isLowerCase(clss)) clss = Character.toUpperCase(clss); clss ^= 64; break; case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': clss = (char)__parseOctal(__input._array, __input._getOffset() - 1, 3, numLength); __input._increment(numLength[0] - 1); break; } } if(range) { if(lastclss > clss) throw new MalformedPatternException( "Invalid [] range in expression."); range = false; } else { lastclss = clss; if(__input._getValue() == '-' && __input._getOffset() + 1 < __input._getLength() && __input._getValueRelative(1) != ']') { __input._increment(); range = true; continue; } } while(lastclss <= clss) { __setCharacterClassBits(__program, bits, deflt, lastclss); if((__modifierFlags[0] & __CASE_INSENSITIVE) != 0 && Character.isUpperCase(lastclss)) __setCharacterClassBits(__program, bits, deflt, Character.toLowerCase(lastclss)); ++lastclss; } lastclss = clss; } if(__input._getValue() != ']') throw new MalformedPatternException("Unmatched [] in expression."); __getNextChar(); return offset; } */ private int __parseUnicodeClass() throws MalformedPatternException { boolean range = false, skipTest; char clss, lastclss = Character.MAX_VALUE; int offset, numLength[] = { 0 }; boolean negFlag[] = { false }; boolean opcodeFlag; /* clss isn't character when this flag true. */ if(__input._getValue() == '^') { offset = __emitNode(OpCode._NANYOFUN); __input._increment(); } else { offset = __emitNode(OpCode._ANYOFUN); } clss = __input._getValue(); if(clss == ']' || clss == '-') skipTest = true; else skipTest = false; while((!__input._isAtEnd() && (clss = __input._getValue()) != ']') || skipTest) { // It sucks, but we have to make this assignment every time skipTest = false; opcodeFlag = false; __input._increment(); if(clss == '\\' || clss == '[') { if(clss == '\\') { /* character is escaped */ clss = __input._postIncrement(); } else { /* try POSIX expression */ char posixOpCode = __parsePOSIX(negFlag); if(posixOpCode != 0){ opcodeFlag = true; clss = posixOpCode; } } if (opcodeFlag != true) { switch(clss){ case 'w': opcodeFlag = true; clss = OpCode._ALNUM; lastclss = Character.MAX_VALUE; break; case 'W': opcodeFlag = true; clss = OpCode._NALNUM; lastclss = Character.MAX_VALUE; break; case 's': opcodeFlag = true; clss = OpCode._SPACE; lastclss = Character.MAX_VALUE; break; case 'S': opcodeFlag = true; clss = OpCode._NSPACE; lastclss = Character.MAX_VALUE; break; case 'd': opcodeFlag = true; clss = OpCode._DIGIT; lastclss = Character.MAX_VALUE; break; case 'D': opcodeFlag = true; clss = OpCode._NDIGIT; lastclss = Character.MAX_VALUE; break; case 'n': clss = '\n'; break; case 'r': clss = '\r'; break; case 't': clss = '\t'; break; case 'f': clss = '\f'; break; case 'b': clss = '\b'; break; case 'e': clss = '\033'; break; case 'a': clss = '\007'; break; case 'x': clss = (char)__parseHex(__input._array, __input._getOffset(), 2, numLength); __input._increment(numLength[0]); break; case 'c': clss = __input._postIncrement(); if(Character.isLowerCase(clss)) clss = Character.toUpperCase(clss); clss ^= 64; break; case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': clss = (char)__parseOctal(__input._array, __input._getOffset() - 1, 3, numLength); __input._increment(numLength[0] - 1); break; default: break; } } } if(range) { if(lastclss > clss) throw new MalformedPatternException( "Invalid [] range in expression."); range = false; } else { lastclss = clss; if(opcodeFlag == false && __input._getValue() == '-' && __input._getOffset() + 1 < __input._getLength() && __input._getValueRelative(1) != ']') { __input._increment(); range = true; continue; } } if(lastclss == clss) { if(opcodeFlag == true) { if(negFlag[0] == false) __emitCode(OpCode._OPCODE); else __emitCode(OpCode._NOPCODE); } else __emitCode(OpCode._ONECHAR); __emitCode(clss); if((__modifierFlags[0] & __CASE_INSENSITIVE) != 0 && Character.isUpperCase(clss) && Character.isUpperCase(lastclss)){ __programSize--; __emitCode(Character.toLowerCase(clss)); } } if(lastclss < clss) { __emitCode(OpCode._RANGE); __emitCode(lastclss); __emitCode(clss); if((__modifierFlags[0] & __CASE_INSENSITIVE) != 0 && Character.isUpperCase(clss) && Character.isUpperCase(lastclss)){ __programSize-=2; __emitCode(Character.toLowerCase(lastclss)); __emitCode(Character.toLowerCase(clss)); } lastclss = Character.MAX_VALUE; range = false; } lastclss = clss; } if(__input._getValue() != ']') throw new MalformedPatternException("Unmatched [] in expression."); __getNextChar(); __emitCode(OpCode._END); return offset; } /** * Parse POSIX epxression like [:foo:]. * * @return OpCode. return 0 when fail parsing POSIX expression. */ private char __parsePOSIX(boolean negFlag[]) throws MalformedPatternException { int offset = __input._getOffset(); int len = __input._getLength(); int pos = offset; char value = __input._getValue(pos++); StringBuffer buf; Object opcode; if( value != ':' ) return 0; if( __input._getValue(pos) == '^' ) { negFlag[0] = true; pos++; } else { negFlag[0] = false; } buf = new StringBuffer(); try { while ( (value = __input._getValue(pos++)) != ':' && pos < len) { buf.append(value); } } catch (Exception e){ return 0; } if( __input._getValue(pos++) != ']'){ return 0; } opcode = __hashPOSIX.get(buf.toString()); if( opcode == null ) return 0; __input._setOffset(pos); return ((Character)opcode).charValue(); } private int __parseBranch(int[] retFlags) throws MalformedPatternException { boolean nestCheck = false, handleRepetition = false; int offset, next, min, max, flags[] = { 0 }; char operator, value; min = 0; max = Character.MAX_VALUE; offset = __parseAtom(flags); if(offset == OpCode._NULL_OFFSET) { if((flags[0] & __TRYAGAIN) != 0) retFlags[0] |= __TRYAGAIN; return OpCode._NULL_OFFSET; } operator = __input._getValue(); if(operator == '(' && __input._getValueRelative(1) == '?' && __input._getValueRelative(2) == '#') { while(operator != CharStringPointer._END_OF_STRING && operator != ')') operator = __input._increment(); if(operator != CharStringPointer._END_OF_STRING) { __getNextChar(); operator = __input._getValue(); } } if(operator == '{' && __parseRepetition(__input._array, __input._getOffset())) { int maxOffset, pos; next = __input._getOffset() + 1; pos = maxOffset = __input._getLength(); value = __input._getValue(next); while(Character.isDigit(value) || value == ',') { if(value == ',') { if(pos != maxOffset) break; else pos = next; } ++next; value = __input._getValue(next); } if(value == '}') { int num; StringBuffer buffer = new StringBuffer(10); if(pos == maxOffset) pos = next; __input._increment(); num = __input._getOffset(); value = __input._getValue(num); while(Character.isDigit(value)) { buffer.append(value); ++num; value = __input._getValue(num); } try { min = Integer.parseInt(buffer.toString()); } catch(NumberFormatException e) { throw new MalformedPatternException( "Unexpected number format exception. Please report this bug." + "NumberFormatException message: " + e.getMessage()); } value = __input._getValue(pos); if(value == ',') ++pos; else pos = __input._getOffset(); num = pos; buffer = new StringBuffer(10); value = __input._getValue(num); while(Character.isDigit(value)){ buffer.append(value); ++num; value = __input._getValue(num); } try { if(num != pos) max = Integer.parseInt(buffer.toString()); } catch(NumberFormatException e) { throw new MalformedPatternException( "Unexpected number format exception. Please report this bug." + "NumberFormatException message: " + e.getMessage()); } if(max == 0 && __input._getValue(pos) != '0') max = Character.MAX_VALUE; __input._setOffset(next); __getNextChar(); nestCheck = true; handleRepetition = true; } } if(!nestCheck) { handleRepetition = false; if(!__isSimpleRepetitionOp(operator)) { retFlags[0] = flags[0]; return offset; } __getNextChar(); retFlags[0] = ((operator != '+') ? (__WORSTCASE | __SPSTART) : (__WORSTCASE | __NONNULL)); if(operator == '*' && ((flags[0] & __SIMPLE) != 0)) { __programInsertOperator(OpCode._STAR, offset); __cost+=4; } else if(operator == '*') { min = 0; handleRepetition = true; } else if(operator == '+' && (flags[0] & __SIMPLE) != 0) { __programInsertOperator(OpCode._PLUS, offset); __cost+=3; } else if(operator == '+') { min = 1; handleRepetition = true; } else if(operator == '?') { min = 0; max = 1; handleRepetition = true; } } if(handleRepetition) { // handle repetition if((flags[0] & __SIMPLE) != 0){ __cost+= ((2 + __cost) / 2); __programInsertOperator(OpCode._CURLY, offset); } else { __cost += (4 + __cost); __programAddTail(offset, __emitNode(OpCode._WHILEM)); __programInsertOperator(OpCode._CURLYX, offset); __programAddTail(offset, __emitNode(OpCode._NOTHING)); } if(min > 0) retFlags[0] = (__WORSTCASE | __NONNULL); if(max != 0 && max < min) throw new MalformedPatternException( "Invalid interval {" + min + "," + max + "}"); if(__program!= null) { __program[offset + 2] = (char)min; __program[offset + 3] = (char)max; } } if(__input._getValue() == '?') { __getNextChar(); __programInsertOperator(OpCode._MINMOD, offset); __programAddTail(offset, offset + 2); } if(__isComplexRepetitionOp(__input._array, __input._getOffset())) throw new MalformedPatternException( "Nested repetitions *?+ in expression"); return offset; } private int __parseExpression(boolean isParenthesized, int[] hintFlags) throws MalformedPatternException { char value, paren; char[] modifierFlags, posFlags = { 0 }, negFlags = { 0 }; int nodeOffset = OpCode._NULL_OFFSET, parenthesisNum = 0, br, ender; int[] flags = { 0 };; String modifiers = "iogmsx-"; modifierFlags = posFlags; // Initially we assume expression doesn't match null string. hintFlags[0] = __NONNULL; if (isParenthesized) { paren = 1; if(__input._getValue() == '?') { __input._increment(); paren = value = __input._postIncrement(); switch(value) { case ':' : case '=' : case '!' : break; case '#' : value = __input._getValue(); while(value != CharStringPointer._END_OF_STRING && value != ')') value = __input._increment(); if(value != ')') throw new MalformedPatternException( "Sequence (?#... not terminated"); __getNextChar(); hintFlags[0] = __TRYAGAIN; return OpCode._NULL_OFFSET; default : __input._decrement(); value = __input._getValue(); while(value != CharStringPointer._END_OF_STRING && modifiers.indexOf(value) != -1) { if(value == '-') modifierFlags = negFlags; else __setModifierFlag(modifierFlags, value); value = __input._increment(); } __modifierFlags[0] |= posFlags[0]; __modifierFlags[0] &= ~negFlags[0]; if(value != ')') throw new MalformedPatternException( "Sequence (?" + value + "...) not recognized"); __getNextChar(); hintFlags[0] = __TRYAGAIN; return OpCode._NULL_OFFSET; } } else { parenthesisNum = __numParentheses; ++__numParentheses; nodeOffset = __emitArgNode(OpCode._OPEN, (char)parenthesisNum); } } else paren = 0; br = __parseAlternation(flags); if(br == OpCode._NULL_OFFSET) return OpCode._NULL_OFFSET; if(nodeOffset != OpCode._NULL_OFFSET) __programAddTail(nodeOffset, br); else nodeOffset = br; if((flags[0] & __NONNULL) == 0) hintFlags[0] &= ~__NONNULL; hintFlags[0] |= (flags[0] & __SPSTART); while(__input._getValue() == '|') { __getNextChar(); br = __parseAlternation(flags); if(br == OpCode._NULL_OFFSET) return OpCode._NULL_OFFSET; __programAddTail(nodeOffset, br); if((flags[0] & __NONNULL) == 0) hintFlags[0] &= ~__NONNULL; hintFlags[0] |= (flags[0] & __SPSTART); } switch(paren) { case ':' : ender = __emitNode(OpCode._NOTHING); break; case 1: ender = __emitArgNode(OpCode._CLOSE, (char)parenthesisNum); break; case '=': case '!': ender = __emitNode(OpCode._SUCCEED); hintFlags[0] &= ~__NONNULL; break; case 0 : default : ender = __emitNode(OpCode._END); break; } __programAddTail(nodeOffset, ender); for(br = nodeOffset; br != OpCode._NULL_OFFSET; br = OpCode._getNext(__program, br)) __programAddOperatorTail(br, ender); if(paren == '=') { __programInsertOperator(OpCode._IFMATCH, nodeOffset); __programAddTail(nodeOffset, __emitNode(OpCode._NOTHING)); } else if(paren == '!') { __programInsertOperator(OpCode._UNLESSM, nodeOffset); __programAddTail(nodeOffset, __emitNode(OpCode._NOTHING)); } if(paren != 0 && (__input._isAtEnd() || __getNextChar() != ')')) { throw new MalformedPatternException("Unmatched parentheses."); } else if(paren == 0 && !__input._isAtEnd()) { if(__input._getValue() == ')') throw new MalformedPatternException("Unmatched parentheses."); else // Should never happen. throw new MalformedPatternException( "Unreached characters at end of expression. Please report this bug!"); } return nodeOffset; } /** * Compiles a Perl5 regular expression into a Perl5Pattern instance that * can be used by a Perl5Matcher object to perform pattern matching. * Please see the user's guide for more information about Perl5 regular * expressions. *

* @param pattern A Perl5 regular expression to compile. * @param options A set of flags giving the compiler instructions on * how to treat the regular expression. The flags * are a logical OR of any number of the five MASK * constants. For example: *

   * regex =
   *   compiler.compile(pattern, Perl5Compiler.
   *                    CASE_INSENSITIVE_MASK |
   *                    Perl5Compiler.MULTILINE_MASK);
   *                 
* This says to compile the pattern so that it treats * input as consisting of multiple lines and to perform * matches in a case insensitive manner. * @return A Pattern instance constituting the compiled regular expression. * This instance will always be a Perl5Pattern and can be reliably * casted to a Perl5Pattern. * @exception MalformedPatternException If the compiled expression * is not a valid Perl5 regular expression. */ public Pattern compile(char[] pattern, int options) throws MalformedPatternException { int[] flags = { 0 }; int caseInsensitive, scan; Perl5Pattern regexp; String mustString, startString; int first; boolean sawOpen = false, sawPlus = false; StringBuffer lastLongest, longest; int length, minLength = 0, curBack, back, backmost; __input = new CharStringPointer(pattern); caseInsensitive = options & __CASE_INSENSITIVE; __modifierFlags[0] = (char)options; __sawBackreference = false; __numParentheses = 1; __programSize = 0; __cost = 0; __program= null; __emitCode((char)0); if(__parseExpression(false, flags) == OpCode._NULL_OFFSET) throw new MalformedPatternException("Unknown compilation error."); if(__programSize >= Character.MAX_VALUE - 1) throw new MalformedPatternException("Expression is too large."); __program= new char[__programSize]; regexp = new Perl5Pattern(); regexp._program = __program; regexp._expression = new String(pattern); __input._setOffset(0); __numParentheses = 1; __programSize = 0; __cost = 0; __emitCode((char)0); if(__parseExpression(false, flags) == OpCode._NULL_OFFSET) throw new MalformedPatternException("Unknown compilation error."); caseInsensitive = __modifierFlags[0] & __CASE_INSENSITIVE; regexp._isExpensive = (__cost >= 10); regexp._startClassOffset = OpCode._NULL_OFFSET; regexp._anchor = 0; regexp._back = -1; regexp._options = options; regexp._startString = null; regexp._mustString = null; mustString = null; startString = null; scan = 1; if(__program[OpCode._getNext(__program, scan)] == OpCode._END){ boolean doItAgain; // bad variables names! char op; first = scan = OpCode._getNextOperator(scan); op = __program[first]; while((op == OpCode._OPEN && (sawOpen = true)) || (op == OpCode._BRANCH && __program[OpCode._getNext(__program, first)] != OpCode._BRANCH) || op == OpCode._PLUS || op == OpCode._MINMOD || (OpCode._opType[op] == OpCode._CURLY && OpCode._getArg1(__program, first) > 0)) { if(op == OpCode._PLUS) sawPlus = true; else first+=OpCode._operandLength[op]; first = OpCode._getNextOperator(first); op = __program[first]; } doItAgain = true; while(doItAgain) { doItAgain = false; op = __program[first]; if(op == OpCode._EXACTLY) { startString = new String(__program, OpCode._getOperand(first + 1), __program[OpCode._getOperand(first)]); } else if(OpCode._isInArray(op, OpCode._opLengthOne, 2)) regexp._startClassOffset = first; else if(op == OpCode._BOUND || op == OpCode._NBOUND) regexp._startClassOffset = first; else if(OpCode._opType[op] == OpCode._BOL) { if(op == OpCode._BOL) regexp._anchor = Perl5Pattern._OPT_ANCH_BOL; else if(op == OpCode._MBOL) regexp._anchor = Perl5Pattern._OPT_ANCH_MBOL; else regexp._anchor = Perl5Pattern._OPT_ANCH; first = OpCode._getNextOperator(first); doItAgain = true; continue; } else if(op == OpCode._STAR && OpCode._opType[__program[OpCode._getNextOperator(first)]] == OpCode._ANY && (regexp._anchor & Perl5Pattern._OPT_ANCH) != 0) { regexp._anchor = Perl5Pattern._OPT_ANCH | Perl5Pattern._OPT_IMPLICIT; first = OpCode._getNextOperator(first); doItAgain = true; continue; } } // end while do it again if(sawPlus && (!sawOpen || !__sawBackreference)) regexp._anchor |= Perl5Pattern._OPT_SKIP; lastLongest = new StringBuffer(); longest = new StringBuffer(); length = 0; minLength = 0; curBack = 0; back = 0; backmost = 0; while(scan > 0 && (op = __program[scan]) != OpCode._END) { if(op == OpCode._BRANCH) { if(__program[OpCode._getNext(__program, scan)] == OpCode._BRANCH) { curBack = -30000; while(__program[scan] == OpCode._BRANCH) scan = OpCode._getNext(__program, scan); } else scan = OpCode._getNextOperator(scan); continue; } if(op == OpCode._UNLESSM) { curBack = -30000; scan = OpCode._getNext(__program, scan); continue; } if(op == OpCode._EXACTLY) { int temp; first = scan; while(__program[(temp = OpCode._getNext(__program, scan))] == OpCode._CLOSE) scan = temp; minLength += __program[OpCode._getOperand(first)]; temp = __program[OpCode._getOperand(first)]; if(curBack - back == length) { lastLongest.append(new String(__program, OpCode._getOperand(first) + 1, temp)); length += temp; curBack += temp; first = OpCode._getNext(__program, scan); } else if(temp >= (length + (curBack >= 0 ? 1 : 0))) { length = temp; lastLongest = new StringBuffer(new String(__program, OpCode._getOperand(first) + 1, temp)); back = curBack; curBack += length; first = OpCode._getNext(__program, scan); } else curBack += temp; } else if(OpCode._isInArray(op, OpCode._opLengthVaries, 0)) { curBack = -30000; length = 0; if(lastLongest.length() > longest.length()) { longest = lastLongest; backmost = back; } lastLongest = new StringBuffer(); if(op == OpCode._PLUS && OpCode._isInArray(__program[OpCode._getNextOperator(scan)], OpCode._opLengthOne, 0)) ++minLength; else if(OpCode._opType[op] == OpCode._CURLY && OpCode._isInArray(__program[OpCode._getNextOperator(scan) + 2], OpCode._opLengthOne, 0)) minLength += OpCode._getArg1(__program, scan); } else if(OpCode._isInArray(op, OpCode._opLengthOne, 0)) { ++curBack; ++minLength; length = 0; if(lastLongest.length() > longest.length()) { longest = lastLongest; backmost = back; } lastLongest = new StringBuffer(); } scan = OpCode._getNext(__program, scan); } // end while if(lastLongest.length() + ((OpCode._opType[__program[first]] == OpCode._EOL) ? 1 : 0) > longest.length()) { longest = lastLongest; backmost = back; } else lastLongest = new StringBuffer(); if(longest.length() > 0 && startString == null) { mustString = longest.toString(); if(backmost < 0) backmost = -1; regexp._back = backmost; /* if(longest.length() > (((caseInsensitive & __CASE_INSENSITIVE) != 0 || OpCode._opType[__program[first]] == OpCode._EOL) ? 1 : 0)) */ } else longest = null; } // end if regexp._isCaseInsensitive = ((caseInsensitive & __CASE_INSENSITIVE) != 0); regexp._numParentheses = __numParentheses - 1; regexp._minLength = minLength; if(mustString != null) { regexp._mustString = mustString.toCharArray(); regexp._mustUtility = 100; } if(startString != null) regexp._startString = startString.toCharArray(); return regexp; } /** * Same as calling compile(pattern, Perl5Compiler.DEFAULT_MASK); *

* @param pattern A regular expression to compile. * @return A Pattern instance constituting the compiled regular expression. * This instance will always be a Perl5Pattern and can be reliably * casted to a Perl5Pattern. * @exception MalformedPatternException If the compiled expression * is not a valid Perl5 regular expression. */ public Pattern compile(char[] pattern) throws MalformedPatternException { return compile(pattern, DEFAULT_MASK); } /** * Same as calling compile(pattern, Perl5Compiler.DEFAULT_MASK); *

* @param pattern A regular expression to compile. * @return A Pattern instance constituting the compiled regular expression. * This instance will always be a Perl5Pattern and can be reliably * casted to a Perl5Pattern. * @exception MalformedPatternException If the compiled expression * is not a valid Perl5 regular expression. */ public Pattern compile(String pattern) throws MalformedPatternException { return compile(pattern.toCharArray(), DEFAULT_MASK); } /** * Compiles a Perl5 regular expression into a Perl5Pattern instance that * can be used by a Perl5Matcher object to perform pattern matching. * Please see the user's guide for more information about Perl5 regular * expressions. *

* @param pattern A Perl5 regular expression to compile. * @param options A set of flags giving the compiler instructions on * how to treat the regular expression. The flags * are a logical OR of any number of the five MASK * constants. For example: *

   * regex =
   *   compiler.compile("^\\w+\\d+$",
   *                    Perl5Compiler.CASE_INSENSITIVE_MASK |
   *                    Perl5Compiler.MULTILINE_MASK);
   *                 
* This says to compile the pattern so that it treats * input as consisting of multiple lines and to perform * matches in a case insensitive manner. * @return A Pattern instance constituting the compiled regular expression. * This instance will always be a Perl5Pattern and can be reliably * casted to a Perl5Pattern. * @exception MalformedPatternException If the compiled expression * is not a valid Perl5 regular expression. */ public Pattern compile(String pattern, int options) throws MalformedPatternException { return compile(pattern.toCharArray(), options); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/Perl5Repetition.java0000644000175000017500000000623007773723336025575 0ustar arnaudarnaud/* * $Id: Perl5Repetition.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * Perl5Repetition is a support class for Perl5Matcher. It was originally * defined as a top-level class rather than as an inner class to allow * compilation for JDK 1.0.2. * * @version @version@ * @since 1.0 * @see Perl5Matcher */ final class Perl5Repetition { int _parenFloor; int _numInstances, _min, _max; boolean _minMod; int _scan; int _next; int _lastLocation; Perl5Repetition _lastRepetition; } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/Perl5MatchResult.java0000644000175000017500000002205607773723336025712 0ustar arnaudarnaud/* * $Id: Perl5MatchResult.java,v 1.8 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * A class used to store and access the results of a Perl5Pattern match. * * @version @version@ * @since 1.0 * @see PatternMatcher * @see Perl5Matcher */ final class Perl5MatchResult implements MatchResult { /** * The character offset into the line or stream where the match * begins. Pattern matching methods that look for matches a line at * a time should use this field as the offset into the line * of the match. Methods that look for matches independent of line * boundaries should use this field as the offset into the entire * text stream. */ int _matchBeginOffset; /** * Arrays containing the beginning and end offsets of the pattern * groups matched within the actual matched pattern contained in the * variable match. * Pattern matching methods that do not match subgroups, will only contain * entries for group 0, which always refers to the entire pattern. * beginGroupOffset contains the start offset of the groups, * indexed by group number, which will always be 0 for group 0. * endGroupOffset contains the ending offset + 1 of the groups. * A group matching the null string will have beginGroupOffset * and endGroupOffset entries of equal value. Following a * convention established by the GNU regular expression library for the * C language, groups that are not part of a match contain -1 as their * begin and end offsets. */ int[] _beginGroupOffset, _endGroupOffset; /** * The entire string that matched the pattern. */ String _match; /** * Constructs a MatchResult able to store match information for * a number of subpattern groups. *

* @param groups The number of groups this MatchResult can store. * Only postitive values greater than or equal to 1 make any * sense. At minimum, a MatchResult stores one group which * represents the entire pattern matched including all subparts. */ Perl5MatchResult(int groups){ _beginGroupOffset = new int[groups]; _endGroupOffset = new int[groups]; } /** * @return The length of the match. */ public int length(){ int length; length = (_endGroupOffset[0] - _beginGroupOffset[0]); return (length > 0 ? length : 0); } /** * @return The number of groups contained in the result. This number * includes the 0th group. In other words, the result refers * to the number of parenthesized subgroups plus the entire match * itself. */ public int groups(){ return _beginGroupOffset.length; } /** * @param group The pattern subgroup to return. * @return A string containing the indicated pattern subgroup. Group * 0 always refers to the entire match. If a group was never * matched, it returns null. This is not to be confused with * a group matching the null string, which will return a String * of length 0. */ public String group(int group){ int begin, end, length; if(group < _beginGroupOffset.length){ begin = _beginGroupOffset[group]; end = _endGroupOffset[group]; length = _match.length(); if(begin >= 0 && end >= 0) { if(begin < length && end <= length && end > begin) return _match.substring(begin, end); else if(begin <= end) return ""; } } return null; } /** * @param group The pattern subgroup. * @return The offset into group 0 of the first token in the indicated * pattern subgroup. If a group was never matched or does * not exist, returns -1. */ public int begin(int group){ int begin, end;//, length; if(group < _beginGroupOffset.length){ begin = _beginGroupOffset[group]; end = _endGroupOffset[group]; //length = _match.length(); if(begin >= 0 && end >= 0)// && begin < length && end <= length) //return _beginGroupOffset[group]; return begin; } return -1; } /** * @param group The pattern subgroup. * @return Returns one plus the offset into group 0 of the last token in * the indicated pattern subgroup. If a group was never matched * or does not exist, returns -1. A group matching the null * string will return its start offset. */ public int end(int group){ int begin, end; //, length; if(group < _beginGroupOffset.length){ begin = _beginGroupOffset[group]; end = _endGroupOffset[group]; //length = _match.length(); if(begin >= 0 && end >= 0)// && begin < length && end <= length) //return _endGroupOffset[group]; return end; } return -1; } /** * Returns an offset marking the beginning of the pattern match * relative to the beginning of the input. *

* @param group The pattern subgroup. * @return The offset of the first token in the indicated * pattern subgroup. If a group was never matched or does * not exist, returns -1. */ public int beginOffset(int group){ int begin, end;//, length; if(group < _beginGroupOffset.length){ begin = _beginGroupOffset[group]; end = _endGroupOffset[group]; //length = _match.length(); if(begin >= 0 && end >= 0)// && begin < length && end <= length) //return _matchBeginOffset + _beginGroupOffset[group]; return _matchBeginOffset + begin; } return -1; } /** * Returns an offset marking the end of the pattern match * relative to the beginning of the input. *

* @param group The pattern subgroup. * @return Returns one plus the offset of the last token in * the indicated pattern subgroup. If a group was never matched * or does not exist, returns -1. A group matching the null * string will return its start offset. */ public int endOffset(int group){ int begin, end;//, length; if(group < _endGroupOffset.length){ begin = _beginGroupOffset[group]; end = _endGroupOffset[group]; //length = _match.length(); if(begin >= 0 && end >= 0)// && begin < length && end <= length) //return _matchBeginOffset + _endGroupOffset[group]; return _matchBeginOffset + end; } return -1; } /** * The same as group(0). * * @return A string containing the entire match. */ public String toString() { return group(0); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/Substitution.java0000644000175000017500000001335707773723336025267 0ustar arnaudarnaud/* * $Id: Substitution.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * The Substitution interface provides a means for you to control how * a substitution is performed when using the * {@link Util#substitute Util.substitute} method. Two standard * implementations are provided, * {@link StringSubstitution} and {@link Perl5Substitution}. To * achieve custom control over the behavior of substitutions, you can * create your own implementations. A common use for customization is * to make a substitution a function of a match. * * @version @version@ * @since 1.1 * @see Util * @see Util#substitute * @see StringSubstitution * @see Perl5Substitution */ public interface Substitution { /** * Appends the substitution to a buffer containing the original input * with substitutions applied for the pattern matches found so far. * For maximum flexibility, the original input as well as the * PatternMatcher and Pattern used to find the match are included as * arguments. However, you will almost never find a need to use those * arguments when creating your own Substitution implementations. *

* For performance reasons, rather than provide a getSubstitution method * that returns a String used by Util.substitute, we have opted to pass * a StringBuffer argument from Util.substitute to which the Substitution * must append data. The contract that an appendSubstitution * implementation must abide by is that the appendBuffer may only be * appended to. appendSubstitution() may not alter the appendBuffer in * any way other than appending to it. *

* This method is invoked by Util.substitute every time it finds a match. * After finding a match, Util.substitute appends to the appendBuffer * all of the original input occuring between the end of the last match * and the beginning of the current match. Then it invokes * appendSubstitution(), passing the appendBuffer, current match, and * other information as arguments. The substitutionCount keeps track * of how many substitutions have been performed so far by an invocation * of Util.substitute. Its value starts at 1 when the first substitution * is found and appendSubstitution is invoked for the first time. It * will NEVER be zero or a negative value. *

* @param appendBuffer The buffer containing the new string resulting * from performing substitutions on the original input. * @param match The current match causing a substitution to be made. * @param substitutionCount The number of substitutions that have been * performed so far by Util.substitute. * @param originalInput The original input upon which the substitutions are * being performed. The Substitution must treat this parameter as read only. * @param matcher The PatternMatcher used to find the current match. * @param pattern The Pattern used to find the current match. */ public void appendSubstitution(StringBuffer appendBuffer, MatchResult match, int substitutionCount, PatternMatcherInput originalInput, PatternMatcher matcher, Pattern pattern); } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/PatternMatcherInput.java0000644000175000017500000004453607773723336026517 0ustar arnaudarnaud/* * $Id: PatternMatcherInput.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * The PatternMatcherInput class is used to preserve state across * calls to the contains() methods of PatternMatcher instances. * It is also used to specify that only a subregion of a string * should be used as input when looking for a pattern match. All that * is meant by preserving state is that the end offset of the last match * is remembered, so that the next match is performed from that point * where the last match left off. This offset can be accessed from * the {@link #getCurrentOffset()} method and can be set with the * {@link #setCurrentOffset(int)} method. *

* You would use a PatternMatcherInput object when you want to search for * more than just the first occurrence of a pattern in a string, or when * you only want to search a subregion of the string for a match. An * example of its most common use is: *

 * PatternMatcher matcher;
 * PatternCompiler compiler;
 * Pattern pattern;
 * PatternMatcherInput input;
 * MatchResult result;
 *
 * compiler = new Perl5Compiler();
 * matcher  = new Perl5Matcher();
 *
 * try {
 *   pattern = compiler.compile(somePatternString);
 * } catch(MalformedPatternException e) {
 *   System.out.println("Bad pattern.");
 *   System.out.println(e.getMessage());
 *   return;
 * }
 *
 * input   = new PatternMatcherInput(someStringInput);
 *
 * while(matcher.contains(input, pattern)) {
 *   result = matcher.getMatch();  
 *   // Perform whatever processing on the result you want.
 * }
 * // Suppose we want to start searching from the beginning again with
 * // a different pattern.
 * // Just set the current offset to the begin offset.
 * input.setCurrentOffset(input.getBeginOffset());
 *
 * // Second search omitted
 *
 * // Suppose we're done with this input, but want to search another string.
 * // There's no need to create another PatternMatcherInput instance.
 * // We can just use the setInput() method.
 * input.setInput(aNewInputString);
 *
 * 
* * @version @version@ * @since 1.0 * @see PatternMatcher */ public final class PatternMatcherInput { String _originalStringInput; char[] _originalCharInput, _originalBuffer, _toLowerBuffer; int _beginOffset, _endOffset, _currentOffset; int _matchBeginOffset = -1, _matchEndOffset = -1; /** * Creates a PatternMatcherInput object, associating a region of a String * as input to be used for pattern matching by PatternMatcher objects. * A copy of the string is not made, therefore you should not modify * the string unless you know what you are doing. * The current offset of the PatternMatcherInput is set to the begin * offset of the region. *

* @param input The input to associate with the PatternMatcherInput. * @param begin The offset into the char[] to use as the beginning of * the input. * @param length The length of the reegion starting from the begin offset * to use as the input for pattern matching purposes. */ public PatternMatcherInput(String input, int begin, int length) { setInput(input, begin, length); } /** * Like calling *

   * PatternMatcherInput(input, 0, input.length());
   * 
*

* @param input The input to associate with the PatternMatcherInput. */ public PatternMatcherInput(String input) { this(input, 0, input.length()); } /** * Creates a PatternMatcherInput object, associating a region of a string * (represented as a char[]) as input * to be used for pattern matching by PatternMatcher objects. * A copy of the string is not made, therefore you should not modify * the string unless you know what you are doing. * The current offset of the PatternMatcherInput is set to the begin * offset of the region. *

* @param input The input to associate with the PatternMatcherInput. * @param begin The offset into the char[] to use as the beginning of * the input. * @param length The length of the reegion starting from the begin offset * to use as the input for pattern matching purposes. */ public PatternMatcherInput(char[] input, int begin, int length) { setInput(input, begin, length); } /** * Like calling: *

   * PatternMatcherInput(input, 0, input.length);
   * 
*

* @param input The input to associate with the PatternMatcherInput. */ public PatternMatcherInput(char[] input) { this(input, 0, input.length); } /** * @return The length of the region to be considered input for pattern * matching purposes. Essentially this is then end offset minus * the begin offset. */ public int length() { return (_endOffset - _beginOffset); //return _originalBuffer.length; } /** * Associates a region of a String as input * to be used for pattern matching by PatternMatcher objects. * The current offset of the PatternMatcherInput is set to the begin * offset of the region. *

* @param input The input to associate with the PatternMatcherInput. * @param begin The offset into the String to use as the beginning of * the input. * @param length The length of the reegion starting from the begin offset * to use as the input for pattern matching purposes. */ public void setInput(String input, int begin, int length) { _originalStringInput = input; _originalCharInput = null; _toLowerBuffer = null; _originalBuffer = input.toCharArray(); setCurrentOffset(begin); setBeginOffset(begin); setEndOffset(_beginOffset + length); } /** * This method is identical to calling: *

   * setInput(input, 0, input.length());
   * 
*

* @param input The input to associate with the PatternMatcherInput. */ public void setInput(String input) { setInput(input, 0, input.length()); } /** * Associates a region of a string (represented as a char[]) as input * to be used for pattern matching by PatternMatcher objects. * A copy of the string is not made, therefore you should not modify * the string unless you know what you are doing. * The current offset of the PatternMatcherInput is set to the begin * offset of the region. *

* @param input The input to associate with the PatternMatcherInput. * @param begin The offset into the char[] to use as the beginning of * the input. * @param length The length of the reegion starting from the begin offset * to use as the input for pattern matching purposes. */ public void setInput(char[] input, int begin, int length) { _originalStringInput = null; _toLowerBuffer = null; _originalBuffer = _originalCharInput = input; setCurrentOffset(begin); setBeginOffset(begin); setEndOffset(_beginOffset + length); } /** * This method is identical to calling: *

   * setInput(input, 0, input.length);
   * 
*

* @param input The input to associate with the PatternMatcherInput. */ public void setInput(char[] input) { setInput(input, 0, input.length); } /** * Returns the character at a particular offset relative to the begin * offset of the input. *

* @param offset The offset at which to fetch a character (relative to * the beginning offset. * @return The character at a particular offset. * @exception ArrayIndexOutOfBoundsException If the offset does not occur * within the bounds of the input. */ public char charAt(int offset) { return _originalBuffer[_beginOffset + offset]; } /** * Returns a new string that is a substring of the PatternMatcherInput * instance. The substring begins at the specified beginOffset relative * to the begin offset and extends to the specified endOffset - 1 * relative to the begin offset of the PatternMatcherInput instance. *

* @param beginOffset The offset relative to the begin offset of the * PatternMatcherInput at which to start the substring (inclusive). * @param endOffset The offset relative to the begin offset of the * PatternMatcherInput at which to end the substring (exclusive). * @return The specified substring. * @exception ArrayIndexOutOfBoundsException If one of the offsets does * not occur within the bounds of the input. */ public String substring(int beginOffset, int endOffset) { return new String(_originalBuffer, _beginOffset+beginOffset, endOffset - beginOffset); } /** * Returns a new string that is a substring of the PatternMatcherInput * instance. The substring begins at the specified beginOffset relative * to the begin offset and extends to the end offset of the * PatternMatcherInput. *

* @param beginOffset The offset relative to the begin offset of the * PatternMatcherInput at which to start the substring. * @return The specified substring. * @exception ArrayIndexOutOfBoundsException If the offset does not occur * within the bounds of the input. */ public String substring(int beginOffset) { beginOffset+=_beginOffset; return new String(_originalBuffer, beginOffset, _endOffset - beginOffset); } /** * Retrieves the original input used to initialize the PatternMatcherInput * instance. If a String was used, the String instance will be returned. * If a char[] was used, a char instance will be returned. This violates * data encapsulation and hiding principles, but it is a great convenience * for the programmer. *

* @return The String or char[] input used to initialize the * PatternMatcherInput instance. */ public Object getInput(){ if(_originalStringInput == null) return _originalCharInput; return _originalStringInput; } /** * Retrieves the char[] buffer to be used used as input by PatternMatcher * implementations to look for matches. This array should be treated * as read only by the programmer. *

* @return The char[] buffer to be used as input by PatternMatcher * implementations. */ public char[] getBuffer() { return _originalBuffer; } /** * Returns whether or not the end of the input has been reached. *

* @return True if the current offset is greater than or equal to the * end offset. */ public boolean endOfInput(){ return (_currentOffset >= _endOffset); } /** * @return The offset of the input that should be considered the start * of the region to be considered as input by PatternMatcher * methods. */ public int getBeginOffset() { return _beginOffset; } /** * @return The offset of the input that should be considered the end * of the region to be considered as input by PatternMatcher * methods. This offset is actually 1 plus the last offset * that is part of the input region. */ public int getEndOffset() { return _endOffset; } /** * @return The offset of the input that should be considered the current * offset where PatternMatcher methods should start looking for * matches. */ public int getCurrentOffset() { return _currentOffset; } /** * Sets the offset of the input that should be considered the start * of the region to be considered as input by PatternMatcher * methods. In other words, everything before this offset is ignored * by a PatternMatcher. *

* @param offset The offset to use as the beginning of the input. */ public void setBeginOffset(int offset) { _beginOffset = offset; } /** * Sets the offset of the input that should be considered the end * of the region to be considered as input by PatternMatcher * methods. This offset is actually 1 plus the last offset * that is part of the input region. *

* @param offset The offset to use as the end of the input. */ public void setEndOffset(int offset) { _endOffset = offset; } /** * Sets the offset of the input that should be considered the current * offset where PatternMatcher methods should start looking for * matches. Also resets all match offset information to -1. By calling * this method, you invalidate all previous match information. Therefore * a PatternMatcher implementation must call this method before setting * match offset information. *

* @param offset The offset to use as the current offset. */ public void setCurrentOffset(int offset) { _currentOffset = offset; setMatchOffsets(-1, -1); } /** * Returns the string representation of the input, where the input is * considered to start from the begin offset and end at the end offset. *

* @return The string representation of the input. */ public String toString() { return new String(_originalBuffer, _beginOffset, length()); } /** * A convenience method returning the part of the input occurring before * the last match found by a call to a Perl5Matcher * {@link Perl5Matcher#contains contains} method. *

* @return The input preceeding a match. */ public String preMatch() { return new String(_originalBuffer, _beginOffset, _matchBeginOffset - _beginOffset); } /** * A convenience method returning the part of the input occurring after * the last match found by a call to a Perl5Matcher * {@link Perl5Matcher#contains contains} method. *

* @return The input succeeding a contains() match. */ public String postMatch() { return new String(_originalBuffer, _matchEndOffset, _endOffset - _matchEndOffset); } /** * A convenience method returning the part of the input corresponding * to the last match found by a call to a Perl5Matcher * {@link Perl5Matcher#contains contains} method. * The method is not called getMatch() so as not to confuse it * with Perl5Matcher's getMatch() which returns a MatchResult instance * and also for consistency with preMatch() and postMatch(). *

* @return The input consisting of the match found by contains(). */ public String match() { return new String(_originalBuffer, _matchBeginOffset, _matchEndOffset - _matchBeginOffset); } /** * This method is intended for use by PatternMatcher implementations. * It is necessary to record the location of the previous match so that * consecutive contains() matches involving null string matches are * properly handled. If you are not implementing a PatternMatcher, forget * this method exists. If you use it outside of its intended context, you * will only disrupt the stored state. *

* As a note, the preMatch(), postMatch(), and match() methods are provided * as conveniences because PatternMatcherInput must store match offset * information to completely preserve state for consecutive PatternMatcher * contains() matches. *

* @param matchBeginOffset The begin offset of a match found by contains(). * @param matchEndOffset The end offset of a match found by contains(). */ public void setMatchOffsets(int matchBeginOffset, int matchEndOffset) { _matchBeginOffset = matchBeginOffset; _matchEndOffset = matchEndOffset; } /** * Returns the offset marking the beginning of the match found by * contains(). *

* @return The begin offset of a contains() match. */ public int getMatchBeginOffset() { return _matchBeginOffset; } /** * Returns the offset marking the end of the match found by contains(). *

* @return The end offset of a contains() match. */ public int getMatchEndOffset() { return _matchEndOffset; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/Perl5Pattern.java0000644000175000017500000001216407773723336025073 0ustar arnaudarnaud/* * $Id: Perl5Pattern.java,v 1.8 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; import java.io.*; /** * An implementation of the Pattern interface for Perl5 regular expressions. * This class is compatible with the Perl5Compiler and Perl5Matcher * classes. When a Perl5Compiler instance compiles a regular expression * pattern, it produces a Perl5Pattern instance containing internal * data structures used by Perl5Matcher to perform pattern matches. * This class cannot be subclassed and * cannot be directly instantiated by the programmer as it would not * make sense. Perl5Pattern instances should only be created through calls * to a Perl5Compiler instance's compile() methods. The class implements * the Serializable interface so that instances may be pre-compiled and * saved to disk if desired. * * @version @version@ * @since 1.0 * @see Perl5Compiler * @see Perl5Matcher */ public final class Perl5Pattern implements Pattern, Serializable, Cloneable { static final int _OPT_ANCH_BOL = 0x01, _OPT_ANCH_MBOL = 0x02, _OPT_SKIP = 0x04, _OPT_IMPLICIT = 0x08; static final int _OPT_ANCH = (_OPT_ANCH_BOL | _OPT_ANCH_MBOL); String _expression; char[] _program; int _mustUtility; int _back; int _minLength; int _numParentheses; boolean _isCaseInsensitive, _isExpensive; int _startClassOffset; int _anchor; int _options; char[] _mustString, _startString; /** * A dummy constructor with default visibility to override the default * public constructor that would be created otherwise by the compiler. */ Perl5Pattern(){ } /* private void readObject(ObjectInputStream stream) throws IOException, ClassNotFoundException { stream.defaultReadObject(); _beginMatchOffsets = new int[_numParentheses + 1]; _endMatchOffsets = new int[_numParentheses + 1]; } */ /** * This method returns the string representation of the pattern. *

* @return The original string representation of the regular expression * pattern. */ public String getPattern() { return _expression; } /** * This method returns an integer containing the compilation options used * to compile this pattern. *

* @return The compilation options used to compile the pattern. */ public int getOptions() { return _options; } /* // For testing public String toString() { return "Parens: " + _numParentheses + " " + _beginMatchOffsets.length + " " + _endMatchOffsets.length; } */ } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/PatternCompiler.java0000644000175000017500000001664407773723336025665 0ustar arnaudarnaud/* * $Id: PatternCompiler.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * The PatternCompiler interface defines the operations a regular * expression compiler must implement. However, the types of * regular expressions recognized by a compiler and the Pattern * implementations produced as a result of compilation are not * restricted. *

* A PatternCompiler instance is used to compile the string representation * (either as a String or char[]) of a regular expression into a Pattern * instance. The Pattern can then be used in conjunction with the appropriate * PatternMatcher instance to perform pattern searches. A form * of use might be: *

*

 * PatternCompiler compiler;
 * PatternMatcher matcher;
 * Pattern pattern;
 * String input;
 *
 * // Initialization of compiler, matcher, and input omitted;
 *
 * try {
 *   pattern = compiler.compile("\\d+");
 * } catch(MalformedPatternException e) {
 *   System.out.println("Bad pattern.");
 *   System.out.println(e.getMessage());
 *   System.exit(1);
 * }
 * 
 *
 * if(matcher.matches(input, pattern))
 *    System.out.println(input + " is a number");
 * else
 *    System.out.println(input + " is not a number");
 *
 * 
*

* Specific PatternCompiler implementations such as Perl5Compiler may have * variations of the compile() methods that take extra options affecting * the compilation of a pattern. However, the PatternCompiler method * implementations should provide the default behavior of the class. * * @version @version@ * @since 1.0 * @see Pattern * @see PatternMatcher * @see MalformedPatternException */ public interface PatternCompiler { /** * Compiles a regular expression into a data structure that can be used * by a PatternMatcher implementation to perform pattern matching. *

* @param pattern A regular expression to compile. * @return A Pattern instance constituting the compiled regular expression. * @exception MalformedPatternException If the compiled expression * does not conform to the grammar understood by the PatternCompiler or * if some other error in the expression is encountered. */ public Pattern compile(String pattern) throws MalformedPatternException; /** * Compiles a regular expression into a data structure that can be * used by a PatternMatcher implementation to perform pattern matching. * Additional regular expression syntax specific options can be passed * as a bitmask of options. *

* @param pattern A regular expression to compile. * @param options A set of flags giving the compiler instructions on * how to treat the regular expression. The flags * are a logical OR of any number of the allowable * constants permitted by the PatternCompiler * implementation. * @return A Pattern instance constituting the compiled regular expression. * @exception MalformedPatternException If the compiled expression * does not conform to the grammar understood by the PatternCompiler or * if some other error in the expression is encountered. */ public Pattern compile(String pattern, int options) throws MalformedPatternException; /** * Compiles a regular expression into a data structure that can be used * by a PatternMatcher implementation to perform pattern matching. *

* @param pattern A regular expression to compile. * @return A Pattern instance constituting the compiled regular expression. * @exception MalformedPatternException If the compiled expression * does not conform to the grammar understood by the PatternCompiler or * if some other error in the expression is encountered. */ public Pattern compile(char[] pattern) throws MalformedPatternException; /** * Compiles a regular expression into a data structure that can be * used by a PatternMatcher implementation to perform pattern matching. * Additional regular expression syntax specific options can be passed * as a bitmask of options. *

* @param pattern A regular expression to compile. * @param options A set of flags giving the compiler instructions on * how to treat the regular expression. The flags * are a logical OR of any number of the allowable * constants permitted by the PatternCompiler * implementation. * @return A Pattern instance constituting the compiled regular expression. * @exception MalformedPatternException If the compiled expression * does not conform to the grammar understood by the PatternCompiler or * if some other error in the expression is encountered. */ public Pattern compile(char[] pattern, int options) throws MalformedPatternException; } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/MalformedPatternException.java0000644000175000017500000000733407773723336027674 0ustar arnaudarnaud/* * $Id: MalformedPatternException.java,v 1.8 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * A class used to signify the occurrence of a syntax error in a * regular expression that is being compiled. The class is not * declared final so that it may be subclassed for identifying * more specific pattern comilation errors. However, at this point * in time, this package does not subclass MalformedPatternException * for any purpose. This does not preclude users and third party * implementors of the interfaces of this package from subclassing it * for their own purposes. * * @version @version@ * @since 1.0 * @see PatternCompiler */ public class MalformedPatternException extends Exception { /** * Simply calls the corresponding constructor of its superclass. */ public MalformedPatternException() { super(); } /** * Simply calls the corresponding constructor of its superclass. *

* @param message A message indicating the nature of the parse error. */ public MalformedPatternException(String message) { super(message); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/OpCode.java0000644000175000017500000002117207773723336023716 0ustar arnaudarnaud/* * $Id: OpCode.java,v 1.11 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * The OpCode class should not be instantiated. It is a holder of various * constants and static methods pertaining to the manipulation of the * op-codes used in a compiled regular expression. * * @version @version@ * @since 1.0 */ final class OpCode { private OpCode() { } // Names, values, and descriptions of operators correspond to those of // Perl regex bytecodes and for compatibility purposes are drawn from // regcomp.h in the Perl source tree by Larry Wall. static final char // Has Operand Meaning _END = 0, // no End of program. _BOL = 1, // no Match "" at beginning of line. _MBOL = 2, // no Same, assuming multiline. _SBOL = 3, // no Same, assuming singleline. _EOL = 4, // no Match "" at end of line. _MEOL = 5, // no Same, assuming multiline. _SEOL = 6, // no Same, assuming singleline. _ANY = 7, // no Match any one character (except newline). _SANY = 8, // no Match any one character. _ANYOF = 9, // yes Match character in (or not in) this class. _CURLY = 10, // yes Match this simple thing {n,m} times. _CURLYX = 11, // yes Match this complex thing {n,m} times. _BRANCH = 12, // yes Match this alternative, or the next... _BACK = 13, // no Match "", "next" ptr points backward. _EXACTLY = 14, // yes Match this string (preceded by length). _NOTHING = 15, // no Match empty string. _STAR = 16, // yes Match this (simple) thing 0 or more times. _PLUS = 17, // yes Match this (simple) thing 1 or more times. _ALNUM = 18, // no Match any word character _NALNUM = 19, // no Match any non-word character _BOUND = 20, // no Match "" at any word boundary _NBOUND = 21, // no Match "" at any word non-boundary _SPACE = 22, // no Match any whitespace character _NSPACE = 23, // no Match any non-whitespace character _DIGIT = 24, // no Match any numeric character _NDIGIT = 25, // no Match any non-numeric character _REF = 26, // yes Match some already matched string _OPEN = 27, // yes Mark this point in input as start of #n. _CLOSE = 28, // yes Analogous to OPEN. _MINMOD = 29, // no Next operator is not greedy. _GBOL = 30, // no Matches where last m//g left off. _IFMATCH = 31, // no Succeeds if the following matches. _UNLESSM = 32, // no Fails if the following matches. _SUCCEED = 33, // no Return from a subroutine, basically. _WHILEM = 34, // no Do curly processing and see if rest matches. _ANYOFUN = 35, // yes Match unicode character in this class. _NANYOFUN= 36, // yes Match unicode character not in this class. _RANGE = 37, // yes Range flag in // Change the names of these constants later to make it clear they // are POSIX classes. _ALPHA = 38, _BLANK = 39, _CNTRL = 40, _GRAPH = 41, _LOWER = 42, _PRINT = 43, _PUNCT = 44, _UPPER = 45, _XDIGIT = 46, _OPCODE = 47, _NOPCODE = 48, _ONECHAR = 49, _ALNUMC = 50, _ASCII = 51; // Lengths of the various operands. static final int _operandLength[] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // OpCode 0-9 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, // OpCode 10-19 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, // OpCode 20-29 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // OpCode 30-39 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // OpCode 40-49 0, 0 // OpCode 50-51 }; static final char _opType[] = { _END, _BOL, _BOL, _BOL, _EOL, _EOL, _EOL, _ANY, _ANY, _ANYOF, _CURLY, _CURLY, _BRANCH, _BACK, _EXACTLY, _NOTHING, _STAR, _PLUS, _ALNUM, _NALNUM, _BOUND, _NBOUND, _SPACE, _NSPACE, _DIGIT, _NDIGIT, _REF, _OPEN, _CLOSE, _MINMOD, _BOL, _BRANCH, _BRANCH, _END, _WHILEM, _ANYOFUN, _NANYOFUN, _RANGE, _ALPHA, _BLANK, _CNTRL, _GRAPH, _LOWER, _PRINT, _PUNCT, _UPPER, _XDIGIT, _OPCODE, _NOPCODE, _ONECHAR, _ALNUMC, _ASCII }; static final char _opLengthVaries[] = { _BRANCH, _BACK, _STAR, _PLUS, _CURLY, _CURLYX, _REF, _WHILEM }; static final char _opLengthOne[] = { _ANY, _SANY, _ANYOF, _ALNUM, _NALNUM, _SPACE, _NSPACE, _DIGIT, _NDIGIT, _ANYOFUN, _NANYOFUN, _ALPHA, _BLANK, _CNTRL, _GRAPH, _LOWER, _PRINT, _PUNCT, _UPPER, _XDIGIT, _OPCODE, _NOPCODE, _ONECHAR, _ALNUMC, _ASCII }; static final int _NULL_OFFSET = -1; static final char _NULL_POINTER = 0; static final int _getNextOffset(char[] program, int offset) { return ((int)program[offset + 1]); } static final char _getArg1(char[] program, int offset) { return program[offset + 2]; } static final char _getArg2(char[] program, int offset) { return program[offset + 3]; } static final int _getOperand(int offset) { return (offset + 2); } static final boolean _isInArray(char ch, char[] array, int start) { while(start < array.length) if(ch == array[start++]) return true; return false; } static final int _getNextOperator(int offset) { return (offset + 2); } static final int _getPrevOperator(int offset) { return (offset - 2); } static final int _getNext(char[] program, int offset) { int offs; if(program == null) return _NULL_OFFSET; offs = _getNextOffset(program, offset); if(offs == _NULL_POINTER) return _NULL_OFFSET; if(program[offset] == OpCode._BACK) return (offset - offs); return (offset + offs); } // doesn't really belong in this class, but we want Perl5Matcher not to // depend on Perl5Compiler // Matches Perl's definition of \w, which is different from [:alnum:] static final boolean _isWordCharacter(char token) { return (Character.isLetterOrDigit(token) || token == '_'); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/MatchResult.java0000644000175000017500000002223707773723336025003 0ustar arnaudarnaud/* * $Id: MatchResult.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * The MatchResult interface allows PatternMatcher implementors to return * results storing match information in whatever format they like, while * presenting a consistent way of accessing that information. However, * MatchResult implementations should strictly follow the behavior * described for the interface methods. *

* * A MatchResult instance contains a pattern match and its saved groups. * You can access the entire match directly using the * {@link #group(int)} method with an argument of 0, * or by the {@link #toString()} method which is * defined to return the same thing. It is also possible to obtain * the beginning and ending offsets of a match relative to the input * producing the match by using the * {@link #beginOffset(int)} and {@link #endOffset(int)} methods. The * {@link #begin(int)} and {@link #end(int)} are useful in some * circumstances and return the begin and end offsets of the subgroups * of a match relative to the beginning of the match. *

* * You might use a MatchResult as follows: *

 * int groups;
 * PatternMatcher matcher;
 * PatternCompiler compiler;
 * Pattern pattern;
 * PatternMatcherInput input;
 * MatchResult result;
 *
 * compiler = new Perl5Compiler();
 * matcher  = new Perl5Matcher();
 *
 * try {
 *   pattern = compiler.compile(somePatternString);
 * } catch(MalformedPatternException e) {
 *   System.out.println("Bad pattern.");
 *   System.out.println(e.getMessage());
 *   return;
 * }
 *
 * input   = new PatternMatcherInput(someStringInput);
 *
 * while(matcher.contains(input, pattern)) {
 *   result = matcher.getMatch();  
 *   // Perform whatever processing on the result you want.
 *   // Here we just print out all its elements to show how its
 *   // methods are used.
 * 
 *   System.out.println("Match: " + result.toString());
 *   System.out.println("Length: " + result.length());
 *   groups = result.groups();
 *   System.out.println("Groups: " + groups);
 *   System.out.println("Begin offset: " + result.beginOffset(0));
 *   System.out.println("End offset: " + result.endOffset(0));
 *   System.out.println("Saved Groups: ");
 *
 *   // Start at 1 because we just printed out group 0
 *   for(int group = 1; group < groups; group++) {
 *	 System.out.println(group + ": " + result.group(group));
 *	 System.out.println("Begin: " + result.begin(group));
 *	 System.out.println("End: " + result.end(group));
 *   }
 * }
 * 
* * @version @version@ * @since 1.0 * @see PatternMatcher */ public interface MatchResult { /** * A convenience method returning the length of the entire match. * If you want to get the length of a particular subgroup you should * use the {@link #group(int)} method to get * the string and then access its length() method as follows: *

*

   * int length = -1; // Use -1 to indicate group doesn't exist
   * MatchResult result;
   * String subgroup;
   * 
   * // Initialization of result omitted
   *
   * subgroup = result.group(1);
   * if(subgroup != null)
   *   length = subgroup.length();
   *
   * 
*

* * The length() method serves as a more a more efficient way to do: *

*

   * length = result.group(0).length();
   * 
*

* * @return The length of the match. */ public int length(); /** * @return The number of groups contained in the result. This number * includes the 0th group. In other words, the result refers * to the number of parenthesized subgroups plus the entire match * itself. */ public int groups(); /** * Returns the contents of the parenthesized subgroups of a match, * counting parentheses from left to right and starting from 1. * Group 0 always refers to the entire match. For example, if the * pattern foo(\d+) is used to extract a match * from the input abfoo123 , then group(0) * will return foo123 and group(1) will return * 123 . group(2) will return * null because there is only one subgroup in the original * pattern. *

* @param group The pattern subgroup to return. * @return A string containing the indicated pattern subgroup. Group * 0 always refers to the entire match. If a group was never * matched, it returns null. This is not to be confused with * a group matching the null string, which will return a String * of length 0. */ public String group(int group); /** * @param group The pattern subgroup. * @return The offset into group 0 of the first token in the indicated * pattern subgroup. If a group was never matched or does * not exist, returns -1. Be aware that a group that matches * the null string at the end of a match will have an offset * equal to the length of the string, so you shouldn't blindly * use the offset to index an array or String. */ public int begin(int group); /** * @param group The pattern subgroup. * @return Returns one plus the offset into group 0 of the last token in * the indicated pattern subgroup. If a group was never matched * or does not exist, returns -1. A group matching the null * string will return its start offset. */ public int end(int group); /** * Returns an offset marking the beginning of the pattern match * relative to the beginning of the input from which the match * was extracted. *

* @param group The pattern subgroup. * @return The offset of the first token in the indicated * pattern subgroup. If a group was never matched or does * not exist, returns -1. */ public int beginOffset(int group); /** * Returns an offset marking the end of the pattern match * relative to the beginning of the input from which the match was * extracted. *

* @param group The pattern subgroup. * @return Returns one plus the offset of the last token in * the indicated pattern subgroup. If a group was never matched * or does not exist, returns -1. A group matching the null * string will return its start offset. */ public int endOffset(int group); /** * Returns the same as group(0). * * @return A string containing the entire match. */ public String toString(); } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/Perl5Debug.java0000644000175000017500000002331507773723336024504 0ustar arnaudarnaud/* * $Id: Perl5Debug.java,v 1.11 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * The Perl5Debug class is not intended for general use and should not * be instantiated, but is provided because some users may find the output * of its single method to be useful. * The Perl5Compiler class generates a representation of a * regular expression identical to that of Perl5 in the abstract, but * not in terms of actual data structures. The Perl5Debug class allows * the bytecode program contained by a Perl5Pattern to be printed out for * comparison with the program generated by Perl5 with the -r option. * * @version @version@ * @since 1.0 * @see Perl5Pattern */ public final class Perl5Debug { /** * A dummy constructor to prevent instantiation of Perl5Debug. */ private Perl5Debug() { } /** * This method prints to a String the bytecode program contained in a * Perl5Pattern._ The program byte codes are identical to those * generated by Perl5 with the -r option, but the offsets are * different due to the different data structures used. This * method is useful for diagnosing suspected bugs. The Perl5Compiler * class is designed to produce regular expression programs identical * to those produced by Perl5. By comparing the output of this method * and the output of Perl5 with the -r option on the same regular * expression, you can determine if Perl5Compiler correctly compiled * an expression. *

* @param regexp The Perl5Pattern to print. * @return A string representation of the bytecode program defining the * regular expression. */ public static String printProgram(Perl5Pattern regexp) { StringBuffer buffer; char operator = OpCode._OPEN, prog[]; int offset, next; prog = regexp._program; offset = 1; buffer = new StringBuffer(); while(operator != OpCode._END) { operator = prog[offset]; buffer.append(offset); _printOperator(prog, offset, buffer); next = OpCode._getNext(prog, offset); offset+=OpCode._operandLength[operator]; buffer.append("(" + next + ")"); offset+=2; if(operator == OpCode._ANYOF) { offset += 16; } else if(operator == OpCode._ANYOFUN || operator == OpCode._NANYOFUN) { while(prog[offset] != OpCode._END) { if(prog[offset] == OpCode._RANGE) offset+=3; else offset+=2; } ++offset; } else if(operator == OpCode._EXACTLY) { ++offset; buffer.append(" <"); //while(prog[offset] != '0') while(prog[offset] != CharStringPointer._END_OF_STRING) { //while(prog[offset] != 0 && // prog[offset] != CharStringPointer._END_OF_STRING) { buffer.append(prog[offset]); ++offset; } buffer.append(">"); ++offset; } buffer.append('\n'); } // Can print some other stuff here. if(regexp._startString != null) buffer.append("start `" + new String(regexp._startString) + "' "); if(regexp._startClassOffset != OpCode._NULL_OFFSET) { buffer.append("stclass `"); _printOperator(prog, regexp._startClassOffset, buffer); buffer.append("' "); } if((regexp._anchor & Perl5Pattern._OPT_ANCH) != 0) buffer.append("anchored "); if((regexp._anchor & Perl5Pattern._OPT_SKIP) != 0) buffer.append("plus "); if((regexp._anchor & Perl5Pattern._OPT_IMPLICIT) != 0) buffer.append("implicit "); if(regexp._mustString != null) buffer.append("must have \""+ new String(regexp._mustString) + "\" back " + regexp._back + " "); buffer.append("minlen " + regexp._minLength + '\n'); return buffer.toString(); } static void _printOperator(char[] program, int offset, StringBuffer buffer) { String str = null; buffer.append(":"); switch(program[offset]) { case OpCode._BOL : str = "BOL"; break; case OpCode._MBOL : str = "MBOL"; break; case OpCode._SBOL : str = "SBOL"; break; case OpCode._EOL : str = "EOL"; break; case OpCode._MEOL : str = "MEOL"; break; case OpCode._ANY : str = "ANY"; break; case OpCode._SANY : str = "SANY"; break; case OpCode._ANYOF : str = "ANYOF"; break; case OpCode._ANYOFUN : str = "ANYOFUN"; break; case OpCode._NANYOFUN : str = "NANYOFUN"; break; /* case OpCode._ANYOF : // debug buffer.append("ANYOF\n\n"); int foo = OpCode._OPERAND(offset); char ch; for(ch=0; ch < 256; ch++) { if(ch % 16 == 0) buffer.append(" "); buffer.append((program[foo + (ch >> 4)] & (1 << (ch & 0xf))) == 0 ? 0 : 1); } buffer.append("\n\n"); break; */ case OpCode._BRANCH : str = "BRANCH"; break; case OpCode._EXACTLY : str = "EXACTLY"; break; case OpCode._NOTHING : str = "NOTHING"; break; case OpCode._BACK : str = "BACK"; break; case OpCode._END : str = "END"; break; case OpCode._ALNUM : str = "ALNUM"; break; case OpCode._NALNUM : str = "NALNUM"; break; case OpCode._BOUND : str = "BOUND"; break; case OpCode._NBOUND : str = "NBOUND"; break; case OpCode._SPACE : str = "SPACE"; break; case OpCode._NSPACE : str = "NSPACE"; break; case OpCode._DIGIT : str = "DIGIT"; break; case OpCode._NDIGIT : str = "NDIGIT"; break; case OpCode._ALPHA : str = "ALPHA"; break; case OpCode._BLANK : str = "BLANK"; break; case OpCode._CNTRL : str = "CNTRL"; break; case OpCode._GRAPH : str = "GRAPH"; break; case OpCode._LOWER : str = "LOWER"; break; case OpCode._PRINT : str = "PRINT"; break; case OpCode._PUNCT : str = "PUNCT"; break; case OpCode._UPPER : str = "UPPER"; break; case OpCode._XDIGIT : str = "XDIGIT"; break; case OpCode._ALNUMC : str = "ALNUMC"; break; case OpCode._ASCII : str = "ASCII"; break; case OpCode._CURLY : buffer.append("CURLY {"); buffer.append((int)OpCode._getArg1(program, offset)); buffer.append(','); buffer.append((int)OpCode._getArg2(program, offset)); buffer.append('}'); break; case OpCode._CURLYX: buffer.append("CURLYX {"); buffer.append((int)OpCode._getArg1(program, offset)); buffer.append(','); buffer.append((int)OpCode._getArg2(program, offset)); buffer.append('}'); break; case OpCode._REF: buffer.append("REF"); buffer.append((int)OpCode._getArg1(program, offset)); break; case OpCode._OPEN: buffer.append("OPEN"); buffer.append((int)OpCode._getArg1(program, offset)); break; case OpCode._CLOSE: buffer.append("CLOSE"); buffer.append((int)OpCode._getArg1(program, offset)); break; case OpCode._STAR : str = "STAR"; break; case OpCode._PLUS : str = "PLUS"; break; case OpCode._MINMOD : str = "MINMOD"; break; case OpCode._GBOL : str = "GBOL"; break; case OpCode._UNLESSM: str = "UNLESSM"; break; case OpCode._IFMATCH: str = "IFMATCH"; break; case OpCode._SUCCEED: str = "SUCCEED"; break; case OpCode._WHILEM : str = "WHILEM"; break; default: buffer.append("Operator is unrecognized. Faulty expression code!"); break; } if(str != null) buffer.append(str); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/package.html0000644000175000017500000001271407773723336024165 0ustar arnaudarnaud This package used to be the OROMatcher library and provides both generic regular expression interfaces and Perl5 regular expression compatible implementation classes.

Note: The following information will be moved into the user's guide.

Perl5 regular expressions

Here we summarize the syntax of Perl5.003 regular expressions, all of which is supported by the Perl5 classes in this package. However, for a definitive reference, you should consult the perlre man page that accompanies the Perl5 distribution and also the book Programming Perl, 2nd Edition from O'Reilly & Associates. We are working toward implementing the features added after Perl5.003 up to and including Perl 5.6. Please remember, we only guarantee support for Perl5.003 expressions in version 2.0.

  • Alternatives separated by |
  • Quantified atoms
    {n,m}
    Match at least n but not more than m times.
    {n,}
    Match at least n times.
    {n}
    Match exactly n times.
    *
    Match 0 or more times.
    +
    Match 1 or more times.
    ?
    Match 0 or 1 times.
  • Atoms
    • regular expression within parentheses
    • a . matches everything except \n
    • a ^ is a null token matching the beginning of a string or line (i.e., the position right after a newline or right before the beginning of a string)
    • a $ is a null token matching the end of a string or line (i.e., the position right before a newline or right after the end of a string)
    • Character classes (e.g., [abcd]) and ranges (e.g. [a-z])
      • Special backslashed characters work within a character class (except for backreferences and boundaries).
      • \b is backspace inside a character class
    • Special backslashed characters
      \b
      null token matching a word boundary (\w on one side and \W on the other)
      \B
      null token matching a boundary that isn't a word boundary
      \A
      Match only at beginning of string
      \Z
      Match only at end of string (or before newline at the end)
      \n
      newline
      \r
      carriage return
      \t
      tab
      \f
      formfeed
      \d
      digit [0-9]
      \D
      non-digit [^0-9]
      \w
      word character [0-9a-z_A-Z]
      \W
      a non-word character [^0-9a-z_A-Z]
      \s
      a whitespace character [ \t\n\r\f]
      \S
      a non-whitespace character [^ \t\n\r\f]
      \xnn
      hexadecimal representation of character
      \cD
      matches the corresponding control character
      \nn or \nnn
      octal representation of character unless a backreference. a
      \1, \2, \3, etc.
      match whatever the first, second, third, etc. parenthesized group matched. This is called a backreference. If there is no corresponding group, the number is interpreted as an octal representation of a character.
      \0
      matches null character
      Any other backslashed character matches itself
  • Expressions within parentheses are matched as subpattern groups and saved for use by certain methods.

By default, a quantified subpattern is greedy . In other words it matches as many times as possible without causing the rest of the pattern not to match. To change the quantifiers to match the minimum number of times possible, without causing the rest of the pattern not to match, you may use a "?" right after the quantifier.

*?
Match 0 or more times
+?
Match 1 or more times
??
Match 0 or 1 time
{n}?
Match exactly n times
{n,}?
Match at least n times
{n,m}?
Match at least n but not more than m times

Perl5 extended regular expressions are fully supported.

(?#text)
An embedded comment causing text to be ignored.
(?:regexp)
Groups things like "()" but doesn't cause the group match to be saved.
(?=regexp)
A zero-width positive lookahead assertion. For example, \w+(?=\s) matches a word followed by whitespace, without including whitespace in the MatchResult.
(?!regexp)
A zero-width negative lookahead assertion. For example foo(?!bar) matches any occurrence of "foo" that isn't followed by "bar". Remember that this is a zero-width assertion, which means that a(?!b)d will match ad because a is followed by a character that is not b (the d) and a d follows the zero-width assertion.
(?imsx)
One or more embedded pattern-match modifiers. i enables case insensitivity, m enables multiline treatment of the input, s enables single line treatment of the input, and x enables extended whitespace comments. jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/Pattern.java0000644000175000017500000001044207773723336024160 0ustar arnaudarnaud/* * $Id: Pattern.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * The Pattern interface allows multiple representations of a regular * expression to be defined. In general, different regular expression * compilers will produce different types of pattern representations. * Some will produce state transition tables derived from syntax trees, * others will produce byte code representations of an NFA, etc. The * Pattern interface does not impose any specific internal pattern * representation, and consequently, Pattern implementations are not meant * to be interchangeable among differing PatternCompiler and PatternMatcher * implementations. The documentation accompanying a specific implementation * will define what other classes a Pattern can interact with. * * @version @version@ * @since 1.0 * @see PatternCompiler * @see PatternMatcher */ public interface Pattern { /** * This method returns the string representation of the pattern. Its * purpose is to allow a pattern to be reconstructed after compilation. * In other words, when you compile a pattern, the resulting data * structures bear no relation to the string defining the pattern. * It is often useful to be able to access the string defining a pattern * after it has been compiled. *

* @return The original string representation of the regular expression * pattern. */ public String getPattern(); /** * This method returns an integer containing the compilation options used * to compile this pattern. *

* @return The compilation options used to compile the pattern. */ public int getOptions(); } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/CharStringPointer.java0000644000175000017500000001075407773723336026156 0ustar arnaudarnaud/* * $Id: CharStringPointer.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * The CharStringPointer class is used to facilitate traversal of a char[] * in the manner pointer traversals of strings are performed in C/C++. * It is expected that the compiler will inline all the functions. * * @since 1.0 * @version @version@ */ final class CharStringPointer { static final char _END_OF_STRING = Character.MAX_VALUE; int _offset; char[] _array; CharStringPointer(char[] charArray, int offset) { _array = charArray; _offset = offset; } CharStringPointer(char[] charArray) { this(charArray, 0); } char _getValue() { return _getValue(_offset); } char _getValue(int offset) { if(offset < _array.length && offset >= 0) return _array[offset]; return _END_OF_STRING; } char _getValueRelative(int offset) { return _getValue(_offset + offset); } int _getLength() { return _array.length; } int _getOffset() { return _offset; } void _setOffset(int offset) { _offset = offset; } boolean _isAtEnd() { return (_offset >= _array.length); } char _increment(int inc) { _offset+=inc; if(_isAtEnd()) { _offset = _array.length; return _END_OF_STRING; } return _array[_offset]; } char _increment() { return _increment(1); } char _decrement(int inc) { _offset-=inc; if(_offset < 0) _offset = 0; return _array[_offset]; } char _decrement() { return _decrement(1); } char _postIncrement() { char ret; ret = _getValue(); _increment(); return ret; } char _postDecrement() { char ret; ret = _getValue(); _decrement(); return ret; } String _toString(int offset) { return new String(_array, offset, _array.length - offset); } public String toString() { return _toString(0); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/StringSubstitution.java0000644000175000017500000001315107773723336026446 0ustar arnaudarnaud/* * $Id: StringSubstitution.java,v 1.8 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * StringSubstitution implements a Substitution consisting of a simple * literal string. This class is intended for use with * {@link Util#substitute Util.substitute}. * * @version @version@ * @since 1.1 * @see Substitution * @see Util * @see Util#substitute * @see Substitution * @see Perl5Substitution */ public class StringSubstitution implements Substitution { int _subLength; String _substitution; /** * Default constructor initializing substitution to a zero length * String. */ public StringSubstitution() { this(""); } /** * Creates a StringSubstitution representing the given string. *

* @param substitution The string to use as a substitution. */ public StringSubstitution(String substitution) { setSubstitution(substitution); } /** * Sets the substitution represented by this StringSubstitution. You * should use this method in order to avoid repeatedly allocating new * StringSubstitutions. It is recommended that you allocate a single * StringSubstitution and reuse it by using this method when appropriate. *

* @param substitution The string to use as a substitution. */ public void setSubstitution(String substitution) { _substitution = substitution; _subLength = substitution.length(); } /** * Returns the string substitution represented by this object. *

* @return The string substitution represented by this object. */ public String getSubstitution() { return _substitution; } /** * Returns the same value as {@link #getSubstitution()}. *

* @return The string substitution represented by this object. */ public String toString() { return getSubstitution(); } /** * Appends the substitution to a buffer containing the original input * with substitutions applied for the pattern matches found so far. * See * {@link Substitution#appendSubstitution Substitution.appendSubstition()} * for more details regarding the expected behavior of this method. *

* @param appendBuffer The buffer containing the new string resulting * from performing substitutions on the original input. * @param match The current match causing a substitution to be made. * @param substitutionCount The number of substitutions that have been * performed so far by Util.substitute. * @param originalInput The original input upon which the substitutions are * being performed. This is a read-only parameter and is not modified. * @param matcher The PatternMatcher used to find the current match. * @param pattern The Pattern used to find the current match. */ public void appendSubstitution(StringBuffer appendBuffer, MatchResult match, int substitutionCount, PatternMatcherInput originalInput, PatternMatcher matcher, Pattern pattern) { if(_subLength == 0) return; appendBuffer.append(_substitution); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/PatternMatcher.java0000644000175000017500000003113107773723336025462 0ustar arnaudarnaud/* * $Id: PatternMatcher.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; /** * The PatternMatcher interface defines the operations a regular * expression matcher must implement. However, the types of the Pattern * implementations recognized by a matcher are not restricted. Typically * PatternMatcher instances will only recognize a specific type of Pattern. * For example, the Perl5Matcher only recognizes Perl5Pattern instances. * However, none of the PatternMatcher methods are required to throw an * exception in case of the use of an invalid pattern. This is done for * efficiency reasons, although usually a CastClassException will be * thrown by the Java runtime system if you use the wrong Pattern * implementation. It is the responsibility of the programmer to make * sure he uses the correct Pattern instance with a given PatternMatcher * instance. The current version of this package only contains the Perl5 * suite of pattern matching classes, but future ones for other regular * expression grammars may be added and users may also create their own * implementations of the provided interfaces. Therefore the programmer * should be careful not to mismatch classes. * * @version @version@ * @since 1.0 * @see Pattern * @see PatternCompiler * @see MatchResult */ public interface PatternMatcher { /** * Determines if a prefix of a string (represented as a char[]) * matches a given pattern, starting from a given offset into the string. * If a prefix of the string matches the pattern, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. *

* This method is useful for certain common token identification tasks * that are made more difficult without this functionality. *

* @param input The char[] to test for a prefix match. * @param pattern The Pattern to be matched. * @param offset The offset at which to start searching for the prefix. * @return True if input matches pattern, false otherwise. */ public boolean matchesPrefix(char[] input, Pattern pattern, int offset); /** * Determines if a prefix of a string matches a given pattern. * If a prefix of the string matches the pattern, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. *

* This method is useful for certain common token identification tasks * that are made more difficult without this functionality. *

* @param input The String to test for a prefix match. * @param pattern The Pattern to be matched. * @return True if input matches pattern, false otherwise. */ public boolean matchesPrefix(String input, Pattern pattern); /** * Determines if a prefix of a string (represented as a char[]) * matches a given pattern. * If a prefix of the string matches the pattern, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. *

* This method is useful for certain common token identification tasks * that are made more difficult without this functionality. *

* @param input The char[] to test for a prefix match. * @param pattern The Pattern to be matched. * @return True if input matches pattern, false otherwise. */ public boolean matchesPrefix(char[] input, Pattern pattern); /** * Determines if a prefix of a PatternMatcherInput instance * matches a given pattern. If there is a match, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. Unlike the * {@link #contains(PatternMatcherInput, Pattern)} * method, the current offset of the PatternMatcherInput argument * is not updated. You should remember that the region starting * from the begin offset of the PatternMatcherInput will be * tested for a prefix match. *

* This method is useful for certain common token identification tasks * that are made more difficult without this functionality. *

* @param input The PatternMatcherInput to test for a prefix match. * @param pattern The Pattern to be matched. * @return True if input matches pattern, false otherwise. */ public boolean matchesPrefix(PatternMatcherInput input, Pattern pattern); /** * Determines if a string exactly matches a given pattern. If * there is an exact match, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. *

* @param input The String to test for an exact match. * @param pattern The Pattern to be matched. * @return True if input matches pattern, false otherwise. */ public boolean matches(String input, Pattern pattern); /** * Determines if a string (represented as a char[]) exactly matches * a given pattern. If there is an exact match, a MatchResult * instance representing the match is made accesible via * {@link #getMatch()}. *

* @param input The char[] to test for a match. * @param pattern The Pattern to be matched. * @return True if input matches pattern, false otherwise. */ public boolean matches(char[] input, Pattern pattern); /** * Determines if the contents of a PatternMatcherInput instance * exactly matches a given pattern. If * there is an exact match, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. Unlike the * {@link #contains(PatternMatcherInput, Pattern)} * method, the current offset of the PatternMatcherInput argument * is not updated. You should remember that the region between * the begin and end offsets of the PatternMatcherInput will be * tested for an exact match. *

* @param input The PatternMatcherInput to test for a match. * @param pattern The Pattern to be matched. * @return True if input matches pattern, false otherwise. */ public boolean matches(PatternMatcherInput input, Pattern pattern); /** * Determines if a string contains a pattern. If the pattern is * matched by some substring of the input, a MatchResult instance * representing the first such match is made acessible via * {@link #getMatch()}. If you want to access * subsequent matches you should either use a PatternMatcherInput object * or use the offset information in the MatchResult to create a substring * representing the remaining input. Using the MatchResult offset * information is the recommended method of obtaining the parts of the * string preceeding the match and following the match. *

* @param input The String to test for a match. * @param pattern The Pattern to be matched. * @return True if the input contains a pattern match, false otherwise. */ public boolean contains(String input, Pattern pattern); /** * Determines if a string (represented as a char[]) contains a pattern. * If the pattern is matched by some substring of the input, a MatchResult * instance representing the first such match is made acessible via * {@link #getMatch()}. If you want to access * subsequent matches you should either use a PatternMatcherInput object * or use the offset information in the MatchResult to create a substring * representing the remaining input. Using the MatchResult offset * information is the recommended method of obtaining the parts of the * string preceeding the match and following the match. *

* @param input The String to test for a match. * @param pattern The Pattern to be matched. * @return True if the input contains a pattern match, false otherwise. */ public boolean contains(char[] input, Pattern pattern); /** * Determines if the contents of a PatternMatcherInput, starting from the * current offset of the input contains a pattern. * If a pattern match is found, a MatchResult * instance representing the first such match is made acessible via * {@link #getMatch()}. The current offset of the * PatternMatcherInput is set to the offset corresponding to the end * of the match, so that a subsequent call to this method will continue * searching where the last call left off. You should remember that the * region between the begin and end offsets of the PatternMatcherInput are * considered the input to be searched, and that the current offset * of the PatternMatcherInput reflects where a search will start from. * Matches extending beyond the end offset of the PatternMatcherInput * will not be matched. In other words, a match must occur entirely * between the begin and end offsets of the input. See * {@link PatternMatcherInput} for more details. *

* This method is usually used in a loop as follows: *

   * PatternMatcher matcher;
   * PatternCompiler compiler;
   * Pattern pattern;
   * PatternMatcherInput input;
   * MatchResult result;
   *
   * compiler = new Perl5Compiler();
   * matcher  = new Perl5Matcher();
   *
   * try {
   *   pattern = compiler.compile(somePatternString);
   * } catch(MalformedPatternException e) {
   *   System.out.println("Bad pattern.");
   *   System.out.println(e.getMessage());
   *   return;
   * }
   *
   * input   = new PatternMatcherInput(someStringInput);
   *
   * while(matcher.contains(input, pattern)) {
   *   result = matcher.getMatch();  
   *   // Perform whatever processing on the result you want.
   * }
   *
   * 
*

* @param input The PatternMatcherInput to test for a match. * @param pattern The Pattern to be matched. * @return True if the input contains a pattern match, false otherwise. */ public boolean contains(PatternMatcherInput input, Pattern pattern); /** * Fetches the last match found by a call to a matches() or contains() * method. *

* @return A MatchResult instance containing the pattern match found * by the last call to any one of the matches() or contains() * methods. If no match was found by the last call, * returns null. */ public MatchResult getMatch(); } jakarta-oro-2.0.8/src/java/org/apache/oro/text/regex/Perl5Matcher.java0000644000175000017500000015725007773723336025047 0ustar arnaudarnaud/* * $Id: Perl5Matcher.java,v 1.27 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.regex; import java.util.*; /** * The Perl5Matcher class is used to match regular expressions * (conforming to the Perl5 regular expression syntax) generated by * Perl5Compiler. *

* Perl5Compiler and Perl5Matcher are designed with the intent that * you use a separate instance of each per thread to avoid the overhead * of both synchronization and concurrent access (e.g., a match that takes * a long time in one thread will block the progress of another thread with * a shorter match). If you want to use a single instance of each * in a concurrent program, you must appropriately protect access to * the instances with critical sections. If you want to share Perl5Pattern * instances between concurrently executing instances of Perl5Matcher, you * must compile the patterns with {@link Perl5Compiler#READ_ONLY_MASK}. * * @version @version@ * @since 1.0 * @see PatternMatcher * @see Perl5Compiler */ public final class Perl5Matcher implements PatternMatcher { private static final char __EOS = Character.MAX_VALUE; private static final int __INITIAL_NUM_OFFSETS = 20; private boolean __multiline = false, __lastSuccess = false; private boolean __caseInsensitive = false; private char __previousChar, __input[], __originalInput[]; private Perl5Repetition __currentRep; private int __numParentheses, __bol, __eol, __currentOffset, __endOffset; private char[] __program; private int __expSize, __inputOffset, __lastParen; private int[] __beginMatchOffsets, __endMatchOffsets; private Stack __stack = new Stack(); private Perl5MatchResult __lastMatchResult = null; private static boolean __compare(char[] s1, int s1Offs, char[] s2, int s2Offs, int n) { int cnt; for(cnt = 0; cnt < n; cnt++, s1Offs++, s2Offs++) { if(s1Offs >= s1.length) return false; if(s2Offs >= s2.length) return false; if(s1[s1Offs] != s2[s2Offs]) return false; } return true; } private static int __findFirst(char[] input, int current, int endOffset, char[] mustString) { int count, saveCurrent; char ch; if(input.length == 0) return endOffset; ch = mustString[0]; // Find the offset of the first character of the must string while(current < endOffset) { if(ch == input[current]){ saveCurrent = current; count = 0; while(current < endOffset && count < mustString.length) { if(mustString[count] != input[current]) break; ++count; ++current; } current = saveCurrent; if(count >= mustString.length) break; } ++current; } return current; } private void __pushState(int parenFloor) { int[] state; int stateEntries, paren; stateEntries = 3*(__expSize - parenFloor); if(stateEntries <= 0) state = new int[3]; else state = new int[stateEntries + 3]; state[0] = __expSize; state[1] = __lastParen; state[2] = __inputOffset; for(paren = __expSize; paren > parenFloor; --paren, stateEntries-=3) { state[stateEntries] = __endMatchOffsets[paren]; state[stateEntries + 1] = __beginMatchOffsets[paren]; state[stateEntries + 2] = paren; } __stack.push(state); } private void __popState() { int[] state; int entry, paren; state = (int[])__stack.pop(); __expSize = state[0]; __lastParen = state[1]; __inputOffset = state[2]; for(entry = 3; entry < state.length; entry+=3) { paren = state[entry + 2]; __beginMatchOffsets[paren] = state[entry + 1]; if(paren <= __lastParen) __endMatchOffsets[paren] = state[entry]; } for(paren = __lastParen + 1; paren <= __numParentheses; paren++) { if(paren > __expSize) __beginMatchOffsets[paren] = OpCode._NULL_OFFSET; __endMatchOffsets[paren] = OpCode._NULL_OFFSET; } } // Initialize globals needed before calling __tryExpression for first time private void __initInterpreterGlobals(Perl5Pattern expression, char[] input, int beginOffset, int endOffset, int currentOffset) { // Remove this hack after more efficient case-folding and unicode // character classes are implemented __caseInsensitive = expression._isCaseInsensitive; __input = input; __endOffset = endOffset; __currentRep = new Perl5Repetition(); __currentRep._numInstances = 0; __currentRep._lastRepetition = null; __program = expression._program; __stack.setSize(0); // currentOffset should always be >= beginOffset and should // always be equal to zero when beginOffset equals 0, but we // make a weak attempt to protect against a violation of this // precondition if(currentOffset == beginOffset || currentOffset <= 0) __previousChar = '\n'; else { __previousChar = input[currentOffset - 1]; if(!__multiline && __previousChar == '\n') __previousChar = '\0'; } __numParentheses = expression._numParentheses; __currentOffset = currentOffset; __bol = beginOffset; __eol = endOffset; // Ok, here we're using endOffset as a temporary variable. endOffset = __numParentheses + 1; if(__beginMatchOffsets == null || endOffset > __beginMatchOffsets.length) { if(endOffset < __INITIAL_NUM_OFFSETS) endOffset = __INITIAL_NUM_OFFSETS; __beginMatchOffsets = new int[endOffset]; __endMatchOffsets = new int[endOffset]; } } // Set the match result information. Only call this if we successfully // matched. private void __setLastMatchResult() { int offs, maxEndOffs = 0; //endOffset+=dontTry; __lastMatchResult = new Perl5MatchResult(__numParentheses + 1); // This can happen when using Perl5StreamInput if(__endMatchOffsets[0] > __originalInput.length) throw new ArrayIndexOutOfBoundsException(); __lastMatchResult._matchBeginOffset = __beginMatchOffsets[0]; while(__numParentheses >= 0) { offs = __beginMatchOffsets[__numParentheses]; if(offs >= 0) __lastMatchResult._beginGroupOffset[__numParentheses] = offs - __lastMatchResult._matchBeginOffset; else __lastMatchResult._beginGroupOffset[__numParentheses] = OpCode._NULL_OFFSET; offs = __endMatchOffsets[__numParentheses]; if(offs >= 0) { __lastMatchResult._endGroupOffset[__numParentheses] = offs - __lastMatchResult._matchBeginOffset; if(offs > maxEndOffs && offs <= __originalInput.length) maxEndOffs = offs; } else __lastMatchResult._endGroupOffset[__numParentheses] = OpCode._NULL_OFFSET; --__numParentheses; } __lastMatchResult._match = new String(__originalInput, __beginMatchOffsets[0], maxEndOffs - __beginMatchOffsets[0]); // Free up for garbage collection __originalInput = null; } // Expects to receive a valid regular expression program. No checking // is done to ensure validity. // __originalInput must be set before calling this method for // __lastMatchResult to be set correctly. // beginOffset marks the beginning of the string // currentOffset marks where to start the pattern search private boolean __interpret(Perl5Pattern expression, char[] input, int beginOffset, int endOffset, int currentOffset) { boolean success; int minLength = 0, dontTry = 0, offset; char ch, mustString[]; __initInterpreterGlobals(expression, input, beginOffset, endOffset, currentOffset); success = false; mustString = expression._mustString; _mainLoop: while(true) { if(mustString != null && ((expression._anchor & Perl5Pattern._OPT_ANCH) == 0 || ((__multiline || (expression._anchor & Perl5Pattern._OPT_ANCH_MBOL) != 0) && expression._back >= 0))) { __currentOffset = __findFirst(__input, __currentOffset, endOffset, mustString); if(__currentOffset >= endOffset) { if((expression._options & Perl5Compiler.READ_ONLY_MASK) == 0) expression._mustUtility++; success = false; break _mainLoop; } else if(expression._back >= 0) { __currentOffset-=expression._back; if(__currentOffset < currentOffset) __currentOffset = currentOffset; minLength = expression._back + mustString.length; } else if(!expression._isExpensive && (expression._options & Perl5Compiler.READ_ONLY_MASK) == 0 && (--expression._mustUtility < 0)) { // Be careful! The preceding logical expression is constructed // so that mustUtility is only decremented if the expression is // compiled without READ_ONLY_MASK. mustString = expression._mustString = null; __currentOffset = currentOffset; } else { __currentOffset = currentOffset; minLength = mustString.length; } } if((expression._anchor & Perl5Pattern._OPT_ANCH) != 0) { if(__currentOffset == beginOffset && __tryExpression(beginOffset)) { success = true; break _mainLoop; } else if(__multiline || (expression._anchor & Perl5Pattern._OPT_ANCH_MBOL) != 0 || (expression._anchor & Perl5Pattern._OPT_IMPLICIT) != 0) { if(minLength > 0) dontTry = minLength - 1; endOffset-=dontTry; if(__currentOffset > currentOffset) --__currentOffset; while(__currentOffset < endOffset) { if(__input[__currentOffset++] == '\n') { if(__currentOffset < endOffset && __tryExpression(__currentOffset)) { success = true; break _mainLoop; } } } } break _mainLoop; } if(expression._startString != null) { mustString = expression._startString; if((expression._anchor & Perl5Pattern._OPT_SKIP) != 0) { ch = mustString[0]; while(__currentOffset < endOffset) { if(ch == __input[__currentOffset]) { if(__tryExpression(__currentOffset)){ success = true; break _mainLoop; } ++__currentOffset; while(__currentOffset < endOffset && __input[__currentOffset] == ch) ++__currentOffset; } ++__currentOffset; } } else { while((__currentOffset = __findFirst(__input, __currentOffset, endOffset, mustString)) < endOffset){ if(__tryExpression(__currentOffset)) { success = true; break _mainLoop; } ++__currentOffset; } } break _mainLoop; } if((offset = expression._startClassOffset) != OpCode._NULL_OFFSET) { boolean doEvery, tmp; char op; doEvery = ((expression._anchor & Perl5Pattern._OPT_SKIP) == 0); if(minLength > 0) dontTry = minLength - 1; endOffset -= dontTry; tmp = true; switch(op = __program[offset]) { case OpCode._ANYOF: offset = OpCode._getOperand(offset); while(__currentOffset < endOffset) { ch = __input[__currentOffset]; if(ch < 256 && (__program[offset + (ch >> 4)] & (1 << (ch & 0xf))) == 0) { if(tmp && __tryExpression(__currentOffset)) { success = true; break _mainLoop; } else tmp = doEvery; } else tmp = true; ++__currentOffset; } break; case OpCode._ANYOFUN: case OpCode._NANYOFUN: offset = OpCode._getOperand(offset); while(__currentOffset < endOffset) { ch = __input[__currentOffset]; if(__matchUnicodeClass(ch, __program, offset, op)) { if(tmp && __tryExpression(__currentOffset)) { success = true; break _mainLoop; } else tmp = doEvery; } else tmp = true; ++__currentOffset; } break; case OpCode._BOUND: if(minLength > 0) { ++dontTry; --endOffset; } if(__currentOffset != beginOffset) { ch = __input[__currentOffset - 1]; tmp = OpCode._isWordCharacter(ch); } else tmp = OpCode._isWordCharacter(__previousChar); while(__currentOffset < endOffset) { ch = __input[__currentOffset]; if(tmp != OpCode._isWordCharacter(ch)){ tmp = !tmp; if(__tryExpression(__currentOffset)) { success = true; break _mainLoop; } } ++__currentOffset; } if((minLength > 0 || tmp) && __tryExpression(__currentOffset)) { success = true; break _mainLoop; } break; case OpCode._NBOUND: if(minLength > 0) { ++dontTry; --endOffset; } if(__currentOffset != beginOffset) { ch = __input[__currentOffset - 1]; tmp = OpCode._isWordCharacter(ch); } else tmp = OpCode._isWordCharacter(__previousChar); while(__currentOffset < endOffset) { ch = __input[__currentOffset]; if(tmp != OpCode._isWordCharacter(ch)) tmp = !tmp; else if(__tryExpression(__currentOffset)) { success = true; break _mainLoop; } ++__currentOffset; } if((minLength > 0 || !tmp) && __tryExpression(__currentOffset)) { success = true; break _mainLoop; } break; case OpCode._ALNUM: while(__currentOffset < endOffset) { ch = __input[__currentOffset]; if(OpCode._isWordCharacter(ch)) { if(tmp && __tryExpression(__currentOffset)) { success = true; break _mainLoop; } else tmp = doEvery; } else tmp = true; ++__currentOffset; } break; case OpCode._NALNUM: while(__currentOffset < endOffset) { ch = __input[__currentOffset]; if(!OpCode._isWordCharacter(ch)) { if(tmp && __tryExpression(__currentOffset)) { success = true; break _mainLoop; } else tmp = doEvery; } else tmp = true; ++__currentOffset; } break; case OpCode._SPACE: while(__currentOffset < endOffset) { if(Character.isWhitespace(__input[__currentOffset])) { if(tmp && __tryExpression(__currentOffset)) { success = true; break _mainLoop; } else tmp = doEvery; } else tmp = true; ++__currentOffset; } break; case OpCode._NSPACE: while(__currentOffset < endOffset) { if(!Character.isWhitespace(__input[__currentOffset])) { if(tmp && __tryExpression(__currentOffset)) { success = true; break _mainLoop; } else tmp = doEvery; } else tmp = true; ++__currentOffset; } break; case OpCode._DIGIT: while(__currentOffset < endOffset) { if(Character.isDigit(__input[__currentOffset])) { if(tmp && __tryExpression(__currentOffset)) { success = true; break _mainLoop; } else tmp = doEvery; } else tmp = true; ++__currentOffset; } break; case OpCode._NDIGIT: while(__currentOffset < endOffset) { if(!Character.isDigit(__input[__currentOffset])) { if(tmp && __tryExpression(__currentOffset)) { success = true; break _mainLoop; } else tmp = doEvery; } else tmp = true; ++__currentOffset; } break; } // end switch } else { if(minLength > 0) dontTry = minLength - 1; endOffset-=dontTry; do { if(__tryExpression(__currentOffset)) { success = true; break _mainLoop; } } while(__currentOffset++ < endOffset); } break _mainLoop; } // end while __lastSuccess = success; __lastMatchResult = null; return success; } private boolean __matchUnicodeClass(char code, char __program[], int offset ,char opcode) { boolean isANYOF = ( opcode == OpCode._ANYOFUN ); while( __program[offset] != OpCode._END ){ if( __program[offset] == OpCode._RANGE ){ offset++; if((code >= __program[offset]) && (code <= __program[offset+1])){ return isANYOF; } else { offset+=2; } } else if(__program[offset] == OpCode._ONECHAR) { offset++; if(__program[offset++] == code) return isANYOF; } else { isANYOF = (__program[offset] == OpCode._OPCODE) ? isANYOF : !isANYOF; offset++; switch ( __program[offset++] ) { case OpCode._ALNUM: if(OpCode._isWordCharacter(code)) return isANYOF; break; case OpCode._NALNUM: if(!OpCode._isWordCharacter(code)) return isANYOF; break; case OpCode._SPACE: if(Character.isWhitespace(code)) return isANYOF; break; case OpCode._NSPACE: if(!Character.isWhitespace(code)) return isANYOF; break; case OpCode._DIGIT: if(Character.isDigit(code)) return isANYOF; break; case OpCode._NDIGIT: if(!Character.isDigit(code)) return isANYOF; break; case OpCode._ALNUMC: if(Character.isLetterOrDigit(code)) return isANYOF; break; case OpCode._ALPHA: if(Character.isLetter(code)) return isANYOF; break; case OpCode._BLANK: if(Character.isSpaceChar(code)) return isANYOF; break; case OpCode._CNTRL: if(Character.isISOControl(code)) return isANYOF; break; case OpCode._LOWER: if(Character.isLowerCase(code)) return isANYOF; // Remove this hack after more efficient case-folding and unicode // character classes are implemented if(__caseInsensitive && Character.isUpperCase(code)) return isANYOF; break; case OpCode._UPPER: if(Character.isUpperCase(code)) return isANYOF; // Remove this hack after more efficient case-folding and unicode // character classes are implemented if(__caseInsensitive && Character.isLowerCase(code)) return isANYOF; break; case OpCode._PRINT: if(Character.isSpaceChar(code)) return isANYOF; // Fall through to check if the character is alphanumeric, // or a punctuation mark. Printable characters are either // alphanumeric, punctuation marks, or spaces. case OpCode._GRAPH: if(Character.isLetterOrDigit(code)) return isANYOF; // Fall through to check if the character is a punctuation mark. // Graph characters are either alphanumeric or punctuation. case OpCode._PUNCT: switch ( Character.getType(code) ) { case Character.DASH_PUNCTUATION: case Character.START_PUNCTUATION: case Character.END_PUNCTUATION: case Character.CONNECTOR_PUNCTUATION: case Character.OTHER_PUNCTUATION: case Character.MATH_SYMBOL: case Character.CURRENCY_SYMBOL: case Character.MODIFIER_SYMBOL: return isANYOF; default: break; } break; case OpCode._XDIGIT: if( (code >= '0' && code <= '9') || (code >= 'a' && code <= 'f') || (code >= 'A' && code <= 'F')) return isANYOF; break; case OpCode._ASCII: if(code < 0x80)return isANYOF; } } } return !isANYOF; } private boolean __tryExpression(int offset) { int count; __inputOffset = offset; __lastParen = 0; __expSize = 0; if(__numParentheses > 0) { for(count=0; count <= __numParentheses; count++) { __beginMatchOffsets[count] = OpCode._NULL_OFFSET; __endMatchOffsets[count] = OpCode._NULL_OFFSET; } } if(__match(1)){ __beginMatchOffsets[0] = offset; __endMatchOffsets[0] = __inputOffset; return true; } return false; } private int __repeat(int offset, int max) { int scan, eol, operand, ret; char ch; char op; scan = __inputOffset; eol = __eol; if(max != Character.MAX_VALUE && max < eol - scan) eol = scan + max; operand = OpCode._getOperand(offset); switch(op = __program[offset]) { case OpCode._ANY: while(scan < eol && __input[scan] != '\n') ++scan; break; case OpCode._SANY: scan = eol; break; case OpCode._EXACTLY: ++operand; while(scan < eol && __program[operand] == __input[scan]) ++scan; break; case OpCode._ANYOF: if(scan < eol && (ch = __input[scan]) < 256) { while((ch < 256 ) && (__program[operand + (ch >> 4)] & (1 << (ch & 0xf))) == 0) { if(++scan < eol) ch = __input[scan]; else break; } } break; case OpCode._ANYOFUN: case OpCode._NANYOFUN: if(scan < eol) { ch = __input[scan]; while(__matchUnicodeClass(ch, __program, operand, op)){ if(++scan < eol) ch = __input[scan]; else break; } } break; case OpCode._ALNUM: while(scan < eol && OpCode._isWordCharacter(__input[scan])) ++scan; break; case OpCode._NALNUM: while(scan < eol && !OpCode._isWordCharacter(__input[scan])) ++scan; break; case OpCode._SPACE: while(scan < eol && Character.isWhitespace(__input[scan])) ++scan; break; case OpCode._NSPACE: while(scan < eol && !Character.isWhitespace(__input[scan])) ++scan; break; case OpCode._DIGIT: while(scan < eol && Character.isDigit(__input[scan])) ++scan; break; case OpCode._NDIGIT: while(scan < eol && !Character.isDigit(__input[scan])) ++scan; break; default: break; } ret = scan - __inputOffset; __inputOffset = scan; return ret; } private boolean __match(int offset) { char nextChar, op; int scan, next, input, maxScan, current, line, arg; boolean inputRemains = true, minMod = false; Perl5Repetition rep; input = __inputOffset; inputRemains = (input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); scan = offset; maxScan = __program.length; while(scan < maxScan /*&& scan > 0*/){ next = OpCode._getNext(__program, scan); switch(op = __program[scan]) { case OpCode._BOL: if(input == __bol ? __previousChar == '\n' : (__multiline && (inputRemains || input < __eol) && __input[input - 1] == '\n')) break; return false; case OpCode._MBOL: if(input == __bol ? __previousChar == '\n' : ((inputRemains || input < __eol) && __input[input - 1] == '\n')) break; return false; case OpCode._SBOL: if(input == __bol && __previousChar == '\n') break; return false; case OpCode._GBOL: if(input == __bol) break; return true; case OpCode._EOL : if((inputRemains || input < __eol) && nextChar != '\n') return false; if(!__multiline && __eol - input > 1) return false; break; case OpCode._MEOL: if((inputRemains || input < __eol) && nextChar != '\n') return false; break; case OpCode._SEOL: if((inputRemains || input < __eol) && nextChar != '\n') return false; if(__eol - input > 1) return false; break; case OpCode._SANY: if(!inputRemains && input >= __eol) return false; inputRemains = (++input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); break; case OpCode._ANY: if((!inputRemains && input >= __eol) || nextChar == '\n') return false; inputRemains = (++input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); break; case OpCode._EXACTLY: current = OpCode._getOperand(scan); line = __program[current++]; if(__program[current] != nextChar) return false; if(__eol - input < line) return false; if(line > 1 && !__compare(__program, current, __input, input, line)) return false; input+=line; inputRemains = (input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); break; case OpCode._ANYOF: current = OpCode._getOperand(scan); if(nextChar == __EOS && inputRemains) nextChar = __input[input]; if(nextChar >= 256 || (__program[current + (nextChar >> 4)] & (1 << (nextChar & 0xf))) != 0) return false; if(!inputRemains && input >= __eol) return false; inputRemains = (++input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); break; case OpCode._ANYOFUN: case OpCode._NANYOFUN: current = OpCode._getOperand(scan); if(nextChar == __EOS && inputRemains) nextChar = __input[input]; if(!__matchUnicodeClass(nextChar, __program, current, op)) return false; if(!inputRemains && input >= __eol) return false; inputRemains = (++input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); break; case OpCode._ALNUM: if(!inputRemains) return false; if(!OpCode._isWordCharacter(nextChar)) return false; inputRemains = (++input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); break; case OpCode._NALNUM: if(!inputRemains && input >= __eol) return false; if(OpCode._isWordCharacter(nextChar)) return false; inputRemains = (++input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); break; case OpCode._NBOUND: case OpCode._BOUND: boolean a, b; if(input == __bol) a = OpCode._isWordCharacter(__previousChar); else a = OpCode._isWordCharacter(__input[input - 1]); b = OpCode._isWordCharacter(nextChar); if((a == b) == (__program[scan] == OpCode._BOUND)) return false; break; case OpCode._SPACE: if(!inputRemains && input >= __eol) return false; if(!Character.isWhitespace(nextChar)) return false; inputRemains = (++input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); break; case OpCode._NSPACE: if(!inputRemains) return false; if(Character.isWhitespace(nextChar)) return false; inputRemains = (++input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); break; case OpCode._DIGIT: if(!Character.isDigit(nextChar)) return false; inputRemains = (++input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); break; case OpCode._NDIGIT: if(!inputRemains && input >= __eol) return false; if(Character.isDigit(nextChar)) return false; inputRemains = (++input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); break; case OpCode._REF: arg = OpCode._getArg1(__program, scan); current = __beginMatchOffsets[arg]; if(current == OpCode._NULL_OFFSET) return false; if(__endMatchOffsets[arg] == OpCode._NULL_OFFSET) return false; if(current == __endMatchOffsets[arg]) break; if(__input[current] != nextChar) return false; line = __endMatchOffsets[arg] - current; if(input + line > __eol) return false; if(line > 1 && !__compare(__input, current, __input, input, line)) return false; input+=line; inputRemains = (input < __endOffset); nextChar = (inputRemains ? __input[input] : __EOS); break; case OpCode._NOTHING: break; case OpCode._BACK: break; case OpCode._OPEN: arg = OpCode._getArg1(__program, scan); __beginMatchOffsets[arg] = input; if(arg > __expSize) __expSize = arg; break; case OpCode._CLOSE: arg = OpCode._getArg1(__program, scan); __endMatchOffsets[arg] = input; if(arg > __lastParen) __lastParen = arg; break; case OpCode._CURLYX: rep = new Perl5Repetition(); rep._lastRepetition = __currentRep; __currentRep = rep; rep._parenFloor = __lastParen; rep._numInstances = -1; rep._min = OpCode._getArg1(__program, scan); rep._max = OpCode._getArg2(__program, scan); rep._scan = OpCode._getNextOperator(scan) + 2; rep._next = next; rep._minMod = minMod; // Must initialize to -1 because if we initialize to 0 and are // at the beginning of the input the OpCode._WHILEM case will // not work right. rep._lastLocation = -1; __inputOffset = input; // use minMod as temporary minMod = __match(OpCode._getPrevOperator(next)); // leave scope call not pertinent? __currentRep = rep._lastRepetition; return minMod; case OpCode._WHILEM: rep = __currentRep; arg = rep._numInstances + 1; __inputOffset = input; if(input == rep._lastLocation) { __currentRep = rep._lastRepetition; line = __currentRep._numInstances; if(__match(rep._next)) return true; __currentRep._numInstances = line; __currentRep = rep; return false; } if(arg < rep._min) { rep._numInstances = arg; rep._lastLocation = input; if(__match(rep._scan)) return true; rep._numInstances = arg - 1; return false; } if(rep._minMod) { __currentRep = rep._lastRepetition; line = __currentRep._numInstances; if(__match(rep._next)) return true; __currentRep._numInstances = line; __currentRep = rep; if(arg >= rep._max) return false; __inputOffset = input; rep._numInstances = arg; rep._lastLocation = input; if(__match(rep._scan)) return true; rep._numInstances = arg - 1; return false; } if(arg < rep._max) { __pushState(rep._parenFloor); rep._numInstances = arg; rep._lastLocation = input; if(__match(rep._scan)) return true; __popState(); __inputOffset = input; } __currentRep = rep._lastRepetition; line = __currentRep._numInstances; if(__match(rep._next)) return true; rep._numInstances = line; __currentRep = rep; rep._numInstances = arg - 1; return false; case OpCode._BRANCH: if(__program[next] != OpCode._BRANCH) next = OpCode._getNextOperator(scan); else { int lastParen; lastParen = __lastParen; do { __inputOffset = input; if(__match(OpCode._getNextOperator(scan))) return true; for(arg = __lastParen; arg > lastParen; --arg) //__endMatchOffsets[arg] = 0; __endMatchOffsets[arg] = OpCode._NULL_OFFSET; __lastParen = arg; scan = OpCode._getNext(__program, scan); } while(scan != OpCode._NULL_OFFSET && __program[scan] == OpCode._BRANCH); return false; } break; case OpCode._MINMOD: minMod = true; break; case OpCode._CURLY: case OpCode._STAR: case OpCode._PLUS: if(op == OpCode._CURLY) { line = OpCode._getArg1(__program, scan); arg = OpCode._getArg2(__program, scan); scan = OpCode._getNextOperator(scan) + 2; } else if(op == OpCode._STAR) { line = 0; arg = Character.MAX_VALUE; scan = OpCode._getNextOperator(scan); } else { line = 1; arg = Character.MAX_VALUE; scan = OpCode._getNextOperator(scan); } if(__program[next] == OpCode._EXACTLY) { nextChar = __program[OpCode._getOperand(next) + 1]; current = 0; } else { nextChar = __EOS; current = -1000; } __inputOffset = input; if(minMod) { minMod = false; if(line > 0 && __repeat(scan, line) < line) return false; while(arg >= line || (arg == Character.MAX_VALUE && line > 0)) { // there may be a bug here with respect to // __inputOffset >= __endOffset, but it seems to be right for // now. the issue is with __inputOffset being reset later. // is this test really supposed to happen here? if(current == -1000 || __inputOffset >= __endOffset || __input[__inputOffset] == nextChar) { if(__match(next)) return true; } __inputOffset = input + line; if(__repeat(scan, 1) != 0) { ++line; __inputOffset = input + line; } else return false; } } else { arg = __repeat(scan, arg); if(line < arg && OpCode._opType[__program[next]] == OpCode._EOL && ((!__multiline && __program[next] != OpCode._MEOL) || __program[next] == OpCode._SEOL)) line = arg; while(arg >= line) { // there may be a bug here with respect to // __inputOffset >= __endOffset, but it seems to be right for // now. the issue is with __inputOffset being reset later. // is this test really supposed to happen here? if(current == -1000 || __inputOffset >= __endOffset || __input[__inputOffset] == nextChar) { if(__match(next)) return true; } --arg; __inputOffset = input + arg; } } return false; case OpCode._SUCCEED: case OpCode._END: __inputOffset = input; // This enforces the rule that two consecutive matches cannot have // the same end offset. if(__inputOffset == __lastMatchInputEndOffset) return false; return true; case OpCode._IFMATCH: __inputOffset = input; scan = OpCode._getNextOperator(scan); if(!__match(scan)) return false; break; case OpCode._UNLESSM: __inputOffset = input; scan = OpCode._getNextOperator(scan); if(__match(scan)) return false; break; default: // todo: Need to throw an exception here. } // end switch //scan = (next > 0 ? next : 0); scan = next; } // end while scan return false; } /** * Set whether or not subsequent calls to {@link #matches matches()} * or {@link #contains contains()} should treat the input as * consisting of multiple lines. The default behavior is for * input to be treated as consisting of multiple lines. This method * should only be called if the Perl5Pattern used for a match was * compiled without either of the Perl5Compiler.MULTILINE_MASK or * Perl5Compiler.SINGLELINE_MASK flags, and you want to alter the * behavior of how the ^, $, and . metacharacters are * interpreted on the fly. The compilation options used when compiling * a pattern ALWAYS override the behavior specified by setMultiline(). See * {@link Perl5Compiler} for more details. *

* @param multiline If set to true treats the input as consisting of * multiple lines with respect to the ^ and $ * metacharacters. If set to false treats the input as consisting * of a single line with respect to the ^ and $ * metacharacters. */ public void setMultiline(boolean multiline) { __multiline = multiline; } /** * @return True if the matcher is treating input as consisting of multiple * lines with respect to the ^ and $ metacharacters, * false otherwise. */ public boolean isMultiline() { return __multiline; } char[] _toLower(char[] input) { int current; char[] inp; // todo: // Certainly not the best way to do case insensitive matching. // Must definitely change this in some way, but for now we // do what Perl does and make a copy of the input, converting // it all to lowercase. This is truly better handled in the // compilation phase. inp = new char[input.length]; System.arraycopy(input, 0, inp, 0, input.length); input = inp; // todo: Need to inline toLowerCase() for(current = 0; current < input.length; current++) if(Character.isUpperCase(input[current])) input[current] = Character.toLowerCase(input[current]); return input; } /** * Determines if a prefix of a string (represented as a char[]) * matches a given pattern, starting from a given offset into the string. * If a prefix of the string matches the pattern, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. *

* This method is useful for certain common token identification tasks * that are made more difficult without this functionality. *

* @param input The char[] to test for a prefix match. * @param pattern The Pattern to be matched. * @param offset The offset at which to start searching for the prefix. * @return True if input matches pattern, false otherwise. */ public boolean matchesPrefix(char[] input, Pattern pattern, int offset) { Perl5Pattern expression; expression = (Perl5Pattern)pattern; __originalInput = input; if(expression._isCaseInsensitive) input = _toLower(input); __initInterpreterGlobals(expression, input, 0, input.length, offset); __lastSuccess = __tryExpression(offset); __lastMatchResult = null; return __lastSuccess; } /** * Determines if a prefix of a string (represented as a char[]) * matches a given pattern. * If a prefix of the string matches the pattern, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. *

* This method is useful for certain common token identification tasks * that are made more difficult without this functionality. *

* @param input The char[] to test for a prefix match. * @param pattern The Pattern to be matched. * @return True if input matches pattern, false otherwise. */ public boolean matchesPrefix(char[] input, Pattern pattern) { return matchesPrefix(input, pattern, 0); } /** * Determines if a prefix of a string matches a given pattern. * If a prefix of the string matches the pattern, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. *

* This method is useful for certain common token identification tasks * that are made more difficult without this functionality. *

* @param input The String to test for a prefix match. * @param pattern The Pattern to be matched. * @return True if input matches pattern, false otherwise. */ public boolean matchesPrefix(String input, Pattern pattern) { return matchesPrefix(input.toCharArray(), pattern, 0); } /** * Determines if a prefix of a PatternMatcherInput instance * matches a given pattern. If there is a match, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. Unlike the * {@link #contains(PatternMatcherInput, Pattern)} * method, the current offset of the PatternMatcherInput argument * is not updated. However, unlike the * {@link #matches matches(PatternMatcherInput, Pattern)} method, * matchesPrefix() will start its search from the current offset * rather than the begin offset of the PatternMatcherInput. *

* This method is useful for certain common token identification tasks * that are made more difficult without this functionality. *

* @param input The PatternMatcherInput to test for a prefix match. * @param pattern The Pattern to be matched. * @return True if input matches pattern, false otherwise. */ public boolean matchesPrefix(PatternMatcherInput input, Pattern pattern) { char[] inp; Perl5Pattern expression; expression = (Perl5Pattern)pattern; __originalInput = input._originalBuffer; if(expression._isCaseInsensitive) { if(input._toLowerBuffer == null) input._toLowerBuffer = _toLower(__originalInput); inp = input._toLowerBuffer; } else inp = __originalInput; __initInterpreterGlobals(expression, inp, input._beginOffset, input._endOffset, input._currentOffset); __lastSuccess = __tryExpression(input._currentOffset); __lastMatchResult = null; return __lastSuccess; } /** * Determines if a string (represented as a char[]) exactly * matches a given pattern. If * there is an exact match, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. The pattern must be * a Perl5Pattern instance, otherwise a ClassCastException will * be thrown. You are not required to, and indeed should NOT try to * (for performance reasons), catch a ClassCastException because it * will never be thrown as long as you use a Perl5Pattern as the pattern * parameter. *

* Note: matches() is not the same as sticking a ^ in front of * your expression and a $ at the end of your expression in Perl5 * and using the =~ operator, even though in many cases it will be * equivalent. matches() literally looks for an exact match according * to the rules of Perl5 expression matching. Therefore, if you have * a pattern foo|foot and are matching the input foot * it will not produce an exact match. But foot|foo will * produce an exact match for either foot or foo. * Remember, Perl5 regular expressions do not match the longest * possible match. From the perlre manpage: *

* Alternatives are tried from left to right, so the first * alternative found for which the entire expression matches, * is the one that is chosen. This means that alternatives * are not necessarily greedy. For example: when matching * foo|foot against "barefoot", only the "foo" part will * match, as that is the first alternative tried, and it * successfully matches the target string. *
*

* @param input The char[] to test for an exact match. * @param pattern The Perl5Pattern to be matched. * @return True if input matches pattern, false otherwise. * @exception ClassCastException If a Pattern instance other than a * Perl5Pattern is passed as the pattern parameter. */ public boolean matches(char[] input, Pattern pattern) { Perl5Pattern expression; expression = (Perl5Pattern)pattern; __originalInput = input; if(expression._isCaseInsensitive) input = _toLower(input); __initInterpreterGlobals(expression, input, 0, input.length, 0); __lastSuccess = (__tryExpression(0) && __endMatchOffsets[0] == input.length); __lastMatchResult = null; return __lastSuccess; } /** * Determines if a string exactly matches a given pattern. If * there is an exact match, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. The pattern must be * a Perl5Pattern instance, otherwise a ClassCastException will * be thrown. You are not required to, and indeed should NOT try to * (for performance reasons), catch a ClassCastException because it * will never be thrown as long as you use a Perl5Pattern as the pattern * parameter. *

* Note: matches() is not the same as sticking a ^ in front of * your expression and a $ at the end of your expression in Perl5 * and using the =~ operator, even though in many cases it will be * equivalent. matches() literally looks for an exact match according * to the rules of Perl5 expression matching. Therefore, if you have * a pattern foo|foot and are matching the input foot * it will not produce an exact match. But foot|foo will * produce an exact match for either foot or foo. * Remember, Perl5 regular expressions do not match the longest * possible match. From the perlre manpage: *

* Alternatives are tried from left to right, so the first * alternative found for which the entire expression matches, * is the one that is chosen. This means that alternatives * are not necessarily greedy. For example: when matching * foo|foot against "barefoot", only the "foo" part will * match, as that is the first alternative tried, and it * successfully matches the target string. *
*

* @param input The String to test for an exact match. * @param pattern The Perl5Pattern to be matched. * @return True if input matches pattern, false otherwise. * @exception ClassCastException If a Pattern instance other than a * Perl5Pattern is passed as the pattern parameter. */ public boolean matches(String input, Pattern pattern) { return matches(input.toCharArray(), pattern); } /** * Determines if the contents of a PatternMatcherInput instance * exactly matches a given pattern. If * there is an exact match, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. Unlike the * {@link #contains(PatternMatcherInput, Pattern)} * method, the current offset of the PatternMatcherInput argument * is not updated. You should remember that the region between * the begin (NOT the current) and end offsets of the PatternMatcherInput * will be tested for an exact match. *

* The pattern must be a Perl5Pattern instance, otherwise a * ClassCastException will be thrown. You are not required to, and * indeed should NOT try to (for performance reasons), catch a * ClassCastException because it will never be thrown as long as you use * a Perl5Pattern as the pattern parameter. *

* Note: matches() is not the same as sticking a ^ in front of * your expression and a $ at the end of your expression in Perl5 * and using the =~ operator, even though in many cases it will be * equivalent. matches() literally looks for an exact match according * to the rules of Perl5 expression matching. Therefore, if you have * a pattern foo|foot and are matching the input foot * it will not produce an exact match. But foot|foo will * produce an exact match for either foot or foo. * Remember, Perl5 regular expressions do not match the longest * possible match. From the perlre manpage: *

* Alternatives are tried from left to right, so the first * alternative found for which the entire expression matches, * is the one that is chosen. This means that alternatives * are not necessarily greedy. For example: when matching * foo|foot against "barefoot", only the "foo" part will * match, as that is the first alternative tried, and it * successfully matches the target string. *
*

* @param input The PatternMatcherInput to test for a match. * @param pattern The Perl5Pattern to be matched. * @return True if input matches pattern, false otherwise. * @exception ClassCastException If a Pattern instance other than a * Perl5Pattern is passed as the pattern parameter. */ public boolean matches(PatternMatcherInput input, Pattern pattern) { char[] inp; Perl5Pattern expression; expression = (Perl5Pattern)pattern; __originalInput = input._originalBuffer; if(expression._isCaseInsensitive) { if(input._toLowerBuffer == null) input._toLowerBuffer = _toLower(__originalInput); inp = input._toLowerBuffer; } else inp = __originalInput; __initInterpreterGlobals(expression, inp, input._beginOffset, input._endOffset, input._beginOffset); __lastMatchResult = null; if(__tryExpression(input._beginOffset)) { if(__endMatchOffsets[0] == input._endOffset || input.length() == 0 || input._beginOffset == input._endOffset) { __lastSuccess = true; return true; } } __lastSuccess = false; return false; } /** * Determines if a string contains a pattern. If the pattern is * matched by some substring of the input, a MatchResult instance * representing the first such match is made acessible via * {@link #getMatch()}. If you want to access * subsequent matches you should either use a PatternMatcherInput object * or use the offset information in the MatchResult to create a substring * representing the remaining input. Using the MatchResult offset * information is the recommended method of obtaining the parts of the * string preceeding the match and following the match. *

* The pattern must be a Perl5Pattern instance, otherwise a * ClassCastException will be thrown. You are not required to, and * indeed should NOT try to (for performance reasons), catch a * ClassCastException because it will never be thrown as long as you use * a Perl5Pattern as the pattern parameter. *

* @param input The String to test for a match. * @param pattern The Perl5Pattern to be matched. * @return True if the input contains a pattern match, false otherwise. * @exception ClassCastException If a Pattern instance other than a * Perl5Pattern is passed as the pattern parameter. */ public boolean contains(String input, Pattern pattern) { return contains(input.toCharArray(), pattern); } /** * Determines if a string (represented as a char[]) contains a pattern. * If the pattern is * matched by some substring of the input, a MatchResult instance * representing the first such match is made acessible via * {@link #getMatch()}. If you want to access * subsequent matches you should either use a PatternMatcherInput object * or use the offset information in the MatchResult to create a substring * representing the remaining input. Using the MatchResult offset * information is the recommended method of obtaining the parts of the * string preceeding the match and following the match. *

* The pattern must be a Perl5Pattern instance, otherwise a * ClassCastException will be thrown. You are not required to, and * indeed should NOT try to (for performance reasons), catch a * ClassCastException because it will never be thrown as long as you use * a Perl5Pattern as the pattern parameter. *

* @param input The char[] to test for a match. * @param pattern The Perl5Pattern to be matched. * @return True if the input contains a pattern match, false otherwise. * @exception ClassCastException If a Pattern instance other than a * Perl5Pattern is passed as the pattern parameter. */ public boolean contains(char[] input, Pattern pattern) { Perl5Pattern expression; expression = (Perl5Pattern)pattern; __originalInput = input; if(expression._isCaseInsensitive) input = _toLower(input); return __interpret(expression, input, 0, input.length, 0); } private static final int __DEFAULT_LAST_MATCH_END_OFFSET = -100; private int __lastMatchInputEndOffset = __DEFAULT_LAST_MATCH_END_OFFSET; /** * Determines if the contents of a PatternMatcherInput, starting from the * current offset of the input contains a pattern. * If a pattern match is found, a MatchResult * instance representing the first such match is made acessible via * {@link #getMatch()}. The current offset of the * PatternMatcherInput is set to the offset corresponding to the end * of the match, so that a subsequent call to this method will continue * searching where the last call left off. You should remember that the * region between the begin and end offsets of the PatternMatcherInput are * considered the input to be searched, and that the current offset * of the PatternMatcherInput reflects where a search will start from. * Matches extending beyond the end offset of the PatternMatcherInput * will not be matched. In other words, a match must occur entirely * between the begin and end offsets of the input. See * {@link PatternMatcherInput} for more details. *

* As a side effect, if a match is found, the PatternMatcherInput match * offset information is updated. See the * {@link PatternMatcherInput#setMatchOffsets(int, int)} * method for more details. *

* The pattern must be a Perl5Pattern instance, otherwise a * ClassCastException will be thrown. You are not required to, and * indeed should NOT try to (for performance reasons), catch a * ClassCastException because it will never be thrown as long as you use * a Perl5Pattern as the pattern parameter. *

* This method is usually used in a loop as follows: *

   * PatternMatcher matcher;
   * PatternCompiler compiler;
   * Pattern pattern;
   * PatternMatcherInput input;
   * MatchResult result;
   *
   * compiler = new Perl5Compiler();
   * matcher  = new Perl5Matcher();
   *
   * try {
   *   pattern = compiler.compile(somePatternString);
   * } catch(MalformedPatternException e) {
   *   System.err.println("Bad pattern.");
   *   System.err.println(e.getMessage());
   *   return;
   * }
   *
   * input   = new PatternMatcherInput(someStringInput);
   *
   * while(matcher.contains(input, pattern)) {
   *   result = matcher.getMatch();  
   *   // Perform whatever processing on the result you want.
   * }
   *
   * 
*

* @param input The PatternMatcherInput to test for a match. * @param pattern The Pattern to be matched. * @return True if the input contains a pattern match, false otherwise. * @exception ClassCastException If a Pattern instance other than a * Perl5Pattern is passed as the pattern parameter. */ public boolean contains(PatternMatcherInput input, Pattern pattern) { char[] inp; Perl5Pattern expression; boolean matchFound; //if(input.length() > 0) { // We want to allow a null string to match at the end of the input // which is why we don't check endOfInput. Not sure if this is a // safe thing to do or not. if(input._currentOffset > input._endOffset) return false; //} /* else if(input._endOfInput()) return false; */ expression = (Perl5Pattern)pattern; __originalInput = input._originalBuffer; // Todo: // Really should only reduce to lowercase that part of the // input that is necessary, instead of the whole thing. // Adjust MatchResult offsets accordingly. Actually, pass an adjustment // value to __interpret. __originalInput = input._originalBuffer; if(expression._isCaseInsensitive) { if(input._toLowerBuffer == null) input._toLowerBuffer = _toLower(__originalInput); inp = input._toLowerBuffer; } else inp = __originalInput; __lastMatchInputEndOffset = input.getMatchEndOffset(); matchFound = __interpret(expression, inp, input._beginOffset, input._endOffset, input._currentOffset); if(matchFound) { input.setCurrentOffset(__endMatchOffsets[0]); input.setMatchOffsets(__beginMatchOffsets[0], __endMatchOffsets[0]); } else { input.setCurrentOffset(input._endOffset + 1); } // Restore so it doesn't interfere with other unrelated matches. __lastMatchInputEndOffset = __DEFAULT_LAST_MATCH_END_OFFSET; return matchFound; } /** * Fetches the last match found by a call to a matches() or contains() * method. If you plan on modifying the original search input, you * must call this method BEFORE you modify the original search input, * as a lazy evaluation technique is used to create the MatchResult. * This reduces the cost of pattern matching when you don't care about * the actual match and only care if the pattern occurs in the input. * Otherwise, a MatchResult would be created for every match found, * whether or not the MatchResult was later used by a call to getMatch(). *

* @return A MatchResult instance containing the pattern match found * by the last call to any one of the matches() or contains() * methods. If no match was found by the last call, returns * null. */ public MatchResult getMatch() { if(!__lastSuccess) return null; if(__lastMatchResult == null) __setLastMatchResult(); return __lastMatchResult; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/0000755000175000017500000000000010423237774021337 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/CharacterClassNode.java0000644000175000017500000000663007773723336025707 0ustar arnaudarnaud/* * $Id: CharacterClassNode.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import java.util.*; /** * @version @version@ * @since 1.0 */ class CharacterClassNode extends LeafNode { BitSet _characterSet; CharacterClassNode(int position) { super(position); _characterSet = new BitSet(LeafNode._NUM_TOKENS + 1); } void _addToken(int token) { _characterSet.set(token); } void _addTokenRange(int min, int max) { while(min <= max) _characterSet.set(min++); } boolean _matches(char token) { return _characterSet.get(token); } SyntaxNode _clone(int pos[]) { CharacterClassNode node; node = new CharacterClassNode(pos[0]++); node._characterSet = (BitSet)_characterSet.clone(); return node; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/QuestionNode.java0000644000175000017500000000606707773723336024640 0ustar arnaudarnaud/* * $Id: QuestionNode.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; /** * @version @version@ * @since 1.0 */ final class QuestionNode extends OrNode { final static SyntaxNode _epsilon = new EpsilonNode(); QuestionNode(SyntaxNode child){ super(child, _epsilon); } boolean _nullable() { return true; } SyntaxNode _clone(int pos[]) { return new QuestionNode(_left._clone(pos)); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/AwkStreamInput.java0000644000175000017500000001542707773723336025141 0ustar arnaudarnaud/* * $Id: AwkStreamInput.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import java.io.*; import org.apache.oro.text.regex.*; /** * The AwkStreamInput class is used to look for pattern matches in an * input stream (actually a java.io.Reader instance) in conjunction with * the AwkMatcher class. It is called * AwkStreamInput instead of AwkInputStream to stress that it is a form * of streamed input for the AwkMatcher class to use rather than a subclass of * InputStream. * AwkStreamInput performs special internal buffering to accelerate * pattern searches through a stream. You can determine the size of this * buffer and how it grows by using the appropriate constructor. *

* If you want to perform line by line * matches on an input stream, you should use a DataInput or BufferedReader * instance in conjunction * with one of the PatternMatcher methods taking a String, char[], or * PatternMatcherInput as an argument. The DataInput and BufferedReader * readLine() methods will likely be implemented as native methods and * therefore more efficient than supporting line by line searching within * AwkStreamInput. *

* In the future the programmer will be able to set this class to save * all the input it sees so that it can be accessed later. This will avoid * having to read a stream more than once for whatever reason. * * @version @version@ * @since 1.0 * @see AwkMatcher */ public final class AwkStreamInput { static final int _DEFAULT_BUFFER_INCREMENT = 2048; private Reader __searchStream; private int __bufferIncrementUnit; boolean _endOfStreamReached; // The offset into the stream corresponding to buffer[0] int _bufferSize, _bufferOffset, _currentOffset; char[] _buffer; /** * We use this default contructor only within the package to create a dummy * AwkStreamInput instance. */ AwkStreamInput() { _currentOffset = 0; } /** * Creates an AwkStreamInput instance bound to a Reader with a * specified initial buffer size and default buffer increment. *

* @param input The InputStream to associate with the AwkStreamInput * instance. * @param bufferIncrement The initial buffer size and the default buffer * increment to use when the input buffer has to be increased in * size. */ public AwkStreamInput(Reader input, int bufferIncrement) { __searchStream = input; __bufferIncrementUnit = bufferIncrement; _buffer = new char[bufferIncrement]; _bufferOffset = _bufferSize = _currentOffset = 0; _endOfStreamReached = false; } /** * Creates an AwkStreamInput instance bound to a Reader with an * initial buffer size and default buffer increment of 2048 bytes. *

* @param input The InputStream to associate with the AwkStreamInput * instance. */ public AwkStreamInput(Reader input) { this(input, _DEFAULT_BUFFER_INCREMENT); } // Only called when buffer overflows int _reallocate(int initialOffset) throws IOException { int offset, bytesRead; char[] tmpBuffer; if(_endOfStreamReached) return _bufferSize; offset = _bufferSize - initialOffset; tmpBuffer = new char[offset + __bufferIncrementUnit]; bytesRead = __searchStream.read(tmpBuffer, offset, __bufferIncrementUnit); if(bytesRead <= 0){ _endOfStreamReached = true; /* bytesRead should never equal zero, but if it does, we don't want to continue to try and read, running the risk of entering an infinite loop. Throw an IOException instead, because this really IS an exception. */ if(bytesRead == 0) throw new IOException("read from input stream returned 0 bytes."); return _bufferSize; } else { _bufferOffset += initialOffset; _bufferSize = offset + bytesRead; System.arraycopy(_buffer, initialOffset, tmpBuffer, 0, offset); _buffer = tmpBuffer; } return offset; } boolean read() throws IOException { _bufferOffset+=_bufferSize; _bufferSize = __searchStream.read(_buffer); _endOfStreamReached = (_bufferSize == -1); return (!_endOfStreamReached); } public boolean endOfStream() { return _endOfStreamReached; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/StarNode.java0000644000175000017500000000667507773723336023747 0ustar arnaudarnaud/* * $Id: StarNode.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import java.util.*; /** * @version @version@ * @since 1.0 */ class StarNode extends SyntaxNode { SyntaxNode _left; StarNode(SyntaxNode child){ _left = child; } boolean _nullable() { return true; } BitSet _firstPosition() { return _left._firstPosition(); } BitSet _lastPosition() { return _left._lastPosition(); } void _followPosition(BitSet[] follow, SyntaxNode[] nodes) { BitSet last, first; int size; _left._followPosition(follow, nodes); last = _lastPosition(); first = _firstPosition(); size = last.size(); while(0 < size--) if(last.get(size)) follow[size].or(first); } SyntaxNode _clone(int pos[]) { return new StarNode(_left._clone(pos)); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/AwkMatcher.java0000644000175000017500000006560507773723336024254 0ustar arnaudarnaud/* * $Id: AwkMatcher.java,v 1.11 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import java.io.*; import org.apache.oro.text.regex.*; /** * The AwkMatcher class is used to match regular expressions * (conforming to the Awk regular expression syntax) generated by * AwkCompiler. AwkMatcher only supports 8-bit ASCII. Any attempt * to match Unicode values greater than 255 will result in undefined * behavior. AwkMatcher finds true leftmost-longest matches, so * you must take care with how you formulate your regular expression * to avoid matching more than you really want. *

* It is important for you to remember that AwkMatcher does not save * parenthesized sub-group information. Therefore the number of groups * saved in a MatchResult produced by AwkMatcher will always be 1. * * @version @version@ * @since 1.0 * @see org.apache.oro.text.regex.PatternMatcher * @see AwkCompiler */ public final class AwkMatcher implements PatternMatcher { private int __lastMatchedBufferOffset; private AwkMatchResult __lastMatchResult = null; private AwkStreamInput __scratchBuffer, __streamSearchBuffer; private AwkPattern __awkPattern; private int __offsets[] = new int[2]; /** * A kluge variable to make PatternMatcherInput matches work when * their begin offset is non-zero. This kluge is caused by the * misguided notion that AwkStreamInput could be overloaded to do * both stream and fixed buffer matches. The whole input representation * scheme has to be scrapped and redone. -- dfs 2001/07/10 */ private int __beginOffset; public AwkMatcher() { __scratchBuffer = new AwkStreamInput(); __scratchBuffer._endOfStreamReached = true; } /** * Determines if a prefix of a string (represented as a char[]) * matches a given pattern, starting from a given offset into the string. * If a prefix of the string matches the pattern, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. *

* This method is useful for certain common token identification tasks * that are made more difficult without this functionality. *

* @param input The char[] to test for a prefix match. * @param pattern The Pattern to be matched. * @param offset The offset at which to start searching for the prefix. * @return True if input matches pattern, false otherwise. */ // I reimplemented this method in terms of streammatchesPrefix // to reduce the code size. This is not very elegant and // reduces performance by a small degree. public boolean matchesPrefix(char[] input, Pattern pattern, int offset){ int result = -1; __awkPattern = (AwkPattern)pattern; __scratchBuffer._buffer = input; __scratchBuffer._bufferSize = input.length; __scratchBuffer._bufferOffset = __beginOffset = 0; __scratchBuffer._endOfStreamReached = true; __streamSearchBuffer = __scratchBuffer; __offsets[0] = offset; try { result = __streamMatchPrefix(); } catch(IOException e){ // Don't do anything because we're not doing any I/O result = -1; } if(result < 0) { __lastMatchResult = null; return false; } __lastMatchResult = new AwkMatchResult(new String(input, 0, result), offset); return true; } /** * Determines if a prefix of a string (represented as a char[]) * matches a given pattern. * If a prefix of the string matches the pattern, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. *

* This method is useful for certain common token identification tasks * that are made more difficult without this functionality. *

* @param input The char[] to test for a prefix match. * @param pattern The Pattern to be matched. * @return True if input matches pattern, false otherwise. */ public boolean matchesPrefix(char[] input, Pattern pattern){ return matchesPrefix(input, pattern, 0); } /** * Determines if a prefix of a string matches a given pattern. * If a prefix of the string matches the pattern, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. *

* This method is useful for certain common token identification tasks * that are made more difficult without this functionality. *

* @param input The String to test for a prefix match. * @param pattern The Pattern to be matched. * @return True if input matches pattern, false otherwise. */ public boolean matchesPrefix(String input, Pattern pattern) { return matchesPrefix(input.toCharArray(), pattern, 0); } /** * Determines if a prefix of a PatternMatcherInput instance * matches a given pattern. If there is a match, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. Unlike the * {@link #contains(PatternMatcherInput, Pattern)} * method, the current offset of the PatternMatcherInput argument * is not updated. You should remember that the region starting * from the begin offset of the PatternMatcherInput will be * tested for a prefix match. *

* This method is useful for certain common token identification tasks * that are made more difficult without this functionality. *

* @param input The PatternMatcherInput to test for a prefix match. * @param pattern The Pattern to be matched. * @return True if input matches pattern, false otherwise. */ public boolean matchesPrefix(PatternMatcherInput input, Pattern pattern){ int result = -1; __awkPattern = (AwkPattern)pattern; __scratchBuffer._buffer = input.getBuffer(); __scratchBuffer._bufferOffset = __beginOffset = input.getBeginOffset(); __offsets[0] = input.getCurrentOffset(); __scratchBuffer._bufferSize = input.length(); __scratchBuffer._endOfStreamReached = true; __streamSearchBuffer = __scratchBuffer; try { result = __streamMatchPrefix(); } catch(IOException e) { // Don't do anything because we're not doing any I/O result = -1; } if(result < 0) { __lastMatchResult = null; return false; } __lastMatchResult = new AwkMatchResult(new String(__scratchBuffer._buffer, __offsets[0], result), __offsets[0]); return true; } /** * Determines if a string (represented as a char[]) exactly * matches a given pattern. If * there is an exact match, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. The pattern must be * an AwkPattern instance, otherwise a ClassCastException will * be thrown. You are not required to, and indeed should NOT try to * (for performance reasons), catch a ClassCastException because it * will never be thrown as long as you use an AwkPattern as the pattern * parameter. *

* @param input The char[] to test for an exact match. * @param pattern The AwkPattern to be matched. * @return True if input matches pattern, false otherwise. * @exception ClassCastException If a Pattern instance other than an * AwkPattern is passed as the pattern parameter. */ public boolean matches(char[] input, Pattern pattern) { int result = -1; __awkPattern = (AwkPattern)pattern; __scratchBuffer._buffer = input; __scratchBuffer._bufferSize = input.length; __scratchBuffer._bufferOffset = __beginOffset = 0; __scratchBuffer._endOfStreamReached = true; __streamSearchBuffer = __scratchBuffer; __offsets[0] = 0; try { result = __streamMatchPrefix(); } catch(IOException e){ // Don't do anything because we're not doing any I/O result = -1; } if(result != input.length) { __lastMatchResult = null; return false; } __lastMatchResult = new AwkMatchResult(new String(input, 0, result), 0); return true; } /** * Determines if a string exactly matches a given pattern. If * there is an exact match, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. The pattern must be * a AwkPattern instance, otherwise a ClassCastException will * be thrown. You are not required to, and indeed should NOT try to * (for performance reasons), catch a ClassCastException because it * will never be thrown as long as you use an AwkPattern as the pattern * parameter. *

* @param input The String to test for an exact match. * @param pattern The AwkPattern to be matched. * @return True if input matches pattern, false otherwise. * @exception ClassCastException If a Pattern instance other than an * AwkPattern is passed as the pattern parameter. */ public boolean matches(String input, Pattern pattern){ return matches(input.toCharArray(), pattern); } /** * Determines if the contents of a PatternMatcherInput instance * exactly matches a given pattern. If * there is an exact match, a MatchResult instance * representing the match is made accesible via * {@link #getMatch()}. Unlike the * {@link #contains(PatternMatcherInput, Pattern)} * method, the current offset of the PatternMatcherInput argument * is not updated. You should remember that the region between * the begin and end offsets of the PatternMatcherInput will be * tested for an exact match. *

* The pattern must be an AwkPattern instance, otherwise a * ClassCastException will be thrown. You are not required to, and * indeed should NOT try to (for performance reasons), catch a * ClassCastException because it will never be thrown as long as you use * an AwkPattern as the pattern parameter. *

* @param input The PatternMatcherInput to test for a match. * @param pattern The AwkPattern to be matched. * @return True if input matches pattern, false otherwise. * @exception ClassCastException If a Pattern instance other than an * AwkPattern is passed as the pattern parameter. */ public boolean matches(PatternMatcherInput input, Pattern pattern){ int result = -1; __awkPattern = (AwkPattern)pattern; __scratchBuffer._buffer = input.getBuffer(); __scratchBuffer._bufferSize = input.length(); __scratchBuffer._bufferOffset = __beginOffset = input.getBeginOffset(); __offsets[0] = input.getBeginOffset(); __scratchBuffer._endOfStreamReached = true; __streamSearchBuffer = __scratchBuffer; try { result = __streamMatchPrefix(); } catch(IOException e){ // Don't do anything because we're not doing any I/O result = -1; } if(result != __scratchBuffer._bufferSize) { __lastMatchResult = null; return false; } __lastMatchResult = new AwkMatchResult(new String(__scratchBuffer._buffer, __offsets[0], __scratchBuffer._bufferSize), __offsets[0]); return true; } /** * Determines if a string (represented as a char[]) contains a pattern. * If the pattern is * matched by some substring of the input, a MatchResult instance * representing the first such match is made acessible via * {@link #getMatch()}. If you want to access * subsequent matches you should either use a PatternMatcherInput object * or use the offset information in the MatchResult to create a substring * representing the remaining input. Using the MatchResult offset * information is the recommended method of obtaining the parts of the * string preceeding the match and following the match. *

* The pattern must be an AwkPattern instance, otherwise a * ClassCastException will be thrown. You are not required to, and * indeed should NOT try to (for performance reasons), catch a * ClassCastException because it will never be thrown as long as you use * an AwkPattern as the pattern parameter. *

* @param input The char[] to test for a match. * @param pattern The AwkPattern to be matched. * @return True if the input contains a pattern match, false otherwise. * @exception ClassCastException If a Pattern instance other than an * AwkPattern is passed as the pattern parameter. */ public boolean contains(char[] input, Pattern pattern) { __awkPattern = (AwkPattern)pattern; // Begin anchor requires match occur at beginning of input if(__awkPattern._hasBeginAnchor && !__awkPattern._fastMap[input[0]]){ __lastMatchResult = null; return false; } __scratchBuffer._buffer = input; __scratchBuffer._bufferSize = input.length; __scratchBuffer._bufferOffset = __beginOffset = 0; __scratchBuffer._endOfStreamReached = true; __streamSearchBuffer = __scratchBuffer; __lastMatchedBufferOffset = 0; try { _search(); } catch(IOException e) { // do nothing } return (__lastMatchResult != null); } /** * Determines if a string contains a pattern. If the pattern is * matched by some substring of the input, a MatchResult instance * representing the first such match is made acessible via * {@link #getMatch()}. If you want to access * subsequent matches you should either use a PatternMatcherInput object * or use the offset information in the MatchResult to create a substring * representing the remaining input. Using the MatchResult offset * information is the recommended method of obtaining the parts of the * string preceeding the match and following the match. *

* The pattern must be an AwkPattern instance, otherwise a * ClassCastException will be thrown. You are not required to, and * indeed should NOT try to (for performance reasons), catch a * ClassCastException because it will never be thrown as long as you use * an AwkPattern as the pattern parameter. *

* @param input The String to test for a match. * @param pattern The AwkPattern to be matched. * @return True if the input contains a pattern match, false otherwise. * @exception ClassCastException If a Pattern instance other than an * AwkPattern is passed as the pattern parameter. */ public boolean contains(String input, Pattern pattern){ return contains(input.toCharArray(), pattern); } /** * Determines if the contents of a PatternMatcherInput, starting from the * current offset of the input contains a pattern. * If a pattern match is found, a MatchResult * instance representing the first such match is made acessible via * {@link #getMatch()}. The current offset of the * PatternMatcherInput is set to the offset corresponding to the end * of the match, so that a subsequent call to this method will continue * searching where the last call left off. You should remember that the * region between the begin and end offsets of the PatternMatcherInput are * considered the input to be searched, and that the current offset * of the PatternMatcherInput reflects where a search will start from. * Matches extending beyond the end offset of the PatternMatcherInput * will not be matched. In other words, a match must occur entirely * between the begin and end offsets of the input. See * {@link org.apache.oro.text.regex.PatternMatcherInput PatternMatcherInput} * for more details. *

* As a side effect, if a match is found, the PatternMatcherInput match * offset information is updated. See the PatternMatcherInput * {@link org.apache.oro.text.regex.PatternMatcherInput#setMatchOffsets * setMatchOffsets(int, int)} method for more details. *

* The pattern must be an AwkPattern instance, otherwise a * ClassCastException will be thrown. You are not required to, and * indeed should NOT try to (for performance reasons), catch a * ClassCastException because it will never be thrown as long as you use * an AwkPattern as the pattern parameter. *

* This method is usually used in a loop as follows: *

   * PatternMatcher matcher;
   * PatternCompiler compiler;
   * Pattern pattern;
   * PatternMatcherInput input;
   * MatchResult result;
   *
   * compiler = new AwkCompiler();
   * matcher  = new AwkMatcher();
   *
   * try {
   *   pattern = compiler.compile(somePatternString);
   * } catch(MalformedPatternException e) {
   *   System.err.println("Bad pattern.");
   *   System.err.println(e.getMessage());
   *   return;
   * }
   *
   * input   = new PatternMatcherInput(someStringInput);
   *
   * while(matcher.contains(input, pattern)) {
   *   result = matcher.getMatch();  
   *   // Perform whatever processing on the result you want.
   * }
   *
   * 
*

* @param input The PatternMatcherInput to test for a match. * @param pattern The Pattern to be matched. * @return True if the input contains a pattern match, false otherwise. * @exception ClassCastException If a Pattern instance other than an * AwkPattern is passed as the pattern parameter. */ public boolean contains(PatternMatcherInput input, Pattern pattern) { __awkPattern = (AwkPattern)pattern; __scratchBuffer._buffer = input.getBuffer(); __scratchBuffer._bufferOffset = __beginOffset = input.getBeginOffset(); __lastMatchedBufferOffset = input.getCurrentOffset(); // Begin anchor requires match occur at beginning of input // No need to adjust current offset if no match found. if(__awkPattern._hasBeginAnchor) { if(__beginOffset != __lastMatchedBufferOffset || !__awkPattern._fastMap[__scratchBuffer._buffer[__beginOffset]]) { __lastMatchResult = null; return false; } } __scratchBuffer._bufferSize = input.length(); __scratchBuffer._endOfStreamReached = true; __streamSearchBuffer = __scratchBuffer; try { _search(); } catch(IOException e) { // do nothing } input.setCurrentOffset(__lastMatchedBufferOffset); if(__lastMatchResult == null) return false; input.setMatchOffsets(__lastMatchResult.beginOffset(0), __lastMatchResult.endOffset(0)); return true; } /** * Determines if the contents of an AwkStreamInput, starting from the * current offset of the input contains a pattern. * If a pattern match is found, a MatchResult * instance representing the first such match is made acessible via * {@link #getMatch()}. The current offset of the * input stream is advanced to the end offset corresponding to the end * of the match. Consequently a subsequent call to this method will continue * searching where the last call left off. * See {@link AwkStreamInput} for more details. *

* Note, patterns matching the null string do NOT match at end of input * stream. This is different from the behavior you get from the other * contains() methods. *

* The pattern must be an AwkPattern instance, otherwise a * ClassCastException will be thrown. You are not required to, and * indeed should NOT try to (for performance reasons), catch a * ClassCastException because it will never be thrown as long as you use * an AwkPattern as the pattern parameter. *

* This method is usually used in a loop as follows: *

   * PatternMatcher matcher;
   * PatternCompiler compiler;
   * Pattern pattern;
   * AwkStreamInput input;
   * MatchResult result;
   *
   * compiler = new AwkCompiler();
   * matcher  = new AwkMatcher();
   *
   * try {
   *   pattern = compiler.compile(somePatternString);
   * } catch(MalformedPatternException e) {
   *   System.err.println("Bad pattern.");
   *   System.err.println(e.getMessage());
   *   return;
   * }
   *
   * input   = new AwkStreamInput(
   *             new BufferedInputStream(new FileInputStream(someFileName)));
   *
   * while(matcher.contains(input, pattern)) {
   *   result = matcher.getMatch();  
   *   // Perform whatever processing on the result you want.
   * }
   *
   * 
*

* @param input The PatternStreamInput to test for a match. * @param pattern The Pattern to be matched. * @return True if the input contains a pattern match, false otherwise. * @exception ClassCastException If a Pattern instance other than an * AwkPattern is passed as the pattern parameter. */ public boolean contains(AwkStreamInput input, Pattern pattern) throws IOException { __awkPattern = (AwkPattern)pattern; // Begin anchor requires match occur at beginning of input if(__awkPattern._hasBeginAnchor) { // Do read here instead of in _search() so we can test first char if(input._bufferOffset == 0) { if(input.read() && !__awkPattern._fastMap[input._buffer[0]]) { __lastMatchResult = null; return false; } } else { __lastMatchResult = null; return false; } } __lastMatchedBufferOffset = input._currentOffset; __streamSearchBuffer = input; __beginOffset = 0; _search(); input._currentOffset = __lastMatchedBufferOffset; if(__lastMatchResult != null) { // Adjust match begin offset to be relative to beginning of stream. __lastMatchResult._incrementMatchBeginOffset(input._bufferOffset); return true; } return false; } private int __streamMatchPrefix() throws IOException { int token, current = AwkPattern._START_STATE, lastState; int offset, initialOffset, maxOffset; int lastMatchedOffset = -1; int[] tstateArray; offset = initialOffset = __offsets[0]; maxOffset = __streamSearchBuffer._bufferSize + __beginOffset; test: while(offset < maxOffset) { token = __streamSearchBuffer._buffer[offset++]; if(current < __awkPattern._numStates) { lastState = current; tstateArray = __awkPattern._getStateArray(current); current = tstateArray[token]; if(current == 0){ __awkPattern._createNewState(lastState, token, tstateArray); current = tstateArray[token]; } if(current == AwkPattern._INVALID_STATE){ break test; } else if(__awkPattern._endStates.get(current)){ lastMatchedOffset = offset; } if(offset == maxOffset){ offset = __streamSearchBuffer._reallocate(initialOffset) + __beginOffset; maxOffset = __streamSearchBuffer._bufferSize + __beginOffset; // If we're at the end of the stream, don't reset values if(offset != maxOffset){ if(lastMatchedOffset != -1) lastMatchedOffset-=initialOffset; initialOffset = 0; } } } else break; } __offsets[0] = initialOffset; __offsets[1] = lastMatchedOffset - 1; if(lastMatchedOffset == -1 && __awkPattern._matchesNullString) return 0; // End anchor requires match occur at end of input if(__awkPattern._hasEndAnchor && (!__streamSearchBuffer._endOfStreamReached || lastMatchedOffset < __streamSearchBuffer._bufferSize + __beginOffset)) return -1; return (lastMatchedOffset - initialOffset); } void _search() throws IOException { int position, tokensMatched; __lastMatchResult = null; while(true){ if(__lastMatchedBufferOffset >= __streamSearchBuffer._bufferSize + __beginOffset) { if(__streamSearchBuffer._endOfStreamReached){ // Get rid of reference now that it should no longer be used. __streamSearchBuffer = null; return; } else { if(!__streamSearchBuffer.read()) return; __lastMatchedBufferOffset = 0; } } for(position = __lastMatchedBufferOffset; position < __streamSearchBuffer._bufferSize + __beginOffset; position = __offsets[0] + 1) { __offsets[0] = position; if(__awkPattern._fastMap[__streamSearchBuffer._buffer[position]] && (tokensMatched = __streamMatchPrefix()) > -1) { __lastMatchResult = new AwkMatchResult( new String(__streamSearchBuffer._buffer, __offsets[0], tokensMatched), __offsets[0]); __lastMatchedBufferOffset = (tokensMatched > 0 ? __offsets[1] + 1 : __offsets[0] + 1); return; } else if(__awkPattern._matchesNullString) { __lastMatchResult = new AwkMatchResult(new String(), position); __lastMatchedBufferOffset = position + 1; return; } } __lastMatchedBufferOffset = position; } } /** * Fetches the last match found by a call to a matches() or contains() * method. *

* @return A MatchResult instance containing the pattern match found * by the last call to any one of the matches() or contains() * methods. If no match was found by the last call, returns * null. */ public MatchResult getMatch() { return __lastMatchResult; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/SyntaxNode.java0000644000175000017500000000657007773723336024316 0ustar arnaudarnaud/* * $Id: SyntaxNode.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import java.util.*; /** * @version @version@ * @since 1.0 */ abstract class SyntaxNode { abstract boolean _nullable(); abstract BitSet _firstPosition(); abstract BitSet _lastPosition(); abstract void _followPosition(BitSet[] follow, SyntaxNode[] nodes); /** * This method is designed specifically to accommodate the expansion of * an interval into its subparts. *

* @param pos A single element array containing a variable representing * the current position. It is made an array to cause it * to be passed by reference to allow incrementing. */ abstract SyntaxNode _clone(int pos[]); } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/OrNode.java0000644000175000017500000000730507773723336023405 0ustar arnaudarnaud/* * $Id: OrNode.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import java.util.*; /** * @version @version@ * @since 1.0 */ class OrNode extends SyntaxNode { SyntaxNode _left, _right; OrNode(SyntaxNode left, SyntaxNode right) { _left = left; _right = right; } boolean _nullable() { return (_left._nullable() || _right._nullable()); } BitSet _firstPosition() { BitSet ls, rs, bs; ls = _left._firstPosition(); rs = _right._firstPosition(); bs = new BitSet(Math.max(ls.size(), rs.size())); bs.or(rs); bs.or(ls); return bs; } BitSet _lastPosition() { BitSet ls, rs, bs; ls = _left._lastPosition(); rs = _right._lastPosition(); bs = new BitSet(Math.max(ls.size(), rs.size())); bs.or(rs); bs.or(ls); return bs; } void _followPosition(BitSet[] follow, SyntaxNode[] nodes) { _left._followPosition(follow, nodes); _right._followPosition(follow, nodes); } SyntaxNode _clone(int pos[]) { return new OrNode(_left._clone(pos), _right._clone(pos)); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/CatNode.java0000644000175000017500000001014107773723336023524 0ustar arnaudarnaud/* * $Id: CatNode.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import java.util.*; /** * @version @version@ * @since 1.0 */ final class CatNode extends SyntaxNode { SyntaxNode _left, _right; boolean _nullable() { return (_left._nullable() && _right._nullable()); } BitSet _firstPosition() { if(_left._nullable()){ BitSet ls, rs, bs; ls = _left._firstPosition(); rs = _right._firstPosition(); bs = new BitSet(Math.max(ls.size(), rs.size())); bs.or(rs); bs.or(ls); return bs; } return _left._firstPosition(); } BitSet _lastPosition() { if(_right._nullable()) { BitSet ls, rs, bs; ls = _left._lastPosition(); rs = _right._lastPosition(); bs = new BitSet(Math.max(ls.size(), rs.size())); bs.or(rs); bs.or(ls); return bs; } return _right._lastPosition(); } void _followPosition(BitSet[] follow, SyntaxNode[] nodes) { int size; BitSet leftLast, rightFirst; _left._followPosition(follow, nodes); _right._followPosition(follow, nodes); leftLast = _left._lastPosition(); rightFirst = _right._firstPosition(); size = leftLast.size(); while(0 < size--) if(leftLast.get(size)) follow[size].or(rightFirst); } SyntaxNode _clone(int pos[]) { CatNode node; node = new CatNode(); node._left = _left._clone(pos); node._right = _right._clone(pos); return node; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/package.html0000644000175000017500000000036107773723336023630 0ustar arnaudarnaud This package used to be the AwkTools library and provides AWK-like regular expression classes that implement the {@link org.apache.oro.text.regex} interfaces. jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/LeafNode.java0000644000175000017500000000670107773723336023673 0ustar arnaudarnaud/* * $Id: LeafNode.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import java.util.*; /** * @version @version@ * @since 1.0 */ abstract class LeafNode extends SyntaxNode { static final int _NUM_TOKENS = 256; static final int _END_MARKER_TOKEN = _NUM_TOKENS; protected int _position; protected BitSet _positionSet; LeafNode(int position){ _position = position; _positionSet = new BitSet(position + 1); _positionSet.set(position); } abstract boolean _matches(char token); final boolean _nullable() { return false; } final BitSet _firstPosition() { return _positionSet; } final BitSet _lastPosition() { return _positionSet; } final void _followPosition(BitSet[] follow, SyntaxNode[] nodes) { nodes[_position] = this; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/SyntaxTree.java0000644000175000017500000001014307773723336024317 0ustar arnaudarnaud/* * $Id: SyntaxTree.java,v 1.8 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import java.util.*; /* * IMPORTANT!!!!!!!!!!!!! * Don't forget to optimize this module. The calculation of follow can * be accelerated by calculating first and last only once for each node and * saving instead of doing dynamic calculation every time. */ /** * @version @version@ * @since 1.0 */ final class SyntaxTree { int _positions; SyntaxNode _root; LeafNode[] _nodes; BitSet[] _followSet; SyntaxTree(SyntaxNode root, int positions) { _root = root; _positions = positions; } void _computeFollowPositions() { int index; _followSet = new BitSet[_positions]; _nodes = new LeafNode[_positions]; index = _positions; while(0 < index--) _followSet[index] = new BitSet(_positions); _root._followPosition(_followSet, _nodes); } private void __addToFastMap(BitSet pos, boolean[] fastMap, boolean[] done){ int token, node; for(node = 0; node < _positions; node++){ if(pos.get(node) && !done[node]){ done[node] = true; for(token=0; token < LeafNode._NUM_TOKENS; token++){ if(!fastMap[token]) fastMap[token] = _nodes[node]._matches((char)token); } } } } boolean[] createFastMap(){ boolean[] fastMap, done; fastMap = new boolean[LeafNode._NUM_TOKENS]; done = new boolean[_positions]; __addToFastMap(_root._firstPosition(), fastMap, done); return fastMap; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/AwkMatchResult.java0000644000175000017500000001614007773723336025112 0ustar arnaudarnaud/* * $Id: AwkMatchResult.java,v 1.8 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import org.apache.oro.text.regex.*; /** * A class used to store and access the results of an AwkPattern match. * It is important for you to remember that AwkMatcher does not save * parenthesized sub-group information. Therefore the number of groups * saved in an AwkMatchResult will always be 1. * * @version @version@ * @since 1.0 * @see org.apache.oro.text.regex.PatternMatcher * @see AwkMatcher * @see AwkCompiler */ final class AwkMatchResult implements MatchResult { /** * The character offset into the line or stream where the match * begins. Pattern matching methods that look for matches a line at * a time should use this field as the offset into the line * of the match. Methods that look for matches independent of line * boundaries should use this field as the offset into the entire * text stream. */ private int __matchBeginOffset; /** * The length of the match. Stored as a convenience to avoid calling * the String length(). Since groups aren't saved, all we need is the * length and the offset into the stream. */ private int __length; /** * The entire string that matched the pattern. */ private String __match; /** * Default constructor given default access to prevent instantiation * outside the package. */ AwkMatchResult(String match, int matchBeginOffset){ __match = match; __length = match.length(); __matchBeginOffset = matchBeginOffset; } /** * Adjusts the relative offset where the match begins to an absolute * value. Only used by AwkMatcher to adjust the offset for stream * matches. */ void _incrementMatchBeginOffset(int streamOffset) { __matchBeginOffset+=streamOffset; } /** * @return The length of the match. */ public int length(){ return __length; } /** * @return The number of groups contained in the result. This number * includes the 0th group. In other words, the result refers * to the number of parenthesized subgroups plus the entire match * itself. Because Awk doesn't save parenthesized groups, this * always returns 1. */ public int groups(){ return 1; } /** * @param group The pattern subgroup to return. * @return A string containing the indicated pattern subgroup. Group * 0 always refers to the entire match. If a group was never * matched, it returns null. This is not to be confused with * a group matching the null string, which will return a String * of length 0. */ public String group(int group){ return (group == 0 ? __match : null); } /** * @param group The pattern subgroup. * @return The offset into group 0 of the first token in the indicated * pattern subgroup. If a group was never matched or does * not exist, returns -1. */ public int begin(int group){ return (group == 0 ? 0 : -1); } /** * @param group The pattern subgroup. * @return Returns one plus the offset into group 0 of the last token in * the indicated pattern subgroup. If a group was never matched * or does not exist, returns -1. A group matching the null * string will return its start offset. */ public int end(int group){ return (group == 0 ? __length : -1); } /** * Returns an offset marking the beginning of the pattern match * relative to the beginning of the input. *

* @param group The pattern subgroup. * @return The offset of the first token in the indicated * pattern subgroup. If a group was never matched or does * not exist, returns -1. */ public int beginOffset(int group){ return (group == 0 ? __matchBeginOffset : -1); } /** * Returns an offset marking the end of the pattern match * relative to the beginning of the input. *

* @param group The pattern subgroup. * @return Returns one plus the offset of the last token in * the indicated pattern subgroup. If a group was never matched * or does not exist, returns -1. A group matching the null * string will return its start offset. */ public int endOffset(int group){ return (group == 0 ? __matchBeginOffset + __length : -1); } /** * The same as group(0). * * @return A string containing the entire match. */ public String toString() { return group(0); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/PlusNode.java0000644000175000017500000000575107773723336023753 0ustar arnaudarnaud/* * $Id: PlusNode.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; /** * @version @version@ * @since 1.o */ final class PlusNode extends StarNode { PlusNode(SyntaxNode child) { super(child); } boolean _nullable() { return false; } SyntaxNode _clone(int pos[]) { return new PlusNode(_left._clone(pos)); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/AwkPattern.java0000644000175000017500000001637307773723336024304 0ustar arnaudarnaud/* * $Id: AwkPattern.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import java.io.Serializable; import java.util.*; import org.apache.oro.text.regex.*; final class DFAState { int _stateNumber; BitSet _state; DFAState(BitSet s, int num){ _state = s; _stateNumber = num; } } /** * An implementation of the Pattern interface for Awk regular expressions. * This class is compatible with the AwkCompiler and AwkMatcher * classes. When an AwkCompiler instance compiles a regular expression * pattern, it produces an AwkPattern instance containing internal * data structures used by AwkMatcher to perform pattern matches. * This class cannot be subclassed and cannot be directly instantiated * by the programmer as it would not make sense. It is however serializable * so that pre-compiled patterns may be saved to disk and re-read at a later * time. AwkPattern instances should only be created through calls to an * AwkCompiler instance's compile() methods * * @version @version@ * @since 1.0 * @see AwkCompiler * @see AwkMatcher */ public final class AwkPattern implements Pattern, Serializable { final static int _INVALID_STATE = -1, _START_STATE = 1; int _numStates, _endPosition, _options; String _expression; Vector _Dtrans, _nodeList[], _stateList; BitSet _U, _emptySet, _followSet[], _endStates; Hashtable _stateMap; boolean _matchesNullString, _fastMap[]; boolean _hasBeginAnchor = false, _hasEndAnchor = false; AwkPattern(String expression, SyntaxTree tree){ int token, node, tstateArray[]; DFAState dfaState; _expression = expression; // Assume endPosition always occurs at end of parse. _endPosition = tree._positions - 1; _followSet = tree._followSet; _Dtrans = new Vector(); _stateList = new Vector(); _endStates = new BitSet(); _U = new BitSet(tree._positions); _U.or(tree._root._firstPosition()); tstateArray = new int[LeafNode._NUM_TOKENS]; _Dtrans.addElement(tstateArray); // this is a dummy entry because we // number our states starting from 1 _Dtrans.addElement(tstateArray); _numStates = _START_STATE; if(_U.get(_endPosition)) _endStates.set(_numStates); dfaState = new DFAState((BitSet)_U.clone(), _numStates); _stateMap = new Hashtable(); _stateMap.put(dfaState._state, dfaState); _stateList.addElement(dfaState); // this is a dummy entry because we // number our states starting from 1 _stateList.addElement(dfaState); _numStates++; _U.xor(_U); // clear bits _emptySet = new BitSet(tree._positions); _nodeList = new Vector[LeafNode._NUM_TOKENS]; for(token = 0; token < LeafNode._NUM_TOKENS; token++){ _nodeList[token] = new Vector(); for(node=0; node < tree._positions; node++) if(tree._nodes[node]._matches((char)token)) _nodeList[token].addElement(tree._nodes[node]); } _fastMap = tree.createFastMap(); _matchesNullString = _endStates.get(_START_STATE); } // tstateArray is assumed to have been set before calling this method void _createNewState(int current, int token, int[] tstateArray) { int node, pos; DFAState T, dfaState; T = (DFAState)_stateList.elementAt(current); node = _nodeList[token].size(); _U.xor(_U); // clear bits while(node-- > 0){ pos = ((LeafNode)_nodeList[token].elementAt(node))._position; if(T._state.get(pos)) _U.or(_followSet[pos]); } if(!_stateMap.containsKey(_U)){ dfaState = new DFAState((BitSet)_U.clone(), _numStates++); _stateList.addElement(dfaState); _stateMap.put(dfaState._state, dfaState); _Dtrans.addElement(new int[LeafNode._NUM_TOKENS]); if(!_U.equals(_emptySet)){ tstateArray[token] = _numStates - 1; if(_U.get(_endPosition)) _endStates.set(_numStates - 1); } else tstateArray[token] = _INVALID_STATE; } else { if(_U.equals(_emptySet)) tstateArray[token] = _INVALID_STATE; else tstateArray[token] = ((DFAState)_stateMap.get(_U))._stateNumber; } } int[] _getStateArray(int state) { return ((int[])_Dtrans.elementAt(state)); } /** * This method returns the string representation of the pattern. *

* @return The original string representation of the regular expression * pattern. */ public String getPattern() { return _expression; } /** * This method returns an integer containing the compilation options used * to compile this pattern. *

* @return The compilation options used to compile the pattern. */ public int getOptions() { return _options; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/EpsilonNode.java0000644000175000017500000000622207773723336024433 0ustar arnaudarnaud/* * $Id: EpsilonNode.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import java.util.*; /** * @version @version@ * @since 1.0 */ final class EpsilonNode extends SyntaxNode { BitSet _positionSet = new BitSet(1); boolean _nullable() { return true; } BitSet _firstPosition() { return _positionSet; } BitSet _lastPosition() { return _positionSet; } void _followPosition(BitSet[] follow, SyntaxNode[] nodes) { } SyntaxNode _clone(int pos[]){ return new EpsilonNode(); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/AwkCompiler.java0000644000175000017500000006746607773723336024452 0ustar arnaudarnaud/* * $Id: AwkCompiler.java,v 1.10 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import org.apache.oro.text.regex.*; /** * The AwkCompiler class is used to create compiled regular expressions * conforming to the Awk regular expression syntax. It generates * AwkPattern instances upon compilation to be used in conjunction * with an AwkMatcher instance. AwkMatcher finds true leftmost-longest * matches, so you must take care with how you formulate your regular * expression to avoid matching more than you really want. *

* The supported regular expression syntax is a superset of traditional AWK, * but NOT to be confused with GNU AWK or other AWK variants. Additionally, * this AWK implementation is DFA-based and only supports 8-bit ASCII. * Consequently, these classes can perform very fast pattern matches in * most cases. *

* This is the traditional Awk syntax that is supported: *

    *
  • Alternatives separated by | *
  • Quantified atoms *
    *
    *
    Match 0 or more times. *
    +
    Match 1 or more times. *
    ?
    Match 0 or 1 times. *
    *
  • Atoms *
      *
    • regular expression within parentheses *
    • a . matches everything including newline *
    • a ^ is a null token matching the beginning of a string * but has no relation to newlines (and is only valid at the * beginning of a regex; this differs from traditional awk * for the sake of efficiency in Java). *
    • a $ is a null token matching the end of a string but has * no relation to newlines (and is only valid at the * end of a regex; this differs from traditional awk for the * sake of efficiency in Java). *
    • Character classes (e.g., [abcd]) and ranges (e.g. [a-z]) *
        *
      • Special backslashed characters work within a character class *
      *
    • Special backslashed characters *
      *
      \b
      backspace *
      \n
      newline *
      \r
      carriage return *
      \t
      tab *
      \f
      formfeed *
      \xnn
      hexadecimal representation of character *
      \nn or \nnn
      octal representation of character *
      Any other backslashed character matches itself *
      *
*

* This is the extended syntax that is supported: *

    *
  • Quantified atoms *
    *
    {n,m}
    Match at least n but not more than m times. *
    {n,}
    Match at least n times. *
    {n}
    Match exactly n times. *
    *
  • Atoms *
      *
    • Special backslashed characters *
      *
      \d
      digit [0-9] *
      \D
      non-digit [^0-9] *
      \w
      word character [0-9a-z_A-Z] *
      \W
      a non-word character [^0-9a-z_A-Z] *
      \s
      a whitespace character [ \t\n\r\f] *
      \S
      a non-whitespace character [^ \t\n\r\f] *
      \cD
      matches the corresponding control character *
      \0
      matches null character *
      *
* * @version @version@ * @since 1.0 * @see org.apache.oro.text.regex.PatternCompiler * @see org.apache.oro.text.regex.MalformedPatternException * @see AwkPattern * @see AwkMatcher */ public final class AwkCompiler implements PatternCompiler { /** * The default mask for the {@link #compile compile} methods. * It is equal to 0 and indicates no special options are active. */ public static final int DEFAULT_MASK = 0; /** * A mask passed as an option to the {@link #compile compile} methods * to indicate a compiled regular expression should be case insensitive. */ public static final int CASE_INSENSITIVE_MASK = 0x0001; /** * A mask passed as an option to the {@link #compile compile} methods * to indicate a compiled regular expression should treat input as having * multiple lines. This option affects the interpretation of * the . metacharacters. When this mask is used, * the . metacharacter will not match newlines. The default * behavior is for . to match newlines. */ public static final int MULTILINE_MASK = 0x0002; static final char _END_OF_INPUT = '\uFFFF'; // All of these are initialized by the compile() and _parse() methods // so there is no need or use in initializing them in the constructor // although this may change in the future. private boolean __inCharacterClass, __caseSensitive, __multiline; private boolean __beginAnchor, __endAnchor; private char __lookahead; private int __position, __bytesRead, __expressionLength; private char[] __regularExpression; private int __openParen, __closeParen; // We do not currently need to initialize any state, but keep this // commented out as a reminder that we may have to at some point. //public AwkCompiler() { } private static boolean __isMetachar(char token) { return (token == '*' || token == '?' || token == '+' || token == '[' || token == ']' || token == '(' || token == ')' || token == '|' || /* token == '^' || token == '$' || */ token == '.'); } static boolean _isWordCharacter(char token) { return ((token >= 'a' && token <= 'z') || (token >= 'A' && token <= 'Z') || (token >= '0' && token <= '9') || (token == '_')); } static boolean _isLowerCase(char token){ return (token >= 'a' && token <= 'z'); } static boolean _isUpperCase(char token){ return (token >= 'A' && token <= 'Z'); } static char _toggleCase(char token){ if(_isUpperCase(token)) return (char)(token + 32); else if(_isLowerCase(token)) return (char)(token - 32); return token; } private void __match(char token) throws MalformedPatternException { if(token == __lookahead){ if(__bytesRead < __expressionLength) __lookahead = __regularExpression[__bytesRead++]; else __lookahead = _END_OF_INPUT; } else throw new MalformedPatternException("token: " + token + " does not match lookahead: " + __lookahead + " at position: " + __bytesRead); } private void __putback() { if(__lookahead != _END_OF_INPUT) --__bytesRead; __lookahead = __regularExpression[__bytesRead - 1]; } private SyntaxNode __regex() throws MalformedPatternException { SyntaxNode left; left = __branch(); if(__lookahead == '|') { __match('|'); return (new OrNode(left, __regex())); } return left; } private SyntaxNode __branch() throws MalformedPatternException { CatNode current; SyntaxNode left, root; left = __piece(); if(__lookahead == ')'){ if(__openParen > __closeParen) return left; else throw new MalformedPatternException("Parse error: close parenthesis" + " without matching open parenthesis at position " + __bytesRead); } else if(__lookahead == '|' || __lookahead == _END_OF_INPUT) return left; root = current = new CatNode(); current._left = left; while(true) { left = __piece(); if(__lookahead == ')'){ if(__openParen > __closeParen){ current._right = left; break; } else throw new MalformedPatternException("Parse error: close parenthesis" + " without matching open parenthesis at position " + __bytesRead); } else if(__lookahead == '|' || __lookahead == _END_OF_INPUT){ current._right = left; break; } current._right = new CatNode(); current = (CatNode)current._right; current._left = left; } return root; } private SyntaxNode __piece() throws MalformedPatternException { SyntaxNode left; left = __atom(); switch(__lookahead){ case '+' : __match('+'); return (new PlusNode(left)); case '?' : __match('?'); return (new QuestionNode(left)); case '*' : __match('*'); return (new StarNode(left)); case '{' : return __repetition(left); } return left; } // if numChars is 0, this means match as many as you want private int __parseUnsignedInteger(int radix, int minDigits, int maxDigits) throws MalformedPatternException { int num, digits = 0; StringBuffer buf; // We don't expect huge numbers, so an initial buffer of 4 is fine. buf = new StringBuffer(4); while(Character.digit(__lookahead, radix) != -1 && digits < maxDigits){ buf.append((char)__lookahead); __match(__lookahead); ++digits; } if(digits < minDigits || digits > maxDigits) throw new MalformedPatternException( "Parse error: unexpected number of digits at position " + __bytesRead); try { num = Integer.parseInt(buf.toString(), radix); } catch(NumberFormatException e) { throw new MalformedPatternException("Parse error: numeric value at " + "position " + __bytesRead + " is invalid"); } return num; } private SyntaxNode __repetition(SyntaxNode atom) throws MalformedPatternException { int min, max, startPosition[]; SyntaxNode root = null; CatNode catNode; __match('{'); min = __parseUnsignedInteger(10, 1, Integer.MAX_VALUE); startPosition = new int[1]; startPosition[0] = __position; if(__lookahead == '}'){ // Match exactly min times. Concatenate the atom min times. __match('}'); if(min == 0) throw new MalformedPatternException( "Parse error: Superfluous interval specified at position " + __bytesRead + ". Number of occurences was set to zero."); if(min == 1) return atom; root = catNode = new CatNode(); catNode._left = atom; while(--min > 1) { atom = atom._clone(startPosition); catNode._right = new CatNode(); catNode = (CatNode)catNode._right; catNode._left = atom; } catNode._right = atom._clone(startPosition); } else if(__lookahead == ','){ __match(','); if(__lookahead == '}') { // match at least min times __match('}'); if(min == 0) return new StarNode(atom); if(min == 1) return new PlusNode(atom); root = catNode = new CatNode(); catNode._left = atom; while(--min > 0) { atom = atom._clone(startPosition); catNode._right = new CatNode(); catNode = (CatNode)catNode._right; catNode._left = atom; } catNode._right = new StarNode(atom._clone(startPosition)); } else { // match at least min times and at most max times max = __parseUnsignedInteger(10, 1, Integer.MAX_VALUE); __match('}'); if(max < min) throw new MalformedPatternException("Parse error: invalid interval; " + max + " is less than " + min + " at position " + __bytesRead); if(max == 0) throw new MalformedPatternException( "Parse error: Superfluous interval specified at position " + __bytesRead + ". Number of occurences was set to zero."); if(min == 0) { if(max == 1) return new QuestionNode(atom); root = catNode = new CatNode(); atom = new QuestionNode(atom); catNode._left = atom; while(--max > 1) { atom = atom._clone(startPosition); catNode._right = new CatNode(); catNode = (CatNode)catNode._right; catNode._left = atom; } catNode._right = atom._clone(startPosition); } else if(min == max) { if(min == 1) return atom; root = catNode = new CatNode(); catNode._left = atom; while(--min > 1) { atom = atom._clone(startPosition); catNode._right = new CatNode(); catNode = (CatNode)catNode._right; catNode._left = atom; } catNode._right = atom._clone(startPosition); } else { int count; root = catNode = new CatNode(); catNode._left = atom; for(count=1; count < min; count++) { atom = atom._clone(startPosition); catNode._right = new CatNode(); catNode = (CatNode)catNode._right; catNode._left = atom; } atom = new QuestionNode(atom._clone(startPosition)); count = max-min; if(count == 1) catNode._right = atom; else { catNode._right = new CatNode(); catNode = (CatNode)catNode._right; catNode._left = atom; while(--count > 1) { atom = atom._clone(startPosition); catNode._right = new CatNode(); catNode = (CatNode)catNode._right; catNode._left = atom; } catNode._right = atom._clone(startPosition); } } } } else throw new MalformedPatternException("Parse error: unexpected character " + __lookahead + " in interval at position " + __bytesRead); __position = startPosition[0]; return root; } private SyntaxNode __backslashToken() throws MalformedPatternException { SyntaxNode current; char token; int number; __match('\\'); if(__lookahead == 'x'){ __match('x'); // Parse a hexadecimal number current = _newTokenNode((char)__parseUnsignedInteger(16, 2, 2), __position++); } else if(__lookahead == 'c') { __match('c'); // Create a control character token = Character.toUpperCase(__lookahead); token = (char)(token > 63 ? token - 64 : token + 64); current = new TokenNode(token, __position++); __match(__lookahead); } else if(__lookahead >= '0' && __lookahead <= '9') { __match(__lookahead); if(__lookahead >= '0' && __lookahead <= '9'){ // We have an octal character or a multi-digit backreference. // Assume octal character for now. __putback(); number = __parseUnsignedInteger(10, 2, 3); number = Integer.parseInt(Integer.toString(number), 8); current = _newTokenNode((char)number, __position++); } else { // We have either \0, an escaped digit, or a backreference. __putback(); if(__lookahead == '0'){ // \0 matches the null character __match('0'); current = new TokenNode('\0', __position++); } else { // Either an escaped digit or backreference. number = Character.digit(__lookahead, 10); current = _newTokenNode(__lookahead, __position++); __match(__lookahead); } } } else if(__lookahead == 'b') { // Inside of a character class the \b means backspace, otherwise // it means a word boundary //if(__inCharacterClass) // \b always means backspace current = new TokenNode('\b', __position++); /* else current = new TokenNode((char)LeafNode._WORD_BOUNDARY_MARKER_TOKEN, position++); */ __match('b'); } /*else if(__lookahead == 'B' && !__inCharacterClass){ current = new TokenNode((char)LeafNode._NONWORD_BOUNDARY_MARKER_TOKEN, position++); __match('B'); } */ else { CharacterClassNode characterSet; token = __lookahead; switch(__lookahead){ case 'n' : token = '\n'; break; case 'r' : token = '\r'; break; case 't' : token = '\t'; break; case 'f' : token = '\f'; break; } switch(token) { case 'd' : characterSet = new CharacterClassNode(__position++); characterSet._addTokenRange('0', '9'); current = characterSet; break; case 'D' : characterSet = new NegativeCharacterClassNode(__position++); characterSet._addTokenRange('0', '9'); current = characterSet; break; case 'w' : characterSet = new CharacterClassNode(__position++); characterSet._addTokenRange('0', '9'); characterSet._addTokenRange('a', 'z'); characterSet._addTokenRange('A', 'Z'); characterSet._addToken('_'); current = characterSet; break; case 'W' : characterSet = new NegativeCharacterClassNode(__position++); characterSet._addTokenRange('0', '9'); characterSet._addTokenRange('a', 'z'); characterSet._addTokenRange('A', 'Z'); characterSet._addToken('_'); current = characterSet; break; case 's' : characterSet = new CharacterClassNode(__position++); characterSet._addToken(' '); characterSet._addToken('\f'); characterSet._addToken('\n'); characterSet._addToken('\r'); characterSet._addToken('\t'); current = characterSet; break; case 'S' : characterSet = new NegativeCharacterClassNode(__position++); characterSet._addToken(' '); characterSet._addToken('\f'); characterSet._addToken('\n'); characterSet._addToken('\r'); characterSet._addToken('\t'); current = characterSet; break; default : current = _newTokenNode(token, __position++); break; } __match(__lookahead); } return current; } private SyntaxNode __atom() throws MalformedPatternException { SyntaxNode current; if(__lookahead == '(') { __match('('); ++__openParen; current = __regex(); __match(')'); ++__closeParen; } else if(__lookahead == '[') current = __characterClass(); else if(__lookahead == '.') { CharacterClassNode characterSet; __match('.'); characterSet = new NegativeCharacterClassNode(__position++); if(__multiline) characterSet._addToken('\n'); current = characterSet; } else if(__lookahead == '\\') { current = __backslashToken(); } /*else if(__lookahead == '^') { current = new TokenNode((char)LeafNode._BEGIN_LINE_MARKER_TOKEN, __position++); __match('^'); } else if(__lookahead == '$') { current = new TokenNode((char)LeafNode._END_LINE_MARKER_TOKEN, __position++); __match('$'); } */ else if(!__isMetachar(__lookahead)) { current = _newTokenNode(__lookahead, __position++); __match(__lookahead); } else throw new MalformedPatternException("Parse error: unexpected character " + __lookahead + " at position " + __bytesRead); return current; } private SyntaxNode __characterClass() throws MalformedPatternException { char lastToken, token; SyntaxNode node; CharacterClassNode current; __match('['); __inCharacterClass = true; if(__lookahead == '^'){ __match('^'); current = new NegativeCharacterClassNode(__position++); } else current = new CharacterClassNode(__position++); while(__lookahead != ']' && __lookahead != _END_OF_INPUT) { if(__lookahead == '\\'){ node = __backslashToken(); --__position; // __backslashToken() (actually newTokenNode()) does not take care of // case insensitivity when __inCharacterClass is true. if(node instanceof TokenNode){ lastToken = ((TokenNode)node)._token; current._addToken(lastToken); if(!__caseSensitive) current._addToken(_toggleCase(lastToken)); } else { CharacterClassNode slash; slash = (CharacterClassNode)node; // This could be made more efficient by manipulating the // characterSet elements of the CharacterClassNodes but // for the moment, this is more clear. for(token=0; token < LeafNode._NUM_TOKENS; token++){ if(slash._matches(token)) current._addToken(token); } // A byproduct of this act is that when a '-' occurs after // a \d, \w, etc. it is not interpreted as a range and no // parse exception is thrown. // This is considered a feature and not a bug for now. continue; } } else { lastToken = __lookahead; current._addToken(__lookahead); if(!__caseSensitive) current._addToken(_toggleCase(__lookahead)); __match(__lookahead); } // In Perl, a - is a token if it occurs at the beginning // or end of the character class. Anywhere else, it indicates // a range. // A byproduct of this implementation is that if a '-' occurs // after the end of a range, it is interpreted as a '-' and no // exception is thrown. e.g., the second dash in [a-z-x] // This is considered a feature and not a bug for now. if(__lookahead == '-'){ __match('-'); if(__lookahead == ']'){ current._addToken('-'); break; } else if(__lookahead == '\\') { node = __backslashToken(); --__position; if(node instanceof TokenNode) token = ((TokenNode)node)._token; else throw new MalformedPatternException( "Parse error: invalid range specified at position " + __bytesRead); } else { token = __lookahead; __match(__lookahead); } if(token < lastToken) throw new MalformedPatternException( "Parse error: invalid range specified at position " + __bytesRead); current._addTokenRange(lastToken + 1, token); if(!__caseSensitive) current._addTokenRange(_toggleCase((char)(lastToken + 1)), _toggleCase(token)); } } __match(']'); __inCharacterClass = false; return current; } SyntaxNode _newTokenNode(char token, int position){ if(!__inCharacterClass && !__caseSensitive && (_isUpperCase(token) || _isLowerCase(token))){ CharacterClassNode node = new CharacterClassNode(position); node._addToken(token); node._addToken(_toggleCase(token)); return node; } return new TokenNode(token, position); } SyntaxTree _parse(char[] expression) throws MalformedPatternException { SyntaxTree tree; __openParen = __closeParen = 0; __regularExpression = expression; __bytesRead = 0; __expressionLength = expression.length; __inCharacterClass = false; __position = 0; __match(__lookahead); // Call match to read first input. if(__lookahead == '^') { __beginAnchor = true; __match(__lookahead); } if(__expressionLength > 0 && expression[__expressionLength - 1] == '$') { --__expressionLength; __endAnchor = true; } if(__expressionLength > 1 || (__expressionLength == 1 && !__beginAnchor)) { CatNode root; root = new CatNode(); root._left = __regex(); // end marker root._right = new TokenNode((char)LeafNode._END_MARKER_TOKEN, __position++); tree = new SyntaxTree(root, __position); } else tree = new SyntaxTree(new TokenNode((char)LeafNode._END_MARKER_TOKEN, 0), 1); tree._computeFollowPositions(); return tree; } /** * Compiles an Awk regular expression into an AwkPattern instance that * can be used by an AwkMatcher object to perform pattern matching. *

* @param pattern An Awk regular expression to compile. * @param options A set of flags giving the compiler instructions on * how to treat the regular expression. Currently the * only meaningful flag is AwkCompiler.CASE_INSENSITIVE_MASK. * @return A Pattern instance constituting the compiled regular expression. * This instance will always be an AwkPattern and can be reliably * be casted to an AwkPattern. * @exception MalformedPatternException If the compiled expression * is not a valid Awk regular expression. */ public Pattern compile(char[] pattern, int options) throws MalformedPatternException { SyntaxTree tree; AwkPattern regexp; __beginAnchor = __endAnchor = false; __caseSensitive = ((options & CASE_INSENSITIVE_MASK) == 0); __multiline = ((options & MULTILINE_MASK) != 0); tree = _parse(pattern); regexp = new AwkPattern(new String(pattern), tree); regexp._options = options; regexp._hasBeginAnchor = __beginAnchor; regexp._hasEndAnchor = __endAnchor; return regexp; } /** * Compiles an Awk regular expression into an AwkPattern instance that * can be used by an AwkMatcher object to perform pattern matching. *

* @param pattern An Awk regular expression to compile. * @param options A set of flags giving the compiler instructions on * how to treat the regular expression. Currently the * only meaningful flag is AwkCompiler.CASE_INSENSITIVE_MASK. * @return A Pattern instance constituting the compiled regular expression. * This instance will always be an AwkPattern and can be reliably * be casted to an AwkPattern. * @exception MalformedPatternException If the compiled expression * is not a valid Awk regular expression. */ public Pattern compile(String pattern, int options) throws MalformedPatternException { SyntaxTree tree; AwkPattern regexp; __beginAnchor = __endAnchor = false; __caseSensitive = ((options & CASE_INSENSITIVE_MASK) == 0); __multiline = ((options & MULTILINE_MASK) != 0); tree = _parse(pattern.toCharArray()); regexp = new AwkPattern(pattern, tree); regexp._options = options; regexp._hasBeginAnchor = __beginAnchor; regexp._hasEndAnchor = __endAnchor; return regexp; } /** * Same as calling compile(pattern, AwkCompiler.DEFAULT_MASK); *

* @param pattern A regular expression to compile. * @return A Pattern instance constituting the compiled regular expression. * This instance will always be an AwkPattern and can be reliably * be casted to an AwkPattern. * @exception MalformedPatternException If the compiled expression * is not a valid Awk regular expression. */ public Pattern compile(char[] pattern) throws MalformedPatternException { return compile(pattern, DEFAULT_MASK); } /** * Same as calling compile(pattern, AwkCompiler.DEFAULT_MASK); *

* @param pattern A regular expression to compile. * @return A Pattern instance constituting the compiled regular expression. * This instance will always be an AwkPattern and can be reliably * be casted to an AwkPattern. * @exception MalformedPatternException If the compiled expression * is not a valid Awk regular expression. */ public Pattern compile(String pattern) throws MalformedPatternException { return compile(pattern, DEFAULT_MASK); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/TokenNode.java0000644000175000017500000000605207773723336024103 0ustar arnaudarnaud/* * $Id: TokenNode.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; /** * @version @version@ * @since 1.0 */ class TokenNode extends LeafNode { char _token; TokenNode(char token, int position) { super(position); _token = token; } boolean _matches(char token) { return (_token == token); } SyntaxNode _clone(int pos[]) { return new TokenNode(_token, pos[0]++); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/awk/NegativeCharacterClassNode.java0000644000175000017500000000643307773723336027373 0ustar arnaudarnaud/* * $Id: NegativeCharacterClassNode.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.awk; import java.util.*; /** * @version @version@ * @since 1.0 */ final class NegativeCharacterClassNode extends CharacterClassNode { NegativeCharacterClassNode(int position) { super(position); _characterSet.set(LeafNode._END_MARKER_TOKEN); } boolean _matches(char token) { return (!_characterSet.get(token)); } SyntaxNode _clone(int pos[]) { NegativeCharacterClassNode node; node = new NegativeCharacterClassNode(pos[0]++); node._characterSet = (BitSet)_characterSet.clone(); return node; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/perl/0000755000175000017500000000000010423237774021517 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/org/apache/oro/text/perl/MalformedPerl5PatternException.java0000644000175000017500000001036707773723336030434 0ustar arnaudarnaud/* * $Id: MalformedPerl5PatternException.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.perl; import org.apache.oro.text.MalformedCachePatternException; /** * An exception used to indicate errors in Perl style regular expressions. * It is derived from RuntimeException, and therefore does not have to be * caught. You should generally make an effort to catch * MalformedPerl5PatternException whenever you use dynamically generated * patterns (from user input or some other source). Static expressions * represented as strings in your source code don't require exception * handling because as you write and test run your program you will * correct any errors in those expressions when you run into an uncaught * MalformedPerl5PatternException. By the time you complete your * project, those static expressions will be guaranteed to be correct. * However, pieces of code with expressions that you cannot guarantee to * be correct should catch MalformedPerl5PatternException to ensure * reliability. * * @version @version@ * @since 1.0 * @see org.apache.oro.text.regex.MalformedPatternException */ public final class MalformedPerl5PatternException extends MalformedCachePatternException { /** * Simply calls the corresponding constructor of its superclass. */ public MalformedPerl5PatternException() { super(); } /** * Simply calls the corresponding constructor of its superclass. *

* @param message A message indicating the nature of the error. */ public MalformedPerl5PatternException(String message) { super(message); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/perl/package.html0000644000175000017500000000041207773723336024005 0ustar arnaudarnaud This package used to be the PerlTools library and adds Perl5 regular expression syntactic sugar built on top of the {@link org.apache.oro.text.regex} Perl5 regular expression classes. jakarta-oro-2.0.8/src/java/org/apache/oro/text/perl/Perl5Util.java0000644000175000017500000013602007773723336024221 0ustar arnaudarnaud/* * $Id: Perl5Util.java,v 1.19 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.perl; import java.util.*; import org.apache.oro.text.*; import org.apache.oro.text.regex.*; import org.apache.oro.util.*; /** * This is a utility class implementing the 3 most common Perl5 operations * involving regular expressions: *

    *
  • [m]/pattern/[i][m][s][x], *
  • s/pattern/replacement/[g][i][m][o][s][x], *
  • and split(). *
* As with Perl, any non-alphanumeric character can be used in lieu of * the slashes. *

* The objective of the class is to minimize the amount of code a Java * programmer using Jakarta-ORO * has to write to achieve the same results as Perl by * transparently handling regular expression compilation, caching, and * matching. A second objective is to use the same Perl pattern matching * syntax to ease the task of Perl programmers transitioning to Java * (this also reduces the number of parameters to a method). * All the state affecting methods are synchronized to avoid * the maintenance of explicit locks in multithreaded programs. This * philosophy differs from the * {@link org.apache.oro.text.regex} package, where * you are expected to either maintain explicit locks, or more preferably * create separate compiler and matcher instances for each thread. *

* To use this class, first create an instance using the default constructor * or initialize the instance with a PatternCache of your choosing using * the alternate constructor. The default cache used by Perl5Util is a * PatternCacheLRU of capacity GenericPatternCache.DEFAULT_CAPACITY. You may * want to create a cache with a different capacity, a different * cache replacement policy, or even devise your own PatternCache * implementation. The PatternCacheLRU is probably the best general purpose * pattern cache, but your specific application may be better served by * a different cache replacement policy. You should remember that you can * front-load a cache with all the patterns you will be using before * initializing a Perl5Util instance, or you can just let Perl5Util * fill the cache as you use it. *

* You might use the class as follows: *

 * Perl5Util util = new Perl5Util();
 * String line;
 * DataInputStream input;
 * PrintStream output;
 * 
 * // Initialization of input and output omitted
 * while((line = input.readLine()) != null) {
 *     // First find the line with the string we want to substitute because
 *     // it is cheaper than blindly substituting each line.
 *     if(util.match("/HREF=\"description1.html\"/")) {
 *        line = util.substitute("s/description1\\.html/about1.html/", line);
 *     }
 *    output.println(line);
 * }
 * 
*

* A couple of things to remember when using this class are that the * {@link #match match()} methods have the same meaning as * {@link org.apache.oro.text.regex.Perl5Matcher#contains * Perl5Matcher.contains()} * and =~ m/pattern/ in Perl. The methods are named match * to more closely associate them with Perl and to differentiate them * from {@link org.apache.oro.text.regex.Perl5Matcher#matches * Perl5Matcher.matches()}. * A further thing to keep in mind is that the * {@link MalformedPerl5PatternException} class is derived from * RuntimeException which means you DON'T have to catch it. The reasoning * behind this is that you will detect your regular expression mistakes * as you write and debug your program when a MalformedPerl5PatternException * is thrown during a test run. However, we STRONGLY recommend that you * ALWAYS catch MalformedPerl5PatternException whenever you deal with a * DYNAMICALLY created pattern. Relying on a fatal * MalformedPerl5PatternException being thrown to detect errors while * debugging is only useful for dealing with static patterns, that is, actual * pregenerated strings present in your program. Patterns created from user * input or some other dynamic method CANNOT be relied upon to be correct * and MUST be handled by catching MalformedPerl5PatternException for your * programs to be robust. *

* Finally, as a convenience Perl5Util implements * the {@link org.apache.oro.text.regex.MatchResult MatchResult} interface. * The methods are merely wrappers which call the corresponding method of * the last {@link org.apache.oro.text.regex.MatchResult MatchResult} * found (which can be accessed with {@link #getMatch()}) by a match or * substitution (or even a split, but this isn't particularly useful). * At the moment, the * {@link org.apache.oro.text.regex.MatchResult MatchResult} returned * by {@link #getMatch()} is not stored in a thread-local variable. Therefore * concurrent calls to {@link #getMatch()} will produce unpredictable * results. So if your concurrent program requires the match results, * you must protect the matching and the result retrieval in a critical * section. If you do not need match results, you don't need to do anything * special. If you feel the J2SE implementation of {@link #getMatch()} * should use a thread-local variable and obviate the need for a critical * section, please express your views on the oro-dev mailing list. * * @version @version@ * @since 1.0 * @see MalformedPerl5PatternException * @see org.apache.oro.text.PatternCache * @see org.apache.oro.text.PatternCacheLRU * @see org.apache.oro.text.regex.MatchResult */ public final class Perl5Util implements MatchResult { /** The regular expression to use to parse match expression. */ private static final String __matchExpression = "m?(\\W)(.*)\\1([imsx]*)"; /** The pattern cache to compile and store patterns */ private PatternCache __patternCache; /** The hashtable to cache higher-level expressions */ private Cache __expressionCache; /** The pattern matcher to perform matching operations. */ private Perl5Matcher __matcher; /** The compiled match expression parsing regular expression. */ private Pattern __matchPattern; /** The last match from a successful call to a matching method. */ private MatchResult __lastMatch; /** * A container for temporarily holding the results of a split before * deleting trailing empty fields. */ private ArrayList __splitList; /** * Keeps track of the original input (for postMatch() and preMatch()) * methods. This will be discarded if the preMatch() and postMatch() * methods are moved into the MatchResult interface. */ private Object __originalInput; /** * Keeps track of the begin and end offsets of the original input for * the postMatch() and preMatch() methods. */ private int __inputBeginOffset, __inputEndOffset; /** Used for default return value of post and pre Match() */ private static final String __nullString = ""; /** * A constant passed to the {@link #split split()} methods indicating * that all occurrences of a pattern should be used to split a string. */ public static final int SPLIT_ALL = Util.SPLIT_ALL; /** * A secondary constructor for Perl5Util. It initializes the Perl5Matcher * used by the class to perform matching operations, but requires the * programmer to provide a PatternCache instance for the class * to use to compile and store regular expressions. You would want to * use this constructor if you want to change the capacity or policy * of the cache used. Example uses might be: *

   * // We know we're going to use close to 50 expressions a whole lot, so
   * // we create a cache of the proper size.
   * util = new Perl5Util(new PatternCacheLRU(50));
   * 
* or *
   * // We're only going to use a few expressions and know that second-chance
   * // fifo is best suited to the order in which we are using the patterns.
   * util = new Perl5Util(new PatternCacheFIFO2(10));
   * 
*/ public Perl5Util(PatternCache cache) { __splitList = new ArrayList(); __matcher = new Perl5Matcher(); __patternCache = cache; __expressionCache = new CacheLRU(cache.capacity()); __compilePatterns(); } /** * Default constructor for Perl5Util. This initializes the Perl5Matcher * used by the class to perform matching operations and creates a * default PatternCacheLRU instance to use to compile and cache regular * expressions. The size of this cache is * GenericPatternCache.DEFAULT_CAPACITY. */ public Perl5Util() { this(new PatternCacheLRU()); } /** * Compiles the patterns (currently only the match expression) used to * parse Perl5 expressions. Right now it initializes __matchPattern. */ private void __compilePatterns() { Perl5Compiler compiler = new Perl5Compiler(); try { __matchPattern = compiler.compile(__matchExpression, Perl5Compiler.SINGLELINE_MASK); } catch(MalformedPatternException e) { // This should only happen during debugging. //e.printStackTrace(); throw new RuntimeException(e.getMessage()); } } /** * Parses a match expression and returns a compiled pattern. * First checks the expression cache and if the pattern is not found, * then parses the expression and fetches a compiled pattern from the * pattern cache. Otherwise, just uses the pattern found in the * expression cache. __matchPattern is used to parse the expression. *

* @param pattern The Perl5 match expression to parse. * @exception MalformedPerl5PatternException If there is an error parsing * the expression. */ private Pattern __parseMatchExpression(String pattern) throws MalformedPerl5PatternException { int index, compileOptions; String options, regex; MatchResult result; Object obj; Pattern ret; obj = __expressionCache.getElement(pattern); // Must catch ClassCastException because someone might incorrectly // pass an s/// expression. try block is cheaper than checking // instanceof try { if(obj != null) return (Pattern)obj; } catch(ClassCastException e) { // Fall through and parse expression } if(!__matcher.matches(pattern, __matchPattern)) throw new MalformedPerl5PatternException("Invalid expression: " + pattern); result = __matcher.getMatch(); regex = result.group(2); compileOptions = Perl5Compiler.DEFAULT_MASK; options = result.group(3); if(options != null) { index = options.length(); while(index-- > 0) { switch(options.charAt(index)) { case 'i' : compileOptions |= Perl5Compiler.CASE_INSENSITIVE_MASK; break; case 'm' : compileOptions |= Perl5Compiler.MULTILINE_MASK; break; case 's' : compileOptions |= Perl5Compiler.SINGLELINE_MASK; break; case 'x' : compileOptions |= Perl5Compiler.EXTENDED_MASK; break; default : throw new MalformedPerl5PatternException("Invalid options: " + options); } } } ret = __patternCache.getPattern(regex, compileOptions); __expressionCache.addElement(pattern, ret); return ret; } /** * Searches for the first pattern match somewhere in a character array * taking a pattern specified in Perl5 native format: *

   * [m]/pattern/[i][m][s][x]
   * 
* The m prefix is optional and the meaning of the optional * trailing options are: *
*
i
case insensitive match *
m
treat the input as consisting of multiple lines *
s
treat the input as consisting of a single line *
x
enable extended expression syntax incorporating whitespace * and comments *
* As with Perl, any non-alphanumeric character can be used in lieu of * the slashes. *

* If the input contains the pattern, the org.apache.oro.text.regex.MatchResult * can be obtained by calling {@link #getMatch()}. * However, Perl5Util implements the MatchResult interface as a wrapper * around the last MatchResult found, so you can call its methods to * access match information. *

* @param pattern The pattern to search for. * @param input The char[] input to search. * @return True if the input contains the pattern, false otherwise. * @exception MalformedPerl5PatternException If there is an error in * the pattern. You are not forced to catch this exception * because it is derived from RuntimeException. */ public synchronized boolean match(String pattern, char[] input) throws MalformedPerl5PatternException { boolean result; __parseMatchExpression(pattern); result = __matcher.contains(input, __parseMatchExpression(pattern)); if(result) { __lastMatch = __matcher.getMatch(); __originalInput = input; __inputBeginOffset = 0; __inputEndOffset = input.length; } return result; } /** * Searches for the first pattern match in a String taking * a pattern specified in Perl5 native format: *

   * [m]/pattern/[i][m][s][x]
   * 
* The m prefix is optional and the meaning of the optional * trailing options are: *
*
i
case insensitive match *
m
treat the input as consisting of multiple lines *
s
treat the input as consisting of a single line *
x
enable extended expression syntax incorporating whitespace * and comments *
* As with Perl, any non-alphanumeric character can be used in lieu of * the slashes. *

* If the input contains the pattern, the * {@link org.apache.oro.text.regex.MatchResult MatchResult} * can be obtained by calling {@link #getMatch()}. * However, Perl5Util implements the MatchResult interface as a wrapper * around the last MatchResult found, so you can call its methods to * access match information. *

* @param pattern The pattern to search for. * @param input The String input to search. * @return True if the input contains the pattern, false otherwise. * @exception MalformedPerl5PatternException If there is an error in * the pattern. You are not forced to catch this exception * because it is derived from RuntimeException. */ public synchronized boolean match(String pattern, String input) throws MalformedPerl5PatternException { return match(pattern, input.toCharArray()); } /** * Searches for the next pattern match somewhere in a * org.apache.oro.text.regex.PatternMatcherInput instance, taking * a pattern specified in Perl5 native format: *

   * [m]/pattern/[i][m][s][x]
   * 
* The m prefix is optional and the meaning of the optional * trailing options are: *
*
i
case insensitive match *
m
treat the input as consisting of multiple lines *
s
treat the input as consisting of a single line *
x
enable extended expression syntax incorporating whitespace * and comments *
* As with Perl, any non-alphanumeric character can be used in lieu of * the slashes. *

* If the input contains the pattern, the * {@link org.apache.oro.text.regex.MatchResult MatchResult} * can be obtained by calling {@link #getMatch()}. * However, Perl5Util implements the MatchResult interface as a wrapper * around the last MatchResult found, so you can call its methods to * access match information. * After the call to this method, the PatternMatcherInput current offset * is advanced to the end of the match, so you can use it to repeatedly * search for expressions in the entire input using a while loop as * explained in the {@link org.apache.oro.text.regex.PatternMatcherInput * PatternMatcherInput} documentation. *

* @param pattern The pattern to search for. * @param input The PatternMatcherInput to search. * @return True if the input contains the pattern, false otherwise. * @exception MalformedPerl5PatternException If there is an error in * the pattern. You are not forced to catch this exception * because it is derived from RuntimeException. */ public synchronized boolean match(String pattern, PatternMatcherInput input) throws MalformedPerl5PatternException { boolean result; result = __matcher.contains(input, __parseMatchExpression(pattern)); if(result) { __lastMatch = __matcher.getMatch(); __originalInput = input.getInput(); __inputBeginOffset = input.getBeginOffset(); __inputEndOffset = input.getEndOffset(); } return result; } /** * Returns the last match found by a call to a match(), substitute(), or * split() method. This method is only intended for use to retrieve a match * found by the last match found by a match() method. This method should * be used when you want to save MatchResult instances. Otherwise, for * simply accessing match information, it is more convenient to use the * Perl5Util methods implementing the MatchResult interface. *

* @return The org.apache.oro.text.regex.MatchResult instance containing the * last match found. */ public synchronized MatchResult getMatch() { return __lastMatch; } /** * Substitutes a pattern in a given input with a replacement string. * The substitution expression is specified in Perl5 native format: *

   * s/pattern/replacement/[g][i][m][o][s][x]
   * 
* The s prefix is mandatory and the meaning of the optional * trailing options are: *
*
g
Substitute all occurrences of pattern with replacement. * The default is to replace only the first occurrence. *
i
perform a case insensitive match *
m
treat the input as consisting of multiple lines *
o
If variable interopolation is used, only evaluate the * interpolation once (the first time). This is equivalent * to using a numInterpolations argument of 1 in * {@link org.apache.oro.text.regex.Util#substitute Util.substitute()}. * The default is to compute each interpolation independently. * See * {@link org.apache.oro.text.regex.Util#substitute Util.substitute()} * and {@link org.apache.oro.text.regex.Perl5Substitution Perl5Substitution} * for more details on variable interpolation in * substitutions. *
s
treat the input as consisting of a single line *
x
enable extended expression syntax incorporating whitespace * and comments *
* As with Perl, any non-alphanumeric character can be used in lieu of * the slashes. This is helpful to avoid backslashing. For example, * using slashes you would have to do: *
   * numSubs = util.substitute(result, "s/foo\\/bar/goo\\/\\/baz/", input);
   * 
* when you could more easily write: *
   * numSubs = util.substitute(result, "s#foo/bar#goo//baz#", input);
   * 
* where the hashmarks are used instead of slashes. *

* There is a special case of backslashing that you need to pay attention * to. As demonstrated above, to denote a delimiter in the substituted * string it must be backslashed. However, this can be a problem * when you want to denote a backslash at the end of the substituted * string. As of PerlTools 1.3, a new means of handling this * situation has been implemented. * In previous versions, the behavior was that *

* "... a double backslash (quadrupled in the Java String) always * represents two backslashes unless the second backslash is followed * by the delimiter, in which case it represents a single backslash." *
*

* The new behavior is that a backslash is always a backslash * in the substitution portion of the expression unless it is used to * escape a delimiter. A backslash is considered to escape a delimiter * if an even number of contiguous backslashes preceed the backslash * and the delimiter following the backslash is not the FINAL delimiter * in the expression. Therefore, backslashes preceding final delimiters * are never considered to escape the delimiter. The following, which * used to be an invalid expression and require a special-case extra * backslash, will now replace all instances of / with \: *

   * numSubs = util.substitute(result, "s#/#\\#g", input);
   * 
*

* @param result The StringBuffer in which to store the result of the * substitutions. The buffer is only appended to. * @param expression The Perl5 substitution regular expression. * @param input The input on which to perform substitutions. * @return The number of substitutions made. * @exception MalformedPerl5PatternException If there is an error in * the expression. You are not forced to catch this exception * because it is derived from RuntimeException. * @since 2.0.6 */ // Expression parsing will have to be moved into a separate method if // there are going to be variations of this method. public synchronized int substitute(StringBuffer result, String expression, String input) throws MalformedPerl5PatternException { boolean backslash, finalDelimiter; int index, compileOptions, numSubstitutions, numInterpolations; int firstOffset, secondOffset, thirdOffset, subCount; StringBuffer replacement; Pattern compiledPattern; char exp[], delimiter; ParsedSubstitutionEntry entry; Perl5Substitution substitution; Object obj; obj = __expressionCache.getElement(expression); __nullTest: if(obj != null) { // Must catch ClassCastException because someone might incorrectly // pass an m// expression. try block is cheaper than checking // instanceof. We want to go ahead with parsing just in case so // we break. try { entry = (ParsedSubstitutionEntry)obj; } catch(ClassCastException e) { break __nullTest; } subCount = Util.substitute(result, __matcher, entry._pattern, entry._substitution, input, entry._numSubstitutions); __lastMatch = __matcher.getMatch(); return subCount; } exp = expression.toCharArray(); // Make sure basic conditions for a valid substitution expression hold. if(exp.length < 4 || exp[0] != 's' || Character.isLetterOrDigit(exp[1]) || exp[1] == '-') throw new MalformedPerl5PatternException("Invalid expression: " + expression); delimiter = exp[1]; firstOffset = 2; secondOffset = thirdOffset = -1; backslash = false; // Parse pattern for(index = firstOffset; index < exp.length; index++) { if(exp[index] == '\\') backslash = !backslash; else if(exp[index] == delimiter && !backslash) { secondOffset = index; break; } else if(backslash) backslash = !backslash; } if(secondOffset == -1 || secondOffset == exp.length - 1) throw new MalformedPerl5PatternException("Invalid expression: " + expression); // Parse replacement string backslash = false; finalDelimiter = true; replacement = new StringBuffer(exp.length - secondOffset); for(index = secondOffset + 1; index < exp.length; index++) { if(exp[index] == '\\') { backslash = !backslash; // 05/05/99 dfs // We unbackslash backslashed delimiters in the replacement string // only if we're on an odd backslash and there is another occurrence // of a delimiter later in the string. if(backslash && index + 1 < exp.length && exp[index + 1] == delimiter && expression.lastIndexOf(delimiter, exp.length - 1) != (index + 1)) { finalDelimiter = false; continue; } } else if(exp[index] == delimiter && finalDelimiter) { thirdOffset = index; break; } else { backslash = false; finalDelimiter = true; } replacement.append(exp[index]); } if(thirdOffset == -1) throw new MalformedPerl5PatternException("Invalid expression: " + expression); compileOptions = Perl5Compiler.DEFAULT_MASK; numSubstitutions = 1; // Single quotes cause no interpolations to be performed in replacement if(delimiter != '\'') numInterpolations = Perl5Substitution.INTERPOLATE_ALL; else numInterpolations = Perl5Substitution.INTERPOLATE_NONE; // Parse options for(index = thirdOffset + 1; index < exp.length; index++) { switch(exp[index]) { case 'i' : compileOptions |= Perl5Compiler.CASE_INSENSITIVE_MASK; break; case 'm' : compileOptions |= Perl5Compiler.MULTILINE_MASK; break; case 's' : compileOptions |= Perl5Compiler.SINGLELINE_MASK; break; case 'x' : compileOptions |= Perl5Compiler.EXTENDED_MASK; break; case 'g' : numSubstitutions = Util.SUBSTITUTE_ALL; break; case 'o' : numInterpolations = 1; break; default : throw new MalformedPerl5PatternException("Invalid option: " + exp[index]); } } compiledPattern = __patternCache.getPattern(new String(exp, firstOffset, secondOffset - firstOffset), compileOptions); substitution = new Perl5Substitution(replacement.toString(), numInterpolations); entry = new ParsedSubstitutionEntry(compiledPattern, substitution, numSubstitutions); __expressionCache.addElement(expression, entry); subCount = Util.substitute(result, __matcher, compiledPattern, substitution, input, numSubstitutions); __lastMatch = __matcher.getMatch(); return subCount; } /** * Substitutes a pattern in a given input with a replacement string. * The substitution expression is specified in Perl5 native format. *

*
Calling this method is the same as:
*
*
   *      String result;
   *      StringBuffer buffer = new StringBuffer();
   *      perl.substitute(buffer, expression, input);
   *      result = buffer.toString();
   *     
*
*
* @param expression The Perl5 substitution regular expression. * @param input The input on which to perform substitutions. * @return The input as a String after substitutions have been performed. * @exception MalformedPerl5PatternException If there is an error in * the expression. You are not forced to catch this exception * because it is derived from RuntimeException. * @since 1.0 * @see #substitute */ public synchronized String substitute(String expression, String input) throws MalformedPerl5PatternException { StringBuffer result = new StringBuffer(); substitute(result, expression, input); return result.toString(); } /** * Splits a String into strings that are appended to a List, but no more * than a specified limit. The String is split using a regular expression * as the delimiter. The regular expression is a pattern specified * in Perl5 native format: *
   * [m]/pattern/[i][m][s][x]
   * 
* The m prefix is optional and the meaning of the optional * trailing options are: *
*
i
case insensitive match *
m
treat the input as consisting of multiple lines *
s
treat the input as consisting of a single line *
x
enable extended expression syntax incorporating whitespace * and comments *
* As with Perl, any non-alphanumeric character can be used in lieu of * the slashes. *

* The limit parameter causes the string to be split on at most the first * limit - 1 number of pattern occurences. *

* Of special note is that this split method performs EXACTLY the same * as the Perl split() function. In other words, if the split pattern * contains parentheses, additional Vector elements are created from * each of the matching subgroups in the pattern. Using an example * similar to the one from the Camel book: *

   * split(list, "/([,-])/", "8-12,15,18")
   * 
* produces the Vector containing: *
   * { "8", "-", "12", ",", "15", ",", "18" }
   * 
* Furthermore, the following Perl behavior is observed: "leading empty * fields are preserved, and empty trailing one are deleted." This * has the effect that a split on a zero length string returns an empty * list. * The {@link org.apache.oro.text.regex.Util#split Util.split()} method * does NOT implement these behaviors because it is intended to * be a general self-consistent and predictable split function usable * with Pattern instances other than Perl5Pattern. *

* @param results * A Collection to which the substrings of the input * that occur between the regular expression delimiter occurences * are appended. The input will not be split into any more substrings * than the specified * limit. A way of thinking of this is that only the first * limit - 1 * matches of the delimiting regular expression will be used to split the * input. The Collection must support the * addAll(Collection) operation. * @param pattern The regular expression to use as a split delimiter. * @param input The String to split. * @param limit The limit on the size of the returned Vector. * Values <= 0 produce the same behavior as the SPLIT_ALL constant which * causes the limit to be ignored and splits to be performed on all * occurrences of the pattern. You should use the SPLIT_ALL constant * to achieve this behavior instead of relying on the default behavior * associated with non-positive limit values. * @exception MalformedPerl5PatternException If there is an error in * the expression. You are not forced to catch this exception * because it is derived from RuntimeException. */ public synchronized void split(Collection results, String pattern, String input, int limit) throws MalformedPerl5PatternException { int beginOffset, groups, index; String group; MatchResult currentResult = null; PatternMatcherInput pinput; Pattern compiledPattern; compiledPattern = __parseMatchExpression(pattern); pinput = new PatternMatcherInput(input); beginOffset = 0; while(--limit != 0 && __matcher.contains(pinput, compiledPattern)) { currentResult = __matcher.getMatch(); __splitList.add(input.substring(beginOffset, currentResult.beginOffset(0))); if((groups = currentResult.groups()) > 1) { for(index = 1; index < groups; ++index) { group = currentResult.group(index); if(group != null && group.length() > 0) __splitList.add(group); } } beginOffset = currentResult.endOffset(0); } __splitList.add(input.substring(beginOffset, input.length())); // Remove all trailing empty fields. for(int i = __splitList.size() - 1; i >= 0; --i) { String str; str = (String)__splitList.get(i); if(str.length() == 0) __splitList.remove(i); else break; } results.addAll(__splitList); __splitList.clear(); // Just for the sake of completeness __lastMatch = currentResult; } /** * This method is identical to calling: *

   * split(results, pattern, input, SPLIT_ALL);
   * 
*/ public synchronized void split(Collection results, String pattern, String input) throws MalformedPerl5PatternException { split(results, pattern, input, SPLIT_ALL); } /** * Splits input in the default Perl manner, splitting on all whitespace. * This method is identical to calling: *
   * split(results, "/\\s+/", input);
   * 
*/ public synchronized void split(Collection results, String input) throws MalformedPerl5PatternException { split(results, "/\\s+/", input); } /** * Splits a String into strings contained in a Vector of size no greater * than a specified limit. The String is split using a regular expression * as the delimiter. The regular expression is a pattern specified * in Perl5 native format: *
   * [m]/pattern/[i][m][s][x]
   * 
* The m prefix is optional and the meaning of the optional * trailing options are: *
*
i
case insensitive match *
m
treat the input as consisting of multiple lines *
s
treat the input as consisting of a single line *
x
enable extended expression syntax incorporating whitespace * and comments *
* As with Perl, any non-alphanumeric character can be used in lieu of * the slashes. *

* The limit parameter causes the string to be split on at most the first * limit - 1 number of pattern occurences. *

* Of special note is that this split method performs EXACTLY the same * as the Perl split() function. In other words, if the split pattern * contains parentheses, additional Vector elements are created from * each of the matching subgroups in the pattern. Using an example * similar to the one from the Camel book: *

   * split("/([,-])/", "8-12,15,18")
   * 
* produces the Vector containing: *
   * { "8", "-", "12", ",", "15", ",", "18" }
   * 
* The {@link org.apache.oro.text.regex.Util#split Util.split()} method * does NOT implement this particular behavior because it is intended to * be usable with Pattern instances other than Perl5Pattern. *

* @deprecated Use * {@link #split(Collection results, String pattern, String input, int limit)} * instead. * @param pattern The regular expression to use as a split delimiter. * @param input The String to split. * @param limit The limit on the size of the returned Vector. * Values <= 0 produce the same behavior as the SPLIT_ALL constant which * causes the limit to be ignored and splits to be performed on all * occurrences of the pattern. You should use the SPLIT_ALL constant * to achieve this behavior instead of relying on the default behavior * associated with non-positive limit values. * @return A Vector containing the substrings of the input * that occur between the regular expression delimiter occurences. The * input will not be split into any more substrings than the specified * limit. A way of thinking of this is that only the first * limit - 1 * matches of the delimiting regular expression will be used to split the * input. * @exception MalformedPerl5PatternException If there is an error in * the expression. You are not forced to catch this exception * because it is derived from RuntimeException. */ public synchronized Vector split(String pattern, String input, int limit) throws MalformedPerl5PatternException { Vector results = new Vector(20); split(results, pattern, input, limit); return results; } /** * This method is identical to calling: *

   * split(pattern, input, SPLIT_ALL);
   * 
* @deprecated Use * {@link #split(Collection results, String pattern, String input)} instead. */ public synchronized Vector split(String pattern, String input) throws MalformedPerl5PatternException { return split(pattern, input, SPLIT_ALL); } /** * Splits input in the default Perl manner, splitting on all whitespace. * This method is identical to calling: *
   * split("/\\s+/", input);
   * 
* @deprecated Use * {@link #split(Collection results, String input)} instead. */ public synchronized Vector split(String input) throws MalformedPerl5PatternException { return split("/\\s+/", input); } // // MatchResult interface methods. // /** * Returns the length of the last match found. *

* @return The length of the last match found. */ public synchronized int length() { return __lastMatch.length(); } /** * @return The number of groups contained in the last match found. * This number includes the 0th group. In other words, the * result refers to the number of parenthesized subgroups plus * the entire match itself. */ public synchronized int groups() { return __lastMatch.groups(); } /** * Returns the contents of the parenthesized subgroups of the last match * found according to the behavior dictated by the MatchResult interface. *

* @param group The pattern subgroup to return. * @return A string containing the indicated pattern subgroup. Group * 0 always refers to the entire match. If a group was never * matched, it returns null. This is not to be confused with * a group matching the null string, which will return a String * of length 0. */ public synchronized String group(int group) { return __lastMatch.group(group); } /** * Returns the begin offset of the subgroup of the last match found * relative the beginning of the match. *

* @param group The pattern subgroup. * @return The offset into group 0 of the first token in the indicated * pattern subgroup. If a group was never matched or does * not exist, returns -1. Be aware that a group that matches * the null string at the end of a match will have an offset * equal to the length of the string, so you shouldn't blindly * use the offset to index an array or String. */ public synchronized int begin(int group) { return __lastMatch.begin(group); } /** * Returns the end offset of the subgroup of the last match found * relative the beginning of the match. *

* @param group The pattern subgroup. * @return Returns one plus the offset into group 0 of the last token in * the indicated pattern subgroup. If a group was never matched * or does not exist, returns -1. A group matching the null * string will return its start offset. */ public synchronized int end(int group) { return __lastMatch.end(group); } /** * Returns an offset marking the beginning of the last pattern match * found relative to the beginning of the input from which the match * was extracted. *

* @param group The pattern subgroup. * @return The offset of the first token in the indicated * pattern subgroup. If a group was never matched or does * not exist, returns -1. */ public synchronized int beginOffset(int group) { return __lastMatch.beginOffset(group); } /** * Returns an offset marking the end of the last pattern match found * relative to the beginning of the input from which the match was * extracted. *

* @param group The pattern subgroup. * @return Returns one plus the offset of the last token in * the indicated pattern subgroup. If a group was never matched * or does not exist, returns -1. A group matching the null * string will return its start offset. */ public synchronized int endOffset(int group) { return __lastMatch.endOffset(group); } /** * Returns the same as group(0). *

* @return A string containing the entire match. */ public synchronized String toString() { if(__lastMatch == null) return null; return __lastMatch.toString(); } /** * Returns the part of the input preceding the last match found. *

* @return The part of the input following the last match found. */ public synchronized String preMatch() { int begin; if(__originalInput == null) return __nullString; begin = __lastMatch.beginOffset(0); if(begin <= 0) return __nullString; if(__originalInput instanceof char[]) { char[] input; input = (char[])__originalInput; // Just in case we make sure begin offset is in bounds. It should // be but we're paranoid. if(begin > input.length) begin = input.length; return new String(input, __inputBeginOffset, begin); } else if(__originalInput instanceof String) { String input; input = (String)__originalInput; // Just in case we make sure begin offset is in bounds. It should // be but we're paranoid. if(begin > input.length()) begin = input.length(); return input.substring(__inputBeginOffset, begin); } return __nullString; } /** * Returns the part of the input following the last match found. *

* @return The part of the input following the last match found. */ public synchronized String postMatch() { int end; if(__originalInput == null) return __nullString; end = __lastMatch.endOffset(0); if(end < 0) return __nullString; if(__originalInput instanceof char[]) { char[] input; input = (char[])__originalInput; // Just in case we make sure begin offset is in bounds. It should // be but we're paranoid. if(end >= input.length) return __nullString; return new String(input, end, __inputEndOffset - end); } else if(__originalInput instanceof String) { String input; input = (String)__originalInput; // Just in case we make sure begin offset is in bounds. It should // be but we're paranoid. if(end >= input.length()) return __nullString; return input.substring(end, __inputEndOffset); } return __nullString; } /** * Returns the part of the input preceding the last match found as a * char array. This method eliminates the extra * buffer copying caused by preMatch().toCharArray(). *

* @return The part of the input preceding the last match found as a char[]. * If the result is of zero length, returns null instead of a zero * length array. */ public synchronized char[] preMatchCharArray() { int begin; char[] result = null; if(__originalInput == null) return null; begin = __lastMatch.beginOffset(0); if(begin <= 0) return null; if(__originalInput instanceof char[]) { char[] input; input = (char[])__originalInput; // Just in case we make sure begin offset is in bounds. It should // be but we're paranoid. if(begin >= input.length) begin = input.length; result = new char[begin - __inputBeginOffset]; System.arraycopy(input, __inputBeginOffset, result, 0, result.length); } else if(__originalInput instanceof String) { String input; input = (String)__originalInput; // Just in case we make sure begin offset is in bounds. It should // be but we're paranoid. if(begin >= input.length()) begin = input.length(); result = new char[begin - __inputBeginOffset]; input.getChars(__inputBeginOffset, begin, result, 0); } return result; } /** * Returns the part of the input following the last match found as a char * array. This method eliminates the extra buffer copying caused by * preMatch().toCharArray(). *

* @return The part of the input following the last match found as a char[]. * If the result is of zero length, returns null instead of a zero * length array. */ public synchronized char[] postMatchCharArray() { int end; char[] result = null; if(__originalInput == null) return null; end = __lastMatch.endOffset(0); if(end < 0) return null; if(__originalInput instanceof char[]) { int length; char[] input; input = (char[])__originalInput; // Just in case we make sure begin offset is in bounds. It should // be but we're paranoid. if(end >= input.length) return null; length = __inputEndOffset - end; result = new char[length]; System.arraycopy(input, end, result, 0, length); } else if(__originalInput instanceof String) { String input; input = (String)__originalInput; // Just in case we make sure begin offset is in bounds. It should // be but we're paranoid. if(end >= __inputEndOffset) return null; result = new char[__inputEndOffset - end]; input.getChars(end, __inputEndOffset, result, 0); } return result; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/perl/ParsedSubstitutionEntry.java0000644000175000017500000000623607773723336027276 0ustar arnaudarnaud/* * $Id: ParsedSubstitutionEntry.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text.perl; import org.apache.oro.text.regex.*; /** * @version @version@ * @since 1.0 */ final class ParsedSubstitutionEntry { int _numSubstitutions; Pattern _pattern; Perl5Substitution _substitution; ParsedSubstitutionEntry(Pattern pattern, Perl5Substitution substitution, int numSubstitutions) { _numSubstitutions = numSubstitutions; _substitution = substitution; _pattern = pattern; } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/MatchActionInfo.java0000644000175000017500000001123307773723336024436 0ustar arnaudarnaud/* * $Id: MatchActionInfo.java,v 1.8 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text; import java.util.*; import java.io.*; import org.apache.oro.text.regex.*; /** * This class is used to provide information regarding a match found by * MatchActionProcessor to a MatchAction callback implementation. * * @version @version@ * @since 1.0 * @see MatchAction * @see MatchActionProcessor */ public final class MatchActionInfo { /** The line number of the matching line */ public int lineNumber; /** * The String representation of the matching line with the trailing * newline truncated. */ public String line; /** * The char[] representation of the matching line with the trailing * newline truncated. */ public char[] charLine; /** * The field separator used by the MatchActionProcessor. This will be * set to null by a MatchActionProcessor instance if no field separator * was specified before match processing began. */ public Pattern fieldSeparator; /** * A List of Strings containing the fields of the line that were * separated out by the fieldSeparator. If no field separator was * specified, this variable will be set to null. */ public List fields; /** The PatternMatcher used to find the match. */ public PatternMatcher matcher; /** * The pattern found in the line of input. If a MatchAction callback * is registered with a null pattern (meaning the callback should be * applied to every line of input), this value will be null. */ public Pattern pattern; /** * The first match found in the line of input. If a MatchAction callback * is registered with a null pattern (meaning the callback should be * applied to every line of input), this value will be null. */ public MatchResult match; /** The output stream passed to the MatchActionProcessor. */ public PrintWriter output; /** * The input stream passed to the MatchActionProcessor from which the * matching line was read. */ public BufferedReader input; } jakarta-oro-2.0.8/src/java/org/apache/oro/text/MatchActionProcessor.java0000644000175000017500000004015407773723336025526 0ustar arnaudarnaud/* * $Id: MatchActionProcessor.java,v 1.10 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text; import java.io.*; import java.util.*; import org.apache.oro.text.regex.*; /** * The MatchActionProcessor class provides AWK-like line by line filtering * of a text stream, pattern action pair association, and field splitting * based on a registered separator. However, the class can be used with * any compatible PatternMatcher/PatternCompiler implementations and * need not use the AWK matching classes in org.apache.oro.text.awk. In fact, * the default matcher and compiler used by the class are Perl5Matcher and * Perl5Compiler from org.apache.oro.text.regex. *

* To completely understand how to use MatchActionProcessor, you should first * look at {@link MatchAction} and {@link MatchActionInfo}. * A MatchActionProcessor is first initialized with * the desired PatternCompiler and PatternMatcher instances to use to compile * patterns and perform matches. Then, optionally, a field separator may * be registered with {@link #setFieldSeparator setFieldSeparator()} * Finally, as many pattern action pairs as desired are registerd with * {@link #addAction addAction()} before processing the input * with {@link #processMatches processMatches()}. Pattern action * pairs are processed in the order they were registered. *

* The look of added actions can closely mirror that of AWK when anonymous * classes are used. Here's an example of how you might use * MatchActionProcessor to extract only the second column of a semicolon * delimited file: *

*

 * import java.io.*;
 *
 * import org.apache.oro.text.*;
 * import org.apache.oro.text.regex.*;
 *
 * public final class semicolon {
 *
 *  public static final void main(String[] args) {
 *    MatchActionProcessor processor = new MatchActionProcessor();
 *
 *    try {
 *      processor.setFieldSeparator(";");
 *      // Using a null pattern means to perform the action for every line.
 *      processor.addAction(null, new MatchAction() {
 *        public void processMatch(MatchActionInfo info) {
 *	    // We assume the second column exists
 *          info.output.println(info.fields.elementAt(1));
 *        }
 *     });
 *   } catch(MalformedPatternException e) {
 *     e.printStackTrace();
 *     System.exit(1);
 *   }
 *
 *   try {
 *      processor.processMatches(System.in, System.out);
 *   } catch(IOException e) {
 *     e.printStackTrace();
 *     System.exit(1);
 *   }
 *  }
 *}
 * 
* You can redirect the following sample input to stdin to test the code: *
 * 1;Trenton;New Jersey
 * 2;Annapolis;Maryland
 * 3;Austin;Texas
 * 4;Richmond;Virginia
 * 5;Harrisburg;Pennsylvania
 * 6;Honolulu;Hawaii
 * 7;Santa Fe;New Mexico
 * 
* * @version @version@ * @since 1.0 * @see MatchAction * @see MatchActionInfo */ public final class MatchActionProcessor { private Pattern __fieldSeparator = null; private PatternCompiler __compiler; private PatternMatcher __matcher; // If a pattern is null, it means to do it for every line. private Vector __patterns = new Vector(); private Vector __actions = new Vector(); private MatchAction __defaultAction = new DefaultMatchAction(); /** * Creates a new MatchActionProcessor instance initialized with the specified * pattern compiler and matcher. The field separator is set to null by * default, which means that matched lines will not be split into separate * fields unless the field separator is set with * {@link #setFieldSeparator setFieldSeparator()}. *

* @param compiler The PatternCompiler to use to compile registered * patterns. * @param matcher The PatternMatcher to use when searching for matches. */ public MatchActionProcessor(PatternCompiler compiler, PatternMatcher matcher) { __compiler = compiler; __matcher = matcher; } /** * Default constructor for MatchActionProcessor. Same as calling *

* MatchActionProcessor(new Perl5Compiler(), new Perl5Matcher()); *
*/ public MatchActionProcessor() { this(new Perl5Compiler(), new Perl5Matcher()); } /** * Registers a pattern action pair, providing options to be used to * compile the pattern. If a pattern is null, the action * is performed for every line of input. *

* @param pattern The pattern to bind to an action. * @param options The compilation options to use for the pattern. * @param action The action to associate with the pattern. * @exception MalformedPatternException If the pattern cannot be compiled. */ public void addAction(String pattern, int options, MatchAction action) throws MalformedPatternException { if(pattern != null) __patterns.addElement(__compiler.compile(pattern, options)); else __patterns.addElement(null); __actions.addElement(action); } /** * Binds a patten to the default action, providing options to be * used to compile the pattern. The default action is to simply print * the matched line to the output. If a pattern is null, the action * is performed for every line of input. *

* @param pattern The pattern to bind to an action. * @param options The compilation options to use for the pattern. * @exception MalformedPatternException If the pattern cannot be compiled. */ public void addAction(String pattern, int options) throws MalformedPatternException { addAction(pattern, options, __defaultAction); } /** * Binds a patten to the default action. The default action is to simply * print the matched line to the output. If a pattern is null, the action * is performed for every line of input. *

* @param pattern The pattern to bind to an action. * @exception MalformedPatternException If the pattern cannot be compiled. */ public void addAction(String pattern) throws MalformedPatternException { addAction(pattern, 0); } /** * Registers a pattern action pair. If a pattern is null, the action * is performed for every line of input. *

* @param pattern The pattern to bind to an action. * @param action The action to associate with the pattern. * @exception MalformedPatternException If the pattern cannot be compiled. */ public void addAction(String pattern, MatchAction action) throws MalformedPatternException { addAction(pattern, 0, action); } /** * Sets the field separator to use when splitting a line into fields. * If the field separator is never set, or set to null, matched input * lines are not split into fields. *

* @param separator A regular expression defining the field separator. * @param options The options to use when compiling the separator. * @exception MalformedPatternException If the separator cannot be compiled. */ public void setFieldSeparator(String separator, int options) throws MalformedPatternException { if(separator == null) { __fieldSeparator = null; return; } __fieldSeparator = __compiler.compile(separator, options); } /** * Sets the field separator to use when splitting a line into fields. * If the field separator is never set, or set to null, matched input * lines are not split into fields. *

* @param separator A regular expression defining the field separator. * @exception MalformedPatternException If the separator cannot be compiled. */ public void setFieldSeparator(String separator) throws MalformedPatternException { setFieldSeparator(separator, 0); } /** * This method reads the provided input one line at a time and for * every registered pattern that is contained in the line it executes * the associated MatchAction's processMatch() method. If a field * separator has been defined with * {@link #setFieldSeparator setFieldSeparator()}, the * fields member of the MatchActionInfo instance passed to the * processMatch() method is set to a Vector of Strings containing * the split fields of the line. Otherwise the fields member is set * to null. If no match was performed to invoke the action (i.e., * a null pattern was registered), then the match member is set * to null. Otherwise, the match member will contain the result of * the match. *

* The input stream, having been exhausted, is closed right before the * method terminates and the output stream is flushed. *

* @see MatchActionInfo * @param input The input stream from which to read lines. * @param output Where to send output. * @param encoding The character encoding of the InputStream source. * If you also want to define an output character encoding, * you should use {@link #processMatches(Reader, Writer)} * and specify the encodings when creating the Reader and * Writer sources and sinks. * @exception IOException If an error occurs while reading input * or writing output. */ public void processMatches(InputStream input, OutputStream output, String encoding) throws IOException { processMatches(new InputStreamReader(input, encoding), new OutputStreamWriter(output)); } /** * This method reads the provided input one line at a time using the * platform standart character encoding and for every registered * pattern that is contained in the line it executes the associated * MatchAction's processMatch() method. If a field separator has been * defined with {@link #setFieldSeparator setFieldSeparator()}, the * fields member of the MatchActionInfo instance passed to the * processMatch() method is set to a Vector of Strings containing * the split fields of the line. Otherwise the fields member is set * to null. If no match was performed to invoke the action (i.e., * a null pattern was registered), then the match member is set * to null. Otherwise, the match member will contain the result of * the match. * *

* The input stream, having been exhausted, is closed right before the * method terminates and the output stream is flushed. *

* * @see MatchActionInfo * @param input The input stream from which to read lines. * @param output Where to send output. * @exception IOException If an error occurs while reading input * or writing output. */ public void processMatches(InputStream input, OutputStream output) throws IOException { processMatches(new InputStreamReader(input), new OutputStreamWriter(output)); } /** * This method reads the provided input one line at a time and for * every registered pattern that is contained in the line it executes * the associated MatchAction's processMatch() method. If a field * separator has been defined with * {@link #setFieldSeparator setFieldSeparator()}, the * fields member of the MatchActionInfo instance passed to the * processMatch() method is set to a Vector of Strings containing * the split fields of the line. Otherwise the fields member is set * to null. If no match was performed to invoke the action (i.e., * a null pattern was registered), then the match member is set * to null. Otherwise, the match member will contain the result of * the match. *

* The input stream, having been exhausted, is closed right before the * method terminates and the output stream is flushed. *

* @see MatchActionInfo * @param input The input stream from which to read lines. * @param output Where to send output. * @exception IOException If an error occurs while reading input * or writing output. */ public void processMatches(Reader input, Writer output) throws IOException { int patternCount, current; LineNumberReader reader = new LineNumberReader(input); PrintWriter writer = new PrintWriter(output); MatchActionInfo info = new MatchActionInfo(); Object obj; Pattern pattern; MatchAction action; List fields = new ArrayList(); // Set those things that will not change. info.matcher = __matcher; info.fieldSeparator = __fieldSeparator; info.input = reader; info.output = writer; info.fields = null; patternCount = __patterns.size(); info.lineNumber = 0; while((info.line = reader.readLine()) != null) { info.charLine = info.line.toCharArray(); for(current=0; current < patternCount; current++) { obj = __patterns.elementAt(current); // If a pattern is null, it means to do it for every line. if(obj != null) { pattern = (Pattern)__patterns.elementAt(current); if(__matcher.contains(info.charLine, pattern)) { info.match = __matcher.getMatch(); info.lineNumber = reader.getLineNumber(); info.pattern = pattern; if(__fieldSeparator != null) { fields.clear(); Util.split(fields, __matcher, __fieldSeparator, info.line); info.fields = fields; } else info.fields = null; action = (MatchAction)__actions.elementAt(current); action.processMatch(info); } } else { info.match = null; info.lineNumber = reader.getLineNumber(); if(__fieldSeparator != null) { fields.clear(); Util.split(fields, __matcher, __fieldSeparator, info.line); info.fields = fields; } else info.fields = null; action = (MatchAction)__actions.elementAt(current); action.processMatch(info); } } } // Flush output but don't close, close input since we reached end. writer.flush(); reader.close(); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/GlobCompiler.java0000644000175000017500000003506507773723336024017 0ustar arnaudarnaud/* * $Id: GlobCompiler.java,v 1.8 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text; import org.apache.oro.text.regex.*; /** * The GlobCompiler class will compile a glob expression into a Perl5Pattern * that may be used to match patterns in conjunction with Perl5Matcher. * Rather than create extra GlobMatcher and GlobPattern classes tailored * to the task of matching glob expressions, we have simply reused the * Perl5 regular expression classes from org.apache.oro.text.regex by * making GlobCompiler translate a glob expression into a Perl5 expression * that is compiled by a Perl5Compiler instance internal to the GlobCompiler. *

* Because there are various similar glob expression syntaxes, GlobCompiler * tries to provide a small amount of customization by providing the * {@link #STAR_CANNOT_MATCH_NULL_MASK} * and {@link #QUESTION_MATCHES_ZERO_OR_ONE_MASK} compilation options. *

* The GlobCompiler expression syntax is based on Unix shell glob expressions * but should be usable to simulate Win32 wildcards. The following syntax * is supported: *

    *
  • * - Matches zero or more instances of any character. If the * STAR_CANNOT_MATCH_NULL_MASK option is used, * matches * one or more instances of any character. *
  • ? - Matches one instance of any character. If the * QUESTION_MATCHES_ZERO_OR_ONE_MASK option is used, ? * matches zero or one instances of any character. *
  • [...] - Matches any of characters enclosed by the brackets. * * and ? lose their special meanings within a * character class. Additionaly if the first character following * the opening bracket is a ! or a ^, then any * character not in the character class is matched. A - * between two characters can be used to denote a range. A * - at the beginning or end of the character class matches * itself rather than referring to a range. A ] immediately * following the opening [ matches itself rather than * indicating the end of the character class, otherwise it must be * escaped with a backslash to refer to itself. *
  • \ - A backslash matches itself in most situations. But * when a special character such as a * follows it, a * backslash escapes the character, indicating that * the special chracter should be interpreted as a normal character * instead of its special meaning. *
  • All other characters match themselves. *
*

* Please remember that the when you construct a Java string in Java code, * the backslash character is itself a special Java character, and it must * be double backslashed to represent single backslash in a regular * expression. * * @version @version@ * @since 1.0 * @see org.apache.oro.text.regex.PatternCompiler * @see org.apache.oro.text.regex.Perl5Matcher */ public final class GlobCompiler implements PatternCompiler { /** * The default mask for the {@link #compile compile} methods. * It is equal to 0. The default behavior is for a glob expression to * be case sensitive unless it is compiled with the CASE_INSENSITIVE_MASK * option. */ public static final int DEFAULT_MASK = 0; /** * A mask passed as an option to the {@link #compile compile} methods * to indicate a compiled glob expression should be case insensitive. */ public static final int CASE_INSENSITIVE_MASK = 0x0001; /** * A mask passed as an option to the {@link #compile compile} methods * to indicate that a * should not be allowed to match the null string. * The normal behavior of the * metacharacter is that it may match any * 0 or more characters. This mask causes it to match 1 or more * characters of anything. */ public static final int STAR_CANNOT_MATCH_NULL_MASK = 0x0002; /** * A mask passed as an option to the {@link #compile compile} methods * to indicate that a ? should not be allowed to match the null string. * The normal behavior of the ? metacharacter is that it may match any 1 * character. This mask causes it to match 0 or 1 characters. */ public static final int QUESTION_MATCHES_ZERO_OR_ONE_MASK = 0x0004; /** * A mask passed as an option to the {@link #compile compile} methods * to indicate that the resulting Perl5Pattern should be treated as a * read only data structure by Perl5Matcher, making it safe to share * a single Perl5Pattern instance among multiple threads without needing * synchronization. Without this option, Perl5Matcher reserves the right * to store heuristic or other information in Perl5Pattern that might * accelerate future matches. When you use this option, Perl5Matcher will * not store or modify any information in a Perl5Pattern. Use this option * when you want to share a Perl5Pattern instance among multiple threads * using different Perl5Matcher instances. */ public static final int READ_ONLY_MASK = 0x0008; private Perl5Compiler __perl5Compiler; private static boolean __isPerl5MetaCharacter(char ch) { return (ch == '*' || ch == '?' || ch == '+' || ch == '[' || ch == ']' || ch == '(' || ch == ')' || ch == '|' || ch == '^' || ch == '$' || ch == '.' || ch == '{' || ch == '}' || ch == '\\'); } private static boolean __isGlobMetaCharacter(char ch) { return (ch == '*' || ch == '?' || ch == '[' || ch == ']'); } /** * This static method is the basic engine of the Glob PatternCompiler * implementation. It takes a glob expression in the form of a character * array and converts it into a String representation of a Perl5 pattern. * The method is made public so that programmers may use it for their * own purposes. However, the GlobCompiler compile methods work by * converting the glob pattern to a Perl5 pattern using this method, and * then invoking the compile() method of an internally stored Perl5Compiler * instance. *

* @param pattern A character array representation of a Glob pattern. * @return A String representation of a Perl5 pattern equivalent to the * Glob pattern. */ public static String globToPerl5(char[] pattern, int options) { boolean inCharSet, starCannotMatchNull = false, questionMatchesZero; int ch; StringBuffer buffer; buffer = new StringBuffer(2*pattern.length); inCharSet = false; questionMatchesZero = ((options & QUESTION_MATCHES_ZERO_OR_ONE_MASK) != 0); starCannotMatchNull = ((options & STAR_CANNOT_MATCH_NULL_MASK) != 0); for(ch=0; ch < pattern.length; ch++) { switch(pattern[ch]) { case '*': if(inCharSet) buffer.append('*'); else { if(starCannotMatchNull) buffer.append(".+"); else buffer.append(".*"); } break; case '?': if(inCharSet) buffer.append('?'); else { if(questionMatchesZero) buffer.append(".?"); else buffer.append('.'); } break; case '[': inCharSet = true; buffer.append(pattern[ch]); if(ch + 1 < pattern.length) { switch(pattern[ch + 1]) { case '!': case '^': buffer.append('^'); ++ch; continue; case ']': buffer.append(']'); ++ch; continue; } } break; case ']': inCharSet = false; buffer.append(pattern[ch]); break; case '\\': buffer.append('\\'); if(ch == pattern.length - 1) { buffer.append('\\'); } else if(__isGlobMetaCharacter(pattern[ch + 1])) buffer.append(pattern[++ch]); else buffer.append('\\'); break; default: if(!inCharSet && __isPerl5MetaCharacter(pattern[ch])) buffer.append('\\'); buffer.append(pattern[ch]); break; } } return buffer.toString(); } /** * The default GlobCompiler constructor. It initializes an internal * Perl5Compiler instance to compile translated glob expressions. */ public GlobCompiler() { __perl5Compiler = new Perl5Compiler(); } /** * Compiles a Glob expression into a Perl5Pattern instance that * can be used by a Perl5Matcher object to perform pattern matching. *

* @param pattern A Glob expression to compile. * @param options A set of flags giving the compiler instructions on * how to treat the glob expression. The flags * are a logical OR of any number of the 3 MASK * constants. For example: *

   * regex =
   *   compiler.compile(pattern, GlobCompiler.
   *                    CASE_INSENSITIVE_MASK |
   *                    GlobCompiler.STAR_CANNOT_MATCH_NULL_MASK);
   *                 
* This says to compile the pattern so that * * cannot match the null string and to perform * matches in a case insensitive manner. * @return A Pattern instance constituting the compiled expression. * This instance will always be a Perl5Pattern and can be reliably * casted to a Perl5Pattern. * @exception MalformedPatternException If the compiled expression * is not a valid Glob expression. */ public Pattern compile(char[] pattern, int options) throws MalformedPatternException { int perlOptions = 0; if((options & CASE_INSENSITIVE_MASK) != 0) perlOptions |= Perl5Compiler.CASE_INSENSITIVE_MASK; if((options & READ_ONLY_MASK) != 0) perlOptions |= Perl5Compiler.READ_ONLY_MASK; return __perl5Compiler.compile(globToPerl5(pattern, options), perlOptions); } /** * Same as calling compile(pattern, GlobCompiler.DEFAULT_MASK); *

* @param pattern A regular expression to compile. * @return A Pattern instance constituting the compiled regular expression. * This instance will always be a Perl5Pattern and can be reliably * casted to a Perl5Pattern. * @exception MalformedPatternException If the compiled expression * is not a valid Glob expression. */ public Pattern compile(char[] pattern) throws MalformedPatternException { return compile(pattern, DEFAULT_MASK); } /** * Same as calling compile(pattern, GlobCompiler.DEFAULT_MASK); *

* @param pattern A regular expression to compile. * @return A Pattern instance constituting the compiled regular expression. * This instance will always be a Perl5Pattern and can be reliably * casted to a Perl5Pattern. * @exception MalformedPatternException If the compiled expression * is not a valid Glob expression. */ public Pattern compile(String pattern) throws MalformedPatternException { return compile(pattern.toCharArray(), DEFAULT_MASK); } /** * Compiles a Glob expression into a Perl5Pattern instance that * can be used by a Perl5Matcher object to perform pattern matching. *

* @param pattern A Glob expression to compile. * @param options A set of flags giving the compiler instructions on * how to treat the glob expression. The flags * are a logical OR of any number of the 3 MASK * constants. For example: *

   * regex =
   *   compiler.compile("*.*", GlobCompiler.
   *                    CASE_INSENSITIVE_MASK |
   *                    GlobCompiler.STAR_CANNOT_MATCH_NULL_MASK);
   *                 
* This says to compile the pattern so that * * cannot match the null string and to perform * matches in a case insensitive manner. * @return A Pattern instance constituting the compiled expression. * This instance will always be a Perl5Pattern and can be reliably * casted to a Perl5Pattern. * @exception MalformedPatternException If the compiled expression * is not a valid Glob expression. */ public Pattern compile(String pattern, int options) throws MalformedPatternException { return compile(pattern.toCharArray(), options); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/PatternCacheRandom.java0000644000175000017500000001055407773723336025137 0ustar arnaudarnaud/* * $Id: PatternCacheRandom.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text; import java.util.*; import org.apache.oro.text.regex.*; import org.apache.oro.util.*; /** * This class is a GenericPatternCache subclass implementing a random * cache replacement policy. In other words, * patterns are added to the cache until it becomes full. Once the * cache is full, when a new pattern is added to the cache, it replaces * a randomly selected pattern in the cache. * * @version @version@ * @since 1.0 * @see GenericPatternCache */ public final class PatternCacheRandom extends GenericPatternCache { /** * Creates a PatternCacheRandom instance with a given cache capacity * and initialized to use a given PatternCompiler instance as a pattern * compiler. *

* @param capacity The capacity of the cache. * @param compiler The PatternCompiler to use to compile patterns. */ public PatternCacheRandom(int capacity, PatternCompiler compiler) { super(new CacheRandom(capacity), compiler); } /** * Same as: *

   * PatternCacheRandom(GenericPatternCache.DEFAULT_CAPACITY, compiler);
   * 
*/ public PatternCacheRandom(PatternCompiler compiler) { this(GenericPatternCache.DEFAULT_CAPACITY, compiler); } /** * Same as: *
   * PatternCacheRandom(capacity, new Perl5Compiler());
   * 
*/ public PatternCacheRandom(int capacity) { this(capacity, new Perl5Compiler()); } /** * Same as: *
   * PatternCacheRandom(GenericPatternCache.DEFAULT_CAPACITY);
   * 
*/ public PatternCacheRandom() { this(GenericPatternCache.DEFAULT_CAPACITY); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/GenericPatternCache.java0000644000175000017500000002063507773723336025274 0ustar arnaudarnaud/* * $Id: GenericPatternCache.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text; import java.util.*; import org.apache.oro.text.regex.*; import org.apache.oro.util.*; /** * This is the base class for all cache implementations provided in the * org.apache.oro.text package. * Although 4 subclasses of GenericPatternCache are provided with this * package, users may not derive subclasses from this class. * Rather, users should create their own implmentations of the * {@link PatternCache} interface. * * @version @version@ * @since 1.0 * @see PatternCache * @see PatternCacheLRU * @see PatternCacheFIFO * @see PatternCacheFIFO2 * @see PatternCacheRandom */ public abstract class GenericPatternCache implements PatternCache { PatternCompiler _compiler; Cache _cache; /** * The default capacity to be used by the GenericPatternCache subclasses * provided with this package. Its value is 20. */ public static final int DEFAULT_CAPACITY = 20; /** * The primary constructor for GenericPatternCache. It has default * access so it will only be used within the package. It initializes * _cache and _compiler to the arguments provided. *

* @param cache The cache with which to store patterns. * @param compiler The PatternCompiler that should be used to compile * patterns. */ GenericPatternCache(Cache cache, PatternCompiler compiler) { _cache = cache; _compiler = compiler; } /** * Adds a pattern to the cache and returns the compiled pattern. This * method is in principle almost identical to * {@link #getPattern getPattern()} except for the fact that * it throws a MalformedPatternException if an expression cannot be * compiled. *

* addPattern() is meant to be used when you expressly intend to add * an expression to the cache and is useful for front-loading a cache * with expressions before use. If the expression added does not * already exist in the cache, it is compiled, added to the cache, * and returned. If the compiled expression is already in the cache, it * is simply returned. *

* The expected behavior of this method should be to start replacing * patterns in the cache only after the cache has been filled to capacity. *

* @param expression The regular expression to add to the cache. * @param options The compilation options to use when compiling the * expression. * @return The Pattern corresponding to the String representation of the * regular expression. * @exception MalformedPatternException If there is an error in compiling * the regular expression. */ public final synchronized Pattern addPattern(String expression, int options) throws MalformedPatternException { Object obj; Pattern pattern; obj = _cache.getElement(expression); if(obj != null) { pattern = (Pattern)obj; if(pattern.getOptions() == options) return pattern; } pattern = _compiler.compile(expression, options); _cache.addElement(expression, pattern); return pattern; } /** * Same as calling *

   * addPattern(expression, 0);
   * 
* @exception MalformedPatternException If there is an error in compiling * the regular expression. */ public final synchronized Pattern addPattern(String expression) throws MalformedPatternException { return addPattern(expression, 0); } /** * This method fetches a pattern from the cache. It is nearly identical * to {@link #addPattern addPattern()} except that it doesn't * throw a MalformedPatternException. If the pattern is not in the * cache, it is compiled, placed in the cache, and returned. If * the pattern cannot be compiled successfully, it * throws a MalformedCachePatternException. * Note that this exception is derived from RuntimeException, which means * you are NOT forced to catch it by the compiler. Please refer to * {@link MalformedCachePatternException} for a discussion of * when you should and shouldn't catch this exception. *

* @param expression The regular expression to fetch from the cache in * compiled form. * @param options The compilation options to use when compiling the * expression. * @return The Pattern corresponding to the String representation of the * regular expression. * @exception MalformedCachePatternException If there is an error in * compiling the regular expression. */ public final synchronized Pattern getPattern(String expression, int options) throws MalformedCachePatternException { Pattern result = null; try { result = addPattern(expression, options); } catch(MalformedPatternException e) { throw new MalformedCachePatternException("Invalid expression: " + expression + "\n" + e.getMessage()); } return result; } /** * Same as calling *

   * getPattern(expression, 0)
   * 
*/ public final synchronized Pattern getPattern(String expression) throws MalformedCachePatternException { return getPattern(expression, 0); } /** * Returns the number of elements in the cache, not to be confused with * the {@link #capacity()} which returns the number * of elements that can be held in the cache at one time. *

* @return The current size of the cache (i.e., the number of elements * currently cached). */ public final int size() { return _cache.size(); } /** * Returns the maximum number of patterns that can be cached at one time. *

* @return The maximum number of patterns that can be cached at one time. */ public final int capacity() { return _cache.capacity(); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/PatternCacheFIFO2.java0000644000175000017500000001143707773723336024525 0ustar arnaudarnaud/* * $Id: PatternCacheFIFO2.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text; import java.util.*; import org.apache.oro.text.regex.*; import org.apache.oro.util.*; /** * This class is a GenericPatternCache subclass implementing a second * chance FIFO (First In First Out) cache replacement policy. In other * words, patterns are added to the cache until the cache becomes full. * Once the cache is full, when a new pattern is added to the cache, it * replaces the first of the current patterns in the cache to have been * added, unless that pattern has been used recently (generally * between the last cache replacement and now). * If the pattern to be replaced has been used, it is given * a second chance, and the next pattern in the cache is tested for * replacement in the same manner. If all the patterns are given a * second chance, then the original pattern selected for replacement is * replaced. * * @version @version@ * @since 1.0 * @see GenericPatternCache */ public final class PatternCacheFIFO2 extends GenericPatternCache { /** * Creates a PatternCacheFIFO2 instance with a given cache capacity, * initialized to use a given PatternCompiler instance as a pattern compiler. *

* @param capacity The capacity of the cache. * @param compiler The PatternCompiler to use to compile patterns. */ public PatternCacheFIFO2(int capacity, PatternCompiler compiler) { super(new CacheFIFO2(capacity), compiler); } /** * Same as: *

   * PatternCacheFIFO2(GenericPatternCache.DEFAULT_CAPACITY, compiler);
   * 
*/ public PatternCacheFIFO2(PatternCompiler compiler) { this(GenericPatternCache.DEFAULT_CAPACITY, compiler); } /** * Same as: *
   * PatternCacheFIFO2(capacity, new Perl5Compiler());
   * 
*/ public PatternCacheFIFO2(int capacity) { this(capacity, new Perl5Compiler()); } /** * Same as: *
   * PatternCacheFIFO2(GenericPatternCache.DEFAULT_CAPACITY);
   * 
*/ public PatternCacheFIFO2() { this(GenericPatternCache.DEFAULT_CAPACITY); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/package.html0000644000175000017500000000041607773723336023047 0ustar arnaudarnaud This package used to be the TextTools library and provides general text processing support, including a glob regular expression class, pattern caching and line-by-line processing classes. jakarta-oro-2.0.8/src/java/org/apache/oro/text/MalformedCachePatternException.java0000644000175000017500000001021207773723336027473 0ustar arnaudarnaud/* * $Id: MalformedCachePatternException.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text; /** * An exception used to indicate errors in a regular expression fetched * from a PatternCache. * It is derived from RuntimeException, and therefore does not have to be * caught. You should generally make an effort to catch * MalformedCachePatternException whenever you use dynamically generated * patterns (from user input or some other source). Static expressions * represented as strings in your source code don't require exception * handling because as you write and test run your program you will * correct any errors in those expressions when you run into an uncaught * MalformedCachePatternException. By the time you complete your * project, those static expressions will be guaranteed to be correct. * However, pieces of code with expressions that you cannot guarantee to * be correct should catch MalformedCachePatternException to ensure * reliability. * * @version @version@ * @since 1.0 * @see PatternCache */ public class MalformedCachePatternException extends RuntimeException { /** * Simply calls the corresponding constructor of its superclass. */ public MalformedCachePatternException() { super(); } /** * Simply calls the corresponding constructor of its superclass. *

* @param message A message indicating the nature of the error. */ public MalformedCachePatternException(String message) { super(message); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/PatternCacheLRU.java0000644000175000017500000001067707773723336024367 0ustar arnaudarnaud/* * $Id: PatternCacheLRU.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text; import java.util.*; import org.apache.oro.text.regex.*; import org.apache.oro.util.*; /** * This class is a GenericPatternCache subclass implementing an LRU * (Least Recently Used) cache replacement policy. In other words, * patterns are added to the cache until it becomes full. Once the * cache is full, when a new pattern is added to the cache, it replaces * the least recently used pattern currently in the cache. This is probably * the best general purpose pattern cache replacement policy. * * @version @version@ * @since 1.0 * @see GenericPatternCache */ public final class PatternCacheLRU extends GenericPatternCache { /** * Creates a PatternCacheLRU instance with a given cache capacity, * and initialized to use a given PatternCompiler instance as a pattern * compiler. *

* @param capacity The capacity of the cache. * @param compiler The PatternCompiler to use to compile patterns. */ public PatternCacheLRU(int capacity, PatternCompiler compiler) { super(new CacheLRU(capacity), compiler); } /** * Same as: *

   * PatternCacheLRU(GenericPatternCache.DEFAULT_CAPACITY, compiler);
   * 
*/ public PatternCacheLRU(PatternCompiler compiler) { this(GenericPatternCache.DEFAULT_CAPACITY, compiler); } /** * Same as: *
   * PatternCacheLRU(capacity, new Perl5Compiler());
   * 
*/ public PatternCacheLRU(int capacity) { this(capacity, new Perl5Compiler()); } /** * Same as: *
   * PatternCacheLRU(GenericPatternCache.DEFAULT_CAPACITY);
   * 
*/ public PatternCacheLRU() { this(GenericPatternCache.DEFAULT_CAPACITY); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/PatternCacheFIFO.java0000644000175000017500000001057607773723336024446 0ustar arnaudarnaud/* * $Id: PatternCacheFIFO.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text; import java.util.*; import org.apache.oro.text.regex.*; import org.apache.oro.util.*; /** * This class is a GenericPatternCache subclass implementing a FIFO (First * In First Out) cache replacement policy. In other words, patterns are * added to the cache until the cache becomes full. Once the cache is full, * if a new pattern is added to the cache, it replaces the first of * the current patterns in the cache to have been added. * * @version @version@ * @since 1.0 * @see GenericPatternCache */ public final class PatternCacheFIFO extends GenericPatternCache { /** * Creates a PatternCacheFIFO instance with a given cache capacity, * initialized to use a given PatternCompiler instance as a pattern compiler. *

* @param capacity The capacity of the cache. * @param compiler The PatternCompiler to use to compile patterns. */ public PatternCacheFIFO(int capacity, PatternCompiler compiler) { super(new CacheFIFO(capacity), compiler); } /** * Same as: *

   * PatternCacheFIFO(GenericPatternCache.DEFAULT_CAPACITY, compiler);
   * 
*/ public PatternCacheFIFO(PatternCompiler compiler) { this(GenericPatternCache.DEFAULT_CAPACITY, compiler); } /** * Same as: *
   * PatternCacheFIFO(capacity, new Perl5Compiler());
   * 
*/ public PatternCacheFIFO(int capacity) { this(capacity, new Perl5Compiler()); } /** * Same as: *
   * PatternCacheFIFO(GenericPatternCache.DEFAULT_CAPACITY);
   * 
*/ public PatternCacheFIFO() { this(GenericPatternCache.DEFAULT_CAPACITY); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/DefaultMatchAction.java0000644000175000017500000000610107773723336025125 0ustar arnaudarnaud/* * $Id: DefaultMatchAction.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text; /** * DefaultMatchAction is a support class for MatchActionProcessor, * providing a default match action. * * @version @version@ * @since 1.0 * @see MatchActionProcessor */ final class DefaultMatchAction implements MatchAction { public void processMatch(MatchActionInfo matchInfo) { matchInfo.output.println(matchInfo.line); } } jakarta-oro-2.0.8/src/java/org/apache/oro/text/PatternCache.java0000644000175000017500000002104007773723336023766 0ustar arnaudarnaud/* * $Id: PatternCache.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text; import org.apache.oro.text.regex.*; /** * An interface defining the basic functions of a regular expression * cache. *

* A PatternCache is an object that takes care of compiling, storing, and * retrieving regular expressions so that the programmer does not have to * explicitly manage these operation himself. The main benefit derived * is the ease of use from only having to express regular expressions * by their String representations. * * @version @version@ * @since 1.0 * @see MalformedCachePatternException */ public interface PatternCache { /** * Adds a pattern to the cache and returns the compiled pattern. This * method is in principle almost identical to * {@link #getPattern(String)} except for the fact that * it throws a MalformedPatternException if an expression cannot be * compiled. *

* addPattern() is meant to be used when you expressly intend to add * an expression to a cache and is useful for front-loading a cache * with expressions before use. If the expression added does not * already exist in the cache, it is compiled, added to the cache, * and returned. If the compiled expression is already in the cache, it * is simply returned. *

* The expected behavior of this method should be to start replacing * patterns in the cache only after the cache has been filled to capacity. *

* @param expression The regular expression to add to the cache. * @return The Pattern corresponding to the String representation of the * regular expression. * @exception MalformedPatternException If there is an error in compiling * the regular expression. */ public Pattern addPattern(String expression) throws MalformedPatternException; /** * Adds a pattern to the cache and returns the compiled pattern. This * method is in principle almost identical to * {@link #getPattern(String)} except for the fact that * it throws a MalformedPatternException if an expression cannot be * compiled. *

* addPattern() is meant to be used when you expressly intend to add * an expression to the cache and is useful for front-loading a cache * with expressions before use. If the expression added does not * already exist in the cache, it is compiled, added to the cache, * and returned. If the compiled expression is already in the cache, it * is simply returned. *

* The expected behavior of this method should be to start replacing * patterns in the cache only after the cache has been filled to capacity. *

* @param expression The regular expression to add to the cache. * @param options The compilation options to use when compiling the * expression. * @return The Pattern corresponding to the String representation of the * regular expression. * @exception MalformedPatternException If there is an error in compiling * the regular expression. */ public Pattern addPattern(String expression, int options) throws MalformedPatternException; /** * This method fetches a pattern from the cache. It is nearly identical * to {@link #addPattern addPattern()} except that it doesn't * throw a MalformedPatternException. If the pattern is not in the * cache, it is compiled, placed in the cache, and returned. If * the pattern cannot be compiled successfully, the implementation must * throw an exception derived from MalformedCachePatternException. * Note that this exception is derived from RuntimeException, which means * you are NOT forced to catch it by the compiler. Please refer to * {@link MalformedCachePatternException} for a discussion of when you * should and shouldn't catch this exception. *

* @param expression The regular expression to fetch from the cache in * compiled form. * @return The Pattern corresponding to the String representation of the * regular expression. * @exception MalformedCachePatternException If there is an error in * compiling the regular expression. */ public Pattern getPattern(String expression) throws MalformedCachePatternException; /** * This method fetches a pattern from the cache. It is nearly identical * to {@link #addPattern addPattern()} except that it doesn't * throw a MalformedPatternException. If the pattern is not in the * cache, it is compiled, placed in the cache, and returned. If * the pattern cannot be compiled successfully, it * throws a MalformedCachePatternException. * Note that this exception is derived from RuntimeException, which means * you are NOT forced to catch it by the compiler. Please refer to * {@link MalformedCachePatternException} for a discussion of when you * should and shouldn't catch this exception. *

* @param expression The regular expression to fetch from the cache in * compiled form. * @param options The compilation options to use when compiling the * expression. * @return The Pattern corresponding to the String representation of the * regular expression. * @exception MalformedCachePatternException If there is an error in * compiling the regular expression. */ public Pattern getPattern(String expression, int options) throws MalformedCachePatternException; /** * Returns the number of elements in the cache, not to be confused with * the {@link #capacity()} which returns the number * of elements that can be held in the cache at one time. *

* @return The current size of the cache (i.e., the number of elements * currently cached). */ public int size(); /** * Returns the maximum number of patterns that can be cached at one time. *

* @return The maximum number of patterns that can be cached at one time. */ public int capacity(); } jakarta-oro-2.0.8/src/java/org/apache/oro/text/MatchAction.java0000644000175000017500000000713607773723336023631 0ustar arnaudarnaud/* * $Id: MatchAction.java,v 1.7 2003/11/07 20:16:24 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.text; /** * The MatchAction interface provides the callback interface for actions * bound to patterns in * {@link MatchActionProcessor}. More often than not, you will want to * create MatchAction instances as anonymous classes when adding pattern * action pairs to a MatchActionProcessor instance. * * @version @version@ * @since 1.0 * @see MatchActionProcessor * @see MatchActionInfo */ public interface MatchAction { /** * This method is called by MatchActionProcessor when it finds an associated * pattern in a line of input. Information pertaining to the matched * line is included in the MatchActionInfo parameter. *

* @see MatchActionProcessor * @see MatchActionInfo * @param matchInfo The match information associated with the line * matched by MatchActionProcessor. */ public void processMatch(MatchActionInfo matchInfo); } jakarta-oro-2.0.8/src/java/org/apache/oro/io/0000755000175000017500000000000010423237774020200 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/org/apache/oro/io/RegexFilenameFilter.java0000644000175000017500000001421307773723336024735 0ustar arnaudarnaud/* * $Id: RegexFilenameFilter.java,v 1.9 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.io; import java.io.*; import org.apache.oro.text.regex.*; import org.apache.oro.text.*; /** * RegexFilenameFilter is the base class for a set of FilenameFilter * implementations that filter based on a regular expression. * * @version @version@ * @since 1.0 * @see Perl5FilenameFilter * @see AwkFilenameFilter * @see GlobFilenameFilter */ public abstract class RegexFilenameFilter implements FilenameFilter, FileFilter { PatternCache _cache; PatternMatcher _matcher; Pattern _pattern; RegexFilenameFilter(PatternCache cache, PatternMatcher matcher, String regex) { _cache = cache; _matcher = matcher; setFilterExpression(regex); } RegexFilenameFilter(PatternCache cache, PatternMatcher matcher, String regex, int options) { _cache = cache; _matcher = matcher; setFilterExpression(regex, options); } RegexFilenameFilter(PatternCache cache, PatternMatcher matcher) { this(cache, matcher, ""); } /** * Set the regular expression on which to filter. *

* @param regex The regular expression on which to filter. * @exception MalformedCachePatternException If there is an error in * compiling the regular expression. This need not be caught if * you are using a hard-coded expression that you know is correct. * But for robustness and reliability you should catch this exception * for dynamically entered expressions determined at runtime. */ public void setFilterExpression(String regex) throws MalformedCachePatternException { _pattern = _cache.getPattern(regex); } /** * Set the regular expression on which to filter along with any * special options to use when compiling the expression. *

* @param regex The regular expression on which to filter. * @param options A set of compilation options specific to the regular * expression grammar being used. * @exception MalformedCachePatternException If there is an error in * compiling the regular expression. This need not be caught if * you are using a hard-coded expression that you know is correct. * But for robustness and reliability you should catch this exception * for dynamically entered expressions determined at runtime. */ public void setFilterExpression(String regex, int options) throws MalformedCachePatternException { _pattern = _cache.getPattern(regex, options); } /** * Filters a filename. Tests if the filename EXACTLY matches the pattern * contained by the filter. The directory argument is not examined. * Conforms to the java.io.FilenameFilter interface. *

* @param dir The directory containing the file. * @param filename The name of the file. * @return True if the filename EXACTLY matches the pattern, false if not. */ public boolean accept(File dir, String filename) { synchronized(_matcher) { return _matcher.matches(filename, _pattern); } } /** * Filters a filename. Tests if the filename EXACTLY matches the pattern * contained by the filter. The filename is defined as pathname.getName(). * Conforms to the java.io.FileFilter interface. *

* @param pathname The file pathname. * @return True if the filename EXACTLY matches the pattern, false if not. */ public boolean accept(File pathname) { synchronized(_matcher) { return _matcher.matches(pathname.getName(), _pattern); } } } jakarta-oro-2.0.8/src/java/org/apache/oro/io/package.html0000644000175000017500000000037207773723336022473 0ustar arnaudarnaud This package provides FilenameFilters that filter based on a regular expression and other I/O-related classes that derive their functionality from regular expressions. jakarta-oro-2.0.8/src/java/org/apache/oro/io/AwkFilenameFilter.java0000644000175000017500000001067407773723336024414 0ustar arnaudarnaud/* * $Id: AwkFilenameFilter.java,v 1.7 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.io; import java.io.*; import org.apache.oro.text.regex.*; import org.apache.oro.text.awk.*; import org.apache.oro.text.*; /** * AwkFilenameFilter is a RegexFilenameFilter subclass that filters on Awk * regular expressions as implemented by the org.apache.oro.text.awk package, * which is required to use this class. * * @version @version@ * @since 1.0 * @see RegexFilenameFilter * @see Perl5FilenameFilter * @see GlobFilenameFilter */ public class AwkFilenameFilter extends RegexFilenameFilter { private static final PatternMatcher __MATCHER = new AwkMatcher(); private static final PatternCache __CACHE = new PatternCacheLRU(new AwkCompiler()); /** * Construct a filter initialized with the indicated regular expression * and accompanying compilation options conforming to those used by * org.apache.oro.text.awk.AwkCompiler *

* @param regex The regular expression on which to filter. * @param options A set of compilation options. * @exception MalformedCachePatternException If there is an error in * compiling the regular expression. This need not be caught if * you are using a hard-coded expression that you know is correct. * But for robustness and reliability you should catch this exception * for dynamically entered expressions determined at runtime. */ public AwkFilenameFilter(String regex, int options) { super(__CACHE, __MATCHER, regex, options); } /** Same as AwkFilenameFilter(regex, AwkCompiler.DEFAULT_MASK); */ public AwkFilenameFilter(String regex) { super(__CACHE, __MATCHER, regex); } /** Same as AwkFilenameFilter(""); */ public AwkFilenameFilter() { super(__CACHE, __MATCHER); } } jakarta-oro-2.0.8/src/java/org/apache/oro/io/GlobFilenameFilter.java0000644000175000017500000001063707773723336024554 0ustar arnaudarnaud/* * $Id: GlobFilenameFilter.java,v 1.7 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.io; import java.io.*; import org.apache.oro.text.regex.*; import org.apache.oro.text.*; /** * GlobFilenameFilter is a RegexFilenameFilter subclass that filters on Glob * regular expressions as implemented by the org.apache.oro.text package, * which is required to use this class. * * @version @version@ * @since 1.0 * @see RegexFilenameFilter * @see AwkFilenameFilter * @see GlobFilenameFilter */ public class GlobFilenameFilter extends RegexFilenameFilter { private static final PatternMatcher __MATCHER = new Perl5Matcher(); private static final PatternCache __CACHE = new PatternCacheLRU(new GlobCompiler()); /** * Construct a filter initialized with the indicated regular expression * and accompanying compilation options conforming to those used by * org.apache.oro.text.GlobCompiler *

* @param regex The regular expression on which to filter. * @param options A set of compilation options. * @exception MalformedCachePatternException If there is an error in * compiling the regular expression. This need not be caught if * you are using a hard-coded expression that you know is correct. * But for robustness and reliability you should catch this exception * for dynamically entered expressions determined at runtime. */ public GlobFilenameFilter(String regex, int options) { super(__CACHE, __MATCHER, regex, options); } /** Same as GlobFilenameFilter(regex, GlobCompiler.DEFAULT_MASK); */ public GlobFilenameFilter(String regex) { super(__CACHE, __MATCHER, regex); } /** Same as GlobFilenameFilter(""); */ public GlobFilenameFilter() { super(__CACHE, __MATCHER); } } jakarta-oro-2.0.8/src/java/org/apache/oro/io/Perl5FilenameFilter.java0000644000175000017500000001061307773723336024652 0ustar arnaudarnaud/* * $Id: Perl5FilenameFilter.java,v 1.7 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.io; import java.io.*; import org.apache.oro.text.regex.*; import org.apache.oro.text.*; /** * Perl5FilenameFilter is a RegexFilenameFilter subclass that filters on Perl5 * regular expressions as implemented by the org.apache.oro.text.regex package, * which is required to use this class. * * @version @version@ * @since 1.0 * @see RegexFilenameFilter * @see AwkFilenameFilter * @see GlobFilenameFilter */ public class Perl5FilenameFilter extends RegexFilenameFilter { private static final PatternMatcher __MATCHER = new Perl5Matcher(); private static final PatternCache __CACHE = new PatternCacheLRU(); /** * Construct a filter initialized with the indicated regular expression * and accompanying compilation options conforming to those used by * org.apache.oro.text.regex.Perl5Compiler *

* @param regex The regular expression on which to filter. * @param options A set of compilation options. * @exception MalformedCachePatternException If there is an error in * compiling the regular expression. This need not be caught if * you are using a hard-coded expression that you know is correct. * But for robustness and reliability you should catch this exception * for dynamically entered expressions determined at runtime. */ public Perl5FilenameFilter(String regex, int options) { super(__CACHE, __MATCHER, regex, options); } /** Same as Perl5FilenameFilter(regex, Perl5Compiler.DEFAULT_MASK); */ public Perl5FilenameFilter(String regex) { super(__CACHE, __MATCHER, regex); } /** Same as Perl5FilenameFilter(""); */ public Perl5FilenameFilter() { super(__CACHE, __MATCHER); } } jakarta-oro-2.0.8/src/java/org/apache/oro/util/0000755000175000017500000000000010423237774020546 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/org/apache/oro/util/CacheFIFO2.java0000644000175000017500000001337007773723336023156 0ustar arnaudarnaud/* * $Id: CacheFIFO2.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.util; import java.util.*; /** * This class is a GenericCache subclass implementing a second * chance FIFO (First In First Out) cache replacement policy. In other * words, values are added to the cache until the cache becomes full. * Once the cache is full, when a new value is added to the cache, it * replaces the first of the current values in the cache to have been * added, unless that value has been used recently (generally * between the last cache replacement and now). * If the value to be replaced has been used, it is given * a second chance, and the next value in the cache is tested for * replacement in the same manner. If all the values are given a * second chance, then the original pattern selected for replacement is * replaced. * * @version @version@ * @since 1.0 * @see GenericCache */ public final class CacheFIFO2 extends GenericCache { private int __current = 0; private boolean[] __tryAgain; /** * Creates a CacheFIFO2 instance with a given cache capacity. *

* @param capacity The capacity of the cache. */ public CacheFIFO2(int capacity) { super(capacity); __tryAgain = new boolean[_cache.length]; } /** * Same as: *

   * CacheFIFO2(GenericCache.DEFAULT_CAPACITY);
   * 
*/ public CacheFIFO2(){ this(GenericCache.DEFAULT_CAPACITY); } public synchronized Object getElement(Object key) { Object obj; obj = _table.get(key); if(obj != null) { GenericCacheEntry entry; entry = (GenericCacheEntry)obj; __tryAgain[entry._index] = true; return entry._value; } return null; } /** * Adds a value to the cache. If the cache is full, when a new value * is added to the cache, it replaces the first of the current values * in the cache to have been added (i.e., FIFO2). *

* @param key The key referencing the value added to the cache. * @param value The value to add to the cache. */ public final synchronized void addElement(Object key, Object value) { int index; Object obj; obj = _table.get(key); if(obj != null) { GenericCacheEntry entry; // Just replace the value. Technically this upsets the FIFO2 ordering, // but it's expedient. entry = (GenericCacheEntry)obj; entry._value = value; entry._key = key; // Set the try again value to compensate. __tryAgain[entry._index] = true; return; } // If we haven't filled the cache yet, put it at the end. if(!isFull()) { index = _numEntries; ++_numEntries; } else { // Otherwise, find the next slot that doesn't have a second chance. index = __current; while(__tryAgain[index]) { __tryAgain[index] = false; if(++index >= __tryAgain.length) index = 0; } __current = index + 1; if(__current >= _cache.length) __current = 0; _table.remove(_cache[index]._key); } _cache[index]._value = value; _cache[index]._key = key; _table.put(key, _cache[index]); } } jakarta-oro-2.0.8/src/java/org/apache/oro/util/CacheFIFO.java0000644000175000017500000001145007773723336023071 0ustar arnaudarnaud/* * $Id: CacheFIFO.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.util; import java.util.*; /** * This class is a GenericCache subclass implementing a FIFO (First * In First Out) cache replacement policy. In other words, values are * added to the cache until the cache becomes full. Once the cache is full, * if a new pattern is added to the cache, it replaces the first of * the current patterns in the cache to have been added. * * @version @version@ * @since 1.0 * @see GenericCache */ public final class CacheFIFO extends GenericCache { private int __curent = 0; /** * Creates a CacheFIFO instance with a given cache capacity. *

* @param capacity The capacity of the cache. */ public CacheFIFO(int capacity) { super(capacity); } /** * Same as: *

   * CacheFIFO(GenericCache.DEFAULT_CAPACITY);
   * 
*/ public CacheFIFO(){ this(GenericCache.DEFAULT_CAPACITY); } /** * Adds a value to the cache. If the cache is full, when a new value * is added to the cache, it replaces the first of the current values * in the cache to have been added (i.e., FIFO). *

* @param key The key referencing the value added to the cache. * @param value The value to add to the cache. */ public final synchronized void addElement(Object key, Object value) { int index; Object obj; obj = _table.get(key); if(obj != null) { GenericCacheEntry entry; // Just replace the value. Technically this upsets the FIFO ordering, // but it's expedient. entry = (GenericCacheEntry)obj; entry._value = value; entry._key = key; return; } // If we haven't filled the cache yet, put it at the end. if(!isFull()) { index = _numEntries; ++_numEntries; } else { // Otherwise, replace the current pointer, which takes care of // FIFO in a circular fashion. index = __curent; if(++__curent >= _cache.length) __curent = 0; _table.remove(_cache[index]._key); } _cache[index]._value = value; _cache[index]._key = key; _table.put(key, _cache[index]); } } jakarta-oro-2.0.8/src/java/org/apache/oro/util/Cache.java0000644000175000017500000000672707773723336022440 0ustar arnaudarnaud/* * $Id: Cache.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.util; /** * An interface defining the basic functions of a cache. * * @version @version@ * @since 1.0 */ public interface Cache { public void addElement(Object key, Object value); public Object getElement(Object key); /** * Returns the number of elements in the cache, not to be confused with * the {@link #capacity()} which returns the number * of elements that can be held in the cache at one time. *

* @return The current size of the cache (i.e., the number of elements * currently cached). */ public int size(); /** * Returns the maximum number of elements that can be cached at one time. *

* @return The maximum number of elements that can be cached at one time. */ public int capacity(); } jakarta-oro-2.0.8/src/java/org/apache/oro/util/package.html0000644000175000017500000000035107773723336023036 0ustar arnaudarnaud This package includes general classes required by {@link org.apache.oro.text} and related packages, but that can also be applied to more general uses. jakarta-oro-2.0.8/src/java/org/apache/oro/util/GenericCacheEntry.java0000644000175000017500000000636307773723336024753 0ustar arnaudarnaud/* * $Id: GenericCacheEntry.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.util; /** * A structure used to store values in a GenericCache. It * is declared with default access to limit it to use only within the * package. * * @version @version@ * @since 1.0 */ final class GenericCacheEntry implements java.io.Serializable { /** The cache array index of the entry. */ int _index; /** The value stored at this entry. */ Object _value; /** The key used to store the value. */ Object _key; GenericCacheEntry(int index) { _index = index; _value = null; _key = null; } } jakarta-oro-2.0.8/src/java/org/apache/oro/util/CacheLRU.java0000644000175000017500000001357207773723336023017 0ustar arnaudarnaud/* * $Id: CacheLRU.java,v 1.10 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.util; import java.util.*; /** * This class is a GenericCache subclass implementing an LRU * (Least Recently Used) cache replacement policy. In other words, * values are added to the cache until it becomes full. Once the * cache is full, when a new value is added to the cache, it replaces * the least recently used value currently in the cache. This is probably * the best general purpose cache replacement policy. * * @version @version@ * @since 1.0 * @see GenericCache */ public final class CacheLRU extends GenericCache { private int __head = 0, __tail = 0; private int[] __next, __prev; /** * Creates a CacheLRU instance with a given cache capacity. *

* @param capacity The capacity of the cache. */ public CacheLRU(int capacity) { super(capacity); int i; __next = new int[_cache.length]; __prev = new int[_cache.length]; for(i=0; i < __next.length; i++) __next[i] = __prev[i] = -1; } /** * Same as: *

   * CacheLRU(GenericCache.DEFAULT_CAPACITY);
   * 
*/ public CacheLRU(){ this(GenericCache.DEFAULT_CAPACITY); } private void __moveToFront(int index) { int next, prev; if(__head != index) { next = __next[index]; prev = __prev[index]; // Only the head has a prev entry that is an invalid index so // we don't check. __next[prev] = next; // Make sure index is valid. If it isn't, we're at the tail // and don't set __prev[next]. if(next >= 0) __prev[next] = prev; else __tail = prev; __prev[index] = -1; __next[index] = __head; __prev[__head] = index; __head = index; } } public synchronized Object getElement(Object key) { Object obj; obj = _table.get(key); if(obj != null) { GenericCacheEntry entry; entry = (GenericCacheEntry)obj; // Maintain LRU property __moveToFront(entry._index); return entry._value; } return null; } /** * Adds a value to the cache. If the cache is full, when a new value * is added to the cache, it replaces the least recently used value * in the cache (i.e., LRU). *

* @param key The key referencing the value added to the cache. * @param value The value to add to the cache. */ public final synchronized void addElement(Object key, Object value) { Object obj; obj = _table.get(key); if(obj != null) { GenericCacheEntry entry; // Just replace the value, but move it to the front. entry = (GenericCacheEntry)obj; entry._value = value; entry._key = key; __moveToFront(entry._index); return; } // If we haven't filled the cache yet, place in next available spot // and move to front. if(!isFull()) { if(_numEntries > 0) { __prev[_numEntries] = __tail; __next[_numEntries] = -1; __moveToFront(_numEntries); } ++_numEntries; } else { // We replace the tail of the list. _table.remove(_cache[__tail]._key); __moveToFront(__tail); } _cache[__head]._value = value; _cache[__head]._key = key; _table.put(key, _cache[__head]); } } jakarta-oro-2.0.8/src/java/org/apache/oro/util/CacheRandom.java0000644000175000017500000001127307773723336023571 0ustar arnaudarnaud/* * $Id: CacheRandom.java,v 1.7 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.util; import java.util.*; /** * This class is a GenericCache subclass implementing a random * cache replacement policy. In other words, * values are added to the cache until it becomes full. Once the * cache is full, when a new value is added to the cache, it replaces * a randomly selected value in the cache. * * @version @version@ * @since 1.0 * @see GenericCache */ public final class CacheRandom extends GenericCache { private Random __random; /** * Creates a CacheRandom instance with a given cache capacity. *

* @param capacity The capacity of the cache. */ public CacheRandom(int capacity) { super(capacity); __random = new Random(System.currentTimeMillis()); } /** * Same as: *

   * CacheRandom(GenericCache.DEFAULT_CAPACITY);
   * 
*/ public CacheRandom(){ this(GenericCache.DEFAULT_CAPACITY); } /** * Adds a value to the cache. If the cache is full, when a new value * is added to the cache, it replaces the first of the current values * in the cache to have been added (i.e., Random). *

* @param key The key referencing the value added to the cache. * @param value The value to add to the cache. */ public final synchronized void addElement(Object key, Object value) { int index; Object obj; obj = _table.get(key); if(obj != null) { GenericCacheEntry entry; // Just replace the value. entry = (GenericCacheEntry)obj; entry._value = value; entry._key = key; return; } // Expression is not in cache. // If we haven't filled the cache yet, put it at the end. if(!isFull()) { index = _numEntries; ++_numEntries; } else { // Otherwise, replace a random entry. index = (int)(_cache.length*__random.nextFloat()); _table.remove(_cache[index]._key); } _cache[index]._value = value; _cache[index]._key = key; _table.put(key, _cache[index]); } } jakarta-oro-2.0.8/src/java/org/apache/oro/util/GenericCache.java0000644000175000017500000001224007773723336023720 0ustar arnaudarnaud/* * $Id: GenericCache.java,v 1.8 2003/11/07 20:16:25 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package org.apache.oro.util; import java.util.*; /** * This is the base class for all cache implementations provided in the * org.apache.oro.util package. To derive a subclass from GenericCache * only the ... methods * need be overridden. * Although 4 subclasses of GenericCache are provided with this * package, users may not derive subclasses from this class. * Rather, users should create their own implmentations of the * {@link Cache} interface. * * @version @version@ * @since 1.0 * @see Cache * @see CacheLRU * @see CacheFIFO * @see CacheFIFO2 * @see CacheRandom */ public abstract class GenericCache implements Cache, java.io.Serializable { /** * The default capacity to be used by the GenericCache subclasses * provided with this package. Its value is 20. */ public static final int DEFAULT_CAPACITY = 20; int _numEntries; GenericCacheEntry[] _cache; HashMap _table; /** * The primary constructor for GenericCache. It has default * access so it will only be used within the package. It initializes * _table to a Hashtable of capacity equal to the capacity argument, * _cache to an array of size equal to the capacity argument, and * _numEntries to 0. *

* @param capacity The maximum capacity of the cache. */ GenericCache(int capacity) { _numEntries = 0; _table = new HashMap(capacity); _cache = new GenericCacheEntry[capacity]; while(--capacity >= 0) _cache[capacity] = new GenericCacheEntry(capacity); } public abstract void addElement(Object key, Object value); public synchronized Object getElement(Object key) { Object obj; obj = _table.get(key); if(obj != null) return ((GenericCacheEntry)obj)._value; return null; } public final Iterator keys() { return _table.keySet().iterator(); } /** * Returns the number of elements in the cache, not to be confused with * the {@link #capacity()} which returns the number * of elements that can be held in the cache at one time. *

* @return The current size of the cache (i.e., the number of elements * currently cached). */ public final int size() { return _numEntries; } /** * Returns the maximum number of elements that can be cached at one time. *

* @return The maximum number of elements that can be cached at one time. */ public final int capacity() { return _cache.length; } public final boolean isFull() { return (_numEntries >= _cache.length); } } jakarta-oro-2.0.8/src/java/org/apache/oro/overview.html0000644000175000017500000000077307773723336022344 0ustar arnaudarnaud The Jakarta-ORO library contains packages for performing general text processing in Java, with an aim to support, though not specifically limited to, servlet development. The core package is {@link org.apache.oro.text.regex}, which defines abstract interfaces for manipulating regular expressions, as well as a set of Perl5 comptabile regular expression classes. Developers will mostly be interested only in that package. jakarta-oro-2.0.8/src/java/examples/0000755000175000017500000000000010423237774016600 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/examples/awk/0000755000175000017500000000000010423237774017362 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/examples/awk/streamInputExample.txt0000644000175000017500000000145407773723336023766 0ustar arnaudarnaud Many programmers believe C++ is too complicated for its own good and prefer to avoid its more obscure and confusing features. In fact, some programmers are so fed up with the language that they will only program in Java, even though Java is still very immature and dog-slow. That is not to say that Java is necessarily a better language than C++, but rather that Java simply has a stronger appeal to the tired C++ programmer. C++ is an object-oriented descendent of C. Being derived from C gave it one marvelous feature that Java lacks: the C preprocessor. C++ programmers that have converted to Java are banging their heads against their keyboards because they do not have a true conditional compilation mechanism. Of course, the lack of enumerations is also a great pain, although tolerable to some. jakarta-oro-2.0.8/src/java/examples/awk/splitExample.java0000644000175000017500000001232507773723336022707 0ustar arnaudarnaud/* * $Id: splitExample.java,v 1.8 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples.awk; import java.util.*; import org.apache.oro.text.regex.*; import org.apache.oro.text.awk.*; /** * This is a test program demonstrating the use of the Util.split() method. * It is the same as the version in the OROMatcher distribution except that * it uses Awk classes instead of Perl classes. * * @version @version@ */ public final class splitExample { /** * A good way for you to understand the split() method is to play around * with it by using this test program. The program takes 2 or 3 arguments * as follows: * java splitExample regex input [split limit] * regex - A regular expression used to split the input. * input - A string to be used as input for split(). * split limit - An optional argument limiting the size of the list returned * by split(). If no limit is given, the limit used is * Util.SPLIT_ALL. Setting the limit to 1 generally doesn't * make any sense. * * Try the following two command lines to see how split limit works: * java splitExample '[:|]' '1:2|3:4' * java splitExample '[:|]' '1:2|3:4' 3 * */ public static final void main(String args[]) { int limit, i; String regularExpression, input; List results = new ArrayList(); Pattern pattern = null; PatternMatcher matcher; PatternCompiler compiler; Iterator elements; // Make sure there are sufficient arguments if(args.length < 2) { System.err.println("Usage: splitExample regex input [split limit]"); System.exit(1); } regularExpression = args[0]; input = args[1]; if(args.length > 2) limit = Integer.parseInt(args[2]); else limit = Util.SPLIT_ALL; // Create AwkCompiler and AwkMatcher instances. compiler = new AwkCompiler(); matcher = new AwkMatcher(); // Attempt to compile the pattern. If the pattern is not valid, // report the error and exit. try { pattern = compiler.compile(regularExpression); System.out.println("split regex: " + regularExpression); } catch(MalformedPatternException e){ System.err.println("Bad pattern."); System.err.println(e.getMessage()); System.exit(1); } // Split the input and print the resulting list. System.out.println("split results: "); Util.split(results, matcher, pattern, input, limit); elements = results.iterator(); i = 0; while(elements.hasNext()) System.out.println("item " + i++ + ": " + (String)elements.next()); } } jakarta-oro-2.0.8/src/java/examples/awk/substituteExample.java0000644000175000017500000001235207773723336023767 0ustar arnaudarnaud/* * $Id: substituteExample.java,v 1.8 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples.awk; import org.apache.oro.text.regex.*; import org.apache.oro.text.awk.*; /** * This is a test program demonstrating the use of the Util.substitute() * method. It is the same as the version in the OROMatcher distribution * except that it uses Awk classes instead of Awk classes. * * @version @version@ */ public final class substituteExample { /** * A good way for you to understand the substitute() method is to play around * with it by using this test program. The program takes 3 to 5 arguments * as follows: * java substituteExample * regex substitution input [sub limit] [interpolation limit] * regex - A regular expression used to find parts of the input to be * substituted. * sub limit - An optional argument limiting the number of substitutions. * If no limit is given, the limit used is Util.SUBSTITUTE_ALL. * input - A string to be used as input for substitute(). * interpolation limit - An optional argument limiting the number of * interpolations performed. * * Try the following command line for a simple example of subsitute(). * It changes (2,3) to (3,2) in the input. * java substituteExample '\(2,3\)' '(3, 2)' '(1,2) (2,3) (4,5)' * * The following command line shows the substitute limit at work. It * changed the first four 1's in the input to 4's. * java substituteExample '1' '4' '381298175 1111' */ public static final void main(String args[]) { int limit; PatternMatcher matcher = new AwkMatcher(); Pattern pattern = null; PatternCompiler compiler = new AwkCompiler(); String regularExpression, input, result; Substitution sub; // Make sure there are sufficient arguments if(args.length < 3) { System.err.println("Usage: substituteExample regex substitution " + "input [sub limit]"); System.exit(1); } limit = Util.SUBSTITUTE_ALL; regularExpression = args[0]; sub = new StringSubstitution(args[1]); input = args[2]; if(args.length > 3) limit = Integer.parseInt(args[3]); try { pattern = compiler.compile(regularExpression); System.out.println("substitute regex: " + regularExpression); } catch(MalformedPatternException e){ System.err.println("Bad pattern."); System.err.println(e.getMessage()); System.exit(1); } // Perform substitution and print result. result = Util.substitute(matcher, pattern, sub, input, limit); System.out.println("result: " + result); } } jakarta-oro-2.0.8/src/java/examples/awk/prefixExample.java0000644000175000017500000001320307773723336023045 0ustar arnaudarnaud/* * $Id: prefixExample.java,v 1.7 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples.awk; import org.apache.oro.text.regex.*; import org.apache.oro.text.awk.*; /** * This is a test program demonstrating an application of the matchesPrefix() * methods. This example program shows how you might tokenize a stream of * input using whitespace as a token separator. Don't forget to use quotes * around the input on the command line, e.g. * java prefixExample "Test to see if 1.0 is real and 2 is an integer" * * If you don't need the power of a full blown lexer generator, you can * easily use regular expressions to create your own tokenization and * simple parsing classes using similar approaches. This example is * rather sloppy. If you look at the equivalent example in the OROMatcher * distribution, you'll see how to Perl's zero-width look ahead assertion * makes correctness easier to achieve. * * @version @version@ */ public final class prefixExample { public static final int REAL = 0; public static final int INTEGER = 1; public static final int STRING = 2; public static final String[] types = { "Real", "Integer", "String" }; public static final String whitespace = "[ \t\n\r]+"; public static final String[] tokens = { "-?[0-9]*\\.[0-9]+([eE]-?[0-9]+)?", "-?[0-9]+", "[^ \t\n\r]+" }; public static final void main(String args[]) { int token; PatternMatcherInput input; PatternMatcher matcher; PatternCompiler compiler; Pattern[] patterns; Pattern tokenSeparator = null; MatchResult result; if(args.length < 1) { System.err.println("Usage: prefixExample "); System.exit(1); } input = new PatternMatcherInput(args[0]); compiler = new AwkCompiler(); patterns = new Pattern[tokens.length]; try { tokenSeparator = compiler.compile(whitespace); for(token=0; token < tokens.length; token++) patterns[token] = compiler.compile(tokens[token]); } catch(MalformedPatternException e) { System.err.println("Bad pattern."); e.printStackTrace(); System.exit(1); } matcher = new AwkMatcher(); _whileLoop: while(!input.endOfInput()) { for(token = 0; token < tokens.length; token++) if(matcher.matchesPrefix(input, patterns[token])) { int offset; result = matcher.getMatch(); offset = input.getCurrentOffset(); input.setCurrentOffset(result.endOffset(0)); if(matcher.matchesPrefix(input, tokenSeparator)) { input.setCurrentOffset(matcher.getMatch().endOffset(0)); System.out.println(types[token] + ": " + result); continue _whileLoop; } else if(input.endOfInput()) { System.out.println(types[token] + ": " + result); break _whileLoop; } input.setCurrentOffset(offset); } if(matcher.matchesPrefix(input, tokenSeparator)) input.setCurrentOffset(matcher.getMatch().endOffset(0)); else { System.err.println("Unrecognized token starting at offset: " + input.getCurrentOffset()); break; } } } } jakarta-oro-2.0.8/src/java/examples/awk/matchesContainsExample.java0000644000175000017500000001516307773723336024702 0ustar arnaudarnaud/* * $Id: matchesContainsExample.java,v 1.7 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples.awk; import org.apache.oro.text.regex.*; import org.apache.oro.text.awk.*; /** * This is a test program demonstrating the difference between the * OROMatcher matches() and contains() methods. * * @version @version@ */ public final class matchesContainsExample { /** * A common mistake is to confuse the behavior of the matches() and * contains() methods. matches() tests to see if a string exactly * matches a pattern whereas contains() searches for the first pattern * match contained somewhere within the string. When used with a * PatternMatcherInput instance, the contains() method allows you to * search for every pattern match within a string by using a while loop. */ public static final void main(String args[]) { int matches = 0; String numberExpression = "\\d+"; String exactMatch = "2010"; String containsMatches = " 2001 was the movie before 2010, which takes place before 2069 the book "; Pattern pattern = null; PatternMatcherInput input; PatternCompiler compiler; PatternMatcher matcher; MatchResult result; // Create AwkCompiler and AwkMatcher instances. compiler = new AwkCompiler(); matcher = new AwkMatcher(); // Attempt to compile the pattern. If the pattern is not valid, // report the error and exit. try { pattern = compiler.compile(numberExpression); } catch(MalformedPatternException e) { System.err.println("Bad pattern."); System.err.println(e.getMessage()); System.exit(1); } // Here we show the difference between the matches() and contains() // methods(). Compile the program and study the output to reinforce // in your mind what the methods do. System.out.println("Input: " + exactMatch); // The following should return true because exactMatch exactly matches // numberExprssion. if(matcher.matches(exactMatch, pattern)) System.out.println("matches() Result: TRUE, EXACT MATCH"); else System.out.println("matches() Result: FALSE, NOT EXACT MATCH"); System.out.println("\nInput: " + containsMatches); // The following should return false because containsMatches does not // exactly match numberExpression even though its subparts do. if(matcher.matches(containsMatches, pattern)) System.out.println("matches() Result: TRUE, EXACT MATCH"); else System.out.println("matches() Result: FALSE, NOT EXACT MATCH"); // Now we call the contains() method. contains() should return true // for both strings. System.out.println("\nInput: " + exactMatch); if(matcher.contains(exactMatch, pattern)) { System.out.println("contains() Result: TRUE"); // Fetch match and print. result = matcher.getMatch(); System.out.println("Match: " + result); } else System.out.println("contains() Result: FALSE"); System.out.println("\nInput: " + containsMatches); if(matcher.contains(containsMatches, pattern)) { System.out.println("contains() Result: TRUE"); // Fetch match and print. result = matcher.getMatch(); System.out.println("Match: " + result); } else System.out.println("contains() Result: FALSE"); // In the previous example, notice how contains() will fetch only first // match in a string. If you want to search a string for all of the // matches it contains, you must create a PatternMatcherInput object // to keep track of the position of the last match, so you can pick // up a search where the last one left off. input = new PatternMatcherInput(containsMatches); System.out.println("\nPatternMatcherInput: " + input); // Loop until there are no more matches left. while(matcher.contains(input, pattern)) { // Since we're still in the loop, fetch match that was found. result = matcher.getMatch(); ++matches; System.out.println("Match " + matches + ": " + result); } } } jakarta-oro-2.0.8/src/java/examples/awk/streamInputExample.java0000644000175000017500000001313307773723336024065 0ustar arnaudarnaud/* * $Id: streamInputExample.java,v 1.7 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples.awk; import java.io.*; import org.apache.oro.text.regex.*; import org.apache.oro.text.awk.*; /** * This is a test program demonstrating how to search an input stream * with the AwkTools regular expression classes. * * @version @version@ */ public final class streamInputExample { /** * This program extracts sentences containing the word C++ from * the sample file streamInputExample.txt The regular expression * used is not perfect, so focus on AwkStreamInput and not the * ability of the regular expression to handle all normal sentences. * For those not familiar with the OROMatcher Util class, a use of * the Util.substitute method is included. */ public static final void main(String args[]) { // A regular expression to extract sentences containing the word C++. // We assume sentences can only end in . ! ? and start with a word // character \w String regex = "(\\w[^\\.?!]*C\\+\\+|C\\+\\+)[^\\.?!]*[\\.?!]"; String sentence; AwkMatcher matcher; AwkCompiler compiler; Pattern pattern = null, newline = null; AwkStreamInput input; MatchResult result; Reader file = null; // Create AwkCompiler and AwkMatcher instances. compiler = new AwkCompiler(); matcher = new AwkMatcher(); // Attempt to compile the pattern. If the pattern is not valid, // report the error and exit. try { pattern = compiler.compile(regex, AwkCompiler.CASE_INSENSITIVE_MASK); // Compile a pattern representing a string of newlines with other // whitespace stuck around the newlines newline = compiler.compile("(\\s*[\n\r]\\s*)+"); } catch(MalformedPatternException e) { System.err.println("Bad pattern."); System.err.println(e.getMessage()); System.exit(1); } // Open input file. try { file = new FileReader("streamInputExample.txt"); } catch(IOException e) { System.err.println("Error opening streamInputExample.txt."); System.err.println(e.getMessage()); System.exit(1); } // Create an AwkStreamInput instance to search the input stream. input = new AwkStreamInput(file); // We need to put the search loop in a try block because when searching // an AwkStreamInput instance, an IOException may occur, and it must be // caught. try { // Loop until there are no more matches left. while(matcher.contains(input, pattern)) { // Since we're still in the loop, fetch match that was found. result = matcher.getMatch(); // Substitute all newlines in the match with spaces. sentence = Util.substitute(matcher, newline, new StringSubstitution(" "), result.toString(), Util.SUBSTITUTE_ALL); System.out.println("\nMatch:\n" + sentence); } } catch(IOException e) { System.err.println("Error occurred while reading file."); System.err.println(e.getMessage()); System.exit(1); } } } jakarta-oro-2.0.8/src/java/examples/awk/strings.java0000644000175000017500000001444707773723336021740 0ustar arnaudarnaud/* * $Id: strings.java,v 1.3 2003/08/12 18:11:30 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2002 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples.awk; import java.io.*; import org.apache.oro.text.regex.*; import org.apache.oro.text.awk.*; /** * This is a test program demonstrating how to search an input stream * with the jakarta-oro awk package regular expression classes. It * performs a function similar to the Unix strings command, * but is intended to show how matching on a stream is affected by its * character encoding. The most important thing to remember is that * AwkMatcher only matches on 8-bit values. If your input contains * Java characters containing values greater than 255, the pattern * matching process will result in an ArrayIndexOutOfBoundsException. * Therefore, if you want to search a binary file containing arbitrary * bytes, you have to make sure you use an 8-bit character encoding * like ISO-8859-1, so that the mapping between byte-values and character * values will be one to one. Otherwise, the file will be interpreted * as UTF-8 by default, and you will probably wind up with character * values outside of the 8-bit range. * * @version @version@ */ public final class strings { public static final class StringFinder { /** * Default string expression. Looks for at least 4 contiguous * printable characters. Differs slightly from GNU strings command * in that any printable character may start a string. */ public static final String DEFAULT_PATTERN = "[\\x20-\\x7E]{3}[\\x20-\\x7E]+"; Pattern pattern; AwkMatcher matcher; public StringFinder(String regex) throws MalformedPatternException { AwkCompiler compiler = new AwkCompiler(); pattern = compiler.compile(regex, AwkCompiler.CASE_INSENSITIVE_MASK); matcher = new AwkMatcher(); } public StringFinder() throws MalformedPatternException { this(DEFAULT_PATTERN); } public void search(Reader input, PrintWriter output) throws IOException { MatchResult result; AwkStreamInput in = new AwkStreamInput(input); while(matcher.contains(in, pattern)) { result = matcher.getMatch(); output.println(result); } output.flush(); } } public static final String DEFAULT_ENCODING = "ISO-8859-1"; public static final void main(String args[]) { String regex = StringFinder.DEFAULT_PATTERN; String filename, encoding = DEFAULT_ENCODING; StringFinder finder; Reader file = null; // Some users thought it would be useful to use the default pattern // and just pass the encoding as the second parameter. Therefore, // when two arguments are given and the second argument is not a valid // encoding, it is interpreted as a pattern. This means you can't // use a valid encoding name as a pattern without also specifying // an encoding as a third argument. if(args.length < 1) { System.err.println("usage: strings file [pattern|encoding] [encoding]"); return; } else if(args.length > 2) { regex = args[1]; encoding = args[2]; } else if(args.length > 1) encoding = args[1]; filename = args[0]; try { InputStream fin = new FileInputStream(filename); try { file = new InputStreamReader(fin, encoding); } catch(UnsupportedEncodingException uee) { if(args.length == 2) { regex = encoding; encoding = DEFAULT_ENCODING; file = new InputStreamReader(fin, encoding); } else throw uee; } finder = new StringFinder(regex); finder.search(file, new PrintWriter(new OutputStreamWriter(System.out))); file.close(); } catch(Exception e) { e.printStackTrace(); return; } } } jakarta-oro-2.0.8/src/java/examples/matchResultExample.java0000644000175000017500000001475607773723336023277 0ustar arnaudarnaud/* * $Id: matchResultExample.java,v 1.7 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import org.apache.oro.text.regex.*; /** * This is a test program demonstrating the methods of the OROMatcher * MatchResult class. * * @version @version@ */ public final class matchResultExample { /** * Takes a regular expression and string as input and reports all the * pattern matches in the string. *

* @param args[] The array of arguments to the program. The first * argument should be a Perl5 regular expression, and the second * should be an input string. */ public static final void main(String args[]) { int groups; PatternMatcher matcher; PatternCompiler compiler; Pattern pattern = null; PatternMatcherInput input; MatchResult result; // Must have at least two arguments, else exit. if(args.length < 2) { System.err.println("Usage: matchResult pattern input"); return; } // Create Perl5Compiler and Perl5Matcher instances. compiler = new Perl5Compiler(); matcher = new Perl5Matcher(); // Attempt to compile the pattern. If the pattern is not valid, // report the error and exit. try { pattern = compiler.compile(args[0]); } catch(MalformedPatternException e) { System.err.println("Bad pattern."); System.err.println(e.getMessage()); return; } // Create a PatternMatcherInput instance to keep track of the position // where the last match finished, so that the next match search will // start from there. You always create a PatternMatcherInput instance // when you want to search a string for all of the matches it contains, // and not just the first one. input = new PatternMatcherInput(args[1]); // Loop until there are no more matches left. while(matcher.contains(input, pattern)) { // Since we're still in the loop, fetch match that was found. result = matcher.getMatch(); // Perform whatever processing on the result you want. // Here we just print out all its elements to show how the // MatchResult methods are used. // The toString() method is provided as a convenience method. // It returns the entire match. The following are all equivalent: // System.out.println("Match: " + result); // System.out.println("Match: " + result.toString()); // System.out.println("Match: " + result.group(0)); System.out.println("Match: " + result.toString()); // Print the length of the match. The length() method is another // convenience method. The lengths of subgroups can be obtained // by first retrieving the subgroup and then calling the string's // length() method. System.out.println("Length: " + result.length()); // Retrieve the number of matched groups. A group corresponds to // a parenthesized set in a pattern. groups = result.groups(); System.out.println("Groups: " + groups); // Print the offset into the input of the beginning and end of the // match. The beinOffset() and endOffset() methods return the // offsets of a group relative to the beginning of the input. The // begin() and end() methods return the offsets of a group relative // the to the beginning of a match. System.out.println("Begin offset: " + result.beginOffset(0)); System.out.println("End offset: " + result.endOffset(0)); System.out.println("Groups: "); // Print the contents of each matched subgroup along with their // offsets relative to the beginning of the entire match. // Start at 1 because we just printed out group 0 for(int group = 1; group < groups; group++) { System.out.println(group + ": " + result.group(group)); System.out.println("Begin: " + result.begin(group)); System.out.println("End: " + result.end(group)); } } } } jakarta-oro-2.0.8/src/java/examples/filter.java0000644000175000017500000000753407773723336020751 0ustar arnaudarnaud/* * $Id: filter.java,v 1.8 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import java.io.*; import java.util.*; import org.apache.oro.io.*; import org.apache.oro.text.*; /** * This is a sample program demonstrating how to use the regular expression * filename filter classes. * * @version @version@ */ public final class filter { public static void printList(String[] list) { System.out.println(); for(int i=0; i < list.length; i++) System.out.println(list[i]); System.out.println(); } public static final void main(String[] args) { File dir; FilenameFilter filter; dir = new File(System.getProperty("user.dir")); // List all files ending in .java filter = new GlobFilenameFilter("*.java", GlobCompiler.STAR_CANNOT_MATCH_NULL_MASK); System.out.println("Glob: *.java"); printList(dir.list(filter)); // List all files ending in .class filter = new AwkFilenameFilter(".+\\.class"); System.out.println("Awk: .+\\.class"); printList(dir.list(filter)); // List all files ending in .java or .class filter = new Perl5FilenameFilter(".+\\.(?:java|class)"); System.out.println("Perl5: .+\\.(?:java|class)"); printList(dir.list(filter)); } } jakarta-oro-2.0.8/src/java/examples/semicolon.java0000644000175000017500000000743007773723336021447 0ustar arnaudarnaud/* * $Id: semicolon.java,v 1.8 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import java.io.*; import org.apache.oro.text.*; import org.apache.oro.text.regex.*; /** * This is a simple example program showing how to use the MatchActionProcessor * class. It reads the provided semi-colon delimited file semicolon.txt and * outputs only the second column to standard output. * * @version @version@ */ public final class semicolon { public static final void main(String[] args) { MatchActionProcessor processor = new MatchActionProcessor(); try { processor.setFieldSeparator(";"); // Using a null pattern means to perform the action for every line. processor.addAction(null, new MatchAction() { public void processMatch(MatchActionInfo info) { // We assume the second column exists info.output.println(info.fields.get(1)); } }); } catch(MalformedPatternException e) { e.printStackTrace(); System.exit(1); } try { processor.processMatches(new FileInputStream("semicolon.txt"), System.out); } catch(IOException e) { e.printStackTrace(); System.exit(1); } } } jakarta-oro-2.0.8/src/java/examples/semicolon.txt0000644000175000017500000000021707773723336021341 0ustar arnaudarnaud1;Trenton;New Jersey 2;Annapolis;Maryland 3;Austin;Texas 4;Richmond;Virginia 5;Harrisburg;Pennsylvania 6;Honolulu;Hawaii 7;Santa Fe;New Mexico jakarta-oro-2.0.8/src/java/examples/MatcherDemoApplet.java0000644000175000017500000002305107773723336023012 0ustar arnaudarnaud/* * $Id: MatcherDemoApplet.java,v 1.5 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import java.applet.*; import java.awt.*; import java.io.*; import java.net.*; import org.apache.oro.text.*; import org.apache.oro.text.awk.*; import org.apache.oro.text.regex.*; /** * This is a quickly hacked together demo of regular expression * matching with three different regular expression syntaxes. * It was originally written in JDK 1.0.2 days and hasn't changed * much. It should be refactored into classes for a general purpose * interactive testing interface that can be run as a standalone * AWT application or embedded in an applet. * * @version @version@ */ public final class MatcherDemoApplet extends Applet { static int CONTAINS_SEARCH = 0, MATCHES_SEARCH = 1; static int CASE_SENSITIVE = 0, CASE_INSENSITIVE = 1; static int PERL5_EXPRESSION = 0; static int AWK_EXPRESSION = 1; static int GLOB_EXPRESSION = 2; static String[] expressionType = { "Perl5 Expression:", "AWK Expression:", "Glob Expression:" }; static int[] CASE_MASK[] = { { Perl5Compiler.DEFAULT_MASK, Perl5Compiler.CASE_INSENSITIVE_MASK }, { AwkCompiler.DEFAULT_MASK, AwkCompiler.CASE_INSENSITIVE_MASK }, { GlobCompiler.DEFAULT_MASK, GlobCompiler.CASE_INSENSITIVE_MASK } }; TextField expressionField; Label resultLabel, inputLabel; TextArea resultArea, inputArea; Choice expressionChoice, searchChoice, caseChoice; Button searchButton, resetButton; PatternCompiler compiler[]; PatternMatcher matcher[]; public MatcherDemoApplet() { setFont(new Font("Helvetica", Font.PLAIN, 14)); setBackground(new Color(210, 180, 140)); expressionChoice = new Choice(); for(int i = 0; i < expressionType.length; ++i) expressionChoice.addItem(expressionType[i]); compiler = new PatternCompiler[expressionType.length]; matcher = new PatternMatcher[expressionType.length]; compiler[PERL5_EXPRESSION] = new Perl5Compiler(); matcher[PERL5_EXPRESSION] = new Perl5Matcher(); compiler[AWK_EXPRESSION] = new AwkCompiler(); matcher[AWK_EXPRESSION] = new AwkMatcher(); compiler[GLOB_EXPRESSION] = new GlobCompiler(); matcher[GLOB_EXPRESSION] = matcher[PERL5_EXPRESSION]; expressionField = new TextField(10); searchChoice = new Choice(); searchChoice.addItem("contains()"); searchChoice.addItem("matches()"); caseChoice = new Choice(); caseChoice.addItem("Case Sensitive"); caseChoice.addItem("Case Insensitive"); searchButton = new Button("Search"); resetButton = new Button("Reset"); resultArea = new TextArea(20, 80); inputArea = new TextArea(5, 80); inputLabel = new Label("Search Input", Label.CENTER); resultLabel = new Label("Search Results", Label.CENTER); resultArea.setEditable(false); } public void init(){ String param; GridBagLayout layout; GridBagConstraints constraints; if((param = getParameter("background")) != null) { try { setBackground(new Color(Integer.parseInt(param, 16))); } catch(NumberFormatException e) { // do nothing, don't set color } } if((param = getParameter("fontSize")) != null) { Font font; font = getFont(); try { setFont(new Font(font.getFamily(), font.getStyle(), Integer.parseInt(param))); } catch(NumberFormatException e) { // do nothing, don't set font size } } setLayout(layout = new GridBagLayout()); constraints = new GridBagConstraints(); constraints.fill = GridBagConstraints.HORIZONTAL; constraints.anchor = GridBagConstraints.EAST; layout.setConstraints(expressionChoice, constraints); add(expressionChoice); constraints.weightx = 1.0; constraints.anchor = GridBagConstraints.WEST; constraints.gridwidth = GridBagConstraints.REMAINDER; layout.setConstraints(expressionField, constraints); add(expressionField); constraints.gridwidth = 1; layout.setConstraints(searchChoice, constraints); add(searchChoice); layout.setConstraints(caseChoice, constraints); add(caseChoice); layout.setConstraints(searchButton, constraints); add(searchButton); constraints.gridwidth = GridBagConstraints.REMAINDER; layout.setConstraints(resetButton, constraints); add(resetButton); constraints.gridwidth = GridBagConstraints.REMAINDER; layout.setConstraints(inputLabel, constraints); add(inputLabel); constraints.gridwidth = GridBagConstraints.REMAINDER; constraints.fill = GridBagConstraints.BOTH; constraints.weighty = 0.25; layout.setConstraints(inputArea, constraints); add(inputArea); constraints.weighty = 0.0; constraints.fill = GridBagConstraints.HORIZONTAL; layout.setConstraints(resultLabel, constraints); add(resultLabel); constraints.weighty = 1.0; constraints.fill = GridBagConstraints.BOTH; constraints.gridheight = GridBagConstraints.REMAINDER; layout.setConstraints(resultArea, constraints); add(resultArea); } public void search(){ int matchNum, group, caseMask, exprChoice, search; String text; MatchResult result; Pattern pattern; PatternMatcherInput input; resultArea.setText(""); text = expressionField.getText(); exprChoice = expressionChoice.getSelectedIndex(); caseMask = CASE_MASK[exprChoice][caseChoice.getSelectedIndex()]; resultArea.appendText("Compiling regular expression.\n"); try { pattern = compiler[exprChoice].compile(text, caseMask); } catch(MalformedPatternException e){ resultArea.appendText("\nMalformed Regular Expression:\n" + e.getMessage()); return; } search = searchChoice.getSelectedIndex(); text = inputArea.getText(); matchNum = 0; resultArea.appendText("\nSearching\n\n"); if(search == MATCHES_SEARCH) { if(matcher[exprChoice].matches(text, pattern)) resultArea.appendText("The input IS an EXACT match.\n"); else resultArea.appendText("The input IS NOT an EXACT match.\n"); } else { input = new PatternMatcherInput(text); while(matcher[exprChoice].contains(input, pattern)) { int groups; result = matcher[exprChoice].getMatch(); ++matchNum; resultArea.appendText("Match " + matchNum + ": " + result.group(0)+ "\n"); groups = result.groups(); if(groups > 1){ resultArea.appendText(" Subgroups:\n"); for(group=1; group < groups; group++){ resultArea.appendText(" " + group + ": " + result.group(group) + "\n"); } } } resultArea.appendText("\nThe input contained " + matchNum + " matches."); } } public boolean action(Event event, Object arg) { if(event.target == searchButton){ search(); return true; } else if(event.target == resetButton) { resultArea.setText(""); inputArea.setText(""); expressionField.setText(""); return true; } return false; } } jakarta-oro-2.0.8/src/java/examples/groups.java0000644000175000017500000001133507773723336020775 0ustar arnaudarnaud/* * $Id: groups.java,v 1.8 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import java.io.*; import java.util.*; import org.apache.oro.text.*; import org.apache.oro.text.regex.*; /** * This is a sample program mimicking the Unix groups command. It assumes * the /etc/group file exists. * * @version @version@ */ public final class groups { public static final void main(String[] args) { int user; MatchActionProcessor processor = new MatchActionProcessor(); final Hashtable groups = new Hashtable(); Vector users = new Vector(); Enumeration usersElements; MatchAction action = new MatchAction() { public void processMatch(MatchActionInfo info) { // Add group name to hashtable entry ((Vector)groups.get(info.match.toString())).addElement( info.fields.get(0)); } }; if(args.length == 0) { // No arguments assumes calling user args = new String[1]; args[0] = System.getProperty("user.name"); } try { processor.setFieldSeparator(":"); for(user = 0; user < args.length; user++) { // Screen out duplicates if(!groups.containsKey(args[user])) { groups.put(args[user], new Vector()); // We assume usernames contain no special characters processor.addAction(args[user], action); // Add username to Vector to preserve argument order when printing users.addElement(args[user]); } } } catch(MalformedPatternException e) { e.printStackTrace(); System.exit(1); } try { processor.processMatches(new FileInputStream("/etc/group"), System.out); } catch(IOException e) { e.printStackTrace(); System.exit(1); } usersElements = users.elements(); while(usersElements.hasMoreElements()) { String username; Enumeration values; username = (String)usersElements.nextElement(); values = ((Vector)groups.get(username)).elements(); System.out.print(username + " :"); while(values.hasMoreElements()) { System.out.print(" " + values.nextElement()); } System.out.println(); } System.out.flush(); } } jakarta-oro-2.0.8/src/java/examples/jdfix.java0000644000175000017500000001256607773723336020571 0ustar arnaudarnaud/* * $Id: jdfix.java,v 1.8 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import java.io.*; import org.apache.oro.text.perl.*; /** * This is an example program demonstrating how to use the PerlTools * match and substitute methods. * * @version @version@ */ public final class jdfix { /** * This program performs the exact same function as this Perl script. * Notice that the Java program is only so much longer because of all * of the I/O exception handling and InputStream creation. The core * while loop is EXACTLY the same length as the while loop in the Perl * script. The number of substitutions performed is printed to standard * output as additional information. Note, this is not an efficient way * to do this job; it is better to first read the entire file into a * character array. *

* This is a simple program that takes a javadoc generated HTML file as * input and produces as output the same HTML file, except with a white * background color for the body. *

*

   * #!/usr/bin/perl
   *
   * $#ARGV >= 1 || die "Usage: jdfix input output\n";
   *
   * open(INPUT, $ARGV[0]) || warn "Couldn't open $ARGV[0]\n";
   * open(OUTPUT, ">$ARGV[1]") || warn "Couldn't open $ARGV[1]\n";
   *
   * while(){
   *     s///;
   *     print OUTPUT;
   * }
   * 
   * close(INPUT);
   * close(OUTPUT);
   * 
*/ public static final void main(String args[]) { String line; BufferedReader input = null; PrintWriter output = null; Perl5Util perl; StringBuffer result = new StringBuffer(); int numSubs = 0; if(args.length < 2) { System.err.println("Usage: jdfix input output"); return; } try { input = new BufferedReader(new FileReader(args[0])); } catch(IOException e) { System.err.println("Error opening input file: " + args[0]); e.printStackTrace(); return; } try { output = new PrintWriter(new FileWriter(args[1])); } catch(IOException e) { System.err.println("Error opening output file: " + args[1]); e.printStackTrace(); return; } perl = new Perl5Util(); try { while((line = input.readLine()) != null) { numSubs+=perl.substitute(result, "s///", line); result.append('\n'); } output.print(result.toString()); System.out.println("Substitutions made: " + numSubs); } catch(IOException e) { System.err.println("Error reading from input: " + args[1]); e.printStackTrace(); return; } finally { try { input.close(); output.close(); } catch(IOException e) { System.err.println("Error closing files."); e.printStackTrace(); return; } } } } jakarta-oro-2.0.8/src/java/examples/didNotMatch.java0000644000175000017500000000764007773723336021660 0ustar arnaudarnaud/* * $Id: didNotMatch.java,v 1.7 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import org.apache.oro.text.regex.*; import org.apache.oro.text.perl.*; /** * This is a trivial example program demonstrating the preMatch() * and postMatch() methods of Perl5Util. * * @version @version@ */ public final class didNotMatch { /** * This program takes a Perl5 pattern and an input string as arguments. * It prints the parts of the input surrounding the first occurrence * of the pattern in the input. */ public static final void main(String args[]) { String pattern, input; Perl5Util perl; if(args.length < 2) { System.err.println("Usage: didNotMatch pattern input"); System.exit(1); } pattern = args[0]; input = args[1]; perl = new Perl5Util(); // Use a try block because we have no idea if the user will enter a valid // pattern. try { if(perl.match(pattern, input)) { System.out.println("Pre : " + perl.preMatch()); System.out.println("Post: " + perl.postMatch()); } else System.err.println("There was no match."); } catch(MalformedPerl5PatternException e) { System.err.println("You entered an invalid pattern."); System.err.println("Error: " + e.getMessage()); System.exit(1); } } } jakarta-oro-2.0.8/src/java/examples/substituteExample.java0000644000175000017500000001310407773723336023201 0ustar arnaudarnaud/* * $Id: substituteExample.java,v 1.9 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import org.apache.oro.text.regex.*; /** * This is a test program demonstrating the use of the Util.substitute() * method. * * @version @version@ */ public final class substituteExample { /** * A good way for you to understand the substitute() method is to play around * with it by using this test program. The program takes 3 to 5 arguments * as follows: * java substituteExample * regex substitution input [sub limit] [interpolation limit] * regex - A regular expression used to find parts of the input to be * substituted. * sub limit - An optional argument limiting the number of substitutions. * If no limit is given, the limit used is Util.SUBSTITUTE_ALL. * input - A string to be used as input for substitute(). * interpolation limit - An optional argument limiting the number of * interpolations performed. * * Try the following command line for a simple example of subsitute(). * It changes (2,3) to (3,2) in the input. * java substituteExample '\(2,3\)' '(3, 2)' '(1,2) (2,3) (4,5)' * * The following command line shows the substitute limit at work. It * changed the first four 1's in the input to 4's. * java substituteExample "1" "4" "381298175 1111" "4" * * The next command line shows how to use interpolations. Suppose we * want to reverse the coordinates of the first 3 entries in the input * and then have all the rest of the coordinates be equal to the new 3rd * entry: java substituteExample '\((\d+),(\d+)\)' '($2,$1)' '(1,2) (2,3) (4,5) (8,8) (9,2)' 5 3 * */ public static final void main(String args[]) { int limit, interps; PatternMatcher matcher = new Perl5Matcher(); Pattern pattern = null; PatternCompiler compiler = new Perl5Compiler(); String regularExpression, input, sub, result; // Make sure there are sufficient arguments if(args.length < 3) { System.err.println("Usage: substituteExample regex substitution " + "input [sub limit] [interpolation limit]"); System.exit(1); } limit = Util.SUBSTITUTE_ALL; interps = Perl5Substitution.INTERPOLATE_ALL; regularExpression = args[0]; sub = args[1]; input = args[2]; if(args.length > 3) limit = Integer.parseInt(args[3]); if(args.length > 4) interps = Integer.parseInt(args[4]); try { pattern = compiler.compile(regularExpression); System.out.println("substitute regex: " + regularExpression); } catch(MalformedPatternException e){ System.err.println("Bad pattern."); System.err.println(e.getMessage()); System.exit(1); } // Perform substitution and print result. result = Util.substitute(matcher, pattern, new Perl5Substitution(sub, interps), input, limit); System.out.println("result: " + result); } } jakarta-oro-2.0.8/src/java/examples/grep.java0000644000175000017500000001050707773723336020413 0ustar arnaudarnaud/* * $Id: grep.java,v 1.7 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import java.io.*; import java.util.*; import org.apache.oro.text.*; import org.apache.oro.text.regex.*; /** * This is a no-frills implementation of grep using Perl regular expressions. * You can easily add most of the options present in most grep versions by * creating a MatchAction class or classes whose behavior varies based on * the provided flags. * * @version @version@ */ public final class grep { static int _file = 0; // args[] is declared final so that Inner Class may reference it. public static final void main(final String[] args) { MatchActionProcessor processor = new MatchActionProcessor(); if(args.length < 2) { System.err.println("Usage: grep "); System.exit(1); } try { if(args.length > 2) { // Print filename before line if more than one file is specified. // Rely on _file to point to current file being processed. processor.addAction(args[0], new MatchAction() { public void processMatch(MatchActionInfo info) { info.output.println(args[_file] + ":" + info.line); } }); } else { // We rely on the default action of printing matched // lines to the given OutputStream processor.addAction(args[0]); } } catch(MalformedPatternException e) { System.err.println("Bad pattern."); e.printStackTrace(); System.exit(1); } for(_file = 1; _file < args.length; _file++) { try { processor.processMatches(new FileInputStream(args[_file]), System.out); } catch(IOException e) { System.err.println("Error opening or reading " + args[_file]); e.printStackTrace(); System.exit(1); } } } } jakarta-oro-2.0.8/src/java/examples/prefixExample.java0000644000175000017500000001321407773723336022265 0ustar arnaudarnaud/* * $Id: prefixExample.java,v 1.7 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import org.apache.oro.text.regex.*; /** * This is a test program demonstrating an application of the matchesPrefix() * methods introduced in OROMatcher v1.0.6. This example program shows how * you might tokenize a stream of input using whitespace as a token * separator. Don't forget to use quotes around the input on the command * line, e.g. * java prefixExample "Test to see if 1.0 is real and 2 is an integer" * * If you don't need the power of a full blown lexer generator, you can * easily use regular expressions to create your own tokenization and * simple parsing classes using similar approaches. * * @version @version@ */ public final class prefixExample { public static final int REAL = 0; public static final int INTEGER = 1; public static final int STRING = 2; public static final String[] types = { "Real", "Integer", "String" }; public static final String whitespace = "\\s+"; public static final String[] tokens = { "-?\\d*\\.\\d+(?:[eE][-+]-?\\d+)?(?=\\s|$)", "-?\\d+(?=\\s|$)", "\\S+" }; public static final String tokens2 = "(-?\\d*\\.\\d+(?:[eE][-+]-?\\d+)?(?=\\s|$))|(-?\\d+(?=\\s|$))|(\\S+)"; public static final void main(String args[]) { int token; PatternMatcherInput input; PatternMatcher matcher; PatternCompiler compiler; Pattern[] patterns; Pattern tokenSeparator = null, patterns2 = null; if(args.length < 1) { System.err.println("Usage: prefixExample "); System.exit(1); } input = new PatternMatcherInput(args[0]); compiler = new Perl5Compiler(); patterns = new Pattern[tokens.length]; try { tokenSeparator = compiler.compile(whitespace); patterns2 = compiler.compile(tokens2); for(token=0; token < tokens.length; token++) patterns[token] = compiler.compile(tokens[token]); } catch(MalformedPatternException e) { System.err.println("Bad pattern."); e.printStackTrace(); System.exit(1); } matcher = new Perl5Matcher(); System.out.println("\nOne approach.\n"); do { for(token = 0; token < tokens.length; token++) if(matcher.matchesPrefix(input, patterns[token])) { System.out.println(types[token] + ": " + matcher.getMatch()); break; } } while(matcher.contains(input, tokenSeparator)); // An alternative approach using the tokens2 expression which // packs all the token patterns into one regular expression. // As in Perl, there's more than one way to do something in Java. System.out.println("\nAn equivalent alternative.\n"); input.setCurrentOffset(input.getBeginOffset()); do { if(matcher.matchesPrefix(input, patterns2)) { MatchResult result = matcher.getMatch(); for(token = 1; token <= tokens.length; token++) { if(result.group(token) != null) { System.out.println(types[token - 1] + ": " + result); break; } } } } while(matcher.contains(input, tokenSeparator)); } } jakarta-oro-2.0.8/src/java/examples/splitExample.java0000644000175000017500000001207307773723336022125 0ustar arnaudarnaud/* * $Id: splitExample.java,v 1.8 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import java.util.*; import org.apache.oro.text.regex.*; /** * This is a test program demonstrating the use of the Util.split() method. * * @version @version@ */ public final class splitExample { /** * A good way for you to understand the split() method is to play around * with it by using this test program. The program takes 2 or 3 arguments * as follows: * java splitExample regex input [split limit] * regex - A regular expression used to split the input. * input - A string to be used as input for split(). * split limit - An optional argument limiting the size of the list returned * by split(). If no limit is given, the limit used is * Util.SPLIT_ALL. Setting the limit to 1 generally doesn't * make any sense. * * Try the following two command lines to see how split limit works: * java splitExample '[:|]' '1:2|3:4' * java splitExample '[:|]' '1:2|3:4' 3 * */ public static final void main(String args[]) { int limit, i; String regularExpression, input; List results = new ArrayList(); Pattern pattern = null; PatternMatcher matcher; PatternCompiler compiler; Iterator elements; // Make sure there are sufficient arguments if(args.length < 2) { System.err.println("Usage: splitExample regex input [split limit]"); System.exit(1); } regularExpression = args[0]; input = args[1]; if(args.length > 2) limit = Integer.parseInt(args[2]); else limit = Util.SPLIT_ALL; // Create Perl5Compiler and Perl5Matcher instances. compiler = new Perl5Compiler(); matcher = new Perl5Matcher(); // Attempt to compile the pattern. If the pattern is not valid, // report the error and exit. try { pattern = compiler.compile(regularExpression); System.out.println("split regex: " + regularExpression); } catch(MalformedPatternException e){ System.err.println("Bad pattern."); System.err.println(e.getMessage()); System.exit(1); } // Split the input and print the resulting list. System.out.println("split results: "); Util.split(results, matcher, pattern, input, limit); elements = results.iterator(); i = 0; while(elements.hasNext()) System.out.println("item " + i++ + ": " + (String)elements.next()); } } jakarta-oro-2.0.8/src/java/examples/addCommas.java0000644000175000017500000000710607773723336021347 0ustar arnaudarnaud/* * $Id: addCommas.java,v 1.7 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import org.apache.oro.text.perl.*; /** * * This is an example program based on a short example from the Camel book. * It demonstrates substitutions by adding commas to a the string * representation of an integer. * * @version @version@ */ public final class addCommas { /** * This program takes a string as an argument and adds commas to all the * integers in the string exceeding 3 digits in length, placing each comma * three digits apart. */ public static final void main(String args[]) { String number; Perl5Util perl; if(args.length < 1) { System.err.println("Usage: addCommas integer"); System.exit(1); } number = args[0]; perl = new Perl5Util(); while(perl.match("/[+-]?\\d*\\d{4}/", number)) number = perl.substitute("s/([+-]?\\d*\\d)(\\d{3})/$1,$2/", number); System.out.println(number); } } jakarta-oro-2.0.8/src/java/examples/printPasswd.java0000644000175000017500000001050607773723336021773 0ustar arnaudarnaud/* * $Id: printPasswd.java,v 1.8 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import java.io.*; import java.util.*; import org.apache.oro.text.perl.*; /** * This is an example program based on a short example from the Camel book. * It demonstrates splits by reading the /etc/passwd file (assuming you're * on a Unix system) and printing out the formatted entries. * * @version @version@ */ public final class printPasswd { public static final String[] fieldNames = { "Login: ", "Encrypted password: ", "UID: ", "GID: ", "Name: ", "Home: ", "Shell: " }; public static final void main(String args[]) { BufferedReader input = null; int field, record; String line; Perl5Util perl; ArrayList fields; Iterator it; try { input = new BufferedReader(new FileReader("/etc/passwd")); } catch(IOException e) { System.err.println("Could not open /etc/passwd."); e.printStackTrace(); System.exit(1); } perl = new Perl5Util(); record = 0; try { fields = new ArrayList(); while((line = input.readLine()) != null) { fields.clear(); perl.split(fields, "/:/", line); it = fields.iterator(); field = 0; System.out.println("Record " + record++); while(it.hasNext() && field < fieldNames.length) System.out.println(fieldNames[field++] + (String)it.next()); System.out.print("\n\n"); } } catch(IOException e) { System.err.println("Error reading /etc/passwd."); e.printStackTrace(); System.exit(1); } finally { try { input.close(); } catch(IOException e) { System.err.println("Could not close /etc/passwd."); e.printStackTrace(); System.exit(1); } } } } jakarta-oro-2.0.8/src/java/examples/matchesContainsExample.java0000644000175000017500000001511207773723336024112 0ustar arnaudarnaud/* * $Id: matchesContainsExample.java,v 1.7 2003/11/07 20:16:23 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package examples; import org.apache.oro.text.regex.*; /** * This is a test program demonstrating the difference between the * matches() and contains() methods. * * @version @version@ */ public final class matchesContainsExample { /** * A common mistake is to confuse the behavior of the matches() and * contains() methods. matches() tests to see if a string exactly * matches a pattern whereas contains() searches for the first pattern * match contained somewhere within the string. When used with a * PatternMatcherInput instance, the contains() method allows you to * search for every pattern match within a string by using a while loop. */ public static final void main(String args[]) { int matches = 0; String numberExpression = "\\d+"; String exactMatch = "2010"; String containsMatches = " 2001 was the movie before 2010, which takes place before 2069 the book "; Pattern pattern = null; PatternMatcherInput input; PatternCompiler compiler; PatternMatcher matcher; MatchResult result; // Create Perl5Compiler and Perl5Matcher instances. compiler = new Perl5Compiler(); matcher = new Perl5Matcher(); // Attempt to compile the pattern. If the pattern is not valid, // report the error and exit. try { pattern = compiler.compile(numberExpression); } catch(MalformedPatternException e) { System.err.println("Bad pattern."); System.err.println(e.getMessage()); System.exit(1); } // Here we show the difference between the matches() and contains() // methods(). Compile the program and study the output to reinforce // in your mind what the methods do. System.out.println("Input: " + exactMatch); // The following should return true because exactMatch exactly matches // numberExprssion. if(matcher.matches(exactMatch, pattern)) System.out.println("matches() Result: TRUE, EXACT MATCH"); else System.out.println("matches() Result: FALSE, NOT EXACT MATCH"); System.out.println("\nInput: " + containsMatches); // The following should return false because containsMatches does not // exactly match numberExpression even though its subparts do. if(matcher.matches(containsMatches, pattern)) System.out.println("matches() Result: TRUE, EXACT MATCH"); else System.out.println("matches() Result: FALSE, NOT EXACT MATCH"); // Now we call the contains() method. contains() should return true // for both strings. System.out.println("\nInput: " + exactMatch); if(matcher.contains(exactMatch, pattern)) { System.out.println("contains() Result: TRUE"); // Fetch match and print. result = matcher.getMatch(); System.out.println("Match: " + result); } else System.out.println("contains() Result: FALSE"); System.out.println("\nInput: " + containsMatches); if(matcher.contains(containsMatches, pattern)) { System.out.println("contains() Result: TRUE"); // Fetch match and print. result = matcher.getMatch(); System.out.println("Match: " + result); } else System.out.println("contains() Result: FALSE"); // In the previous example, notice how contains() will fetch only first // match in a string. If you want to search a string for all of the // matches it contains, you must create a PatternMatcherInput object // to keep track of the position of the last match, so you can pick // up a search where the last one left off. input = new PatternMatcherInput(containsMatches); System.out.println("\nPatternMatcherInput: " + input); // Loop until there are no more matches left. while(matcher.contains(input, pattern)) { // Since we're still in the loop, fetch match that was found. result = matcher.getMatch(); ++matches; System.out.println("Match " + matches + ": " + result); } } } jakarta-oro-2.0.8/src/java/tools/0000755000175000017500000000000010423237774016122 5ustar arnaudarnaudjakarta-oro-2.0.8/src/java/tools/oroToApache.java0000644000175000017500000001616107773723336021206 0ustar arnaudarnaud/* * $Id: oroToApache.java,v 1.7 2003/11/07 20:16:26 dfs Exp $ * * ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ package tools; import java.io.*; import org.apache.oro.text.regex.*; /** * This is a program you can use to convert older source code that uses * the com.oroinc prefixes for the ORO text processing Java classes * to org.apache. It assumes source files are small enough to store in * memory and perform the substitutions. A small effort is made to not * blindly substitute com.oroinc so that code using NetComponents or other * ORO software will not have packages like com.oroinc.net become * org.apache.net. However, you will still have to manually fix some * code if you use the com.oroinc.io classes from NetComponents. * * @version @version@ * @since 2.0 */ public final class oroToApache { public static final String PACKAGE_PATTERN = "com\\.oroinc\\.(io|text|util)"; public static final String PACKAGE_SUBSTITUTION = "org.apache.oro.$1"; public static final String OLD_FILE_EXTENSION = "_old"; public static final class RenameException extends IOException { public RenameException() { } public RenameException(String message) { super(message); } } public static final class Converter { Pattern _sourcePattern; Perl5Matcher _matcher; Perl5Substitution _substitution; public static final int readFully(Reader reader, char[] buffer) throws IOException { int offset, length, charsRead; offset = 0; length = buffer.length; while(offset < buffer.length) { charsRead = reader.read(buffer, offset, length); if(charsRead == -1) break; offset+=charsRead; length-=charsRead; } return offset; } public Converter(String patternString) throws MalformedPatternException { Perl5Compiler compiler; _matcher = new Perl5Matcher(); compiler = new Perl5Compiler(); _sourcePattern = compiler.compile(patternString); _substitution = new Perl5Substitution(PACKAGE_SUBSTITUTION); } public void convertFile(String filename, String oldExtension) throws FileNotFoundException, RenameException, SecurityException, IOException { char[] inputBuffer; int inputLength; File srcFile, outputFile; FileReader input; FileWriter output; String outputData; srcFile = new File(filename); input = new FileReader(srcFile); outputFile = File.createTempFile(srcFile.getName(), null, srcFile.getAbsoluteFile().getParentFile()); output = new FileWriter(outputFile); inputBuffer = new char[(int)srcFile.length()]; inputLength = readFully(input, inputBuffer); input.close(); // new String(inputBuffer) is terribly inefficient because the // string ultimately gets converted back to a char[], but if we've // got the memory it's expedient. outputData = Util.substitute(_matcher, _sourcePattern, _substitution, new String(inputBuffer), Util.SUBSTITUTE_ALL); output.write(outputData); output.close(); if(!srcFile.renameTo(new File(srcFile.getAbsolutePath() + OLD_FILE_EXTENSION))) throw new RenameException("Could not rename " + srcFile.getPath() + "."); if(!outputFile.renameTo(srcFile)) throw new RenameException("Could not rename temporary output file. " + "Original file is in " + srcFile.getAbsolutePath() + OLD_FILE_EXTENSION); } } public static final void main(String[] args) { int file; Converter converter; if(args.length < 1) { System.err.println("usage: oroToApache [file ...]"); return; } try { converter = new Converter(PACKAGE_PATTERN); } catch(MalformedPatternException mpe) { // Shouldn''t happen mpe.printStackTrace(); return; } for(file = 0; file < args.length; file++) { try { System.out.println("Converting " + args[file]); converter.convertFile(args[file], OLD_FILE_EXTENSION); } catch(FileNotFoundException fnfe) { System.err.println("Error: Could not open file. Skipping " + args[file]); } catch(RenameException re) { System.err.println("Error: " + re.getMessage()); } catch(SecurityException se) { System.err.println("Error: Could not rename a file while processing" + args[file] + ". Insufficient permission. " + "File may not have been converted."); } catch(IOException ioe) { ioe.printStackTrace(); System.err.println("Error: I/O exception while converting " + args[file] + ". File not converted."); } } } } jakarta-oro-2.0.8/docs/0000755000175000017500000000000010423237774014202 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/classes/0000755000175000017500000000000007773723336015647 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/classes/org/0000755000175000017500000000000007773723336016436 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/classes/org/apache/0000755000175000017500000000000007773723336017657 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/classes/org/apache/oro/0000755000175000017500000000000007773723336020456 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/classes/org/apache/oro/text/0000755000175000017500000000000010423237774021432 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/0000755000175000017500000000000010423237774022214 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/NegativeCharacterClassNode.class0000644000175000017500000000102007773723336030417 0ustar arnaudarnaudÊþº¾.      (I)VCode_matches(C)Z_clone(([I)Lorg/apache/oro/text/awk/SyntaxNode;   2org/apache/oro/text/awk/NegativeCharacterClassNode java/util/BitSet*org/apache/oro/text/awk/CharacterClassNode _characterSetLjava/util/BitSet;setget(I)Zclone()Ljava/lang/Object;0   *·*´¶±  *´¶š§¬ , »Y+\.[`O·M,*´¶Àµ,°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/StarNode.class0000644000175000017500000000155107773723336024774 0ustar arnaudarnaudÊþº¾.2     !  "# "$ "%& ' ()_left$Lorg/apache/oro/text/awk/SyntaxNode;'(Lorg/apache/oro/text/awk/SyntaxNode;)VCode _nullable()Z_firstPosition()Ljava/util/BitSet; _lastPosition_followPosition;([Ljava/util/BitSet;[Lorg/apache/oro/text/awk/SyntaxNode;)V_clone(([I)Lorg/apache/oro/text/awk/SyntaxNode; *    + ,- ./ 01 org/apache/oro/text/awk/StarNode  "org/apache/oro/text/awk/SyntaxNode()Vjava/util/BitSetsize()Iget(I)Zor(Ljava/util/BitSet;)V  *·*+µ±¬*´¶°*´¶°G;*´+,¶*¶N*¶:-¶6Yd6¢-¶ ™ÿï+2¶ §ÿã±» Y*´+¶ · °jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/AwkCompiler.class0000644000175000017500000002011707773723336025471 0ustar arnaudarnaudÊþº¾.# bž aŸ a  a¡ a¢ a£ a¤ÿÿ¥¦ ž§ ¨ ©ª« ¬ ­ ® a¯ a°± a² ³ a´ aµ a¶·¸ ž ¹ º a»¼ "½¾ $½¿ &½ aÀ Á ÂÃÄ ÅÆÇÈÉÿÿÿ aÊ aËÌÍ ÎÏÐÑÒÓÔ aÕ ÂÖ× =Ø aÙ ÅÚÛ AÁ AÜÝ DÁ AÞ aß aà aá aâ aã =ä aå aæ Açè aé aêë Sì Sí aîïð Xñ Wò Wó Wô Wõ Xö a÷ aøùúû DEFAULT_MASKI ConstantValueCASE_INSENSITIVE_MASKMULTILINE_MASK _END_OF_INPUTC__inCharacterClassZ__caseSensitive __multiline __beginAnchor __endAnchor __lookahead __position __bytesRead__expressionLength__regularExpression[C __openParen __closeParen()VCode __isMetachar(C)Z_isWordCharacter _isLowerCase _isUpperCase _toggleCase(C)C__match(C)V Exceptions __putback__regex&()Lorg/apache/oro/text/awk/SyntaxNode;__branch__piece__parseUnsignedInteger(III)I __repetitionJ(Lorg/apache/oro/text/awk/SyntaxNode;)Lorg/apache/oro/text/awk/SyntaxNode;__backslashToken__atom__characterClass _newTokenNode((CI)Lorg/apache/oro/text/awk/SyntaxNode;_parse(([C)Lorg/apache/oro/text/awk/SyntaxTree;compile(([CI)Lorg/apache/oro/text/regex/Pattern;8(Ljava/lang/String;I)Lorg/apache/oro/text/regex/Pattern;'([C)Lorg/apache/oro/text/regex/Pattern;7(Ljava/lang/String;)Lorg/apache/oro/text/regex/Pattern; |} ƒ€ ‚€ tm ve we xy3org/apache/oro/text/regex/MalformedPatternExceptionjava/lang/StringBuffertoken: üý üþ does not match lookahead:  at position: üÿ  | Œ‹ †‡org/apache/oro/text/awk/OrNode Š‹ | ‹ ze {eMParse error: close parenthesis without matching open parenthesis at position org/apache/oro/text/awk/CatNode   “‹ org/apache/oro/text/awk/PlusNode |$org/apache/oro/text/awk/QuestionNode org/apache/oro/text/awk/StarNode ‘ |   5Parse error: unexpected number of digits at position   java/lang/NumberFormatException'Parse error: numeric value at position  is invalid Ž ue8Parse error: Superfluous interval specified at position (. Number of occurences was set to zero. Parse error: invalid interval;  is less than  at position "Parse error: unexpected character  in interval at position •– …!org/apache/oro/text/awk/TokenNode | ‰} *org/apache/oro/text/awk/CharacterClassNode 2org/apache/oro/text/awk/NegativeCharacterClassNode  ”‹ qo ’‹ € no m po „… €1Parse error: invalid range specified at position ro so"org/apache/oro/text/awk/SyntaxTree | } —˜"org/apache/oro/text/awk/AwkPatternjava/lang/String | | e o  o !" ™š ™›#org/apache/oro/text/awk/AwkCompilerjava/lang/Object)org/apache/oro/text/regex/PatternCompilerappend,(Ljava/lang/String;)Ljava/lang/StringBuffer;(C)Ljava/lang/StringBuffer;(I)Ljava/lang/StringBuffer;toString()Ljava/lang/String;(Ljava/lang/String;)VK(Lorg/apache/oro/text/awk/SyntaxNode;Lorg/apache/oro/text/awk/SyntaxNode;)V_left$Lorg/apache/oro/text/awk/SyntaxNode;_right'(Lorg/apache/oro/text/awk/SyntaxNode;)V(I)Vjava/lang/Characterdigit(CI)Ijava/lang/IntegerparseInt(Ljava/lang/String;I)I"org/apache/oro/text/awk/SyntaxNode_clone(([I)Lorg/apache/oro/text/awk/SyntaxNode; toUpperCase(CI)V(I)Ljava/lang/String;_addTokenRange(II)V _addToken_token_matches((Lorg/apache/oro/text/awk/SyntaxNode;I)V_computeFollowPositions([C)V9(Ljava/lang/String;Lorg/apache/oro/text/awk/SyntaxTree;)V_options_hasBeginAnchor _hasEndAnchor toCharArray()[C1abcdefghefijefklmfnopoqorosotmuevewexyze{e|}~*·± €~H<*Ÿ3?Ÿ-+Ÿ'[Ÿ!]Ÿ(Ÿ)Ÿ|Ÿ . §¬€~<0a¡ z¤!A¡ Z¤0¡ 9¤ _ §¬‚€~a¡ z£§¬ƒ€~A¡ Z£§¬„…~(¸™  `’¬¸™  d’¬¬†‡~sg*´ .*´*´¢**´*Y´Z`µ4µ§?*µ§6» Y» Y·  ¶ ¶¶ *´¶¶ *´¶¶·¿±ˆ ‰}~/#*´Ÿ *Y´dµ**´*´d4µ±Š‹~/#*·L*´| *|·»Y+*··°+°ˆ Œ‹~ÞÒ*·M*´) .*´*´¤,°» Y» Y· ¶ *´¶¶·¿*´|Ÿ *´ ,°»Y·YLN+,µ*·M*´) 4*´*´¤ +,µ §V» Y» Y· ¶ *´¶¶·¿*´|Ÿ *´  +,µ §+»Y·µ +´ ÀL+,µ§ÿ-°ˆ ‹~ui*·!L*´«^*I++?:{X*+·»"Y+·#°*?·»$Y+·%°**·»&Y+·'°*+·(°+°ˆ Ž~«—6» Y·):*´¸*Ÿ!¢*´¶W**´·„§ÿÙ¡ ¤!» Y» Y· +¶ *´¶¶·¿¶¸,6§(:» Y» Y· .¶ *´¶/¶ ¶·¿¬alo-ˆ ‘~g[:*{·* 0·1=¼ :*´2O*´} …*}·š&» Y» Y· 3¶ *´¶4¶ ¶·¿ +°»Y·Y::+µ„ÿ¤)+¶5L»Y·µ ´ À:+µ§ÿÕ+¶5µ §¨*´, u*,·*´} x*}·š »&Y+·'°  »"Y+·#°»Y·Y::+µ„ÿž)+¶5L»Y·µ ´ À:+µ§ÿÖ»&Y+¶5·'µ §* 0·1>*}·¢3» Y» Y· 6¶ ¶7¶ ¶8¶ *´¶¶·¿š&» Y» Y· 3¶ *´¶4¶ ¶·¿šh  »$Y+·%°»Y·Y::»$Y+·%L+µ„ÿ¤)+¶5L»Y·µ ´ À:+µ§ÿÕ+¶5µ §F X +°»Y·Y::+µ„ÿ¤)+¶5L»Y·µ ´ À:+µ§ÿÕ+¶5µ §ì»Y·Y::+µ6¢,+¶5L»Y·µ ´ À:+µ„§ÿÔ»$Y+¶5·%Ld6  +µ §†»Y·µ ´ À:+µ„ÿ¤)+¶5L»Y·µ ´ À:+µ§ÿÔ+¶5µ §-» Y» Y· 9¶ *´¶:¶ *´¶¶·¿*.µ2°ˆ ’‹~/#*\·*´x %*x·**·1’*Y´2Z`µ2¶;L§ó*´c C*c·*´¸<=?¤ @d§@`’=»=Y*Y´2Z`µ2·>L**´·§ª*´0¡¥*´9£œ**´·*´0¡8*´9£/*·?* ·1>¸@¸,>*’*Y´2Z`µ2¶;L§R*·?*´0  *0·»=Y*Y´2Z`µ2·>L§(*´ ¸*>**´*Y´2Z`µ2¶;L**´·§ÿ*´b !»=Y*Y´2Z`µ2·>L*b·§Ø*´=*´«@f=n+r1t7 =§ =§  =§ =«rD]S5W¼d:søw€»AY*Y´2Z`µ2·B:09¶CL§)»DY*Y´2Z`µ2·E:09¶CL§»AY*Y´2Z`µ2·B:09¶Caz¶CAZ¶C_¶FL§Ê»DY*Y´2Z`µ2·E:09¶Caz¶CAZ¶C_¶FL§Ž»AY*Y´2Z`µ2·B: ¶F ¶F ¶F ¶F ¶FL§Q»DY*Y´2Z`µ2·E: ¶F ¶F ¶F ¶F ¶FL§**Y´2Z`µ2¶;L**´·+°ˆ “‹~èÜ*´( +*(·*Y´`µ*·L*)·*Y´`µ§¬*´[  *·GL§›*´. .*.·»DY*Y´2Z`µ2·EM*´H™ , ¶F,L§g*´\  *·IL§V*´¸Jš"**´*Y´2Z`µ2¶;L**´·§-» Y» Y· 9¶ *´¶8¶ *´¶¶·¿+°ˆ ”‹~½±*[·*µK*´^  *^·»DY*Y´2Z`µ2·E:§»AY*Y´2Z`µ2·B:*´]ŸX*´ŸO*´\ `*·IN*Y´2dµ2-Á=™$-À=´L<¶F*´Mš^¸N¶F§R-ÀA:=¢ÿ¢¶O™ ¶F`’=§ÿå*´<*´¶F*´Mš*´¸N¶F**´·*´- ÿY*-·*´]  -¶F§ž*´\ B*·IN*Y´2dµ2-Á=™-À=´L=§.» Y» Y· P¶ *´¶¶·¿*´=**´·¢!» Y» Y· P¶ *´¶¶·¿`¶C*´Mšþ¸`’¸N¸N¶C§þ¥*]·*µK°ˆ •–~J>*´Kš0*´Mš)¸š ¸™»AY·BN-¶F-¸N¶F-°»=Y·>°—˜~ÛÏ**Zµµ*+µ*µ*+¾µ*µK*µ2**´·*´^ *µQ**´·*´ž+*´d4$ *Y´dµ*µR*´£*´ C*´Qš<»Y·N-*·µ-»=Y*Y´2Z`µ2·>µ »SY-*´2·TM§»SY»=Y·>·TM,¶U,°ˆ ™š~g[**ZµRµQ*~š§µM*~™§µH*+¶VN»WY»XY+·Y-·Z:µ[*´Qµ\*´Rµ]°ˆ ™›~cW**ZµRµQ*~š§µM*~™§µH*+¶^¶VN»WY+-·Z:µ[*´Qµ\*´Rµ]°ˆ ™œ~*+¶_°ˆ ™~*+¶`°ˆ jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/TokenNode.class0000644000175000017500000000057007773723336025143 0ustar arnaudarnaudÊþº¾.   _tokenC(CI)VCode_matches(C)Z_clone(([I)Lorg/apache/oro/text/awk/SyntaxNode;  !org/apache/oro/text/awk/TokenNode   org/apache/oro/text/awk/LeafNode(I)V    *·*µ±  *´ §¬  »Y*´+\.[`O·°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/OrNode.class0000644000175000017500000000202707773723336024442 0ustar arnaudarnaudÊþº¾.:  ! " # $% & '( ) * + ,- . /0_left$Lorg/apache/oro/text/awk/SyntaxNode;_rightK(Lorg/apache/oro/text/awk/SyntaxNode;Lorg/apache/oro/text/awk/SyntaxNode;)VCode _nullable()Z_firstPosition()Ljava/util/BitSet; _lastPosition_followPosition;([Ljava/util/BitSet;[Lorg/apache/oro/text/awk/SyntaxNode;)V_clone(([I)Lorg/apache/oro/text/awk/SyntaxNode; 1    java/util/BitSet 234 56 7 89  org/apache/oro/text/awk/OrNode  "org/apache/oro/text/awk/SyntaxNode()Vsize()Ijava/lang/Mathmax(II)I(I)Vor(Ljava/util/BitSet;)V *·*+µ*,µ±&*´¶š *´¶™§¬;/*´¶L*´¶M»Y+¶,¶¸· N-,¶ -+¶ -°;/*´¶ L*´¶ M»Y+¶,¶¸· N-,¶ -+¶ -°*´+,¶ *´+,¶ ±$» Y*´+¶*´+¶·°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/EpsilonNode.class0000644000175000017500000000117307773723336025474 0ustar arnaudarnaudÊþº¾.     _positionSetLjava/util/BitSet;()VCode _nullable()Z_firstPosition()Ljava/util/BitSet; _lastPosition_followPosition;([Ljava/util/BitSet;[Lorg/apache/oro/text/awk/SyntaxNode;)V_clone(([I)Lorg/apache/oro/text/awk/SyntaxNode; java/util/BitSet   #org/apache/oro/text/awk/EpsilonNode"org/apache/oro/text/awk/SyntaxNode(I)V0   *·*»Y·µ±  ¬ *´° *´° ± »Y·°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/CatNode.class0000644000175000017500000000206507773723336024573 0ustar arnaudarnaudÊþº¾.< ! " # $ %& ' () * + , - ./ ! 01_left$Lorg/apache/oro/text/awk/SyntaxNode;_right()VCode _nullable()Z_firstPosition()Ljava/util/BitSet; _lastPosition_followPosition;([Ljava/util/BitSet;[Lorg/apache/oro/text/awk/SyntaxNode;)V_clone(([I)Lorg/apache/oro/text/awk/SyntaxNode;     java/util/BitSet 234 56 7 89   :;org/apache/oro/text/awk/CatNode  "org/apache/oro/text/awk/SyntaxNodesize()Ijava/lang/Mathmax(II)I(I)Vor(Ljava/util/BitSet;)Vget(I)Z0*·±&*´¶™*´¶™§¬MA*´¶™2*´¶L*´¶M»Y+¶,¶¸· N-,¶ -+¶ -°*´¶°MA*´¶™2*´¶ L*´¶ M»Y+¶,¶¸· N-,¶ -+¶ -°*´¶ °TH*´+,¶ *´+,¶ *´¶ :*´¶:¶>Yd>¢¶ ™ÿñ+2¶ §ÿæ± ."»Y·M,*´+¶µ,*´+¶µ,°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/PlusNode.class0000644000175000017500000000070407773723336025005 0ustar arnaudarnaudÊþº¾.    '(Lorg/apache/oro/text/awk/SyntaxNode;)VCode _nullable()Z_clone(([I)Lorg/apache/oro/text/awk/SyntaxNode;  org/apache/oro/text/awk/PlusNode   org/apache/oro/text/awk/StarNode_left$Lorg/apache/oro/text/awk/SyntaxNode;"org/apache/oro/text/awk/SyntaxNode0 *+·±  ¬  »Y*´+¶·°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/LeafNode.class0000644000175000017500000000132607773723336024732 0ustar arnaudarnaudÊþº¾.'    ! "#$ _NUM_TOKENSI ConstantValue_END_MARKER_TOKEN _position _positionSetLjava/util/BitSet;(I)VCode_matches(C)Z _nullable()Z_firstPosition()Ljava/util/BitSet; _lastPosition_followPosition;([Ljava/util/BitSet;[Lorg/apache/oro/text/awk/SyntaxNode;)V %  java/util/BitSet   & org/apache/oro/text/awk/LeafNode"org/apache/oro/text/awk/SyntaxNode()Vset        , *·*µ*»Y`·µ*´¶±¬*´°*´°,*´*S±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/DFAState.class0000644000175000017500000000041307773723336024644 0ustar arnaudarnaudÊþº¾.    _stateNumberI_stateLjava/util/BitSet;(Ljava/util/BitSet;I)VCode    org/apache/oro/text/awk/DFAStatejava/lang/Object()V0   *·*+µ*µ±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/AwkPattern.class0000644000175000017500000000557007773723336025342 0ustar arnaudarnaudÊþº¾.º 4` 3a 3b 3c de 3f dg 3gh ` 3i 3jk ` 3l m 3n do pq r s 3t u vw x yz ` 3{ | } ~ 3 3€ d +‚ dƒ 3„ 3… † ‡ˆ +‰ Š ‹ Œ Ž 3‘’“_INVALID_STATEI ConstantValueÿÿÿÿ _START_STATE _numStates _endPosition_options _expressionLjava/lang/String;_DtransLjava/util/Vector; _nodeList[Ljava/util/Vector; _stateList_ULjava/util/BitSet; _emptySet _followSet[Ljava/util/BitSet; _endStates _stateMapLjava/util/Hashtable;_matchesNullStringZ_fastMap[Z_hasBeginAnchor _hasEndAnchor9(Ljava/lang/String;Lorg/apache/oro/text/awk/SyntaxTree;)VCode_createNewState(II[I)V_getStateArray(I)[I getPattern()Ljava/lang/String; getOptions()I U” SP TP @A• –8 >8 JKjava/util/Vector BC FCjava/util/BitSet LH U— GH ˜™š ›œ ž Ÿ  =8 ¡¢ £— org/apache/oro/text/awk/DFAState ¤¥ U¦java/util/Hashtable MN §H ¨© ªž IH DE «¬ ­® ¯° QR OP ±² ³_ org/apache/oro/text/awk/LeafNode ´8 µ¶ ·¶ ¡¸ ¹8[I ?8"org/apache/oro/text/awk/AwkPatternjava/lang/Object!org/apache/oro/text/regex/Patternjava/io/Serializable()V"org/apache/oro/text/awk/SyntaxTree _positions(I)V_root$Lorg/apache/oro/text/awk/SyntaxNode;"org/apache/oro/text/awk/SyntaxNode_firstPosition()Ljava/util/BitSet;or(Ljava/util/BitSet;)V addElement(Ljava/lang/Object;)Vget(I)Zsetclone()Ljava/lang/Object;(Ljava/util/BitSet;I)V_stateput8(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;xor_nodes#[Lorg/apache/oro/text/awk/LeafNode;_matches(C)Z createFastMap()[Z elementAt(I)Ljava/lang/Object;size _position containsKey(Ljava/lang/Object;)Zequals&(Ljava/lang/Object;)Ljava/lang/Object; _stateNumber13456789:;89<=8>8?8@ABCDEFCGHIHJKLHMNOPQRSPTPUVWym*·*µ*µ*+µ*,´dµ*,´µ*» Y· µ *» Y· µ *» Y·µ*» Y,´·µ*´,´¶¶¼ :*´ ¶*´ ¶*µ*´*´¶™*´*´¶»Y*´¶À *´·:*»Y·µ*´´¶ W*´ ¶*´ ¶*Y´`µ*´*´¶!*» Y,´·µ"*½ µ#>¢G*´#» Y· S6,´¢(,´$2’¶%™*´#2,´$2¶„§ÿÕ„§ÿ¸*,¶&µ'**´¶µ(±XYW *´ ¶)À:*´#2¶*6*´*´¶!Yd6ž4*´#2¶)À+´,6´¶™ÿÙ*´*´2¶§ÿÈ*´*´¶-š»Y*´¶À *Y´Z`µ·:*´ ¶*´´¶ W*´ ¼ ¶*´*´"¶.š*-*´dO*´*´¶™C*´*´d¶§3-O§,*´*´"¶.™ -O§-*´*´¶/À´0O±Z[W *´ ¶)À1°\]W*´°^_W*´2¬jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/CharacterClassNode.class0000644000175000017500000000117507773723336026747 0ustar arnaudarnaudÊþº¾.$        _characterSetLjava/util/BitSet;(I)VCode _addToken_addTokenRange(II)V_matches(C)Z_clone(([I)Lorg/apache/oro/text/awk/SyntaxNode; java/util/BitSet  !*org/apache/oro/text/awk/CharacterClassNode "# org/apache/oro/text/awk/LeafNodesetget(I)Zclone()Ljava/lang/Object;     *·*»Y·µ± *´¶± £*´„¶§ÿð± *´¶¬, »Y+\.[`O·M,*´¶ Àµ,°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/SyntaxTree.class0000644000175000017500000000210307773723336025355 0ustar arnaudarnaudÊþº¾.; ! " #$ %& ' ( )* + , )- ./0 _positionsI_root$Lorg/apache/oro/text/awk/SyntaxNode;_nodes#[Lorg/apache/oro/text/awk/LeafNode; _followSet[Ljava/util/BitSet;((Lorg/apache/oro/text/awk/SyntaxNode;I)VCode_computeFollowPositions()V__addToFastMap(Ljava/util/BitSet;[Z[Z)V createFastMap()[Z   java/util/BitSet  org/apache/oro/text/awk/LeafNode  12 34 56 78 9: "org/apache/oro/text/awk/SyntaxTreejava/lang/Object(I)V"org/apache/oro/text/awk/SyntaxNode_followPosition;([Ljava/util/BitSet;[Lorg/apache/oro/text/awk/SyntaxNode;)Vget(I)Z_matches(C)Z_firstPosition()Ljava/util/BitSet;0*·*+µ*µ±TH**´½µ**´½µ*´<Yd<¢*´»Y*´·S§ÿæ*´*´*´¶ ±]Q6*´¢G+¶ ™8-3š1-T6¢!,3š,*´2’¶ T„§ÿÝ„§ÿ¶± (¼L*´¼M**´¶ +,· +°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/AwkStreamInput.class0000644000175000017500000000233207773723336026171 0ustar arnaudarnaudÊþº¾.E + , - . / 0 1 2 3 4567 8 9: 4;<=_DEFAULT_BUFFER_INCREMENTI ConstantValue__searchStreamLjava/io/Reader;__bufferIncrementUnit_endOfStreamReachedZ _bufferSize _bufferOffset_currentOffset_buffer[C()VCode(Ljava/io/Reader;I)V(Ljava/io/Reader;)V _reallocate(I)I Exceptionsread()Z endOfStream !        #> (?java/io/IOException(read from input stream returned 0 bytes. @A BC (D&org/apache/oro/text/awk/AwkStreamInputjava/lang/Objectjava/io/Reader([CII)I(Ljava/lang/String;)Vjava/lang/System arraycopy*(Ljava/lang/Object;ILjava/lang/Object;II)V([C)I1 !" *·*µ± #"6**·*+µ*µ*¼µ***ZµZµµ*µ± $" *+· ±%&"ym*´™*´¬*´d=*´`¼:*´*´¶ >*µš » Y · ¿*´¬*Y´`µ*`µ*´¸*µ¬' ()"F:*Y´*´`µ**´*´¶µ**´ §µ*´š§¬' *)"*´¬jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/SyntaxNode.class0000644000175000017500000000061307773723336025347 0ustar arnaudarnaudÊþº¾. ()VCode _nullable()Z_firstPosition()Ljava/util/BitSet; _lastPosition_followPosition;([Ljava/util/BitSet;[Lorg/apache/oro/text/awk/SyntaxNode;)V_clone(([I)Lorg/apache/oro/text/awk/SyntaxNode; "org/apache/oro/text/awk/SyntaxNodejava/lang/Object *·±   jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/QuestionNode.class0000644000175000017500000000124207773723336025667 0ustar arnaudarnaudÊþº¾."      _epsilon$Lorg/apache/oro/text/awk/SyntaxNode;'(Lorg/apache/oro/text/awk/SyntaxNode;)VCode _nullable()Z_clone(([I)Lorg/apache/oro/text/awk/SyntaxNode;()V $org/apache/oro/text/awk/QuestionNode !  #org/apache/oro/text/awk/EpsilonNode org/apache/oro/text/awk/OrNodeK(Lorg/apache/oro/text/awk/SyntaxNode;Lorg/apache/oro/text/awk/SyntaxNode;)V_left"org/apache/oro/text/awk/SyntaxNode0    *+²·±¬»Y*´+¶·° »Y·³±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/AwkMatcher.class0000644000175000017500000000777407773723336025320 0ustar arnaudarnaudÊþº¾.® 2Q 1R 1ST Q 1U VW 1X Y Z 1[ \ 1] 1^_`a b c 1d e fg fh fi fj 1k l m 1n 1o 1p fq r s ft u v w x y z { |} ~  € Q‚ƒ__lastMatchedBufferOffsetI__lastMatchResult(Lorg/apache/oro/text/awk/AwkMatchResult;__scratchBuffer(Lorg/apache/oro/text/awk/AwkStreamInput;__streamSearchBuffer __awkPattern$Lorg/apache/oro/text/awk/AwkPattern; __offsets[I __beginOffset()VCode matchesPrefix)([CLorg/apache/oro/text/regex/Pattern;I)Z(([CLorg/apache/oro/text/regex/Pattern;)Z8(Ljava/lang/String;Lorg/apache/oro/text/regex/Pattern;)ZU(Lorg/apache/oro/text/regex/PatternMatcherInput;Lorg/apache/oro/text/regex/Pattern;)ZmatchescontainsN(Lorg/apache/oro/text/awk/AwkStreamInput;Lorg/apache/oro/text/regex/Pattern;)Z Exceptions__streamMatchPrefix()I_searchgetMatch)()Lorg/apache/oro/text/regex/MatchResult; @A 67 =>&org/apache/oro/text/awk/AwkStreamInput 89 „…"org/apache/oro/text/awk/AwkPattern ;< †‡ ˆ5 ?5 ‰5 :9 LMjava/io/IOException&org/apache/oro/text/awk/AwkMatchResultjava/lang/String @Š @‹ CD ŒŽ M ‘M ’M HE “… ”• 45 NA IE –— ˜™ š™ ›œ ž Ÿ5  — ¡5 ¢£ ¤¥ ¦§¨ ©ª «™ ¬… ­…"org/apache/oro/text/awk/AwkMatcherjava/lang/Object(org/apache/oro/text/regex/PatternMatcher_endOfStreamReachedZ_buffer[C _bufferSize _bufferOffset([CII)V(Ljava/lang/String;I)V toCharArray()[C-org/apache/oro/text/regex/PatternMatcherInput getBuffergetBeginOffsetgetCurrentOffsetlength_hasBeginAnchor_fastMap[ZsetCurrentOffset(I)V beginOffset(I)I endOffsetsetMatchOffsets(II)Vread()Z_currentOffset_incrementMatchBeginOffset _numStates_getStateArray(I)[I_createNewState(II[I)V _endStatesLjava/util/BitSet;java/util/BitSetget(I)Z _reallocate_matchesNullString _hasEndAnchor1123456789:9;<=>?5@AB0$*·*µ*¼ µ*»Y·µ*´µ±CDB‡s6*,Àµ *´+µ *´+¾µ *´*Zµ µ *´µ**´µ*´O*·6§:6œ *µ¬*»Y»Y+··µ¬@FICEB*+,¶¬CFB *+¶,¶¬CGB‰>*,Àµ *´+¶µ *´*+¶Zµ µ *´+¶O*´+¶µ *´µ**´µ*·>§:>œ *µ¬*»Y»Y*´´ *´.·*´.·µ¬JORHEB„p>*,Àµ *´+µ *´+¾µ *´*Zµ µ *´µ**´µ*´O*·>§:>+¾Ÿ *µ¬*»Y»Y+··µ¬?DGHFB *+¶,¶¬HGBª–>*,Àµ *´+¶µ *´+¶µ *´*+¶Zµ µ *´+¶O*´µ**´µ*·>§:>*´´ Ÿ *µ¬*»Y»Y*´´ *´.*´´ ·*´.·µ¬JORIEBƒo*,Àµ *´ ´™*´ ´+43š *µ¬*´+µ *´+¾µ *´*Zµ µ *´µ**´µ*µ*¶§N*´Æ§¬Z^aIFB *+¶,¶ ¬IGB¼¨*,Àµ *´+¶µ *´*+¶Zµ µ *+¶µ*´ ´™,*´ *´ *´ ´*´´ *´ 43š *µ¬*´+¶µ *´µ**´µ*¶§N+*´¶!*´Ç¬+*´¶"*´¶#¶$¬y}€IJBs*,Àµ *´ ´™0+´ š"+¶%™"*´ ´+´ 43š*µ¬*µ¬*+´&µ*+µ*µ *¶+*´µ&*´Æ*´+´ ¶'¬¬KLMB  =6*´.Y66*´´ *´ `6¢š*´´ „4<*´ ´(¢>*´ ¶):.=š*´ ¶*.= §T*´ ´+¶,™6 ÿ*´¶-*´ `6*´´ *´ `6ŸÿxŸ d66§ÿe*´O*´dO *´ ´.™¬*´ ´/™ *´´™*´´ *´ `¢¬d¬KNABùí*µ*´*´´ *´ `¡#*´´™ *µ±*´¶%š±*µ*´<*´´ *´ `¢›*´O*´ ´*´´ 43™O*·Y=¤E*»Y»Y*´´ *´.·*´.·µ*ž*´.`§ *´.`µ±*´ ´.™*»Y»Y·0·µ*`µ±*´.`<§ÿ[*µ§ÿKOPB*´°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/awk/AwkMatchResult.class0000644000175000017500000000160607773723336026154 0ustar arnaudarnaudÊþº¾.,  ! "# $ % &'()__matchBeginOffsetI__length__matchLjava/lang/String;(Ljava/lang/String;I)VCode_incrementMatchBeginOffset(I)Vlength()Igroupsgroup(I)Ljava/lang/String;begin(I)Iend beginOffset endOffsettoString()Ljava/lang/String; * +  &org/apache/oro/text/awk/AwkMatchResultjava/lang/Object%org/apache/oro/text/regex/MatchResult()Vjava/lang/String0     #*·*+µ*+¶µ*µ± *Y´`µ±*´¬¬ š *´§° 𧬠š *´§¬ š *´§¬š*´*´`§¬*¶°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/0000755000175000017500000000000010423237774022544 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/CharStringPointer.class0000644000175000017500000000244507773723336027215 0ustar arnaudarnaudÊþº¾.B 0 1 2 3 4ÿÿ 5 6 7 8 9 :; < =>?_END_OF_STRINGC ConstantValue_offsetI_array[C([CI)VCode([C)V _getValue()C(I)C_getValueRelative _getLength()I _getOffset _setOffset(I)V_isAtEnd()Z _increment _decrement_postIncrement_postDecrement _toString(I)Ljava/lang/String;toString()Ljava/lang/String; @     &' ( )  ( )java/lang/String A ,-+org/apache/oro/text/regex/CharStringPointerjava/lang/Object()V([CII)V0*·*+µ*µ±*+·± **´¶¬#*´¾¢› *´4¬¬  **´`¶¬!"*´¾¬#"*´¬$%*µ±&'*´*´¾¡§¬(3'*Y´`µ*¶™**´¾µ¬*´*´4¬(*¶¬), *Y´dµ*´œ*µ*´*´4¬)*¶ ¬* *¶ <*¶ W¬+ *¶ <*¶ W¬,- » Y*´*´¾d·°./*¶°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/Perl5Repetition.class0000644000175000017500000000060207773723336026633 0ustar arnaudarnaudÊþº¾.  _parenFloorI _numInstances_min_max_minModZ_scan_next _lastLocation_lastRepetition+Lorg/apache/oro/text/regex/Perl5Repetition;()VCode )org/apache/oro/text/regex/Perl5Repetitionjava/lang/Object0    *·±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/StringSubstitution.class0000644000175000017500000000170507773723336027511 0ustar arnaudarnaudÊþº¾./     ! " # $%&'( _subLengthI _substitutionLjava/lang/String;()VCode(Ljava/lang/String;)VsetSubstitutiongetSubstitution()Ljava/lang/String;toStringappendSubstitution¿(Ljava/lang/StringBuffer;Lorg/apache/oro/text/regex/MatchResult;ILorg/apache/oro/text/regex/PatternMatcherInput;Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;)V    ) *+  , -.,org/apache/oro/text/regex/StringSubstitutionjava/lang/Object&org/apache/oro/text/regex/Substitutionjava/lang/Stringlength()Ijava/lang/StringBufferappend,(Ljava/lang/String;)Ljava/lang/StringBuffer;!   *·± *·*+¶±*+µ*+¶µ±*´°*¶°*´š±+*´¶ W±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/PatternMatcher.class0000644000175000017500000000077207773723336026532 0ustar arnaudarnaudÊþº¾.   matchesPrefix)([CLorg/apache/oro/text/regex/Pattern;I)Z8(Ljava/lang/String;Lorg/apache/oro/text/regex/Pattern;)Z(([CLorg/apache/oro/text/regex/Pattern;)ZU(Lorg/apache/oro/text/regex/PatternMatcherInput;Lorg/apache/oro/text/regex/Pattern;)ZmatchescontainsgetMatch)()Lorg/apache/oro/text/regex/MatchResult;(org/apache/oro/text/regex/PatternMatcherjava/lang/Object     jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/Perl5Substitution.class0000644000175000017500000000666607773723336027245 0ustar arnaudarnaudÊþº¾.˜ ST $U $V WX YZ $[ S\ÿÿ $] $^_ ` $a b cd ce cf cg ch Si j k Slm $n %o $p %q $r Ys $t $u %v $w xyzINTERPOLATE_ALLI ConstantValueINTERPOLATE_NONEÿÿÿÿ__OPCODE_STORAGE_SIZE  __MAX_GROUPS _OPCODE_COPY_OPCODE_LOWERCASE_CHARÿÿÿþ_OPCODE_UPPERCASE_CHARÿÿÿý_OPCODE_LOWERCASE_MODEÿÿÿü_OPCODE_UPPERCASE_MODEÿÿÿû_OPCODE_ENDCASE_MODEÿÿÿú_numInterpolations _subOpcodes[I_subOpcodesCount_substitutionChars[C_lastInterpolationLjava/lang/String;__isInterpolationCharacter(C)ZCode __addElement(I)V __parseSubs(Ljava/lang/String;)V_finalInterpolatedSub;(Lorg/apache/oro/text/regex/MatchResult;)Ljava/lang/String;_calcSubB(Ljava/lang/StringBuffer;Lorg/apache/oro/text/regex/MatchResult;)V()V(Ljava/lang/String;I)VsetSubstitutionappendSubstitution¿(Ljava/lang/StringBuffer;Lorg/apache/oro/text/regex/MatchResult;ILorg/apache/oro/text/regex/PatternMatcherInput;Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;)V{ |C ;< ='} ~€ ‚ >? ƒ„ EF BCjava/lang/StringBuffer MF KL …†‡ ˆ‰ Š‹ Œ Ž ‹ ‘ ’“ ’” •‘ MO MN PO PH :' – GH @A QR IJ ’—+org/apache/oro/text/regex/Perl5Substitution,org/apache/oro/text/regex/StringSubstitutionjava/lang/CharacterisDigitjava/lang/System arraycopy*(Ljava/lang/Object;ILjava/lang/Object;II)Vjava/lang/String toCharArray()[Cdigit(CI)ItoString()Ljava/lang/String;%org/apache/oro/text/regex/MatchResultgroup(I)Ljava/lang/String;groups()Ibegin(I)Iendlength toLowerCase(C)Cappend(C)Ljava/lang/StringBuffer;([CII)Ljava/lang/StringBuffer; toUpperCaseindexOf,(Ljava/lang/String;)Ljava/lang/StringBuffer;!$%&'()*'(+,'(-.'(/'(+0'(12'(34'(56'(78'(9:';<='>?€@A BCD¸š & §¬EFDC7*´¾=*´  `¼ N*´-¸*-µ*´*Y´Z`µO±GHDŸ“*+¶Zµ:¾6* ¼ µ*µ66=>66  ¢a 46  `6 ™b  ¸6  ¤(£ h6 `6  *· § &  d4$ *· 6=§ö*· 6= $Ÿ  \ ™,>œ 6*· *·   ¼* d· §°›* d· 6  §• 46  $   ¸ =§~ \ w l šk*þ· „ §_ u šS*ý· „ §G L *ü· „ 6§1 U *û· „ 6§ E *ú· „ 6§>„ §þž±IJD!» Y · M*,+¶ ,¶°KLDÎÂ*´: 6*´:,¹¶: *´>6  ¢›  .6  ›` ,¹¢U, ¹6œ§l, ¹6  œ§Z,¹6¢K £D ¡§: d6 :§{  0„  ¡§  .6„  ¡§  .6:§H þŸ  ý üŸäûŸÝ 6§Ö üŸ  û   6§Á ú º6§´þ '+„4¸¶W+„ÿ¶W6§‰ý '+„4¸¶W+„ÿ¶W6§^ü $`6  ¢I+„4¸¶W§ÿéû $`6  ¢!+„4¸¶W§ÿé+¶W„ §þe±MND*·±MHD*+·±MOD *·*+¶±PHD*+¶±PODB6*+·*µŸ+$¶  +\¶Ÿ *+·§*µ*µ ±QRDUI*´Ç*+,·!±*´¡ *´¢ *+,¶ §*´  **,¶"µ +*´ ¶#W±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/OpCode.class0000644000175000017500000000640207773723336024756 0ustar arnaudarnaudÊþº¾.œ ‘ ’“ ” • – —˜™_ENDC ConstantValue_BOL_MBOL_SBOL_EOL_MEOL_SEOL_ANY_SANY_ANYOF _CURLY _CURLYX _BRANCH _BACK _EXACTLY_NOTHING_STAR_PLUS_ALNUM_NALNUM_BOUND_NBOUND_SPACE_NSPACE_DIGIT_NDIGIT_REF_OPEN_CLOSE_MINMOD_GBOL_IFMATCH_UNLESSM _SUCCEED!_WHILEM"_ANYOFUN# _NANYOFUN$_RANGE%_ALPHA&_BLANK'_CNTRL(_GRAPH)_LOWER*_PRINT+_PUNCT,_UPPER-_XDIGIT._OPCODE/_NOPCODE0_ONECHAR1_ALNUMC2_ASCII3_operandLength[I_opType[C_opLengthVaries _opLengthOne _NULL_OFFSETIÿÿÿÿ _NULL_POINTER()VCode_getNextOffset([CI)I_getArg1([CI)C_getArg2 _getOperand(I)I _isInArray(C[CI)Z_getNextOperator_getPrevOperator_getNext_isWordCharacter(C)Z ~ ‚š ›Ž tu vw xw yw org/apache/oro/text/regex/OpCodejava/lang/Objectjava/lang/CharacterisLetterOrDigit0 :                       !"  #$  %&  '(  )*  +,  -.  /0  12  34  56  78  9:  ;<  =>  ?@  AB  CD  EF  GH  IJ  KL  MN  OP  QR  ST  UV  WX  YZ  [\  ]^  _`  ab  cd  ef  gh  ij  kl  mn  op  qr  stuvwxwywz{ |}   ~€*·±‚€*`4¬ƒ„€*`4¬…„€*`4¬†‡€`¬ˆ‰€ +¾¢+„4 ÿó¬¬Ї€`¬‹‡€d¬Œ‚€."*Ǭ*¸=š¬*4  d¬`¬Ž€¸š _ §¬€ þ4¼ YOYOYOYOYOYOYOYOYOY OY OY OY OY OYOYOYOYOYOYOYOYOYOYOYOYOYOYOYOYOYOYOY OY!OY"OY#OY$OY%OY&OY'OY(OY)OY*OY+OY,OY-OY.OY/OY0OY1OY2OY3O³4¼YUYUYUYUYUYUYUYUYUY  UY  UY  UY  UY  UYUYUYUYUYUYUYUYUYUYUYUYUYUYUYUYUYUY UY  UY!UY""UY##UY$$UY%%UY&&UY''UY((UY))UY**UY++UY,,UY--UY..UY//UY00UY11UY22UY33U³¼Y UY UYUYUY UY UYUY"U³¼YUYUY UYUYUYUYUYUYUY #UY $UY &UY 'UY (UY)UY*UY+UY,UY-UY.UY/UY0UY1UY2UY3U³±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/MalformedPatternException.class0000644000175000017500000000035307773723336030727 0ustar arnaudarnaudÊþº¾.     ()VCode(Ljava/lang/String;)V  3org/apache/oro/text/regex/MalformedPatternExceptionjava/lang/Exception!*·±*+·±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/Pattern.class0000644000175000017500000000023407773723336025217 0ustar arnaudarnaudÊþº¾.  getPattern()Ljava/lang/String; getOptions()I!org/apache/oro/text/regex/Patternjava/lang/Objectjakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/Perl5Matcher.class0000644000175000017500000002671707773723336026113 0ustar arnaudarnaudÊþº¾.m m· l¸ l¹ lº» · l¼ l½ l¾ l¿ lÀ lÁ l là Ä Å‰ lÆ [Ç lÈ lÉÊ · lË Ì Í [Î lÏ Ð lÑ [Ò lÓ lÔ lÕÖ #× lØÙ &· #Ú #Û #ÜÝ +Þ #ß là [á [â [ã lä [å€ [æ [ç lè [é [ê ëì lí ëî ïð ïñ ïò ïó ïô ïõ ïö ï÷ ïø lùÿÿ ëú lû ëü ý þ ëÿ  ë     ë l l l ë   ï  l l +      l  l l    l !"__EOSC ConstantValue__INITIAL_NUM_OFFSETSI __multilineZ __lastSuccess__caseInsensitive__previousChar__input[C__originalInput __currentRep+Lorg/apache/oro/text/regex/Perl5Repetition;__numParentheses__bol__eol__currentOffset __endOffset __program __expSize __inputOffset __lastParen__beginMatchOffsets[I__endMatchOffsets__stackLjava/util/Stack;__lastMatchResult,Lorg/apache/oro/text/regex/Perl5MatchResult;__DEFAULT_LAST_MATCH_END_OFFSETÿÿÿœ__lastMatchInputEndOffset()VCode __compare ([CI[CII)Z __findFirst ([CII[C)I __pushState(I)V __popState__initInterpreterGlobals0(Lorg/apache/oro/text/regex/Perl5Pattern;[CIII)V__setLastMatchResult __interpret0(Lorg/apache/oro/text/regex/Perl5Pattern;[CIII)Z__matchUnicodeClass(C[CIC)Z__tryExpression(I)Z__repeat(II)I__match setMultiline(Z)V isMultiline()Z_toLower([C)[C matchesPrefix)([CLorg/apache/oro/text/regex/Pattern;I)Z(([CLorg/apache/oro/text/regex/Pattern;)Z8(Ljava/lang/String;Lorg/apache/oro/text/regex/Pattern;)ZU(Lorg/apache/oro/text/regex/PatternMatcherInput;Lorg/apache/oro/text/regex/Pattern;)ZmatchescontainsgetMatch)()Lorg/apache/oro/text/regex/MatchResult; ’“ uv wv xvjava/util/Stack ‹Œ Ž ‘s …s ‡s †s Љ ˆ‰ #$ %& s 'v z{ ƒs)org/apache/oro/text/regex/Perl5Repetition }~ (s )~ *{ „{ +š yp ,s ‚s €s s*org/apache/oro/text/regex/Perl5MatchResult ’š |{(java/lang/ArrayIndexOutOfBoundsException -s .‰ /‰java/lang/String ’0 12 œ 3{ 4s 5s —˜ 6s 7s 8v £¤ 9{ :s; <= ¡¢ >?@ A? B? C? D? E? F? G? H? IJ §¤ KL •– MN Os Ps QN Rs S= Ts Us Vv Ws X= ™š ›“ ¥¦ Y{Z [\ ]^&org/apache/oro/text/regex/Perl5Pattern ¬­ ®¯ _`a b{ c{ ds es fs ³° gh ´° Ÿ  ih jš kl ž“&org/apache/oro/text/regex/Perl5Matcherjava/lang/Object(org/apache/oro/text/regex/PatternMatcherpush&(Ljava/lang/Object;)Ljava/lang/Object;pop()Ljava/lang/Object;_isCaseInsensitive _numInstances_lastRepetition_programsetSize_numParentheses_matchBeginOffset_beginGroupOffset_endGroupOffset([CII)V_matchLjava/lang/String; _mustString_anchor_back_options _mustUtility _isExpensive _startString_startClassOffset org/apache/oro/text/regex/OpCode _getOperand(I)I_isWordCharacter(C)Zjava/lang/Character isWhitespaceisDigitisLetterOrDigitisLetter isSpaceChar isISOControl isLowerCase isUpperCasegetType(C)I_getNext([CI)I_getArg1([CI)C _parenFloor_min_getArg2_max_getNextOperator_scan_next_minMod _lastLocation_getPrevOperator_opTypejava/lang/System arraycopy*(Ljava/lang/Object;ILjava/lang/Object;II)V toLowerCase(C)C toCharArray()[C-org/apache/oro/text/regex/PatternMatcherInput_originalBuffer_toLowerBuffer _beginOffset _endOffset_currentOffsetlength()IgetMatchEndOffsetsetCurrentOffsetsetMatchOffsets(II)V1lmnopqGrsqtuvwvxvypz{|{}~s€ss‚sƒs„{…s†s‡sˆ‰Љ‹ŒŽsq‘s’“”6**·*µ*µ*µ*»Y·µ*µ*œµ ± •–”?36¢**¾¡¬,¾¡¬*4,4Ÿ¬„„„§ÿÕ¬ —˜”bV*¾š¬-46¢F*4 866¢ -¾¢-4*4Ÿ§ „„§ÿá<-¾¡§ „§ÿ»¬™š”wk*´ dh> ¼ M§ `¼ M,*´ O,*´ O,*´ O*´ 6¤),*´ .O,`*´.O,`O„ÿ„ý§ÿ×*´,¶W±›“”ˆ|*´¶ÀL*+.µ *+.µ *+.µ =+¾¢++`.>*´+`.O*´ £ *´ +.O„§ÿÕ*´ `>*´£*´ ¤ *´O*´ O„§ÿß±œ”Ë¿*+´µ*,µ*µ*»Y·µ*´µ*´µ*+´µ*´¶Ÿ * µ§"*,d4µ*´š*´  *µ*+´µ*µ *µ!*µ"*´`6*´Æ *´¾¤¢6*¼ µ*¼ µ ±ž“”òæ=*»#Y*´`·$µ*´ .*´%¾¤ »&Y·'¿*´*´.µ(*´›†*´*´.<›*´´)*´*´´(dO§*´´)*´O*´ *´.<›+*´´**´*´´(dO¤*´%¾£=§*´´**´O*Y´dµ§ÿy*´»+Y*´%*´.*´.d·,µ-*µ%±Ÿ ”ɽ66*+,·.6+´/:  ÆÄ+´0~™*´š +´0~™«+´1›¤**´*´  ¸2µ *´ ¡+´34~š +Y´5`µ56§=+´1›,*Y´ +´1dµ *´ ¢ *µ +´1 ¾`6§>+´6š,+´34~š"+Y´5dZµ5œ+Zµ/: *µ §*µ  ¾6+´0~™*´  *·7™ 6§³*´š+´0~š +´0~™™ž d6d6*´ ¤ *Y´ dµ *´ ¢k*´*Y´ Z`µ 4  ÿå*´ ¢ÿÜ**´ ·7™ÿÑ6§<+´8Ʋ+´8: +´0~™l 46 *´ ¢ *´*´ 4 B**´ ·7™ 6§ø*Y´ `µ *´ ¢*´*´ 4  *Y´ `µ §ÿß*Y´ `µ §ÿ **´*´  ¸2Zµ ¢¤**´ ·7™ 6§“*Y´ `µ §ÿÌ+´9Y6 ŸH+´0~š§6 ž d6d66 *´ 4Y6ª $~m¶>ÔÿD‰Îåå ¸:6 *´ ¢ˆ*´*´ 46  ¢7*´  z`4 ~x~š  ™**´ ·7™ 6§~ 6 §6 *Y´ `µ §ÿ£ ¸:6 *´ ¢!*´*´ 46 * *´ ·;™  ™**´ ·7™ 6§% 6 §6 *Y´ `µ §ÿ±ž „„ÿ*´ Ÿ*´*´ d46  ¸<6 § *´¸<6 *´ ¢B*´*´ 46   ¸<Ÿ  š§6 **´ ·7™ 6§ *Y´ `µ §ÿ» ™S**´ ·7™H6§xž „„ÿ*´ Ÿ*´*´ d46  ¸<6 § *´¸<6 *´ ¢E*´*´ 46   ¸<Ÿ š§6 §**´ ·7™ 6§*Y´ `µ §ÿ¸ šº**´ ·7™¯6§ß*´ ¢ *´*´ 46  ¸<™  ™**´ ·7™ 6§­ 6 §6 *Y´ `µ §ÿº*´ ¢W*´*´ 46  ¸<š  ™**´ ·7™ 6§d 6 §6 *Y´ `µ §ÿº*´ ¢*´*´ 4¸=™  ™**´ ·7™ 6§ 6 §6 *Y´ `µ §ÿ¾*´ ¢É*´*´ 4¸=š  ™**´ ·7™ 6§Ú 6 §6 *Y´ `µ §ÿ¾*´ ¢„*´*´ 4¸>™  ™**´ ·7™ 6§• 6 §6 *Y´ `µ §ÿ¾*´ ¢?*´*´ 4¸>š  ™**´ ·7™ 6§P 6 §6 *Y´ `µ §ÿ¾§6ž d6d6**´ ·7™ 6§*Y´ Z`µ ¡ÿâ*µ*µ¬¡¢”9-# §6,4™,4% „,4¡,`4£¬„§ÿÙ,41 „,„4 ÿĬ,4/ § š§6„,„4ª­3– ­­ª´¾È­­­­­­­­­­­­Üæð:ú0D|­­­Ò£¸<™¬¸<𠬏=™ÿ¬¸=šõ¬¸>™ë¬¸>šá¬¸?™×¬¸@™Í¬¸A™Ã¬¸B™¹¬¸C™¬*´™¥¸D™ž¬¸D™¬*´™Š¸C™ƒ¬¸A™¬¸?™¬¸Eª1........¬§40¡ 9¤a¡ f¤A¡F£¬€¢¬§ýïš§¬£¤”]Q*µ *µ *µ *´ž!=*´£*´O*´ O„§ÿç*·F™*´O*´ *´ O¬¬¥¦”  *´ >*´"6GŸd¢`6¸:6*´4Y6ªÎ$‡žÃÎÎÎΤÎÎÎ>VÎÎn†ž¶ÎÎÎÎÎÎÎÎÎ  ¢D*´4 Ÿ9„§ÿì>§-„¢$*´4*´4 „§ÿç¢*´4Y6¢ù¢ñ*´z`4~x~šÚ„¢Ñ*´46§ÿТÀ*´46**´·;™§„¢ž*´46§ÿÞ¢*´4¸<™„§ÿë¢u*´4¸<ši„§ÿë¢]*´4¸=™Q„§ÿë¢E*´4¸=š9„§ÿë¢-*´4¸>™!„§ÿë¢*´4¸>š „§ÿë*´ d6*µ ¬§¤” … y6 6 *´ 6*´¢§6  ™ *´4§G=6*´¾6¢ ;*´¸H6*´4Y>ª $ Ø¡Þ9cyÔœˆ[å´±[[[“ÔÔ`˜É ·ÝU+ í  Øƒøø*´! *´  ,§ j*´™" š *´"¢*´d4  § D¬*´! *´  %§ - š *´"¢*´d4  § ¬*´! *´  §÷¬*´! §é¬ š *´"¢  Ÿ¬*´šÊ*´"d¤¿¬ š *´"¢¯ Ÿ©¬ š *´"¢  Ÿ¬*´"d¤†¬ š*´"¡¬„*´¢§6  ™ *´4§G=§L š *´"¢   ¬„*´¢§6  ™ *´4§G=§¸:6*´„46 *´4Ÿ¬*´"d ¢¬ ¤*´*´ ¸Iš¬ `6*´¢§6  ™ *´4§G=§˜¸:6G  ™ *´4=¢*´z`4~x~™¬ š*´"¡¬„*´¢§6  ™ *´4§G=§(¸:6G  ™ *´4=**´·;š¬ š*´"¡¬„*´¢§6  ™ *´4§G=§Å š¬¸<š¬„*´¢§6  ™ *´4§G=§ š*´"¡¬¸<™¬„*´¢§6  ™ *´4§G=§L*´! *´¸<6§*´d4¸<6¸<6 §*´4 § ¬ š*´"¡¬¸=š¬„*´¢§6  ™ *´4§G=§À š¬¸=™¬„*´¢§6  ™ *´4§G=§ˆ¸>š¬„*´¢§6  ™ *´4§G=§W š*´"¡¬¸>™¬„*´¢§6  ™ *´4§G=§*´¸J6 *´ .6 ¬*´  . ¬*´  . §Þ*´4Ÿ¬*´  .d6  `*´"¤¬ ¤*´*´ ¸Iš¬ `6*´¢§6  ™ *´4§G=§o§l§i*´¸J6 *´ O *´ ¤L* µ §C*´¸J6 *´  O *´ ¤&* µ §»Y·:  *´µ* µ *´ µK µ *´¸JµL *´¸MµN ¸O`µP µQ  µR µS*µ *¸T·F6 * ´µ ¬*´:  ´`6 *µ  ´S 4* ´µ*´´6 * ´Q·F™¬*´ µ* µ¬  ´L¢*  µ µS* ´P·F™¬  dµ¬ ´R™k* ´µ*´´6 * ´Q·F™¬*´ µ* µ  ´N¡¬*µ   µ µS* ´P·F™¬  dµ¬  ´N¢2* ´K·U  µ µS* ´P·F™¬*·V*µ * ´µ*´´6 * ´Q·F™¬  µ* µ  dµ¬*´4 Ÿ ¸O6§%*´ 6*µ *¸O·F™¬*´ 6  ¤*´  O„ ÿ§ÿî* µ *´¸H6Ÿ*´4 Ÿÿ±¬6 §Å  %*´¸J6 *´¸M6 ¸O`6§( 6 G6 ¸O6§6 G6 ¸O6*´4 *´¸:`4=6§ G=ü6*µ  ™|6  ž* ·W ¢¬  ¢ G Ú žÕüŸ*´ *´¢*´*´ 4 *·F™¬* `µ *·W™„ * `µ §ÿ£¬* ·W6   ¢4²X*´44 %*´š*´4 *´4  6   ¡=üŸ*´ *´¢*´*´ 4 *·F™¬„ ÿ* `µ §ÿ¬*µ *´ *´  ¬¬*µ ¸O6*·Fš¬*µ ¸O6*·F™¬6§õĬ¨©”*µ±ª«”*´¬¬­”>2+¾¼N+-+¾¸Y-L=+¾¢+4¸D™ ++4¸ZU„§ÿå+°®¯”C7,À[:*+µ%´™ *+¶\L*++¾·.**·7µ*µ*´¬®°”*+,¶]¬®±” *+¶^,¶]¬®²”k_,À[:*+´_µ%´™+´`Ç+**´%¶\µ`+´`N§*´%N*-+´a+´b+´c·.**+´c·7µ*µ*´¬³°”SG,À[N*+µ%-´™ *+¶\L*-++¾·.**·7™*´ .+¾ §µ*µ*´¬³±” *+¶^,¶d¬³²”’†,À[:*+´_µ%´™+´`Ç+**´%¶\µ`+´`N§*´%N*-+´a+´b+´a·.*µ*+´a·7™)*´ .+´bŸ+¶e™+´a+´b  *µ¬*µ¬´±” *+¶^,¶f¬´°”.",À[N*+µ%-´™ *+¶\L*-++¾·g¬´²”©+´c+´b¤¬,À[:*+´_µ%*+´_µ%´™+´`Ç+**´%¶\µ`+´`N§*´%N*+¶hµ *-+´a+´b+´c·g6™ +*´ .¶i+*´.*´ .¶j§ ++´b`¶i*œµ ¬µ¶”%*´š°*´Ç*·k*´°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/Perl5Debug.class0000644000175000017500000000536607773723336025553 0ustar arnaudarnaudÊþº¾.Ï W_ `ab _ c Vd ef egh ij klÿÿ mn `opq rs `tu `vwxy `z{| `}~ `€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ ¡¢£¤ e¥ e¦§¨©ª«¬­®¯°±²³´µ()VCode printProgram<(Lorg/apache/oro/text/regex/Perl5Pattern;)Ljava/lang/String;_printOperator([CILjava/lang/StringBuffer;)V XY¶ ·¸java/lang/StringBuffer ¹º ]^» ¼½ ¾¿( ¹À) Á < ¹Ã> ĸstart `java/lang/String XÅ' ÆÇ stclass ` ÈÇ anchored plus  implicit ɸ must have "" back ÊÇ minlen ËÇ:BOLMBOLSBOLEOLMEOLANYSANYANYOFANYOFUNNANYOFUNBRANCHEXACTLYNOTHINGBACKENDALNUMNALNUMBOUNDNBOUNDSPACENSPACEDIGITNDIGITALPHABLANKCNTRLGRAPHLOWERPRINTPUNCTUPPERXDIGITALNUMCASCIICURLY { ÌÍ ÎÍCURLYX {REFOPENCLOSESTARPLUSMINMODGBOLUNLESSMIFMATCHSUCCEEDWHILEM2Operator is unrecognized. Faulty expression code!$org/apache/oro/text/regex/Perl5Debugjava/lang/Object&org/apache/oro/text/regex/Perl5Pattern_program[Cappend(I)Ljava/lang/StringBuffer; org/apache/oro/text/regex/OpCode_getNext([CI)I_operandLength[I,(Ljava/lang/String;)Ljava/lang/StringBuffer;toString()Ljava/lang/String;(C)Ljava/lang/StringBuffer; _startString([C)V_startClassOffsetI_anchor _mustString_back _minLength_getArg1([CI)C_getArg21VWXYZ*·± [\Z¼°=*´N6»Y·L™¿-4=+¶W-+¸-¸6².`6+»Y· ¶ ¶ ¶ ¶ ¶ W„   „§c#Ÿ $ %-4™-4%  „§ÿí„§ÿç„§5 /„+ ¶ W-4Ÿ+-4¶W„§ÿë+¶ W„+ ¶W§ÿC*´Æ*+»Y·¶ »Y*´·¶ ¶ ¶ ¶ W*´Ÿ+¶ W-*´+¸+¶ W*´~™ +¶ W*´~™ +¶ W*´~™ +¶ W*´Æ6+»Y·¶ »Y*´·¶ ¶ *´¶ ¶ ¶ ¶ W+»Y·!¶ *´"¶ ¶¶ ¶ W+¶ °]^ZšŽN,#¶ W*4ªp34àæìòøpþ ¬Ø."(@F:@FLRX^d,LR^Xdjpjpv|‚ˆŽ”šppp ¦$N§”%N§Ž&N§ˆ'N§‚(N§|)N§v*N§p+N§j,N§d-N§^.N§X/N§R0N§L1N§F2N§@3N§:4N§45N§.6N§(7N§"8N§9N§:N§;N§ N§ø?N§ò@N§ìAN§æBN§àCN§ÚDN§ÔEN§Î,F¶ W,*¸G¶W,,¶W,*¸H¶W,}¶W§¢,I¶ W,*¸G¶W,,¶W,*¸H¶W,}¶W§v,J¶ W,*¸G¶W§b,K¶ W,*¸G¶W§N,L¶ W,*¸G¶W§:MN§4NN§.ON§(PN§"QN§RN§SN§TN§ ,U¶ W-Æ ,-¶ W±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/PatternCompiler.class0000644000175000017500000000067707773723336026725 0ustar arnaudarnaudÊþº¾.   compile7(Ljava/lang/String;)Lorg/apache/oro/text/regex/Pattern; Exceptions 8(Ljava/lang/String;I)Lorg/apache/oro/text/regex/Pattern;'([C)Lorg/apache/oro/text/regex/Pattern;(([CI)Lorg/apache/oro/text/regex/Pattern;)org/apache/oro/text/regex/PatternCompilerjava/lang/Object3org/apache/oro/text/regex/MalformedPatternException jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/Perl5MatchResult.class0000644000175000017500000000210707773723336026746 0ustar arnaudarnaudÊþº¾.4 # $ % & '( ')* + ,-./_matchBeginOffsetI_beginGroupOffset[I_endGroupOffset_matchLjava/lang/String;(I)VCodelength()Igroupsgroup(I)Ljava/lang/String;begin(I)Iend beginOffset endOffsettoString()Ljava/lang/String; 0   1  23  *org/apache/oro/text/regex/Perl5MatchResultjava/lang/Object%org/apache/oro/text/regex/MatchResult()Vjava/lang/String substring(II)Ljava/lang/String;0    *·*¼ µ*¼ µ±$*´.*´.d<ž§¬*´¾¬YM*´¾¢E*´.=*´.>*´¶6›*›&¢£¤ *´¶°£°°/#*´¾¢*´.=*´.>› ›¬¬/#*´¾¢*´.=*´.>› ›¬¬4(*´¾¢ *´.=*´.>›› *´`¬¬ 4(*´¾¢ *´.=*´.>›› *´`¬¬!"*¶ °jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/MatchResult.class0000644000175000017500000000044207773723336026036 0ustar arnaudarnaudÊþº¾.length()Igroupsgroup(I)Ljava/lang/String;begin(I)Iend beginOffset endOffsettoString()Ljava/lang/String;%org/apache/oro/text/regex/MatchResultjava/lang/Object     jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/Substitution.class0000644000175000017500000000047107773723336026321 0ustar arnaudarnaudÊþº¾.appendSubstitution¿(Ljava/lang/StringBuffer;Lorg/apache/oro/text/regex/MatchResult;ILorg/apache/oro/text/regex/PatternMatcherInput;Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;)V&org/apache/oro/text/regex/Substitutionjava/lang/Objectjakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/PatternMatcherInput.class0000644000175000017500000000434207773723336027547 0ustar arnaudarnaudÊþº¾.c J K L M N O P Q R S T U V W X Y Z [\ Q ] ^ N_`_originalStringInputLjava/lang/String;_originalCharInput[C_originalBuffer_toLowerBuffer _beginOffsetI _endOffset_currentOffset_matchBeginOffset_matchEndOffset(Ljava/lang/String;II)VCode(Ljava/lang/String;)V([CII)V([C)Vlength()IsetInputcharAt(I)C substring(II)Ljava/lang/String;(I)Ljava/lang/String;getInput()Ljava/lang/Object; getBuffer()[C endOfInput()ZgetBeginOffset getEndOffsetgetCurrentOffsetsetBeginOffset(I)V setEndOffsetsetCurrentOffsettoString()Ljava/lang/String;preMatch postMatchmatchsetMatchOffsets(II)VgetMatchBeginOffsetgetMatchEndOffset &a $! %! .' ,- &' .* &* "! !    b7  @> => ?>java/lang/String #! FG-org/apache/oro/text/regex/PatternMatcherInputjava/lang/Object()V toCharArray1  !"!#!$!%!&'("*·*µ*µ*+¶±&)( *++¶·±&*("*·*µ*µ*+¶±&+( *++¾·±,-( *´ *´ d¬.'(8,*+µ *µ *µ *+¶µ*¶*¶**´ `¶±.)( *++¶¶±.*(5)*µ *µ **+Zµ µ*¶*¶**´ `¶±.+( *++¾¶±/0( *´*´ `4¬12(!»Y*´*´ `d·°13(&*´ `<»Y*´*´ d·°45(*´ Ç*´ °*´ °67(*´°89(*´*´ ¡§¬:-(*´ ¬;-(*´ ¬<-(*´¬=>(*µ ±?>(*µ ±@>( *µ*¶±AB( »Y*´*´ *¶·°CB(%»Y*´*´ *´*´ d·°DB(%»Y*´*´*´ *´d·°EB(%»Y*´*´*´*´d·°FG( *µ*µ±H-(*´¬I-(*´¬jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/Perl5Compiler.class0000644000175000017500000003403107773723336026266 0ustar arnaudarnaudÊþº¾.¼ Žé êë ì íî ï ð dñ ò ó Jôõ dö ÷ ø íù íú íû ü ý \þ \ÿ \ÿÿ \ J  \ \  \      é  \ $  ,     \  \ \ \! " J# J$ % J& J' () í* + \,-./ 0 {12 J3 4 5 6789:;< =>? @ ABCD \EFÿþGH aé aIJ dE aK aL aM aN aO aP aQ aR íS íT dU íV íW  íX Y aZ a[ a\ a] ^_ {é` Ja {bcdefghijklmnopqr __WORSTCASEI ConstantValue __NONNULL__SIMPLE __SPSTART __TRYAGAIN__CASE_INSENSITIVEC__GLOBAL__KEEP __MULTILINE __SINGLELINE __EXTENDED  __READ_ONLY€ __HEX_DIGITLjava/lang/String;__input-Lorg/apache/oro/text/regex/CharStringPointer;__sawBackreferenceZ__modifierFlags[C__numParentheses __programSize__cost __program __hashPOSIXLjava/util/HashMap; DEFAULT_MASKCASE_INSENSITIVE_MASKMULTILINE_MASKSINGLELINE_MASK EXTENDED_MASKREAD_ONLY_MASK()VCode quotemeta([C)Ljava/lang/String;&(Ljava/lang/String;)Ljava/lang/String;__isSimpleRepetitionOp(C)Z__isComplexRepetitionOp([CI)Z__parseRepetition __parseHex ([CII[I)I __parseOctal__setModifierFlag([CC)V __emitCode(C)V __emitNode(C)I __emitArgNode(CC)I__programInsertOperator(CI)V__programAddTail(II)V__programAddOperatorTail __getNextChar()C__parseAlternation([I)I Exceptions __parseAtom__parseUnicodeClass()I __parsePOSIX([Z)C __parseBranch__parseExpression(Z[I)Icompile(([CI)Lorg/apache/oro/text/regex/Pattern;'([C)Lorg/apache/oro/text/regex/Pattern;7(Ljava/lang/String;)Lorg/apache/oro/text/regex/Pattern;8(Ljava/lang/String;I)Lorg/apache/oro/text/regex/Pattern; »¼ ­®java/lang/StringBuffer »st u vw xy z{ ¾¿ ÅÄ |Â!0123456789abcdef0123456789ABCDEFx }~ ²® °‘ ® € ‚~ ÓÔ ©ª ƒ× „× …† ‡× ˆÂ ÍÎ ‰Ý Šs Ö× ‹× àÙ ±‘ ÜÝ áâ3org/apache/oro/text/regex/MalformedPatternExceptionError in expression at vŒ Ž »!?+* follows nothing in expression ‘’java/lang/NumberFormatException[Unexpected number format exception. Please report this bug.NumberFormatException message: “y ¯‘Invalid backreference: \ v” «¬ ÏÐ •–Trailing \ in expression. ËÌ —Ý „† ˜® ÆÇ ™Â š› ÈÇ œÂ › ÃÄ8Unexpected compilation failure. Please report this bug! ž~ Þß ‡†Invalid [] range in expression.Unmatched [] in expression.java/lang/Exception ³´ Ÿ java/lang/Character ¡× ÛÙ Á ÑÒInvalid interval {,}$Nested repetitions *?+ in expressioniogmsx-Sequence (?#... not terminated ÉÊ Sequence (?...) not recognized ØÙ ÕÔUnmatched parentheses.CUnreached characters at end of expression. Please report this bug!+org/apache/oro/text/regex/CharStringPointer »¢Unknown compilation error.Expression is too large.&org/apache/oro/text/regex/Perl5Pattern £®java/lang/String ¤¨ ¥¬ ¦‘ §‘ ¨‘ ©‘ ª® «® ¬­ ®¯ »° ±® ²³ ´® µÝ ¶¬ ·‘ ¸‘ ¹‘ ãäjava/util/HashMapalnum »Ì º»wordalphablankcntrldigitgraphlowerprintpunctspaceupperxdigitascii'org/apache/oro/text/regex/Perl5Compilerjava/lang/Object)org/apache/oro/text/regex/PatternCompiler(I)V org/apache/oro/text/regex/OpCode_isWordCharacterappend(C)Ljava/lang/StringBuffer;toString()Ljava/lang/String; toCharArray()[CisDigitindexOf(I)I_opType_getNext([CI)I_getNextOperator_postIncrement _getValue_getValueRelative(I)C _increment isWhitespace _getOffset _setOffset _decrement,(Ljava/lang/String;)Ljava/lang/StringBuffer; _toString(I)Ljava/lang/String;(Ljava/lang/String;)Vjava/lang/IntegerparseInt(Ljava/lang/String;)I getMessage(I)Ljava/lang/StringBuffer;_isAtEnd()Z _getLength_array isLowerCase toUpperCase(C)C isUpperCase toLowerCase _getOperandget&(Ljava/lang/Object;)Ljava/lang/Object; charValue([C)V_program _expression _isExpensive_startClassOffset_anchor_back_options _startString _mustString_getArg1([CI)C_operandLength[I([CII)V _opLengthOne _isInArray(C[CI)Z_opLengthVarieslength_isCaseInsensitive_numParentheses _minLength _mustUtilityput8(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;1Ž‘’“”‘’•–‘’—˜‘’™š‘’›œ’•ž’—Ÿ’™ ’›¡’¢£’¤¥’¦§¨’ ©ª«¬­®¯‘°‘±‘²®³´µ‘’“¶‘’•·‘’›¸‘’¢¹‘’¤º‘’¦»¼½*·*¼YUµ±¾¿½C7»Y*¾h·M<*¾¢!*4¸š ,\¶W,*4¶W„§ÿß,¶°¾À½*¶¸ ° Á½$*Ÿ+Ÿ ? §¬ ÃĽF:*¾¢5›1*4*Ÿ#*4+Ÿ*4?Ÿ*4{ *¸ ™§¬¬ ÅĽwk*4{Ÿ¬„*¾¢ *4¸ š¬*¾¢*4¸ ™ „§ÿî*¾¢*4, „*¾¢*4¸ ™ „§ÿî*¾¢ *4}Ÿ¬¬ ÆÇ½PD6-O*¾¢7Yd=ž/ *4¶ Y6Ÿ x6~€6„-\.`O§ÿɬ ÈǽQE6-O*¾¢8ž4*40¡,*47£$x6*40d€6„ÿ„-\.`O§ÿȬ Éʽž’ªgx`Wri|†*\4€’U±*\4€’U±*\4€’U±*\4€’U±*\4€’U±*\4 €’U±±Ë̽(*´Æ *´*´U*Y´`µ±ÍνI=*´=*´Ç*Y´`µ§%*´*Y´Z`µU*´*Y´Z`µU¬ÏнZN*´>*´Ç*Y´`µ§6*´*Y´Z`µU*´*Y´Z`µU*´*Y´Z`µU¬ÑÒ½‘…²4  §6*´Ç*Y´``µ±*´>*Y´``µ*´6¤„ÿ„ÿ*´*´4U§ÿè*´„U*´„UYd6ž*´„U§ÿì±ÓÔ½XL*´Æ ±>*´¸6 § >§ÿê*´4   d6§d6*´`’U±ÕÔ½2&*´ÆŸ²*´44 Ÿ±*¸·±Ö×½¦š*´¶<*´¶=( ?*´¶? 2*´¶# %Ÿ)Ÿ*´¶=§ÿì*´¶W§ÿ¹*´4 ~™=¸™*´¶W§ÿ›# %Ÿ Ÿ*´¶=§ÿì*´¶W§ÿs¬ØÙ½ÕÉ6+O* ·>=*´¶š*´¶*·W§*´¶W*·W*´¶6Ÿw|Ÿp)Ÿi÷~6*+· 6 ~™*´¶6§ÿƬ+\.~€O +\.~€O§*Y´!`µ!*·=*´¶6§ÿˆ  *·W¬Ú$ÛÙ½  e¼ YO:+O=6*´¶>«è #®$¤(0)b*ž+ž.à?ž[\¨^i|b*·W*´4~™ *·6§t*´4~™ *·6§^*·6§T*·W*´4~™ *·6§9*´4~™*·6§"*·6§*·W*´4~™*·6§ *·6*Y´!`µ!+\.€O§à*´¶W*·"6+\.€O§È*·W*·#6 .~™§þ©¬+\..~€O§–.~™ +\.€O¬»$Y»Y·%&¶'*´*´¶¶(¶'¶·)¿»$Y*·)¿*´¶>«øä0123456789ABDéGS»W_Z1abvcdÒefnrs¤twHxÿÿä*·6+\.€O*·W§-*·6+\.€O*·W§*·6+\.€O*·W§ÿ*·6+\.€O*·W§è*·6+\.€O*·W§Ñ*·6+\.€O*·W§º*·6+\.€O*·W§£*·6+\.€O*·W§Œ*·6+\.€O*·W§u*·6+\.€O*·W§^*·6+\.€O*·W§G=§B»Y ·:6*´¶>¸ ™¶W„*´¶>§ÿ嶸+6§$:»$Y»Y·%-¶'¶.¶'¶·)¿ ¤*´/¡=§É*´/¡»$Y»Y·%0¶'¶1¶·)¿*µ2*’·36+\.€O*´¶>¸ ™*´¶>§ÿñ*´¶W*·W§c*´¶4™ »$Y5·)¿=§J*´4 ~™1*´¶4š*´¶ Ÿ*´¶W§ÿâ*´¶4š§û*´¶W=§™Q*·6*·66*´¶d6*´¶76 ¢Ó ¢Ì6 *´¶8>«X C C C C C C#$‚(‚)‚.‚[‚\…^‚|‚§8*´„¶8>«xg0–1–2–3–4–5–6–7–8–9–ABDGSWZa;bckde1f'n rstwxEÿÿg„ÿ§! 6„§Ã 6„§¹ 6„§¯ 6„§¥6„§›6„§‘¼ : *´´9„ ¸:’6 .`6§k„*´„¶86¸;™ ¸<6@‚’6§@6 *´¶8>0 6 *´`¶8>¸ ™}»Y ·:6 *´ ¶8>¸ ™¶W„ *´ ¶8>§ÿ嶸+6 §$:»$Y»Y·%-¶'¶.¶'¶·)¿ š *´/¡§6  ™&¼ : *´´9 ¸=’6 .`6§u„ÿ§Ã ¡ »$Y5·)¿*´„¶86§M*´4 ~™ ¢*´¶8 Ÿ „§ÿè*´4 ~™ „„ÿ§_*´„¶86*´4~™¸>™ ¸?6 ¢**´´9¸@™ž  6§„*·6§*·6„§ü,*´d¶*·Wœ »$YA·)¿ž +\.€O  +\.€O*´Æ*´¸B’U*·6¬,ßéì,Ú$Üݽ  <6¼ YO:¼YT:*´¶^ *$·6*´¶W§ *#·6*´¶>]Ÿ - =§=*´¶4š*´¶Y>] ™j=6*´¶W\Ÿ [ C\ *´¶>§*·C6  ™ 6 >Ÿª0xììììììììììsY?¤˜Ïfž’€†LŒ2ª6>6§Ù6>6§Ì6>6§¿6>6§²6>6§¥6>6§˜ >§’ >§Œ >§† >§€>§z>§t>§n*´´9*´¶¸:’>*´.¶DW§I*´¶>¸;™¸<>@‚’>§,*´´9*´¶d¸=’>*´.d¶DW§™¤ »$YE·)¿<§D6š<*´¶- 0*´¶`*´¶7¢*´¶]Ÿ*´¶W<§ý4 Y 3š */·6§*0·6§ *1·6*·6*´4~™$¸>™¸>™*Y´dµ*¸?·6¢O*%·6*·6*·6*´4~™-¸>™&¸>™*Y´dµ*¸?·6*¸?·66<6§ü€*´¶]Ÿ »$YF·)¿*·W*·6¬Ú$Þß½È ´*´¶=*´¶7>6*´„¶86:Ÿ¬*´¶8^  +T„§+T»Y·%:*´„¶8Y6:Ÿ¢¶W§ÿÞ§:¬*´„¶8]Ÿ¬²H¶¶I:Ǭ*´¶ÀJ¶K¬OtwGÚ$àÙ½ýá=>¼ YO:66*·L6 .~™ +\.€O¬*´¶6  ( L*´¶? ?*´¶# 2 Ÿ )Ÿ*´¶6 §ÿé Ÿ*·W*´¶6  { °*´´9*´¶¸ ™œ*´¶`6*´¶7Y6 6 *´¶86  ¸ š  , ) ,   Ÿ§6 „*´¶86 §ÿÎ } >»Y ·:   6 *´¶W*´¶6 *´ ¶86  ¸ ™ ¶W„ *´ ¶86 §ÿⶸ+6§$:»$Y»Y·%-¶'¶.¶'¶·)¿*´ ¶86  ,  „ § *´¶6  6 »Y ·:*´ ¶86  ¸ ™ ¶W„ *´ ¶86 §ÿâ  Ÿ ¶¸+6§$:»$Y»Y·%-¶'¶.¶'¶·)¿š*´ ¶80Ÿ6*´¶*·W=>š¢> ¸Mš +.O¬*·W+ +Ÿ§O * !.~™*·N*Y´!`µ!§U *  6>§F + !.~™*·N*Y´!`µ!§! +  6>§ ?  66>™¶.~™*Y´!*´!`l`µ!* ·N§2*Y´!*´!``µ!**"··* ·N**··ž+O™5¢.»$Y»Y·%O¶'¶1P¶'¶1Q¶'¶·)¿*´Æ*´`’U*´`’U*´¶? *·W*·N*`·*´´9*´¶¸@™ »$YR·)¿¬[eh,åöù,Ú$áâ½ ¼YU:¼YU:66 ¼ YO: S: :,O™<6*´¶? *´¶W*´¶Y>6«j!,#/:,=,§í*´¶>Ÿ)Ÿ*´¶>§ÿì)Ÿ »$YT·)¿*·W,O¬*´¶W*´¶>Ÿ+ ¶ Ÿ!-  :§ ¸U*´¶>§ÿÕ*´\44€’U*´\44‚~’U)Ÿ#»$Y»Y·%V¶'¶W¶'¶·)¿*·W,O¬*´/6 *Y´/`µ/* ’·36§6* ·X6   ¬Ÿ* ·§ 6 .~š ,\.þ~O,\. .~€O*´¶| @*·W* ·X6   ¬* · .~š ,\.þ~O,\. .~€O§ÿº«^^=!K:2=K*·6 §+* ’·36 §*!·6 ,\.þ~O§ *·6 * ·6  Ÿ*  ·Y*´ ¸6 §ÿç= *·N**··§! * ·N**··™ *´¶4š *·)Ÿ »$YZ·)¿š-*´¶4š#*´¶)  »$YZ·)¿»$Y[·)¿¬Ú$ã佌€¼ YON6 6 6*»\Y+·]µ~6*´’U*µ2*µ/*µ*µ!*µ*·6*-·#  »$Y^·)¿*´_¡ »$Y`·)¿**´¼µ»aY·b:*´µc»dY+·eµf*´¶*µ/*µ*µ!*·6*-·#  »$Y^·)¿*´4~6*´! ¡§µgµhµiµjµkµlµm::6*´*´¸4š¸Y66 *´ 46  Y6 šB  *´*´ ¸4  (Ÿ!Ÿ²4  :*´ ¸nž.  6 § ²o.`6  ¸6 *´ 46§ÿ‹6™Ø6*´ 46 $»dY*´ `¸B*´ ¸B4·p:§ÿʲq¸r™  µh§ÿ´Ÿ    µh§ÿœ²4 4  µi§  µi§ µi ¸6 6§ÿa ÿZ²*´ ¸44 ÿG´i~™ÿ= µi ¸6 6§ÿ) ™ ™ *´2šY´i€µi»Y·%: »Y·%: 66666ž *´4Y6™ÿ  ?*´*´¸4  "ŠÐ6*´4  ÿÆ*´¸6§ÿé¸6§ÿ®  ŠÐ6*´¸6§ÿ” Õ6 *´*´¸Y64  6§ÿæ*´ ¸B4`6*´ ¸B46d 9 »dY*´ ¸B`·p¶'W`6`6*´¸6 §›§`¡=6»Y»dY*´ ¸B`·p·s: 6`6*´¸6 §Å`6§»²t¸r™|ŠÐ66 ¶u ¶u¤  : 6»Y·%:  *´¸4²q¸r™ „§h²4  ]*´¸`4²q¸r™G*´¸n`6§6²q¸r™*„„6 ¶u ¶u¤  : 6»Y·%: *´¸6§ýõ ¶u²*´ 44 §` ¶u¤ : 6§ »Y·%:  ¶už!Ç ¶:œ6µj§: ~™§µv*´/dµwµxƶµmdµyÆ ¶µl°Ú$ãå½*+¶z°Ú$ãæ½ *+¶¶z°Ú$ãç½ *+¶¶z°Ú$è¼½»{Y·|³H²H}»JY2·~¶W²H€»JY·~¶W²H»JY&·~¶W²H‚»JY'·~¶W²Hƒ»JY(·~¶W²H„»JY·~¶W²H…»JY)·~¶W²H†»JY*·~¶W²H‡»JY+·~¶W²Hˆ»JY,·~¶W²H‰»JY·~¶W²HŠ»JY-·~¶W²H‹»JY.·~¶W²HŒ»JY3·~¶W±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/Util.class0000644000175000017500000000607207773723336024525 0ustar arnaudarnaudÊþº¾.o 12 3 45 46 78 9: ;< 7= 9> ?@ A BC A D E F G H I J KL M >NOSUBSTITUTE_ALLI ConstantValueÿÿÿÿ SPLIT_ALL()VCodesplity(Ljava/util/Collection;Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;Ljava/lang/String;I)Vx(Ljava/util/Collection;Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;Ljava/lang/String;)Vt(Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;Ljava/lang/String;I)Ljava/util/Vector; Deprecateds(Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;Ljava/lang/String;)Ljava/util/Vector; substituteœ(Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;Lorg/apache/oro/text/regex/Substitution;Ljava/lang/String;I)Ljava/lang/String;›(Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;Lorg/apache/oro/text/regex/Substitution;Ljava/lang/String;)Ljava/lang/String;£(Ljava/lang/StringBuffer;Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;Lorg/apache/oro/text/regex/Substitution;Ljava/lang/String;I)IÀ(Ljava/lang/StringBuffer;Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;Lorg/apache/oro/text/regex/Substitution;Lorg/apache/oro/text/regex/PatternMatcherInput;I)I #$-org/apache/oro/text/regex/PatternMatcherInput #PQ RS TUV WXY Z[\ ]^ _X `a &'java/util/Vector #b &)java/lang/StringBuffer ,0 cd ,- ea fg ha ijk lm naorg/apache/oro/text/regex/Utiljava/lang/Object(Ljava/lang/String;)V(org/apache/oro/text/regex/PatternMatchercontainsU(Lorg/apache/oro/text/regex/PatternMatcherInput;Lorg/apache/oro/text/regex/Pattern;)ZgetMatch)()Lorg/apache/oro/text/regex/MatchResult;%org/apache/oro/text/regex/MatchResult beginOffset(I)Ijava/lang/String substring(II)Ljava/lang/String;java/util/Collectionadd(Ljava/lang/Object;)Z endOffsetlength()I(I)VtoString()Ljava/lang/String;getBeginOffset getBuffer()[CgetMatchBeginOffsetappend([CII)Ljava/lang/StringBuffer;&org/apache/oro/text/regex/SubstitutionappendSubstitution¿(Ljava/lang/StringBuffer;Lorg/apache/oro/text/regex/MatchResult;ILorg/apache/oro/text/regex/PatternMatcherInput;Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;)VgetMatchEndOffset1 !" #$%*·± &'%i]»Y-·:6„ÿ™9+,¹™-+¹:*-¹¶¹W¹ 6§ÿÅ*--¶ ¶¹W± &(% *+,-¸ ± &)%#» Y· :*+,¸ °* &+%*+,¸°* ,-%:.»Y-¶ ·:»Y-·:*+,¸™ ¶°-° ,.% *+,-¸° ,/%#»Y·:*+,-¸¬ ,0%v j6¶6¶:™C+,¹™7„ÿ„*¶d¶W-*+¹+,¹¶6§ÿ¾*¶d¶W¬jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/regex/Perl5Pattern.class0000644000175000017500000000160707773723336026134 0ustar arnaudarnaudÊþº¾.4 , - ./0123 _OPT_ANCH_BOLI ConstantValue_OPT_ANCH_MBOL _OPT_SKIP _OPT_IMPLICIT _OPT_ANCH _expressionLjava/lang/String;_program[C _mustUtility_back _minLength_numParentheses_isCaseInsensitiveZ _isExpensive_startClassOffset_anchor_options _mustString _startString()VCode getPattern()Ljava/lang/String; getOptions()I %&  " &org/apache/oro/text/regex/Perl5Patternjava/lang/Object!org/apache/oro/text/regex/Patternjava/io/Serializablejava/lang/Cloneable1                ! " #$%&'*·±()'*´°*+'*´¬jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/perl/0000755000175000017500000000000010423237774022374 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/classes/org/apache/oro/text/perl/Perl5Util.class0000644000175000017500000001721207773723336025263 0ustar arnaudarnaudÊþº¾.B d§¨ § c©ª § c« c¬­ ®¯ ° c± c²³ § c´µ §¶ · c¸¹º » ¼ ½¾¿À ÁÂà §Ä Å Æ ¼ Ç eÈ BÉ BÊË ®Ì ½Í cÎ Ï cÐ cÑ cÒ cÓ BÔ cÕ Ö H× HØ HÙÚ 8Û 8Ü 8Ý Þß àá ° Bâ ãäå Bæç Dè 8é cêë H¼ eì Bí î eï eð ñ ò ó ôõ ö c÷ø cùú W° cû cü eÉ eý eþ dÆÿ  B__matchExpressionLjava/lang/String; ConstantValue__patternCache"Lorg/apache/oro/text/PatternCache;__expressionCacheLorg/apache/oro/util/Cache; __matcher(Lorg/apache/oro/text/regex/Perl5Matcher;__matchPattern#Lorg/apache/oro/text/regex/Pattern; __lastMatch'Lorg/apache/oro/text/regex/MatchResult; __splitListLjava/util/ArrayList;__originalInputLjava/lang/Object;__inputBeginOffsetI__inputEndOffset __nullString SPLIT_ALL%(Lorg/apache/oro/text/PatternCache;)VCode()V__compilePatterns__parseMatchExpression7(Ljava/lang/String;)Lorg/apache/oro/text/regex/Pattern; Exceptionsmatch(Ljava/lang/String;[C)Z'(Ljava/lang/String;Ljava/lang/String;)ZD(Ljava/lang/String;Lorg/apache/oro/text/regex/PatternMatcherInput;)ZgetMatch)()Lorg/apache/oro/text/regex/MatchResult; substitute?(Ljava/lang/StringBuffer;Ljava/lang/String;Ljava/lang/String;)I8(Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;split>(Ljava/util/Collection;Ljava/lang/String;Ljava/lang/String;I)V=(Ljava/util/Collection;Ljava/lang/String;Ljava/lang/String;)V+(Ljava/util/Collection;Ljava/lang/String;)V9(Ljava/lang/String;Ljava/lang/String;I)Ljava/util/Vector; Deprecated8(Ljava/lang/String;Ljava/lang/String;)Ljava/util/Vector;&(Ljava/lang/String;)Ljava/util/Vector;length()Igroupsgroup(I)Ljava/lang/String;begin(I)Iend beginOffset endOffsettoString()Ljava/lang/String;preMatch postMatchpreMatchCharArray()[CpostMatchCharArray }€java/util/ArrayList st&org/apache/oro/text/regex/Perl5Matcher mn ijorg/apache/oro/util/CacheLRU — } kl €#org/apache/oro/text/PatternCacheLRU }~'org/apache/oro/text/regex/Perl5Compilerm?(\W)(.*)\1([imsx]*)   op3org/apache/oro/text/regex/MalformedPatternExceptionjava/lang/RuntimeException  ¡ }  !org/apache/oro/text/regex/Patternjava/lang/ClassCastException 7org/apache/oro/text/perl/MalformedPerl5PatternExceptionjava/lang/StringBufferInvalid expression:   ¡ ‰Š ™š –— Invalid options:   ‚ƒ  qr uv wx yx ¥ …†    — !—0org/apache/oro/text/perl/ParsedSubstitutionEntry "p #$ %x& ‹'( )* +, -Invalid option: java/lang/String }.+org/apache/oro/text/regex/Perl5Substitution }/ }0 ‹Œ-org/apache/oro/text/regex/PatternMatcherInput žœ 12 34 ˜— Ÿœ 5— 67 879 :; <€ Ž/\s+/ Žjava/util/Vector Ž’ Ž” ›œ œ[C= >? @A"org/apache/oro/text/perl/Perl5Utiljava/lang/Object%org/apache/oro/text/regex/MatchResult org/apache/oro/text/PatternCachecapacity(I)Vcompile8(Ljava/lang/String;I)Lorg/apache/oro/text/regex/Pattern; getMessage(Ljava/lang/String;)Vorg/apache/oro/util/Cache getElement&(Ljava/lang/Object;)Ljava/lang/Object;matches8(Ljava/lang/String;Lorg/apache/oro/text/regex/Pattern;)Zappend,(Ljava/lang/String;)Ljava/lang/StringBuffer;charAt(I)C getPattern addElement'(Ljava/lang/Object;Ljava/lang/Object;)Vcontains(([CLorg/apache/oro/text/regex/Pattern;)Z toCharArrayU(Lorg/apache/oro/text/regex/PatternMatcherInput;Lorg/apache/oro/text/regex/Pattern;)ZgetInput()Ljava/lang/Object;getBeginOffset getEndOffset_pattern _substitution-Lorg/apache/oro/text/regex/Perl5Substitution;_numSubstitutionsorg/apache/oro/text/regex/Util£(Ljava/lang/StringBuffer;Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;Lorg/apache/oro/text/regex/Substitution;Ljava/lang/String;I)Ijava/lang/CharacterisLetterOrDigit(C)Z lastIndexOf(II)I(C)Ljava/lang/StringBuffer;([CII)V(Ljava/lang/String;I)VT(Lorg/apache/oro/text/regex/Pattern;Lorg/apache/oro/text/regex/Perl5Substitution;I)V substring(II)Ljava/lang/String;add(Ljava/lang/Object;)Zsizeget(I)Ljava/lang/Object;removejava/util/CollectionaddAll(Ljava/util/Collection;)Zclearjava/lang/System arraycopy*(Ljava/lang/Object;ILjava/lang/Object;II)VgetChars(II[CI)V1cde fghijklmnopqrstuvwxyxzgh_{xh|}~A5*·*»Y·µ*»Y·µ*+µ*» Y+¹ · µ *· ±}€ *»Y··±€9%»Y·L*+¶µ§M»Y,¶·¿±‚ƒ *´ +¹:Æ À°§: *´+*´¶š»Y»Y· !¶"+¶"¶#·$¿*´¶%:¹&:>¹&:Æ|¶'=Yd=žn¶(«Ii*m1s9xA€>§ÿÄ€>§ÿ¼€>§ÿ´ €>§ÿ¬»Y»Y· )¶"¶"¶#·$¿*´¹*:*´ +¹+° „!…†A5*+·,W*´,*+·,¶->™**´¶%µ.*,µ/*µ0*,¾µ1¬„!…‡ *+,¶2¶3¬„!…ˆC7*´,*+·,¶4>™&**´¶%µ.*,¶5µ/*,¶6µ0*,¶7µ1¬„!‰Š*´.°!‹Œ'*´ ,¹:Æ:À8:§:§++*´´9´:-´;¸<6 **´¶%µ. ¬,¶2:¾¡4s 4¸=š 4- »Y»Y· !¶",¶"¶#·$¿466 Y6 6 6 6¾¢I4\ š§6§*4 š 6 §™š§6„§ÿµ Ÿ  ¾d »Y»Y· !¶",¶"¶#·$¿66»Y¾ d·>: `6¾¢y4\ Bš§6™M`¾¢C`4 7,¾d¶?`Ÿ%6§*4 ™ 6 §664¶@W„§ÿ…  »Y»Y· !¶",¶"¶#·$¿66'Ÿ 6 §6  `6¾¢¶4ª‰gx}‰V‰‰‰_‰ƒ‰‰‰i‰‰‰‰s€6§L€6§B€6§8 €6§.6§(6 §"»Y»Y· A¶"4¶@¶#·$¿„§ÿH*´»BY   d·C¹*:»DY¶# ·E:»8Y·F:*´ ,¹++*´-¸<6 **´¶%µ. ¬„!‹!»Y· N*-+,¶GW-¶#°„!Žú: *,·,: »HY-·I: 6„ÿ™}*´  ¶4™o*´¶%: *´- ¹J¶K¶LW ¹MY6¤56¢+ ¹&:ƶ'ž *´¶LW„§ÿÔ ¹N6§ÿ*´--¶'¶K¶LW*´¶Od6  ›)*´ ¶PÀB:  ¶'š*´ ¶QW„ ÿ§ÿØ+*´¹RW*´¶S* µ.±„!Ž *+,-¶T±„!Ž‘ *+U,¶V±„!Ž’#»WY·X:*+,¶T°„“!Ž”*+,¶Y°„“!Ž•*U+¶Z°„“!–— *´.¹[¬!˜— *´.¹M¬!™š *´.¹&°!›œ *´.¹\¬!œ *´.¹]¬!žœ *´.¹J¬!Ÿœ *´.¹N¬! ¡*´.ǰ*´.¶^°!¢¡}q*´/Ç_°*´.¹J<_°*´/Á`™"*´/À`M,¾¤,¾<»BY,*´0·C°*´/ÁB™"*´/ÀBM,¶'¤,¶'<,*´0¶K°_°!£¡}q*´/Ç_°*´.¹N<œ_°*´/Á`™$*´/À`M,¾¡_°»BY,*´1d·C°*´/ÁB™ *´/ÀBM,¶'¡_°,*´1¶K°_°!¤¥„M*´/ǰ*´.¹J<°*´/Á`™,*´/À`N-¾¡-¾<*´0d¼M-*´0,,¾¸a§6*´/ÁB™,*´/ÀBN-¶'¡-¶'<*´0d¼M-*´0,¶b,°!¦¥M*´/ǰ*´.¹N<œ°*´/Á`™,*´/À`:¾¡°*´1d>¼M,¸a§3*´/ÁB™)*´/ÀBN*´1¡°*´1d¼M-*´1,¶b,°jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/perl/MalformedPerl5PatternException.class0000644000175000017500000000041607773723336031467 0ustar arnaudarnaudÊþº¾.     ()VCode(Ljava/lang/String;)V  7org/apache/oro/text/perl/MalformedPerl5PatternException2org/apache/oro/text/MalformedCachePatternException1*·±*+·±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/perl/ParsedSubstitutionEntry.class0000644000175000017500000000071007773723336030326 0ustar arnaudarnaudÊþº¾.    _numSubstitutionsI_pattern#Lorg/apache/oro/text/regex/Pattern; _substitution-Lorg/apache/oro/text/regex/Perl5Substitution;T(Lorg/apache/oro/text/regex/Pattern;Lorg/apache/oro/text/regex/Perl5Substitution;I)VCode   0org/apache/oro/text/perl/ParsedSubstitutionEntryjava/lang/Object()V0   *·*µ*,µ*+µ±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/GlobCompiler.class0000644000175000017500000000407107773723336025051 0ustar arnaudarnaudÊþº¾.S4 5 67 89: ; < = >? > @€ A B C DEFGH DEFAULT_MASKI ConstantValueCASE_INSENSITIVE_MASKSTAR_CANNOT_MATCH_NULL_MASK!QUESTION_MATCHES_ZERO_OR_ONE_MASKREAD_ONLY_MASK__perl5Compiler)Lorg/apache/oro/text/regex/Perl5Compiler;__isPerl5MetaCharacter(C)ZCode__isGlobMetaCharacter globToPerl5([CI)Ljava/lang/String;()Vcompile(([CI)Lorg/apache/oro/text/regex/Pattern; ExceptionsI'([C)Lorg/apache/oro/text/regex/Pattern;7(Ljava/lang/String;)Lorg/apache/oro/text/regex/Pattern;8(Ljava/lang/String;I)Lorg/apache/oro/text/regex/Pattern;java/lang/StringBuffer +J KL.+ KM.*.? (& %& NO +,'org/apache/oro/text/regex/Perl5Compiler #$ )* -3 -.P QR org/apache/oro/text/GlobCompilerjava/lang/Object)org/apache/oro/text/regex/PatternCompiler3org/apache/oro/text/regex/MalformedPatternException(I)Vappend(C)Ljava/lang/StringBuffer;,(Ljava/lang/String;)Ljava/lang/StringBuffer;toString()Ljava/lang/String;java/lang/String toCharArray()[C1 !"#$ %&'fZ*ŸQ?ŸK+ŸE[Ÿ?]Ÿ9(Ÿ3)Ÿ-|Ÿ'^Ÿ!$Ÿ.Ÿ{Ÿ}Ÿ \ §¬ (&'**Ÿ?Ÿ[Ÿ ] §¬ )*'¥™>»Y*¾h·:=~™§6~™§>6*¾¢b*4«5*4?][‡\ò]ã™*¶W§™¶W§¶W§û™?¶W§ì™¶W§Ü.¶W§Ñ=*4¶W`*¾¢¼*`4«>!"]0^"^¶W„§†]¶W„§x§u=*4¶W§f\¶W*¾d \¶W§J*`4¸™*„4¶W§.\¶W§#š*4¸ ™ \¶W*4¶W„§þ¶ °+,'*· *» Y· µ±-.'2&>~™€>~™€>*´+¸¶°/0-1'*+¶°/0-2' *+¶¶°/0-3' *+¶¶°/0jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/PatternCacheLRU.class0000644000175000017500000000111007773723336025406 0ustar arnaudarnaudÊþº¾.     /(ILorg/apache/oro/text/regex/PatternCompiler;)VCode.(Lorg/apache/oro/text/regex/PatternCompiler;)V(I)V()Vorg/apache/oro/util/CacheLRU   'org/apache/oro/text/regex/Perl5Compiler #org/apache/oro/text/PatternCacheLRU'org/apache/oro/text/GenericPatternCacheI(Lorg/apache/oro/util/Cache;Lorg/apache/oro/text/regex/PatternCompiler;)V1   *»Y·,·±  *+·±   *»Y··±  *·±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/MatchAction.class0000644000175000017500000000022507773723336024662 0ustar arnaudarnaudÊþº¾. processMatch((Lorg/apache/oro/text/MatchActionInfo;)Vorg/apache/oro/text/MatchActionjava/lang/Objectjakarta-oro-2.0.8/docs/classes/org/apache/oro/text/PatternCache.class0000644000175000017500000000072107773723336025032 0ustar arnaudarnaudÊþº¾.  addPattern7(Ljava/lang/String;)Lorg/apache/oro/text/regex/Pattern; Exceptions8(Ljava/lang/String;I)Lorg/apache/oro/text/regex/Pattern; getPatternsize()Icapacity org/apache/oro/text/PatternCachejava/lang/Object3org/apache/oro/text/regex/MalformedPatternException2org/apache/oro/text/MalformedCachePatternException    jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/PatternCacheFIFO2.class0000644000175000017500000000111407773723336025555 0ustar arnaudarnaudÊþº¾.     /(ILorg/apache/oro/text/regex/PatternCompiler;)VCode.(Lorg/apache/oro/text/regex/PatternCompiler;)V(I)V()Vorg/apache/oro/util/CacheFIFO2   'org/apache/oro/text/regex/Perl5Compiler %org/apache/oro/text/PatternCacheFIFO2'org/apache/oro/text/GenericPatternCacheI(Lorg/apache/oro/util/Cache;Lorg/apache/oro/text/regex/PatternCompiler;)V1   *»Y·,·±  *+·±   *»Y··±  *·±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/PatternCacheFIFO.class0000644000175000017500000000111207773723336025471 0ustar arnaudarnaudÊþº¾.     /(ILorg/apache/oro/text/regex/PatternCompiler;)VCode.(Lorg/apache/oro/text/regex/PatternCompiler;)V(I)V()Vorg/apache/oro/util/CacheFIFO   'org/apache/oro/text/regex/Perl5Compiler $org/apache/oro/text/PatternCacheFIFO'org/apache/oro/text/GenericPatternCacheI(Lorg/apache/oro/util/Cache;Lorg/apache/oro/text/regex/PatternCompiler;)V1   *»Y·,·±  *+·±   *»Y··±  *·±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/PatternCacheRandom.class0000644000175000017500000000111607773723336026172 0ustar arnaudarnaudÊþº¾.     /(ILorg/apache/oro/text/regex/PatternCompiler;)VCode.(Lorg/apache/oro/text/regex/PatternCompiler;)V(I)V()Vorg/apache/oro/util/CacheRandom   'org/apache/oro/text/regex/Perl5Compiler &org/apache/oro/text/PatternCacheRandom'org/apache/oro/text/GenericPatternCacheI(Lorg/apache/oro/util/Cache;Lorg/apache/oro/text/regex/PatternCompiler;)V1   *»Y·,·±  *+·±   *»Y··±  *·±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/MatchActionProcessor.class0000644000175000017500000000752607773723336026575 0ustar arnaudarnaudÊþº¾.Ë =Z <[\ Z <] <^_ Z <` <a <bc Zd Z <e fg h <i <j <kl mn o <p qr st uv Zw "Z x y z { | } ~  € ‚ ƒ „… †‡ †ˆ ‰ Š ‹ Œ Ž 8‘ ’ “”•__fieldSeparator#Lorg/apache/oro/text/regex/Pattern; __compiler+Lorg/apache/oro/text/regex/PatternCompiler; __matcher*Lorg/apache/oro/text/regex/PatternMatcher; __patternsLjava/util/Vector; __actions__defaultAction!Lorg/apache/oro/text/MatchAction;X(Lorg/apache/oro/text/regex/PatternCompiler;Lorg/apache/oro/text/regex/PatternMatcher;)VCode()V addAction7(Ljava/lang/String;ILorg/apache/oro/text/MatchAction;)V Exceptions–(Ljava/lang/String;I)V(Ljava/lang/String;)V6(Ljava/lang/String;Lorg/apache/oro/text/MatchAction;)VsetFieldSeparatorprocessMatches@(Ljava/io/InputStream;Ljava/io/OutputStream;Ljava/lang/String;)V—.(Ljava/io/InputStream;Ljava/io/OutputStream;)V#(Ljava/io/Reader;Ljava/io/Writer;)V IL >?java/util/Vector DE FE&org/apache/oro/text/DefaultMatchAction GH @A BC'org/apache/oro/text/regex/Perl5Compiler&org/apache/oro/text/regex/Perl5Matcher IJ˜ ™š ›œ MN MQ TQjava/io/InputStreamReader Ijava/io/OutputStreamWriter Iž UY IŸjava/io/LineNumberReader I java/io/PrintWriter I¡#org/apache/oro/text/MatchActionInfojava/util/ArrayList ¢C £? ¤¥ ¦§ ¨© ª« ¬­ ®¯ °±² ³´ µ¶ ·¸!org/apache/oro/text/regex/Pattern¹ º» ¼½ ¾¿ À« Á? ÃLÄ ÅÆorg/apache/oro/text/MatchAction ÇÈ ÉL ÊL(org/apache/oro/text/MatchActionProcessorjava/lang/Object3org/apache/oro/text/regex/MalformedPatternExceptionjava/io/IOException)org/apache/oro/text/regex/PatternCompilercompile8(Ljava/lang/String;I)Lorg/apache/oro/text/regex/Pattern; addElement(Ljava/lang/Object;)V*(Ljava/io/InputStream;Ljava/lang/String;)V(Ljava/io/OutputStream;)V(Ljava/io/InputStream;)V(Ljava/io/Reader;)V(Ljava/io/Writer;)VmatcherfieldSeparatorinputLjava/io/BufferedReader;outputLjava/io/PrintWriter;fieldsLjava/util/List;size()I lineNumberIreadLine()Ljava/lang/String;lineLjava/lang/String;java/lang/String toCharArray()[CcharLine[C elementAt(I)Ljava/lang/Object;(org/apache/oro/text/regex/PatternMatchercontains(([CLorg/apache/oro/text/regex/Pattern;)ZgetMatch)()Lorg/apache/oro/text/regex/MatchResult;match'Lorg/apache/oro/text/regex/MatchResult; getLineNumberpatternjava/util/Listclearorg/apache/oro/text/regex/Utilsplitx(Ljava/util/Collection;Lorg/apache/oro/text/regex/PatternMatcher;Lorg/apache/oro/text/regex/Pattern;Ljava/lang/String;)V processMatch((Lorg/apache/oro/text/MatchActionInfo;)Vflushclose1<=>?@ABCDEFEGH IJKA5*·*µ*»Y·µ*»Y·µ*»Y·µ *+µ *,µ ±ILK*» Y· »Y··±MNK6*+Æ*´*´ +¹¶§ *´¶*´-¶±OPMQK *+*´ ¶±OPMRK*+¶±OPMSK*+,¶±OPTQK&+Ç *µ±**´ +¹µ±OPTRK*+¶±OPUVK"*»Y+-·»Y,·¶±OWUXK!*»Y+·»Y,·¶±OWUYKŒ €»Y+·:»Y,·:» Y·!:»"Y·#: *´ µ$*´µ%µ&µ'µ(*´¶)>µ*¶+Zµ,Æ´,¶-µ.6¢ÿß*´¶/:Æ*´¶/À0: *´ ´. ¹1™Ã*´ ¹2µ3¶4µ* µ5*´Æ& ¹6 *´ *´´,¸7 µ(§ µ(*´¶/À8:  ¹9§Zµ3¶4µ**´Æ& ¹6 *´ *´´,¸7 µ(§ µ(*´¶/À8:  ¹9„§ÿ¶:¶;±OWjakarta-oro-2.0.8/docs/classes/org/apache/oro/text/GenericPatternCache.class0000644000175000017500000000322507773723336026331 0ustar arnaudarnaudÊþº¾.V - . / 012 3 45 06 789: -; <= > ? @ A 0B 0CDEF _compiler+Lorg/apache/oro/text/regex/PatternCompiler;_cacheLorg/apache/oro/util/Cache;DEFAULT_CAPACITYI ConstantValueI(Lorg/apache/oro/util/Cache;Lorg/apache/oro/text/regex/PatternCompiler;)VCode addPattern8(Ljava/lang/String;I)Lorg/apache/oro/text/regex/Pattern; Exceptions7(Ljava/lang/String;)Lorg/apache/oro/text/regex/Pattern; getPatternsize()Icapacity "G  H IJ!org/apache/oro/text/regex/Pattern K+L M& NO %&3org/apache/oro/text/regex/MalformedPatternException2org/apache/oro/text/MalformedCachePatternExceptionjava/lang/StringBufferInvalid expression: PQ RS TS "U )& *+ ,+'org/apache/oro/text/GenericPatternCachejava/lang/Object org/apache/oro/text/PatternCache()Vorg/apache/oro/util/Cache getElement&(Ljava/lang/Object;)Ljava/lang/Object; getOptions)org/apache/oro/text/regex/PatternCompilercompile addElement'(Ljava/lang/Object;Ljava/lang/Object;)Vappend,(Ljava/lang/String;)Ljava/lang/StringBuffer; getMessage()Ljava/lang/String;toString(Ljava/lang/String;)V! !"#$*·*+µ*,µ±1%&$K?*´+¹N-Æ-À:¹ °*´+¹:*´+¹°' 1%($*+¶ °' 1)&$L8N*+¶ N§-:» Y» Y· ¶+¶¶¶¶¶·¿-° ' 1)($*+¶°' *+$ *´¹¬,+$ *´¹¬jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/DefaultMatchAction.class0000644000175000017500000000073707773723336026177 0ustar arnaudarnaudÊþº¾.    ()VCode processMatch((Lorg/apache/oro/text/MatchActionInfo;)V     &org/apache/oro/text/DefaultMatchActionjava/lang/Objectorg/apache/oro/text/MatchAction#org/apache/oro/text/MatchActionInfooutputLjava/io/PrintWriter;lineLjava/lang/String;java/io/PrintWriterprintln(Ljava/lang/String;)V0  *·±   +´+´¶±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/MalformedCachePatternException.class0000644000175000017500000000036107773723336030540 0ustar arnaudarnaudÊþº¾.     ()VCode(Ljava/lang/String;)V  2org/apache/oro/text/MalformedCachePatternExceptionjava/lang/RuntimeException!*·±*+·±jakarta-oro-2.0.8/docs/classes/org/apache/oro/text/MatchActionInfo.class0000644000175000017500000000105507773723336025500 0ustar arnaudarnaudÊþº¾.  lineNumberIlineLjava/lang/String;charLine[CfieldSeparator#Lorg/apache/oro/text/regex/Pattern;fieldsLjava/util/List;matcher*Lorg/apache/oro/text/regex/PatternMatcher;patternmatch'Lorg/apache/oro/text/regex/MatchResult;outputLjava/io/PrintWriter;inputLjava/io/BufferedReader;()VCode #org/apache/oro/text/MatchActionInfojava/lang/Object1     *·±jakarta-oro-2.0.8/docs/classes/org/apache/oro/util/0000755000175000017500000000000010423237774021423 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/classes/org/apache/oro/util/GenericCache.class0000644000175000017500000000206207773723336024762 0ustar arnaudarnaudÊþº¾.@ ( )* + ,- . + / 0 1 234567DEFAULT_CAPACITYI ConstantValue _numEntries_cache([Lorg/apache/oro/util/GenericCacheEntry;_tableLjava/util/HashMap;(I)VCode addElement'(Ljava/lang/Object;Ljava/lang/Object;)V getElement&(Ljava/lang/Object;)Ljava/lang/Object;keys()Ljava/util/Iterator;size()IcapacityisFull()Z 8 java/util/HashMap  %org/apache/oro/util/GenericCacheEntry  9 :; <=> ?" org/apache/oro/util/GenericCachejava/lang/Objectorg/apache/oro/util/Cachejava/io/Serializable()Vget_valueLjava/lang/Object;keySet()Ljava/util/Set; java/util/Setiterator! B6*·*µ*»Y·µ*½µ„ÿ›*´»Y·S§ÿë±! #*´+¶ M,Æ ,À´ °°!" *´¶ ¹ °#$*´¬%$*´¾¬&'*´*´¾¡§¬jakarta-oro-2.0.8/docs/classes/org/apache/oro/util/CacheRandom.class0000644000175000017500000000201507773723336024624 0ustar arnaudarnaudÊþº¾.G    !  " #$% & ' ( ) * + #, #-./__randomLjava/util/Random;(I)VCode()V addElement'(Ljava/lang/Object;Ljava/lang/Object;)V java/util/Random0 12 3  456 78%org/apache/oro/util/GenericCacheEntry 9: ;: <= >? @A BC D8 EForg/apache/oro/util/CacheRandom org/apache/oro/util/GenericCachejava/lang/SystemcurrentTimeMillis()J(J)V_tableLjava/util/HashMap;java/util/HashMapget&(Ljava/lang/Object;)Ljava/lang/Object;_valueLjava/lang/Object;_keyisFull()Z _numEntriesI_cache([Lorg/apache/oro/util/GenericCacheEntry; nextFloat()Fremoveput8(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;1 *·*»Y¸·µ±*·±1*´+¶:ÆÀ :,µ +µ ±*¶ š*´ >*Y´ `µ §$*´¾†*´¶j‹>*´*´2´ ¶W*´2,µ *´2+µ *´+*´2¶W±jakarta-oro-2.0.8/docs/classes/org/apache/oro/util/Cache.class0000644000175000017500000000036207773723336023466 0ustar arnaudarnaudÊþº¾.    addElement'(Ljava/lang/Object;Ljava/lang/Object;)V getElement&(Ljava/lang/Object;)Ljava/lang/Object;size()Icapacityorg/apache/oro/util/Cachejava/lang/Object jakarta-oro-2.0.8/docs/classes/org/apache/oro/util/GenericCacheEntry.class0000644000175000017500000000046207773723336026006 0ustar arnaudarnaudÊþº¾.    _indexI_valueLjava/lang/Object;_key(I)VCode   %org/apache/oro/util/GenericCacheEntryjava/lang/Objectjava/io/Serializable()V0   *·*µ*µ*µ±jakarta-oro-2.0.8/docs/classes/org/apache/oro/util/CacheFIFO.class0000644000175000017500000000157507773723336024141 0ustar arnaudarnaudÊþº¾.7         ! " # $%&__curentI(I)VCode()V addElement'(Ljava/lang/Object;Ljava/lang/Object;)V   '() *+%org/apache/oro/util/GenericCacheEntry ,- .- /0 1 23 4+ 56org/apache/oro/util/CacheFIFO org/apache/oro/util/GenericCache_tableLjava/util/HashMap;java/util/HashMapget&(Ljava/lang/Object;)Ljava/lang/Object;_valueLjava/lang/Object;_keyisFull()Z _numEntries_cache([Lorg/apache/oro/util/GenericCacheEntry;removeput8(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;1 *·*µ±*·±1šŽ*´+¶:ÆÀ:,µ+µ±*¶ š*´ >*Y´ `µ §1*´>*Y´`Zµ*´ ¾¡*µ*´*´ 2´¶ W*´ 2,µ*´ 2+µ*´+*´ 2¶ W±jakarta-oro-2.0.8/docs/classes/org/apache/oro/util/CacheFIFO2.class0000644000175000017500000000206707773723336024220 0ustar arnaudarnaudÊþº¾.?    !  " #$% & ' ( ) * #+ #,-. __currentI __tryAgain[Z(I)VCode()V getElement&(Ljava/lang/Object;)Ljava/lang/Object; addElement'(Ljava/lang/Object;Ljava/lang/Object;)V   /0  123 4%org/apache/oro/util/GenericCacheEntry 5 67 87 9: ; < =>org/apache/oro/util/CacheFIFO2 org/apache/oro/util/GenericCache_cache([Lorg/apache/oro/util/GenericCacheEntry;_tableLjava/util/HashMap;java/util/HashMapget_index_valueLjava/lang/Object;_keyisFull()Z _numEntriesremoveput8(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;1"*·*µ**´¾¼µ±*·±!/#*´+¶M,Æ,ÀN*´-´ T-´ °°1ƺ*´+¶:Æ"À:,µ +µ *´´ T±*¶ š*´ >*Y´ `µ §R*´>*´3™*´T„*´¾¡ÿç>§ÿâ*`µ*´*´¾¡*µ*´*´2´ ¶W*´2,µ *´2+µ *´+*´2¶W±jakarta-oro-2.0.8/docs/classes/org/apache/oro/util/CacheLRU.class0000644000175000017500000000241007773723336024045 0ustar arnaudarnaudÊþº¾.H $ % & ' ( ) $ * +,- . / 0 1 2 3 +4 +567__headI__tail__next[I__prev(I)VCode()V __moveToFront getElement&(Ljava/lang/Object;)Ljava/lang/Object; addElement'(Ljava/lang/Object;Ljava/lang/Object;)V    89   :;< =!%org/apache/oro/util/GenericCacheEntry >  ?@ A@ BC D E! FGorg/apache/oro/util/CacheLRU org/apache/oro/util/GenericCache_cache([Lorg/apache/oro/util/GenericCacheEntry;_tableLjava/util/HashMap;java/util/HashMapget_index_valueLjava/lang/Object;_keyisFull()Z _numEntriesremoveput8(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;1QE*·*µ*µ**´¾¼ µ**´¾¼ µ=*´¾¢*´*´[OO„§ÿæ±*·±]Q*´ŸK*´.=*´.>*´O› *´O§*µ*´O*´*´O*´*´O*µ±! !-!*´+¶ M,Æ,À N*-´ · -´ °°1"#¸¬*´+¶ N-Æ-À :,µ +µ*´ · ±*¶š6*´ž"*´*´*´O*´*´O**´· *Y´`µ§*´*´*´2´¶W**´· *´*´2,µ *´*´2+µ*´+*´*´2¶W±jakarta-oro-2.0.8/docs/classes/org/apache/oro/io/0000755000175000017500000000000010423237774021055 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/classes/org/apache/oro/io/GlobFilenameFilter.class0000644000175000017500000000176107773723336025613 0ustar arnaudarnaudÊþº¾.(        !"# __MATCHER*Lorg/apache/oro/text/regex/PatternMatcher;__CACHE"Lorg/apache/oro/text/PatternCache;(Ljava/lang/String;I)VCode(Ljava/lang/String;)V()V   $ % &&org/apache/oro/text/regex/Perl5Matcher #org/apache/oro/text/PatternCacheLRU org/apache/oro/text/GlobCompiler '$org/apache/oro/io/GlobFilenameFilter%org/apache/oro/io/RegexFilenameFilterb(Lorg/apache/oro/text/PatternCache;Lorg/apache/oro/text/regex/PatternMatcher;Ljava/lang/String;I)Va(Lorg/apache/oro/text/PatternCache;Lorg/apache/oro/text/regex/PatternMatcher;Ljava/lang/String;)VO(Lorg/apache/oro/text/PatternCache;Lorg/apache/oro/text/regex/PatternMatcher;)V.(Lorg/apache/oro/text/regex/PatternCompiler;)V!  *²²+·± *²²+·± *²²·±(»Y·³»Y» Y· · ³±jakarta-oro-2.0.8/docs/classes/org/apache/oro/io/Perl5FilenameFilter.class0000644000175000017500000000161207773723336025712 0ustar arnaudarnaudÊþº¾.#        __MATCHER*Lorg/apache/oro/text/regex/PatternMatcher;__CACHE"Lorg/apache/oro/text/PatternCache;(Ljava/lang/String;I)VCode(Ljava/lang/String;)V()V   ! "&org/apache/oro/text/regex/Perl5Matcher #org/apache/oro/text/PatternCacheLRU%org/apache/oro/io/Perl5FilenameFilter%org/apache/oro/io/RegexFilenameFilterb(Lorg/apache/oro/text/PatternCache;Lorg/apache/oro/text/regex/PatternMatcher;Ljava/lang/String;I)Va(Lorg/apache/oro/text/PatternCache;Lorg/apache/oro/text/regex/PatternMatcher;Ljava/lang/String;)VO(Lorg/apache/oro/text/PatternCache;Lorg/apache/oro/text/regex/PatternMatcher;)V!   *²²+·± *²²+·± *²²·±!»Y·³»Y· ³±jakarta-oro-2.0.8/docs/classes/org/apache/oro/io/RegexFilenameFilter.class0000644000175000017500000000317507773723336026003 0ustar arnaudarnaudÊþº¾.C $ % & ' () * +, - +. /0 123456_cache"Lorg/apache/oro/text/PatternCache;_matcher*Lorg/apache/oro/text/regex/PatternMatcher;_pattern#Lorg/apache/oro/text/regex/Pattern;a(Lorg/apache/oro/text/PatternCache;Lorg/apache/oro/text/regex/PatternMatcher;Ljava/lang/String;)VCodeb(Lorg/apache/oro/text/PatternCache;Lorg/apache/oro/text/regex/PatternMatcher;Ljava/lang/String;I)VO(Lorg/apache/oro/text/PatternCache;Lorg/apache/oro/text/regex/PatternMatcher;)VsetFilterExpression(Ljava/lang/String;)V Exceptions7(Ljava/lang/String;I)Vaccept#(Ljava/io/File;Ljava/lang/String;)Z(Ljava/io/File;)Z 8      9 :;  :<= >?@ AB%org/apache/oro/io/RegexFilenameFilterjava/lang/Objectjava/io/FilenameFilterjava/io/FileFilter2org/apache/oro/text/MalformedCachePatternException()V org/apache/oro/text/PatternCache getPattern7(Ljava/lang/String;)Lorg/apache/oro/text/regex/Pattern;8(Ljava/lang/String;I)Lorg/apache/oro/text/regex/Pattern;(org/apache/oro/text/regex/PatternMatchermatches8(Ljava/lang/String;Lorg/apache/oro/text/regex/Pattern;)Z java/io/FilegetName()Ljava/lang/String;!  *·*+µ*,µ*-¶±"*·*+µ*,µ*-¶± *+,·±**´+¹µ ± **´+¹ µ ±!";*´YNÂ*´,*´ ¹ -ì:-ÿ!#< *´YMÂ*´+¶ *´ ¹ ,ìN,Ã-¿jakarta-oro-2.0.8/docs/classes/org/apache/oro/io/AwkFilenameFilter.class0000644000175000017500000000175707773723336025457 0ustar arnaudarnaudÊþº¾.(        !"# __MATCHER*Lorg/apache/oro/text/regex/PatternMatcher;__CACHE"Lorg/apache/oro/text/PatternCache;(Ljava/lang/String;I)VCode(Ljava/lang/String;)V()V   $ % &"org/apache/oro/text/awk/AwkMatcher #org/apache/oro/text/PatternCacheLRU#org/apache/oro/text/awk/AwkCompiler '#org/apache/oro/io/AwkFilenameFilter%org/apache/oro/io/RegexFilenameFilterb(Lorg/apache/oro/text/PatternCache;Lorg/apache/oro/text/regex/PatternMatcher;Ljava/lang/String;I)Va(Lorg/apache/oro/text/PatternCache;Lorg/apache/oro/text/regex/PatternMatcher;Ljava/lang/String;)VO(Lorg/apache/oro/text/PatternCache;Lorg/apache/oro/text/regex/PatternMatcher;)V.(Lorg/apache/oro/text/regex/PatternCompiler;)V!  *²²+·± *²²+·± *²²·±(»Y·³»Y» Y· · ³±jakarta-oro-2.0.8/docs/classes/examples/0000755000175000017500000000000010423237774017455 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/classes/examples/MatcherDemoApplet.class0000644000175000017500000001257007773723336024057 0ustar arnaudarnaudÊþº¾.\ ©ª« ¬ €­® ¯ €°± © €² €³ ´µ €¶· €¸ €¹º ©» © €¼½ ©¾ © €¿À ©Á  €Ã €ÄÅÆ €ÇÈÉÊË (Ì €ÍÎ €ÏÐ .Ñ €Ò €ÓÔÕ 2Ö €×Ø €Ù .ÚÛ €Ü ÝÞ Âßà €á â ã Ýäå C© €æç F© Fè Fé Cê €ë Fì Fí?Ð Fî Fïð .ñ ò ó €ôõ .ö ÷øù [©ú [û Zü [ý .òþ €ÿ  gÌ   [         € ñ € € €CONTAINS_SEARCHIMATCHES_SEARCHCASE_SENSITIVECASE_INSENSITIVEPERL5_EXPRESSIONAWK_EXPRESSIONGLOB_EXPRESSIONexpressionType[Ljava/lang/String; CASE_MASK[[IexpressionFieldLjava/awt/TextField; resultLabelLjava/awt/Label; inputLabel resultAreaLjava/awt/TextArea; inputAreaexpressionChoiceLjava/awt/Choice; searchChoice caseChoice searchButtonLjava/awt/Button; resetButtoncompiler,[Lorg/apache/oro/text/regex/PatternCompiler;matcher+[Lorg/apache/oro/text/regex/PatternMatcher;()VCodeinitsearchaction%(Ljava/awt/Event;Ljava/lang/Object;)Z ¡¢ java/awt/Font Helvetica ¡  java/awt/Color ¡! "#java/awt/Choice –— Š‹ $%)org/apache/oro/text/regex/PatternCompiler ž(org/apache/oro/text/regex/PatternMatcher Ÿ  ‡ƒ'org/apache/oro/text/regex/Perl5Compiler&org/apache/oro/text/regex/Perl5Matcher ˆƒ#org/apache/oro/text/awk/AwkCompiler"org/apache/oro/text/awk/AwkMatcher ‰ƒ org/apache/oro/text/GlobCompilerjava/awt/TextField ¡& Ž ˜— contains() matches() ™—Case SensitiveCase Insensitivejava/awt/ButtonSearch ¡% š›Reset œ›java/awt/TextArea ¡' “” •”java/awt/Label Search Input ¡( ’‘Search Results ‘ )* background +,- ./java/lang/NumberFormatExceptionfontSize 01 23 45 .6java/awt/GridBagLayout 78java/awt/GridBagConstraints 9ƒ :ƒ ;< => ?@ Aƒ B@ Cƒ D% E3 F5 ŒCompiling regular expression. G% HI3org/apache/oro/text/regex/MalformedPatternExceptionjava/lang/StringBuffer Malformed Regular Expression: JK L3 M3 Searching „ƒ NOThe input IS an EXACT match. !The input IS NOT an EXACT match. -org/apache/oro/text/regex/PatternMatcherInput PQ RSMatch JT: U VW X5 Subgroups:   The input contained  matches.Y Z[ ¥¢ ‚ƒ …ƒ †ƒjava/lang/StringPerl5 Expression:AWK Expression:Glob Expression:[Iexamples/MatcherDemoAppletjava/applet/Applet(Ljava/lang/String;II)VsetFont(Ljava/awt/Font;)V(III)V setBackground(Ljava/awt/Color;)VaddItem(Ljava/lang/String;)V(I)V(II)V(Ljava/lang/String;I)V setEditable(Z)V getParameter&(Ljava/lang/String;)Ljava/lang/String;java/lang/IntegerparseInt(Ljava/lang/String;I)IgetFont()Ljava/awt/Font; getFamily()Ljava/lang/String;getStyle()I(Ljava/lang/String;)I setLayout(Ljava/awt/LayoutManager;)VfillanchorsetConstraints4(Ljava/awt/Component;Ljava/awt/GridBagConstraints;)Vadd*(Ljava/awt/Component;)Ljava/awt/Component;weightxD gridwidthweighty gridheightsetTextgetTextgetSelectedIndex appendTextcompile8(Ljava/lang/String;I)Lorg/apache/oro/text/regex/Pattern;append,(Ljava/lang/String;)Ljava/lang/StringBuffer; getMessagetoStringmatches8(Ljava/lang/String;Lorg/apache/oro/text/regex/Pattern;)ZcontainsU(Lorg/apache/oro/text/regex/PatternMatcherInput;Lorg/apache/oro/text/regex/Pattern;)ZgetMatch)()Lorg/apache/oro/text/regex/MatchResult;(I)Ljava/lang/StringBuffer;%org/apache/oro/text/regex/MatchResultgroup(I)Ljava/lang/String;groupsjava/awt/EventtargetLjava/lang/Object;1€‚ƒ„ƒ…ƒ†ƒ‡ƒˆƒ‰ƒŠ‹ŒŽ‘’‘“”•”–—˜—™—š›œ›žŸ ¡¢£oc*·*»Y·¶*»YÒ´Œ·¶*» Y· µ <² ¾¢*´ ² 2¶ „§ÿé*² ¾½µ*² ¾½µ*´²»Y·S*´²»Y·S*´²»Y·S*´²»Y·S*´²»Y·S*´²*´²2S*»Y · µ!*» Y· µ"*´"#¶ *´"$¶ *» Y· µ%*´%&¶ *´%'¶ *»(Y)·*µ+*»(Y,·*µ-*».YP·/µ0*».YP·/µ1*»2Y3·4µ5*»2Y6·4µ7*´0¶8±¤¢£Šn*9¶:YLÆ*»Y+¸;·<¶§:*>¶:YLÆ'*¶?:*»Y¶@¶A+¸B·¶§:*»CY·DYM¶E»FY·GN-µH- µI,*´ -¶J**´ ¶KW-µL-µI-µM,*´!-¶J**´!¶KW-µM,*´"-¶J**´"¶KW,*´%-¶J**´%¶KW,*´+-¶J**´+¶KW-µM,*´--¶J**´-¶KW-µM,*´5-¶J**´5¶KW-µM-µH-NµP,*´1-¶J**´1¶KW-µP-µH,*´7-¶J**´7¶KW-µP-µH-µQ,*´0-¶J**´0¶KW± =2KN=¥¢£Ÿ ‹*´0R¶S*´!¶T:*´ ¶U6²V2*´%¶U.>*´0W¶X*´2¹Y:§$: *´0»[Y·\]¶^ ¶_¶^¶`¶X±*´"¶U6*´1¶a:<*´0b¶X²c .*´2¹d™*´0e¶X§á*´0f¶X§Õ»gY·h: *´2 ¹i™˜*´2¹j:„*´0»[Y·\k¶^¶lm¶^¹n¶^o¶^¶`¶X¹p6  ¤ÿ¡*´0q¶X= ¢ÿ*´0»[Y·\r¶^¶lm¶^¹n¶^o¶^¶`¶X„§ÿÈ*´0»[Y·\s¶^¶lt¶^¶`¶X±3DGZ¦§£G;+´u*´+¦ *¶v¬+´u*´-¦ *´0R¶S*´1R¶S*´!R¶w¬¬¨¢£pd³x³c³y³z³³³½{Y|SY}SY~S³ ½Y¼ YOYOSY¼ YOYOSY¼ YOYOS³V±jakarta-oro-2.0.8/docs/api/0000755000175000017500000000000010423240066014737 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/images/0000755000175000017500000000000010423237774015447 5ustar arnaudarnaudjakarta-oro-2.0.8/docs/images/logo.gif0000644000175000017500000000664307773723336017117 0ustar arnaudarnaudGIF89a`^çÿÿÿÿÿÌÿÿ™ÿÿfÿÿ3ÿÿÀÀÀÿÌÌÿÌ™ÿÌfÿÌ3ÿÌÿ™ÿÿ™Ìÿ™™ÿ™fÿ™3ÿ™ÿfÿÿfÌÿf™ÿffÿf3ÿfÿ3ÿÿ3Ìÿ3™ÿ3fÿ33ÿ3ÿÿÿÌÿ™ÿfÿ3ÿÌÿÿÌÿÌÌÿ™ÌÿfÌÿ3ÌÿÌÌÿÌÌÌÌÌ™ÌÌfÌÌ3ÌÌÌ™ÿÌ™ÌÌ™™Ì™fÌ™3Ì™ÌfÿÌfÌÌf™ÌffÌf3ÌfÌ3ÿÌ3ÌÌ3™Ì3fÌ33Ì3ÌÿÌÌÌ™ÌfÌ3Ì™ÿÿ™ÿÌ™ÿ™™ÿf™ÿ3™ÿ™Ìÿ™Ì̙̙™Ìf™Ì3™Ì™™ÿ™™Ì™™™™™f™™3™™™fÿ™fÌ™f™™ff™f3™f™3ÿ™3Ì™3™™3f™33™3™ÿ™Ì™™™f™3™fÿÿfÿÌfÿ™fÿffÿ3fÿfÌÿfÌÌfÌ™fÌffÌ3fÌf™ÿf™Ìf™™f™ff™3f™ffÿffÌff™fffff3fff3ÿf3Ìf3™f3ff33f3fÿfÌf™fff3f3ÿÿ3ÿÌ3ÿ™3ÿf3ÿ33ÿ3Ìÿ3ÌÌ3Ì™3Ìf3Ì33Ì3™ÿ3™Ì3™™3™f3™33™3fÿ3fÌ3f™3ff3f33f33ÿ33Ì33™33f333333ÿ3Ì3™3f333ÿÿÿÌÿ™ÿfÿ3ÿÌÿÌÌÌ™ÌfÌ3Ì™ÿ™Ì™™™f™3™fÿfÌf™fff3f3ÿ3Ì3™3f333ÿÌ™f3ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ!ù ,`^@þH° Áƒ¼Æ¡Â‡#JœH±âBV ¯Y¤ØÕÆ ^èÑbÇ(„¨¥Ë@RJ\¥ŠZ*TÒPé̉s§Ïž>W©\9HÐ VF•±p‡¦N;D!•C€)CF ¢DŽ`*Y°!€@4k¦šÖ(O =­]³æ‡Š Ø[€ŠŒ !@¸`Á‚Tà 9Œpº#dȬAD¦á£˜Íhå°ÚªUÔTeìù–gÎŒ°—s+}÷–‚%‹/_"—ɸµk5ºÉ z”qx . 1Š,9æ*ÏÏŸC‡.¶ –¾±ùî}A» Ìþw©nÒ¡Ì‚PPæÈ3¾|¹ÐAàƒ AÊt1âÑ##„ÀŸ!€@à „PixàÁCx@D+¬d…­ V…Ð +@C dQb C48 ñƒgqÆ Áˆ6ÚhM,±,²sYs#fœa5°À¢ˆ"gøX¤ŒäÈDØè# !à VpÙeJ×È,ç•™Y˜f¦Yf4³¨Y¦fnÆ ÑHÌÉùu†ÄGvöYf j•ö“O9ISQ"d!Õ¢XU•¿•áÒwe~¦Š*©ðÄБצӑGÂ净1TÛ©5XpÍ,×fþ†MÅ«×XD@¬Qœ,àç\­]hmyZ(¡8ÁÀ×p¦×^°)à ¤ÊS$fU@ŒðXd‘M¦+C°‚ܙ˩]55•ZêZ—´}5{ ´ø¦@ÃX LŽ€å.`ˆÌ‚+\1§y~öÀÀ ÁY dñ‚‰—X€ƒhD(ãUæ¸È‹1Úh ‹ÄxãŠÌ¬ˆ’®˜¡$,®H3"=O#×5ÕdÄJ‡]R¬g4%è4ˆVtaÅ ÈÀB!¨1ÂàG 8Âý BŽÍ€_ïWÆ;¼· J—Ű™×DÓ°þD 1½7G«ÊÅ´, Íb Ó†ÏúwÃ$íé8$ñ¹øäx£Fùåcá™§™)‡9ß$Éäøç˜Núåt&DÓ4n!K H”‘QÆ­FàªF¤ÀµD^œ4Qƒlëp‘¦ BXÓJ‰Zµ¨SSI…•·AGÆ#ˆPúPè ¡eë@mdÖ˜öSCE ÜеO1o•ST™ÍpˆôêëæYº=*Þ¨:JEFÓr_P›,Ðà|®‚ŠUfu ª4ŠG‚d*ÃGÄ/!Ø3ËgB“ŠTK|‚JCð"¸ïÓŒ`¸@†0æ…GpL¸¦×DÈb+‰y†%†ì„þ?Ô œVC½”ª/Ó¢w°AU 1Âc¦X»ÈÜÎdH„e–9˜-С`:—1VA5ój CðÙdÄ(-¡ž#ãÊP`;Wâ¹³b:½«‰ CS“Ϭ"&×a#½´CfAÖ3ÈñF½J¬%Ã!ßî–½ão"‚8UmøÉÓ1~§»~ƒ·]  #ÐÚþC6² „,@.ËÂ4‰)$AÀ‚äRô [ֲƟfªDP‚@@6@ \[Bà’y ߌŽb%Á ±°ÑͦQ°‚èˆäg ¢5þÈBã®1ˆ{î“!° §2 WL£½†4¤Aií=kèÐ*Ɔ Í]Ö †ÎZ3q ’‚°‚/+ò4¬àZ{ÄŽ€ 4| †àM"€ÀCÂ@ÀS–±È¥ ú€ÀÉ¢"àô€¶­•íéÉК„Jƒ<Íi&½* ,€€§Y` 8@V-°ƒ„D]ÑœüV9¶¦’ §të[åD+1Uu®x=Ý 9™×òD.u} ì&ï*Ø7i®…•S©öÆ×TM±ë«åÊÔ¸Ä.dO¢‹¬e½H؈4v³çÑ,hCûÙÑ¢D´Í¹Ô4nÒ“T%%±“þÞ"Ó’DÄO{£I¦€w,¤b#ˆºV¨Âa+W´íîÔÔ;+ˆ½EÅo%’E©QÁU€à5PïôLIs=Å[dñ¤ ǃd+ I†¤o}‹ºÊ s\®L,ÇÉ\ië§[ Î|âÃÉ[<É¢$¥TØ‚ïú¶%ß”Ê+Ê /gïª=Upï4Þ㨂ô–þí0¯jJ|óák\FÂyA,A*|áÑQr UNF5›`_XÈ1 "€ÀĪ*Ù¤(½Œ$"¿™ls6ˆ)üµåŤùo]ì’ì à<`Cø*©0PÈW‰a®"Cœ=¡X3_ôÌþöàcõ–Sv!Á„ë5° æZ±ZÕ5£ÆLq¶’AD [ÛËê°3i©¨Bõ×ÛÓÈ… pvÚ9…ñq¶²>{Ë1YÃl[Ò‘Gè!‡áŒ@im÷;’“£jgM[žµ÷¸·°!€ÀT †ê¶Âý ™ÁÜ ƒt` A Û,Éf#((ÛTCBÐ/‰ô ,¸ß?2ÌÌàóyåÂ6¶ýügø©å+¤#è™Ð4POI†Ô…f7b#°„I¬÷©BPâ /øÂ°-H KeÑN•úF8E Q#^þÄ"áTKަ4þ›)BŠpè4±Ð‹6’›OȈjÐÏã¿àæË‚–ä ž¹G‹à ®@ ÿ'а@3 f@$:cP24@# K’~Óà3øPCƒ$…`À‡"s3"7‡Çw|ƒ#4$A’$gP*îGqÿµ)N’$733 h%E0pVàƒ°Ká4 €O  Ø÷R B F+è<%!-#…æD#ßÄ1%5¢{8ò}]HS3%!D`SE „d ’U!øjõ4pV°d6“‡ E #PC0“ç5døxFp6HSÏô‡Šrˆc0GÀtcz£'„3‰VuU+P2`3`"`Úukp„· oSuhxA°NÕ­¸°pN31A8n§vµ‹W•!Áø4g⋞å!‰&–¨&j—W ¡‹l'ºèwÓ&‡ƒŒ×¸‘h™Eq‹;jakarta-oro-2.0.8/docs/images/logoSmall.gif0000644000175000017500000000261507773723336020103 0ustar arnaudarnaudGIF89a0/öÿÿÿÿÿÌÿÿ™ÿÿfÿÿ3ÿÿÀÀÀÿÌÌÿÌ™ÿÌfÿÌ3ÿ™fÿ™3ÿ™ÿfÌÿf3ÿfÿ33ÿ3ÿÿÿÌÿ™ÿfÿ3ÿÌÿÿÌÿ3ÌÿÌÌÿÌÌÌÌÌ™ÌÌ3ÌÌÌ™™Ì™fÌ™3Ì™ÌffÌf3ÌfÌ3ÌÌ33Ì3ÌÿÌÌÌ™ÌfÌ3Ì™Ì3™Ì™™ÿ™™Ì™™™™™f™™3™™™ff™f3™f™3Ì™33™3™ÿ™Ì™™™f™3™f™™f™ff™3f™ffÿfffff3fff3Ìf3ff33f3fÿfÌf™fff3f33ÿ33Ì33™33f333333ÿ3™3f333333ÿÌ™f3!ù,0/þ€‚ƒ„…†„iiJ‡ŒŽ„J5‰”i˜™[i55𡢡J£§Ž5œi ¨®…“¦¢i¯†“[¡i¹¶…´½´‹¾†Ã¼Ō­[5Ê®œÑ¨´Ð¶5!Ÿ£[Á4YeWäMŽ!=>)C=PO5—£ádäõöç‡:)^ø¨òƨ"^®ÜS¸P€Ìrð‹@C¿>†À»ÕlP3öB*T˜€%K¶,9Añ_?Š^¸«rM­Ž”xwå šŸ GfI³„ J,VT&Q˜-O:>#Ë•2iМ1“å罡‰˜paÒôÁƒš†bE­Bþ“i‡%+9…XÐd¹’$Ž1nŒà„MSøKÑ£iÔA´L*1 ÅL]4#ë‘ñ9ƒ'>ˆMcâÁai^„‘‚È“4a:ÖšDC –3i2ß5“æ\)”˜˜h`VB+÷ ’†Ë·\¥jTNsF$2Zj¸‘ˆ @00]%M RÀÈø: 1AÐ<ÕÈÂ;èÂ+iа$ “æ(Â`”F ê1FDc¼„@È"6¤AÃC¡qfh°ÜðÁQLÜpC :X‘ áÃ&†ñØ ÐщZd¨OedXß}Ö+ÂV…:U4þÅK…,BKZP˜a%hpÕ£iù] G¥ñÄPTâÔJ8$ˆv³ÑX™Of¡E à!ßyxÃ@Éaô ™‚&|jÒè¦4<¤%à!aP!Af<Œh‰t4r3H„XfŽb„¥Ô#q•Àˆð€ ¨C %`ƒCÉ„h«xÀêª%,/ÀpÁ Ⱥ@A „P„žØ€“  à@DP?*‹¬z/¨ð‚ .´`. è²0,á ÏxóÞ € %˜{. -¤Û/»,¬À.?HQp@t±•OÞäª 8€Líà Ã_˜áî^Dá…]Ha†]|Ü… gøÓg|! (Mk’ 0` ìR@…@ñ_€FÐh€AEÑ» „R´ D %ÐðÉ©‡t ‚¸,¿P@ äº_ñ5ØP µÖ% pÉÄ`Dt‡P‚D&Œû‚- k²ûÂÇÂîmÁ&2C¦!(Âä!,PB&” ‚ )X®‚戠xܬ<²Ë[˜ Oк1¥g¢Rã§P ÜȨÌ0´—ú '¨ûž‰\ô.üñ¿dµD ’Á<î;jakarta-oro-2.0.8/docs/index.html0000644000175000017500000001617107773723336016215 0ustar arnaudarnaud Jakarta ORO - Jakarta ORO
Jakarta-ORO

About

Software

Community

Documentation

Related Projects

Jakarta ORO

The Jakarta-ORO Java classes are a set of text-processing Java classes that provide Perl5 compatible regular expressions, AWK-like regular expressions, glob expressions, and utility classes for performing substitutions, splits, filtering filenames, etc. This library is the successor to the OROMatcher, AwkTools, PerlTools, and TextTools libraries originally from ORO, Inc.

You can download the latest released version from the distribution directory.

The Javadoc is available online. It is also included in the distribution download.

If you would like to get involved with this project in one way or another (Mailing lists, CVS, Contributions), please see the Getting Involved section of the Jakarta Website.


Contributors

  • Daniel F. Savarese wrote the original code.
  • Jon S. Stevens is responsible for helping prepare it for release.
  • Takashi Okamoto contributed a unicode character class fix and an initial posix character class implementation.
  • Mark Murphy contributed performance improvements to Perl5Substitution as well as adding support for \UuLlE and escaping of $.
  • Michael Davey fixed some documentation and added a missing int substitute(...) method to Perl5Util.
  • Harald Kuhn updated MatchActionProcessor.processMatches() to accommodate character encodings.



Copyright © 1999-2003, Apache Software Foundation
jakarta-oro-2.0.8/docs/bugs.html0000644000175000017500000001434407773723336016046 0ustar arnaudarnaud Jakarta ORO - Bug Reporting
Jakarta-ORO

About

Software

Community

Documentation

Related Projects

Bug Reporting Guidelines

Before you decide to report a bug, make sure you have adequately verified that it is not a problem in your code. For example, if you are using the org.apache.oro.text.regex or org.apache.oro.text.perl packages, double check your results against Perl. Right now, compatibility is only guaranteed with Perl 5.003_07. Eventually we will guarantee Perl 5.6 compatibility. It may be that what you think is a problem is just a difference in behavior between Perl 5.003_07 and 5.6. In that case pester the oro-dev mailing list to get on the case and move faster toward 5.6 compatibility.

Before reporting any bug, inquire on either the oro-user or oro-dev mailing lists. You will get a faster response. Always include sample code and input that reproduces your problem and indicate what version of jakarta-oro you are using. If you are not using the latest stable release, download the latest release and see if the problem is still there before submitting a report. If we cannot reproduce a problem, we can't help.

If after you've followed the guidelines above, it is either confirmed or at least not refuted that your problem is caused by a bug in jakarta-oro, you may submit a bug report through Bugzilla



Copyright © 1999-2003, Apache Software Foundation
jakarta-oro-2.0.8/docs/users.html0000644000175000017500000001715507773723336016252 0ustar arnaudarnaud Jakarta ORO - Projects Using Jakarta-ORO
Jakarta-ORO

About

Software

Community

Documentation

Related Projects

Projects Using Jakarta-ORO

These are a just a few of the software projects using Jakarta-ORO. If you would like to have your project added to this list, please contact the oro-dev mailing list.

ArsDigita Community System
An open source Java platform for Web application development.
Compaq Web Language
A scripting language for automating tasks on the World Wide Web.
Cruise Control
A tool for setting up a continuous build process.
E
A secure distributed object platform and P2P scripting language.
FESI
Free ECMAScript Interpreter.
FindIt
A Java GUI application that helps you construct and validate regular expressions.
FormProc
A Java library designed to make web form handling easy.
Jakarta Taglibs
An open source repository for JSP custom taglibs.
Jakarta Velocity
A Java-based template engine providing an alternative to JSP.
JPublish
A Web publishing framework that merges Velocity with a content repository and application control framework.
Joist
Java Open Infrastructure of Servlets and Templates.
jwma
A WebMail implementation in Java.
Jython
An implementaiton of Python in Java.
Netscape Directory SDK for Java
A Java LDAP library.
S-Link-S
Scholarly Link Specification Framework.
SnipSnap
SnipSnap is a free and easy to install Weblog and Wiki Software written in Java.
SASH
Standalone Analysis System Hierarchy, an object-oriented data reduction and analysis system.
Tambora
An open source Web-based Business to Business (B2B) application designed to facilitate information exchange between business partners in the printing and publishing industries.



Copyright © 1999-2003, Apache Software Foundation
jakarta-oro-2.0.8/docs/status.html0000644000175000017500000001712007773723336016424 0ustar arnaudarnaud Jakarta ORO - Project Status
Jakarta-ORO

About

Software

Community

Documentation

Related Projects

Project Status

Version 2.0.8 is the latest stable release. It currently supports Perl 5.003 regular expressions in the org.apache.oro.text.regex and org.apache.oro.text.perlpackages. The main development goal is to upgrade these packages to support Perl 5.8 regular expressions. The development plan will lay out the path to achieving this objective and this status page will report the state of our progress. Our current objective is to settle on a development plan sometime in the near future.

Immediate short-term development goals are summarized in this extract from the oro-dev mailing list:

 3. Prioritize this crop of changes.  My bias is:
      1. Conditional compilation supporting a J2ME target
      2. Optional table-based character type lookup
      3. Theoretically inlinable input iteration abstraction,
         using CharSequence for J2SE 1.4
      4. Proper case folding.
      5. Possibly pool Perl5Repetition objects or something else
         to reduce impact of memory allocation.
    This order is based on dependencies that will minimize work as well
    as complexity.  You need 1 before you can do 3 if you're going to
    support multiple JVMs.  You want to do 3 before you do 4 if 4 might
    affect code that iterates through input; also 3 is easier to implement
    and less likely to introduce a bug than 4.  5 we don't know if we need
    to do yet.

Number 4 may not be a quick change to make, but the rest aren't large
time sinks.  If Mark could get us started with just one functionality
unit test and Bob could get us started with some performance tests,
I think there will be sufficient grounds to nominate you both as
committers (Jon and I just have to dig one of the inactive initial
committers to provide a third vote), which will make it easier for
each of you to support your respective company's use of jakarta-oro.
This may just be the thing to kick some life back into development
and keep my time constraints from being such a bottleneck.  As I
recall, Bob, you also were hoping for group-local modifiers.  That's
something we can tackle if we successfully make it through the above.

As a side note, for bug fixes I'm comfortable with just making the
fixes as necessary.  But for changes that impact the overall API
or implementation, I think the httpd group's original review code
first before commit, or at least discuss and agree on the implementation
beforehand, is the best way to go (and very manageable for this
project since it's not a lot of code).  So, even though I've implicitly
assigned myself the implementation of some of these changes, I'm not
going to have at them all without discussion.  For example, I'll propose
a way to reimplement the input traversal to support the use of
CharSequence and the list can criticize it and counter propose.



Copyright © 1999-2003, Apache Software Foundation
jakarta-oro-2.0.8/docs/demo.html0000644000175000017500000002344307773723336016032 0ustar arnaudarnaud Jakarta ORO - Demonstration Applet
Jakarta-ORO

About

Software

Community

Documentation

Related Projects

Demonstration Applet

This demonstration requires the Java Plugin.

<hr /> If you can't see the demo applet, please try enabling Java in your browser or downloading the <a href="http://java.sun.com/products/plugin/index.html"> Java Plugin</a>. <hr />

Jakarta ORO's text processing classes support a wide range of features which are not demonstrated in this applet. Here we allow you to test for yourself the Perl5, AWK, and glob regular expression support from the org.apache.oro.text.regex, org.apache.oro.text.awk, and org.apache.oro.text packages. The Perl5 syntax demonstrated is Perl 5.003 compatible as of version 2.0.2. Remember, Perl5 compatibility means that zero-width lookahead assertions, greed control, backreferences, and other features are supported. This applet only demonstrates the basic functionality of the packages. The split and substitute methods of the Util class and other features are not demonstrated here. To get a better idea of what else you can do with Jakarta ORO, you should look through the API documentation.


Instructions

Select a regular expression syntax in the topmost choice menu. Type a regular expression in the first text field. Then in the Search Input text area, enter text that you want to search. Click the Search button to search the input text. The results will appear in the Search Results text area. The Reset button will clear the regular expression, input, and result text.

There are two choice menus that affect the regular expression search. The contains() item causes the contains() method of the PatternMatcher interface to be used to perform the search. This search is done in a while loop, finding all pattern matches occuring within the input. The matches() item causes the matches() method of the PatternMatcher interface to be used to perform the search. The matches() method only tests if all the input EXACTLY matches the regular expression. It does not check to see if there is a match somewhere inside the input. That is what the contains() method is for. This is sometimes a point of confusion for users who have tried other packages. In Jakarta ORO, matches() is used to find exact matches, and contains is used to find a match contained in the input.

The Case Sensitive and Case Insensitive choice items are self-explanatory. Case Sensitive causes the regular expression to be compiled with case sensitivity enabled. Case Insensitive causes the regular expression to treat upper and lower case characters the same.

The Search Results text area will display all the matches found in the input when the contains() choice item is selected. It will also display what the parenthesized subgroups of a regular expression matched. When the matches() choice item is selected, only whether or not the input exactly matched the pattern is indicated.

Please note that if you don't enter anything for a regular expression, it will be compiled as an expression matching a zero-length string (the null string), which will match before and after every character in the input.



Copyright © 1999-2003, Apache Software Foundation
jakarta-oro-2.0.8/docs/devplan-2.0.html0000644000175000017500000001545607773723336017041 0ustar arnaudarnaud Jakarta ORO - Development Plan: 2.0 to 3.0
Jakarta-ORO

About

Software

Community

Documentation

Related Projects

Development Plan: 2.0 to 3.0

The development plan is not ready yet, but the idea is that we'll go through a series of iterative develop, test, bug fix, test again, release, revisit objectives, cycles and incrementally release stable versions of the software. The primary objective of the work taking us from 2.0 through 2.1, 2.2, etc. up to 3.0 is to achieve Perl 5.6 regular expression compatibility in version 3.0 for the org.apache.oro.text.regex and org.apache.oro.text.perl packages. This is the driving force because the Perl expressions are the library's most popular feature. Other development will transpire, but the focus will be on the Perl regular expression support.

  1. Overview
    1. Purpose
    2. Deliverables
  2. Development Work
    1. Regression and Unit Testing
    2. Performance Testing and Optimization
    3. Defect Correction
    4. Functionality Enhancement
  3. User Support
    1. Documentation
    2. Example Code
  4. Code Development and Maintenance
    1. Code Cleanup
    2. Committers
  5. Release Schedule
  6. Beyond 3.0



Copyright © 1999-2003, Apache Software Foundation
jakarta-oro-2.0.8/build.xml0000644000175000017500000002550507773723336015112 0ustar arnaudarnaud AnakiaTask is not present! Please check to make sure that velocity.jar is in your classpath.
jakarta-oro-2.0.8/build.properties0000644000175000017500000000472707773723336016511 0ustar arnaudarnaud# ------------------------------------------------------------------------ # $Id: build.properties,v 1.8 2003/12/29 02:22:51 dfs Exp $ # # This file controls various properties which may be set during a build. # # This file is intended to be modified by users to accomadate their own # working practices, or overridden by one of the property files specified # in build.xml. # -------------------------------------------------------------------------- # Name and version information name=Jakarta-ORO project=jakarta-oro version=2.0.8 # Name and version of the project project.name=${project}-${version} top.dir=. year=2000-2003 jakarta-site2.dir=../jakarta-site2 code.src=${top.dir}/src build.src=${top.dir}/src/java build.dest=${top.dir}/classes build.tests=${build.dest}/tests javadoc.destdir=${top.dir}/docs/api final.name=${project}-${version} final.dir=${top.dir}/${final.name} debug=off optimize=on deprecation=off ant.home=. docs.src=${top.dir}/xdocs docs.dest=${top.dir}/docs # # The stuff below is for when we take another pass at cleaning up # build.xml and making the properties more consistent. # # Temporary working directory. Specified on a per user basis #tmp.dir=/tmp/${user.name} tmp.dir=. # Build directory ##build.dir=${tmp.dir}/${project.name}/build #build.dir=${tmp.dir}/build #src.dir=${top.dir}/src #src.java.dir=${src.dir}/java #doc.dir=${top.dir}/docs #doc.java.dir=${doc.dir}/api ##doc.user.dir=${doc.dir}/user ##doc.printer.dir=${doc.dir}/printer # Test results directory #check.dir=${tmp.dir}/${project.name}/check # Build properties #build.debug=off #build.deprecation=off #build.optimize=on #build.bin.dir=${build.dir}/bin #build.lib.dir=${build.dir}/lib #build.src.dir=${build.dir}/src/java #build.data.dir=${build.dir}/conf #build.doc.dir=${build.dir}/docs #build.doc.java.dir=${build.dir}/docs/api # Installation properties #install.dir=./${project.name} #install.bin.dir=${install.dir}/bin #install.lib.dir=${install.dir}/lib #install.src.dir=${install.dir}/src/java #install.data.dir=${install.dir}/conf #install.doc.dir=${install.dir}/docs #install.doc.java.dir=${build.dir}/docs/api # Document constants #company.name=Apache Software Foundation #copyright.date=2000-2003 #copyright.message=Copyright © ${copyright.date} ${company.name}. All Rights Reserved. # Time stamp patterns #timestamp.fullTimeDate.pattern=EEEE, d MMMM yyyy hh:mm:ss aa (z) #timestamp.longTimeDate.pattern=EEEE, d MMMM yyyy hh:mm:ss aa #timestamp.shortTimeDate.pattern=dd/MM/yyyy HH:mm:ssjakarta-oro-2.0.8/LICENSE0000644000175000017500000000520207773723336014266 0ustar arnaudarnaud/* ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000-2002 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro" * must not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache" * or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their * name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * . */ jakarta-oro-2.0.8/ISSUES0000644000175000017500000000033107773723336014215 0ustar arnaudarnaud$Id: ISSUES,v 1.1 2003/01/03 20:11:47 dfs Exp $ o It is possible that Perl5Util may not handle backslashing properly for certain rare cases. Need to collect bug reports on this count and fix it if this is true. jakarta-oro-2.0.8/CHANGES0000644000175000017500000002515507773723336014265 0ustar arnaudarnaud$Id: CHANGES,v 1.37 2003/12/29 02:22:51 dfs Exp $ Version 2.0.8 o examples moved to an examples package and com.oroinc migration tool moved to tools package. o Fixed bug whereby compiling an expression with Perl5Compiler.MULTILINE_MASK wasn't always having the proper effect with respect to the matching of $ even though Perl5Matcher.setMultiline(true) exhibited the proper behavior. For example, the following input " aaa bbb \n ccc ddd \n eee fff " should produce "bbb ", "ddd ", and "fff " as matches for both the patterns "\S+\s*$" and "\S+ *$" when compiled with MULTILINE_MASK. Perl5Matcher was only producing the correct matches for the second pattern, producing only "fff " as a match for the first pattern unless setMultiline(true) had been called. This has now been fixed. o Fixed embarrassing bug whereby an expression like (A)(B)((C)(D))+ when matched against input like ABCDE would produce matching groups of: "A" "B" "" null "D" instead of "A" "B" "CD" "C" "D". Version 2.0.7 o Made Perl5Util.toString() return null if no match exists in keeping with the MatchResult interface. Previously, a NullPointerException was being thrown. o Fixed problem whereby an AwkMatchResult resulting from a match on an AwkStreamInput source (AwkMatcher.contains(AwkStreamInput, Pattern);) would contain offsets relative to the buffered input rather than to where the input stream was first read from (usually the beginning of the stream). o Added support for negating modifiers. For example (?-i)foo. Note, we still do not support Perl 5.005+ group-local modifiers such as (?imsx-imsx:pattern). Version 2.0.7-dev-1 o Fixed a problem whereby offset information for captured groups was considered out of bounds when capturing parentheses occurred in a lookahead assertion that was not part of the actual match (occurring immediately after the match). o Changed processMatches to take Reader and Writer arguments. Reimplemented old method in terms of new method. Added a version of old method that allows you to specify input encoding. Changes suggested by Harald Kuhn. o Changed behavior of Perl5Util.split() to match Perl's behavior, where "leading empty fields are preserved, and empty trailing one are deleted." Util.split() is left unchanged. Version 2.0.6 o Added $& as a valid interpolation variable for Perl5Substitution. The behavior of $0 was changed from undefined to the same as $&, but it should be avoided since $0 no longer means the same thing as $& in Perl. Only use $& if possible. o Removed some leftover references to OROMatcher in the Perl5Util javadocs. o Added an int substitute(...) method to Perl5Util to correspond to the similar method added to org.apache.oro.text.regex.Util in v2.0.3 o Removed ant and support jars from distribution and moved build.xml to top level directory. From now on, you must have ant installed on your system to build the source. o Added a strings.java example to the awk examples, better demonstrating the character encoding issues associated with AwkMatcher's 8-bit character set limitation. Version 2.0.5 o Fixed [[:upper:]] so it would match lower case characters during case insensitive matches. o Fixed [[:punct:]] (which also affected [[:print:]] and [[:graph:]]) to conform to Single Unix Specification (some characters had been omitted). o Fixed bug whereby a - in a Perl expression would be ignored when it followed a builtin character class like \w. In other words, [\w-] behavig like \w instead of like [-\w]. The regression was introduced with the unicode character class patch from version 2.0.2. Version 2.0.4 o Deprecated Vector Perl5Util.split(String input) and replaced with void Perl5Util.split(Collection results, String input). This should have been done in an earlier release along with the other split methods, but this method slipped through the cracks. o Fixed bug in AwkMatcher that didn't properly handle PatternMatcherInput matches when PatternMatcherInput had a non-zero begin offset. o Added code to Perl5Matcher to handle bytecode generated for [[:alpha:]]. [[:alpha:]] would compile but not be interpreted. o Fixed problem whereby MatchActionProcessor would never set MatchActionInfo.pattern before calling MatchAction.processAction. Also changed MatchActionInfo.fields from Vector to List. o Added semicolon.txt input file for the semicolon.java example program. o Fixed bug where RegexFilenameFilter() constructor with options argument would not use the options in compiling the expression. o Added READ_ONLY_MASK compilation option to GlobCompiler Version 2.0.3 o Changed version information in javadocs to correspond to release version rather than CVS version. o Changed the backing store for GenericCache from Hashtable to Hashmap. o Fixed a problem in Perl5Debug.printProgram where it wouldn't handle the printing of the unicode character class bytecode correctly. o Added links to Jakarta ORO home page in javadocs and converted web site documentation to jakarta-site2 xml docs stored in xdocs directory. o Applied Mark Murphy's patch to add case modification support to Perl5Substitution. The \u\U\l\L\E escapes from Perl5 double-quoted string handling are now recognized in a substitution expression. o Made a backwards-incompatible change in the Substitution interface. The input parameter is now a PatternMatcherInput instance instead of a String. A new substitute method was added to Util to allow programmers to reduce the number of string copies in the existing substitute() implementation. This required that the Substitution interface be changed. The existing Substitution implementing classes do not use the input parameter and it is very easy for anyone implementing the Substitution interface in a custom class to update their class to the new interface. A deprecation phase was skipped because it would have still required implementors of the interface to implement a method with the new signature. Version 2.0.2 o Fixed default behavior of '.' in awk package. Previously it wouldn't match newlines, which isn't how AWK behaves. The default behavior now matches any character, but a compilation mask (MULTILINE_MASK) has been added to AwkCompiler to enable the old behavior. o Replaced the use of Vector with ArrayList in Perl5Substitution. o Replaced the use of deprecated Perl5Util split method in printPasswd example with newer method. Also updated splitExample to use ArrayList instead of Vector. o Updated RegexFilenameFilter to also implement the Java 1.1 FileFilter interface. o Applied a follow up to Takashi Okamoto's unicode/posix patch that implemented negative posix character classes (e.g., [[:^digit:]]) Version 2.0.2-dev-2 o Applied a modified version of Takashi Okamoto's unicode/posix patch. It adds unicode support to character classes and adds partial support for posix classes (it supports things like [:digit:] and [:print:], but not [:^digit:] and [:^print:]). It will be improved/optimized later, but gives people the functionality they need today. Version 2.0.2-dev-1 o Removed commented out code and changed OpCode._isWordCharacter() to use Character.isLetterOrDigit() o Some documentation fixes. o Perl5Matcher was not properly tracking the current offset in __interpret so PatternMatcherInput would not have its current offset properly updated. This bug was introduced when fixing the PatternMatcherInput anchor bug. When the currentOffset parameter was added to __interpret, not all of the necessary __currentOffset = beginOffset assignments were changed to __currentOffset = currentOffset assignments. This has been fixed by updating the remaining assignments. o The org.apache.oro.text.regex.Util.split() method was further generalized to accept a Collection reference argument to store the result. Version 2.0.1 o Jeff ? (jlb@houseofdistraction.com) and Peter Kronenberg (pkronenberg@predictivetechnologies.com) identified a bug in the behavior of PatternMatcherInput matching methods with respect to anchors (^). Matches were always being performed interpreting the beginning of the string from either a 0 or a current offset rather than from the beginOffset. Essentially, the beginning of the string for purposes of matching ^ wasn't being associated with beginOffset. This problem has been fixed. o A problem with multiline matches that was introduced in the transition from OROMatcher/Perltools/etc. to Jakarta-ORO was fixed. o The org.apache.oro.text.regex.Util.split() method was generalized to accept a List reference argument to store the result. Version 2.0 Many small, but important, changes have been made to this software between its last release as ORO software and its first release as part of the Apache Jakarta project. The last versions of the ORO software were: OROMatcher 1.1 (com.oroinc.text.regex) PerlTools 1.2 (com.oroinc.text.perl) AwkTools 1.0 (com.oroinc.text.awk) TextTools 1.1 (com.oroinc.io, com.oroinc.text, com.oroinc.util) OROMatcher 1.1 was fully compatible with Perl 5.003 regular expressions. Many changes have been made to the regular expression behavior in successive versions of Perl. The goal is to update org.apache.oro.text.regex to the latest version of Perl (5.6 at this time), but only some of this work has been done. o Guaranteed compatibility with JDK 1.1 is discontinued. JDK 1.2 features may be used indiscriminately, even though they may not have been used at this time. o Perl5StreamInput and methods manipulating Perl5StreamInput have been removed. For the technical reasons behind this decision see "On the Use of Regular Expressions for Searching Text", Clark and Cormack, ACM Transactions on Programming Languages and Systems, Vol 19, No. 3, pp 413-426. o Default behavior of '^' and '$' as matched by Perl5Matcher has changed to match as though in single line mode, which appears to be the Perl default. Previously if you did not specify single line or multi-line mode, '^' and '$' would act as though in multi-line mode. Now they act as though in single line mode, unless you specify otherwise with setMultiline() or by compiling the expression with an appropriate mask. However, '.' acts as in multi-line mode by default and its behavior can only be altered by compiling an expression with SINGLELINE_MASK. setMultiline() emulates the Perl $* variable, whereas the MASK variables emulate the ismx modifiers. o All deprecated methods have been removed. This includes various substitute() convenience methods from com.oroinc.text.regex.Util o Javadoc comments have been updated to use 1.2 standard doclet features. jakarta-oro-2.0.8/COMPILE0000644000175000017500000000352407773723336014301 0ustar arnaudarnaud$Id: COMPILE,v 1.5 2003/12/29 03:58:33 dfs Exp $ The Jakarta-ORO library follows the same build procedure as other Jakarta projects, relying on the Ant build system. You can learn more about the Ant build system from http://jakarta.apache.org/. If you don't have Ant installed on your system, you must download and install it to compile the software. By default, build.xml will build a jar file containing the library. Optionally, you can pass one of the following build targets as an argument to Ant: lib - builds the library examples - builds the example programs examples-awk - builds the org.apache.oro.text.awk examples tools - builds the utility programs jar - builds a jar file containing the class library javadocs - builds the API documentation package - builds a source distribution package package-zip - builds a distribution package stored as a zip file package-tgz - builds a distribution package stored as a gzipped tar file (.tar.gz) clean - removes all files generated by build targets All generated class files are stored in a classes/ directory. All documentation is stored in a doc/ directory. Examples: To build only the library use: ant lib To build only the javadocs use: ant javadocs NOTE FOR DEVELOPERS ------------------- As of 2003/12/28, all generated documentation under the docs/ tree is stored in CVS. This is done so that the infrastructure team can regenerate project pages without having to build individual software distributions. Therefore, after you make changes in xdocs or update the demo applet, you must do a checkin of any updated or new artifacts in the docs/ tree. First do an 'ant docs' and then do the requisite checkins. Do not directly edit any of the files under the docs/ tree. They are to be generated from the source under xdocs. jakarta-oro-2.0.8/CONTRIBUTORS0000644000175000017500000000165707773723336015153 0ustar arnaudarnaud$Id: CONTRIBUTORS,v 1.7 2002/06/27 22:43:39 dfs Exp $ Daniel Savarese is the original author of the OROMatcher, PerlTools, AwkTools, and TextTools packages that became the Jakarta-ORO project. Jon Stevens helped prepare the first release of jakarta-oro and is a constant help in keeping the project consistent with the Jakarta project as a whole. Takashi Okamoto has contributed a unicode character class fix and an initial posix character class implementation. Mark Murphy has contributed performance improvements to Perl5Substitution as well as adding support for \UuLlE and escaping of $. Michael Davey fixed some documentation and added a missing int substitute(...) method to Perl5Util. Harald Kuhn updated MatchActionProcessor.processMatches() to accommodate character encodings. jakarta-oro-2.0.8/README0000644000175000017500000000615707773723336014153 0ustar arnaudarnaud$Id: README,v 1.5 2003/12/29 02:27:55 dfs Exp $ Quick Overview -------------- CHANGES - lists recent changes to the source code COMPILE - contains quick instructions for building the library CONTRIBUTORS - lists people who have contributed to developing the code ISSUES - contains a list of known bugs KEYS - lists PGP keys used to sign releases LICENSE - the license defining the terms of use of the software README - this file STYLE - a set of guidelines for developers of the code TODO - lists planned or possible changes/additions build/ - directory containing the files necessary to build the software src/java - directory containing the library source, example programs, and a utility program for converting old ORO source to Jakarta-ORO docs/ - directory where generated documentation is stored classes/ - directory created when building the library; contains the class files Description ----------- The Jakarta-ORO Java classes are a set of text-processing Java classes that provide Perl5 compatible regular expressions, AWK-like regular expressions, glob expressions, and utility classes for performing substitutions, splits, filtering filenames, etc. This library is the successor to the OROMatcher, AwkTools, PerlTools, and TextTools libraries from ORO, Inc. (www.oroinc.com). They have been donated to the Jakarta Project by Daniel Savarese (www.savarese.org), the copyright holder of the original oroinc.com ORO libraries. Daniel will continue to participate in their development under the Jakarta Project. Building -------- Build instructions are in the COMPILE file. For the impatient, execute build.sh in the build directory. Converting Old Code ------------------- If you need to migrate old source that uses the com.oroinc package prefixes, you can use the provided oroToApache program from the src/tools directory to automate the conversion of your source code. Brief History (for the curious) ------------------------------ ORO, Inc. was a Java tools company that started in 1997 and stopped doing business a year and a half later when the tools market didn't pan out as well as anticipated. Other tools companies disappeared at about the same time included, including JScape. So it goes. ORO produced several Java class libraries that became very popular among Java developers and were licensed by companies such as IBM, Compaq, AOL, Netscape, and you get the picture. Daniel Savarese, one of ORO's founders, continued maintaining these libraries after ORO dissolved but did not have time to provide adequate support and feature improvements for developers. Because of licensing restrictions with other companies, the source code could not be immediately released as an open source project. That is, until now (June 2000). Jakarta-ORO combines all of the former ORO text processing libraries into one package under the Apache Software License. The software was donated to the Apache Software Foundation because of the great strides they have made for server-side Java. The largest group of developers using the ORO text processing libraries are servlet developers, so it seemed like a perfect fit. jakarta-oro-2.0.8/STYLE0000644000175000017500000000513607773723336014112 0ustar arnaudarnaud$Id: STYLE,v 1.1.1.1 2000/07/23 23:08:28 jon Exp $ The two primary design objectives of OROMatcher (what is now org.apache.oro.text.regex) are: 1) be as compatible as possible with Perl 2) be as efficient as possible in meeting objective 1 while maintaining clean object-oriented interfaces and avoiding extraneous features. We're lagging behind on 1 right now. The package is only Perl 5.003 compatible and needs to be updated to Perl 5.6. Number 2 has no bearing on how pretty the implementation is. The external interfaces to the library are what need to be pretty. The implementation as it stands is ugly ugly ugly and will likely remain so for the rest of eternity because of the very nature of Perl regular expressions. CODING CONVENTIONS The source code follows a set of conventions that must be observed by all new additions and modifications for the sake of maintaining a consistent style. o Static final variable names must be all upper case, with word boundaries delimited by underscores. For example: public static final int MY_STATIC_FINAL_CONSTANT = 0; o All method names and variables that are not constants must start with a lower case letter, with word boundaries delimited by using an upper case letter to start the next word. For example: public void myMethodName() { } o All class names must start with an upper case letter followed by lower case letters, with word boundaries delimited by an upper case letter to start the next word. For example: public class MyClassName { } o Private members (variables and methods) must be prefixed by two underscores. For example: private int __myPrivateVariable; o Protected members (variables and methods) must be prefixed and suffixed by one underscore. For example: protected int _myProtectedVariable_; o Package local members (variables and methods) must be prefixed by one underscore. For example: int _myPackageVariable; o Public members (variables and methods) have no underscore prefixes or suffixes. For example: public int myPublicMethod() { return 1; } o All public and protected members (methods and variables) must be fully documented so that someone who does not have access to the source code will be able to use the methods, variables, and classes without ambiguity. o The code uses GNU-style indentation. The best way to ensure this, is to use Emacs java-mode and add the following to your .emacs: (add-hook 'java-mode-hook '(lambda() (c-make-styles-buffer-local) (setq c-basic-offset 2))) jakarta-oro-2.0.8/TODO0000644000175000017500000000544307773723336013760 0ustar arnaudarnaud$Id: TODO,v 1.16 2003/12/29 02:28:55 dfs Exp $ o unit tests o redo/update build.xml file to conform to latest jakarta practices o distribute separate binary and source releases to cut down on size of download for people who just want the libraries. o Optimize/improve Unicode character classes. o Fix any pending issues listed in ISSUES file or issue tracking system. o Update org.apache.oro.text.regex and org.apache.oro.text.perl syntax to latest version of Perl, currently version 5.8. This will require a lot of work. o Pattern cache implementations are probably not very efficient. Should revisit and reimplement. o Look for ways to avoid creating unnecessary String instances and potential cases of redundant String/char[] conversions. o The MatchAction, MatchActionInfo, and MatchActionProcessor classes need to be updated and improved upon. Even though they were probably a bad idea to have created in the first place, people do use them. o Reduce the memory overhead of case insensitive matching in Perl5Matcher. o Measure performance of HotSpot iterating through match input via an interface's virtual function versus direct character array indexing. If HotSpot dynamically inlines the functions and achieves comparable performance, provided a clear warning is indicated that performance could be reduced on earlier JDK versions, could create a generic interface for representing input. Input array indexing could be replaced with the generic interface, PatternMatcherInput could be made to implement the interface, and stream matching could be reintroduced. Reintroduced stream matching could include a callback mechanism in the interface to report when a "contains" match has been found to allow the input encapsulator to trim its buffer. Strong warnings must go into the documentation referencing the ACM paper and noting that for many streams it will be more efficient to read the entire stream into a buffer first rather than try to match incrementally because many regular expressions will cause the whole stream to be read in anyway. For situations where that is not the case we want to be able to trim the buffer (there have been people who used OROMatcher to search gigabyte length files!). Additional methods could be added to regulate buffer growth behavior, whether to save all of it for reuse in a future pass, etc. o Make separate src and bin distributions. Current distribution is getting big on account of 1.2 MB of API docs. src only distribution should be half the size of bin distribution for quicker download. o Write user's guide and FAQ. o Update javadocs to take advantage of more recent features for using the same documentation in multiple places without writing it multiple times. Also get rid of all JDK 1.4 javadoc warnings.