Class RegexLineTokenizer

  • All Implemented Interfaces:
    LineTokenizer

    public class RegexLineTokenizer
    extends AbstractLineTokenizer
    Line-tokenizer using a regular expression to filter out data (by using matching and non-matching groups). Consider the following regex which picks only the first and last name (notice the non-matching group in the middle):
     (.*?)(?: .*)* (.*) 
     
    For the names:
    • "Graham James Edward Miller"
    • "Andrew Gregory Macintyre"
    • "No MiddleName"
    the output will be:
    • "Miller", "Graham"
    • "Macintyre", "Andrew"
    • "MiddleName", "No"
    An empty list is returned, in case of a non-match.
    Author:
    Costin Leau
    See Also:
    Matcher.group(int)
    • Constructor Detail

      • RegexLineTokenizer

        public RegexLineTokenizer()
    • Method Detail

      • setPattern

        public void setPattern​(java.util.regex.Pattern pattern)
        Sets the regex pattern to use.
        Parameters:
        pattern - Regular Expression pattern
      • setRegex

        public void setRegex​(java.lang.String regex)
        Sets the regular expression to use.
        Parameters:
        regex - regular expression (as a String)