Class FixedLengthTokenizer

  • All Implemented Interfaces:
    LineTokenizer

    public class FixedLengthTokenizer
    extends AbstractLineTokenizer
    Tokenizer used to process data obtained from files with fixed-length format. Columns are specified by array of Range objects (setColumns(Range[]) ).
    Author:
    tomas.slanina, peter.zozom, Dave Syer, Lucas Ward, Michael Minella
    • Constructor Detail

      • FixedLengthTokenizer

        public FixedLengthTokenizer()
    • Method Detail

      • setColumns

        public void setColumns​(Range... ranges)
        Set the column ranges. Used in conjunction with the RangeArrayPropertyEditor this property can be set in the form of a String describing the range boundaries, e.g. "1,4,7" or "1-3,4-6,7" or "1-2,4-5,7-10". If the last range is open then the rest of the line is read into that column (irrespective of the strict flag setting).
        Parameters:
        ranges - the column ranges expected in the input
        See Also:
        AbstractLineTokenizer.setStrict(boolean)
      • doTokenize

        protected java.util.List<java.lang.String> doTokenize​(java.lang.String line)
        Yields the tokens resulting from the splitting of the supplied line.
        Specified by:
        doTokenize in class AbstractLineTokenizer
        Parameters:
        line - the line to be tokenized (can be null)
        Returns:
        the resulting tokens (empty if the line is null)
        Throws:
        IncorrectLineLengthException - if line length is greater than or less than the max range set.