Class WordIterator

java.lang.Object
icyllis.modernui.text.method.WordIterator

public class WordIterator extends Object
Walks through cursor positions at word boundaries. Internally uses BreakIterator.getWordInstance(), and caches CharSequence for performance reasons.

Also provides methods to determine word boundaries.

  • Field Details

    • GC_P_MASK

      public static final int GC_P_MASK
  • Constructor Details

    • WordIterator

      public WordIterator()
      Constructs a WordIterator using the default locale.
    • WordIterator

      public WordIterator(Locale locale)
      Constructs a new WordIterator for the specified locale.
      Parameters:
      locale - The locale to be used for analyzing the text.
  • Method Details

    • setCharSequence

      public void setCharSequence(@Nonnull CharSequence charSequence, int start, int end)
    • preceding

      public int preceding(int offset)
    • following

      public int following(int offset)
    • isBoundary

      public boolean isBoundary(int offset)
    • nextBoundary

      public int nextBoundary(int offset)
      Returns the position of next boundary after the given offset. Returns DONE if there is no boundary after the given offset.
      Parameters:
      offset - the given start position to search from.
      Returns:
      the position of the last boundary preceding the given offset.
    • prevBoundary

      public int prevBoundary(int offset)
      Returns the position of boundary preceding the given offset or DONE if the given offset specifies the starting position.
      Parameters:
      offset - the given start position to search from.
      Returns:
      the position of the last boundary preceding the given offset.
    • getPunctuationBeginning

      public int getPunctuationBeginning(int offset)
      If offset is within a group of punctuation as defined by isPunctuation(int), returns the index of the first character of that group, otherwise returns BreakIterator.DONE.
      Parameters:
      offset - the offset to search from.
    • getPunctuationEnd

      public int getPunctuationEnd(int offset)
      If offset is within a group of punctuation as defined by isPunctuation(int), returns the index of the last character of that group plus one, otherwise returns BreakIterator.DONE.
      Parameters:
      offset - the offset to search from.
    • isAfterPunctuation

      public boolean isAfterPunctuation(int offset)
      Indicates if the provided offset is after a punctuation character as defined by isPunctuation(int).
      Parameters:
      offset - the offset to check from.
      Returns:
      Whether the offset is after a punctuation character.
    • isOnPunctuation

      public boolean isOnPunctuation(int offset)
      Indicates if the provided offset is at a punctuation character as defined by isPunctuation(int).
      Parameters:
      offset - the offset to check from.
      Returns:
      Whether the offset is at a punctuation character.
    • isMidWordPunctuation

      public static boolean isMidWordPunctuation(Locale locale, int codePoint)
      Indicates if the codepoint is a mid-word-only punctuation.

      At the moment, this is locale-independent, and includes all the characters in the MidLetter, MidNumLet, and Single_Quote class of Unicode word breaking algorithm (see UAX #29 "Unicode Text Segmentation" at http://unicode.org/reports/tr29/). These are all the characters that according to the rules WB6 and WB7 of UAX #29 prevent word breaks if they are in the middle of a word, but they become word breaks if they happen at the end of a word (accroding to rule WB999 that breaks word in any place that is not prohibited otherwise).

      Parameters:
      locale - the locale to consider the codepoint in. Presently ignored.
      codePoint - the codepoint to check.
      Returns:
      True if the codepoint is a mid-word punctuation.
    • isPunctuation

      public static boolean isPunctuation(int cp)