Package icyllis.modernui.text.method
Class WordIterator
java.lang.Object
icyllis.modernui.text.method.WordIterator
Walks through cursor positions at word boundaries. Internally uses
BreakIterator.getWordInstance()
, and caches CharSequence
for performance reasons.
Also provides methods to determine word boundaries.
-
Field Summary
-
Constructor Summary
ConstructorDescriptionConstructs a WordIterator using the default locale.WordIterator
(Locale locale) Constructs a new WordIterator for the specified locale. -
Method Summary
Modifier and TypeMethodDescriptionint
following
(int offset) int
getPunctuationBeginning
(int offset) Ifoffset
is within a group of punctuation as defined byisPunctuation(int)
, returns the index of the first character of that group, otherwise returns BreakIterator.DONE.int
getPunctuationEnd
(int offset) Ifoffset
is within a group of punctuation as defined byisPunctuation(int)
, returns the index of the last character of that group plus one, otherwise returns BreakIterator.DONE.boolean
isAfterPunctuation
(int offset) Indicates if the provided offset is after a punctuation character as defined byisPunctuation(int)
.boolean
isBoundary
(int offset) static boolean
isMidWordPunctuation
(Locale locale, int codePoint) Indicates if the codepoint is a mid-word-only punctuation.boolean
isOnPunctuation
(int offset) Indicates if the provided offset is at a punctuation character as defined byisPunctuation(int)
.static boolean
isPunctuation
(int cp) int
nextBoundary
(int offset) Returns the position of next boundary after the given offset.int
preceding
(int offset) int
prevBoundary
(int offset) Returns the position of boundary preceding the given offset orDONE
if the given offset specifies the starting position.void
setCharSequence
(CharSequence charSequence, int start, int end)
-
Field Details
-
GC_P_MASK
public static final int GC_P_MASK
-
-
Constructor Details
-
WordIterator
public WordIterator()Constructs a WordIterator using the default locale. -
WordIterator
Constructs a new WordIterator for the specified locale.- Parameters:
locale
- The locale to be used for analyzing the text.
-
-
Method Details
-
setCharSequence
-
preceding
public int preceding(int offset) -
following
public int following(int offset) -
isBoundary
public boolean isBoundary(int offset) -
nextBoundary
public int nextBoundary(int offset) Returns the position of next boundary after the given offset. ReturnsDONE
if there is no boundary after the given offset.- Parameters:
offset
- the given start position to search from.- Returns:
- the position of the last boundary preceding the given offset.
-
prevBoundary
public int prevBoundary(int offset) Returns the position of boundary preceding the given offset orDONE
if the given offset specifies the starting position.- Parameters:
offset
- the given start position to search from.- Returns:
- the position of the last boundary preceding the given offset.
-
getPunctuationBeginning
public int getPunctuationBeginning(int offset) Ifoffset
is within a group of punctuation as defined byisPunctuation(int)
, returns the index of the first character of that group, otherwise returns BreakIterator.DONE.- Parameters:
offset
- the offset to search from.
-
getPunctuationEnd
public int getPunctuationEnd(int offset) Ifoffset
is within a group of punctuation as defined byisPunctuation(int)
, returns the index of the last character of that group plus one, otherwise returns BreakIterator.DONE.- Parameters:
offset
- the offset to search from.
-
isAfterPunctuation
public boolean isAfterPunctuation(int offset) Indicates if the provided offset is after a punctuation character as defined byisPunctuation(int)
.- Parameters:
offset
- the offset to check from.- Returns:
- Whether the offset is after a punctuation character.
-
isOnPunctuation
public boolean isOnPunctuation(int offset) Indicates if the provided offset is at a punctuation character as defined byisPunctuation(int)
.- Parameters:
offset
- the offset to check from.- Returns:
- Whether the offset is at a punctuation character.
-
isMidWordPunctuation
Indicates if the codepoint is a mid-word-only punctuation.At the moment, this is locale-independent, and includes all the characters in the MidLetter, MidNumLet, and Single_Quote class of Unicode word breaking algorithm (see UAX #29 "Unicode Text Segmentation" at http://unicode.org/reports/tr29/). These are all the characters that according to the rules WB6 and WB7 of UAX #29 prevent word breaks if they are in the middle of a word, but they become word breaks if they happen at the end of a word (accroding to rule WB999 that breaks word in any place that is not prohibited otherwise).
- Parameters:
locale
- the locale to consider the codepoint in. Presently ignored.codePoint
- the codepoint to check.- Returns:
- True if the codepoint is a mid-word punctuation.
-
isPunctuation
public static boolean isPunctuation(int cp)
-