作为开发中最常用的数据结构之一String,Apache、Sun等都有提供StringUtils各种工具包。JDK也自带一组String操作方法,极大方便了开发工作,但是诸如join、split使用起来确实不是很方便。基于此Guava提供了四种字符串处理工具连接器(Joiner)、拆分器(Splitter)、匹配器(CharMatcher)、格式器( CaseFormat),本文会分四个章节讲述这四种工具的简单使用。
-
Joiner
S.N. | 方法及说明 |
---|---|
1 | static Joiner on(char separator) static Joiner on(String separator) 初始化Joiner连接器,separator为Joiner连接器的连接符 |
2 | <A extends Appendable> A appendTo(A appendable, Iterable<?> parts) throws IOException <A extends Appendable> A appendTo(A appendable, Iterator<?> parts) throws IOException <A extends Appendable> A appendTo(A appendable, Object[] parts) throws IOException <A extends Appendable> A appendTo(A appendable, @Nullable Object first, @Nullable Object second, Object… rest) throws IOException 将parts通过连接器的连接符连接成字符串,并拼接到appendable后 |
3 | StringBuilder appendTo(StringBuilder builder, Iterable<?> parts) StringBuilder appendTo(StringBuilder builder, Iterator<?> parts) StringBuilder appendTo(StringBuilder builder, @Nullable Object first, @Nullable Object second, Object… rest) StringBuilder appendTo(StringBuilder builder, Object[] parts) 将parts通过连接器的连接符连接成字符串,并拼接到builder后,返回StringBuilder |
4 | String join(Iterable<?> parts) String join(Iterator<?> parts) String join(@Nullable Object first, @Nullable Object second, Object… rest) String join(Object[] parts) 将parts通过连接器的连接符连接成字符串 |
5 | Joiner skipNulls() 连接器做join连接操作时跳过null元素 |
6 | Joiner useForNull(final String nullText) 连接器做join连接操作时用nullText替换null元素值 |
7 | Joiner.MapJoiner withKeyValueSeparator(char keyValueSeparator) Joiner.MapJoiner withKeyValueSeparator(String keyValueSeparator) 初始化一个Map连接器,连接器连接Map对象时,keyValueSeparator为key和value之间的分隔符 |
示例代码:
public class JoinerTest {
@Test
public void joinTest(){
List<String> list = Lists.newArrayList("aaa", "bbb", null, "ccc");
String joinStr = Joiner.on("-").skipNulls().join(list);
assertEquals("aaa-bbb-ccc", joinStr);
}
@Test
public void useForNullTest(){
List<String> list = Lists.newArrayList("aaa", "bbb", null, "ccc");
String joinStr = Joiner.on("-").useForNull("null").join(list);
assertEquals("aaa-bbb-null-ccc", joinStr);
}
@Test
public void appendToTest(){
List<String> list = Lists.newArrayList("aaa", "bbb", null, "ccc");
StringBuilder sb = new StringBuilder("this is: ");
StringBuilder result = Joiner.on("-").skipNulls().appendTo(sb, list);
assertEquals("this is: aaa-bbb-ccc", result.toString());
}
@Test
public void withKeyValueSeparatorTest(){
Map<Integer, String> idNameMap = Maps.newHashMap();
idNameMap.put(1, "Michael");
idNameMap.put(2, "Mary");
idNameMap.put(3, "Jane");
String result = Joiner.on("\n").withKeyValueSeparator(":").join(idNameMap);
System.out.println(result);
}
}
-
Splitter
S.N. | 方法及说明 |
---|---|
1 | static Splitter on(char separator) static Splitter on(final CharMatcher separatorMatcher) static Splitter on(Pattern separatorPattern) static Splitter on(final String separator) static Splitter onPattern(String separatorPattern) 初始化拆分器,参数为分隔符 |
2 | static Splitter fixedLength(final int length) 初始化拆分器,拆分器会将字符串分割为元素长度固定的List,最后一个元素长度不足可以直接返回 |
3 | Splitter omitEmptyStrings() 修饰拆分器,拆分器做拆分操作时,会忽略产生的空元素 |
4 | Splitter trimResults() 修饰拆分器,拆分器做拆分操作时,会对拆分的元素做trim操作(删除元素头和尾的空格) |
5 | Splitter trimResults(CharMatcher trimmer) 修饰拆分器,拆分器做拆分操作时,会删除元素头尾charMatcher匹配到的字符 |
6 | Iterable<String> split(final CharSequence sequence) 对Stirng通过拆分器进行拆分,返回一个Iterable<String> |
7 | List<String> splitToList(CharSequence sequence) 对Stirng通过拆分器进行拆分,返回一个List |
8 | Splitter.MapSplitter withKeyValueSeparator(char separator) Splitter.MapSplitter withKeyValueSeparator(Splitter keyValueSplitter) Splitter.MapSplitter withKeyValueSeparator(String separator) 初始化一个Map拆分器,拆分器对String拆分时,separator为key和value之间的分隔符 |
示例代码:
public class SplitterTest {
@Test
public void splitStringToIterableWithDelimiter() {
/*通过Char初始化拆分器,将String分隔为Iterable*/
String str = "this, is , , random , text,";
List<String> result = Lists.newArrayList(Splitter.on(',').omitEmptyStrings().trimResults().split(str));
assertThat(result, contains("this", "is", "random", "text"));
String str1 = "~?~this, is~~ , , random , text,";
result = Splitter.on(',').omitEmptyStrings().trimResults(CharMatcher.anyOf("~? ")).splitToList(str1);
System.out.println(result);
assertThat(result, contains("this", "is", "random", "text"));
}
@Test
public void splitStringToListWithDelimiter() {
/*通过Char初始化拆分器,将String直接分隔为List*/
String str = "this, is , , random , text,";
List<String> result = Splitter.on(',').omitEmptyStrings().trimResults().splitToList(str);
assertThat(result, contains("this", "is", "random", "text"));
/*生成的list不支持add、remove操作*/
assertThatThrownBy(() -> result.add("haha"))
.isInstanceOf(UnsupportedOperationException.class)
.hasNoCause();
}
@Test
public void splitStringToListWithCharMatcher() {
/*通过CharMatcher初始化拆分器*/
String str = "a,b;c.d,e.f),g,h.i;j.1,2.3;";
List<String> result = Splitter.on(CharMatcher.anyOf(";,.)")).omitEmptyStrings().trimResults().splitToList(str);
assertEquals(13, result.size());
}
@Test
public void splitStringToListWithRegularExpression() {
/*通过正则表达式初始化拆分器*/
String str = "apple.banana,,orange,,.";
List<String> result = Splitter.onPattern("[.|,]").omitEmptyStrings().trimResults().splitToList(str);
assertEquals(3, result.size());
}
@Test
public void splitStringToListWithFixedLength() {
/*将字符串分割为元素长度固定的List,最后一个元素长度不足可以直接返回*/
String str = "Hello world";
List<String> result = Splitter.fixedLength(3).splitToList(str);
assertThat(result, contains("Hel", "lo", "wor", "ld"));
}
@Test
public void splitStringToMap() {
/*String转Map*/
String str = "John=first,Adam=second";
Map<String, String> result = Splitter.on(",")
.withKeyValueSeparator("=")
.split(str);
assertEquals("first", result.get("John"));
assertEquals("second", result.get("Adam"));
}
}
-
CharMatcher
CharMatcher是Guava自定义的匹配器,可以理解为一个CharMatcher实例代表一类字符,可以用于匹配CharSequence中的字符以及对匹配的字符做特定的操作,如修剪[trim]、折叠[collapse]、移除[remove]、保留[retain]等。现在Guava已更新到Guava 25,有很多方法及静态成员变量都已过期。首先罗列一下Guava中已过期不建议使用的方法以及替代方案:
过期静态成员变量 | 对应的过期静态方法 | 可行方案 |
---|---|---|
CharMatcher.ANY | – | CharMatcher.any() |
CharMatcher.ASCII | – | CharMatcher.ascii() |
CharMatcher.BREAKING_WHITESPACE | – | CharMatcher.breakingWhitespace() |
CharMatcher.DIGIT | CharMatcher.digit() | CharMatcher.forPredicate(Character::isDigit) |
CharMatcher.INVISIBLE | CharMatcher.invisible() | – |
CharMatcher.JAVA_DIGIT | CharMatcher.javaDigit() | CharMatcher.forPredicate(Character::isDigit) |
CharMatcher.JAVA_ISO_CONTROL | – | CharMatcher.javaIsoControl() |
CharMatcher.JAVA_LETTER | CharMatcher.javaLetter() | CharMatcher.forPredicate(Character::isLetter) |
CharMatcher.JAVA_LETTER_OR_DIGIT | CharMatcher.javaLetterOrDigit() | CharMatcher.forPredicate(Character::isLetterOrDigit) |
CharMatcher.JAVA_LOWER_CASE | CharMatcher.javaLowerCase() | CharMatcher.forPredicate(Character::isLowerCase) |
CharMatcher.JAVA_UPPER_CASE | CharMatcher.javaUpperCase() | CharMatcher.forPredicate(Character::isUpperCase) |
CharMatcher.NONE | – | CharMatcher.none() |
CharMatcher.SINGLE_WIDTH | CharMatcher.singleWidth() | – |
CharMatcher.WHITESPACE | – | CharMatcher.whitespace() |
常用方法说明:
S.N. | 方法及说明 |
---|---|
1 | static CharMatcher any() 获取可以匹配所有字符的匹配器 |
2 | static CharMatcher anyOf(CharSequence sequence) 通过sequence初始化匹配器,该匹配器可以匹配sequence中所有字符 |
3 | static CharMatcher ascii() 获取可以匹配所有ascii码的匹配器 |
4 | static CharMatcher breakingWhitespace() 获取可以匹配所有可换行的空白字符的匹配器(不包括非换行空白字符,例如”\u00a0″) |
5 | static CharMatcher forPredicate(Predicate<? super Character> predicate) 通过Predicate初始化CharMatcher,该匹配器可以匹配Predicate函数式接口apply方法实现返回True的字符 |
6 | static CharMatcher inRange(char startInclusive, char endInclusive) 通过边界值初始化CharMatcher,该匹配器可以匹配处于startInclusive和endInclusive之间的所有字符 |
7 | static CharMatcher is(char match) 通过单个字符初始化CharMatcher,该匹配器只能匹配match这个单字符 |
8 | static CharMatcher isNot(char match) 通过单个字符初始化CharMatcher,该匹配器可以匹配除了match之外的所有字符 |
9 | static CharMatcher javaIsoControl() 获取可以匹配所有Java转义字符的匹配器 |
10 | static CharMatcher none() 获取不匹配任意字符的匹配器,与any()相反 |
11 | static CharMatcher noneOf(CharSequence sequence) 通过sequence初始化匹配器,该匹配器可以匹配除sequence之外的所有字符 |
12 | static CharMatcher whitespace() 获取可以匹配所有空格的匹配器 |
13 | CharMatcher and(CharMatcher other) 修饰匹配器,返回当前匹配器与other匹配器做与操作的匹配器 |
14 | CharMatcher negate() 修饰匹配器,返回和当前匹配器相反的匹配器 |
15 | CharMatcher or(CharMatcher other) 修饰匹配器,返回当前匹配器与other匹配器做或操作的匹配器 |
16 | CharMatcher precomputed() 修饰匹配器,返回的CharMatcher在检索时比原始的CharMatcher效率高,但是预处理也需要花时间,所以只有当某个 CharMatcher需要被使用上千次的时候才有必要进行预处理 |
17 | String collapseFrom(CharSequence sequence, char replacement) 折叠操作,将charMatcher连续被匹配到的字符用一个replacement替换 |
18 | int countIn(CharSequence sequence) 获取charMatcher在sequence中匹配到字符的个数 |
19 | int indexIn(CharSequence sequence) 获取charMatcher在当sequence中匹配到的第一个字符的index int indexIn(CharSequence sequence, int start) 获取charMatcher在当sequence中从index start开始匹配到的第一个字符的index |
20 | int lastIndexIn(CharSequence sequence) 获取获取charMatcher在当sequence中匹配到的最后一个字符的index |
21 | boolean matchesAllOf(CharSequence sequence) 判断sequence所有字符是否都被charMatcher匹配 |
22 | boolean matchesAnyOf(CharSequence sequence) 判断sequence中是否存在字符被charMatcher匹配 |
23 | boolean matchesNoneOf(CharSequence sequence) 判断sequence所有字符是否都没被charMatcher匹配 |
24 | String removeFrom(CharSequence sequence) 删除sequence中所有被charMatcher匹配到的字符 |
25 | String replaceFrom(CharSequence sequence, char replacement) String replaceFrom(CharSequence sequence, CharSequence replacement) 将sequence中所有被charMatcher匹配到的字符用replacement替换 |
26 | String retainFrom(CharSequence sequence) 保留sequence中所有被charMatcher匹配到的字符 |
27 | String trimAndCollapseFrom(CharSequence sequence, char replacement) 先对sequence做trim操作(删除sequence头和尾的空格),再对trim的结果做collapse操作(将charMatcher连续被匹配到的字符用一个replacement替换) |
28 | String trimFrom(CharSequence sequence) 删除sequence首尾charMatcher匹配到的字符 String trimLeadingFrom(CharSequence sequence) 删除sequence首部charMatcher匹配到的字符 String trimTrailingFrom(CharSequence sequence) 删除sequence尾部charMatcher匹配到的字符 |
示例代码:
public class CharMatcherTest {
@Test
public void retainFromTest() {
String input = "H*el.lo,}12";
/*以下方法和静态成员变量都已过期,不建议使用
CharMatcher matcher = CharMatcher.JAVA_LETTER_OR_DIGIT;
matcher = CharMatcher.javaLetterOrDigit();*/
/*使用如下初始化*/
/*CharMatcher matcher = new CharMatcher() {
@Override
public boolean matches(char c) {
return Character.isLetterOrDigit(c);
}
};*/
/*matcher = CharMatcher.forPredicate(Predicates.compose(Predicates.containsPattern("\\w"), Functions.toStringFunction()));*/
/*Predicate<Character> isLetterOrDigit = new Predicate<Character>() {
@Override
public boolean apply(@Nullable Character character) {
return Character.isLetterOrDigit(character);
}
};
matcher = CharMatcher.forPredicate(isLetterOrDigit);*/
CharMatcher matcher = CharMatcher.forPredicate(Character::isLetterOrDigit);
String result = matcher.retainFrom(input);
assertEquals("Hello12", result);
}
@Test
public void andTest() {
/*返回两个Matcher执行逻辑与操作的Matcher*/
String input = "H*el.lo,}12";
CharMatcher matcher0 = CharMatcher.forPredicate(Character::isLetter);
CharMatcher matcher1 = CharMatcher.forPredicate(Character::isLowerCase);
String result = matcher0.and(matcher1).retainFrom(input);
assertEquals("ello", result);
}
@Test
public void anyTest() {
/*匹配任意字符*/
String input = "H*el.lo,}12";
CharMatcher matcher = CharMatcher.any();
String result = matcher.retainFrom(input);
assertEquals("H*el.lo,}12", result);
}
@Test
public void anyOfTest() {
/*匹配在CharSequence内的任意一个字符*/
String input = "H*el.lo,}12";
CharMatcher matcher = CharMatcher.anyOf("Hel");
String result = matcher.removeFrom(input);
assertEquals("*.o,}12", result);
}
@Test
public void asciiTest() {
/*匹配Ascii*/
String input = "あH*el.lo,}12";
CharMatcher matcher = CharMatcher.ascii();
String result = matcher.retainFrom(input);
assertEquals("H*el.lo,}12", result);
}
@Test
public void breakingWhitespaceTest() {
/*匹配所有可换行的空白字符,(不包括非换行空白字符,例如"\u00a0")*/
String input = " this is test ";
CharMatcher matcher = CharMatcher.breakingWhitespace();
String result = matcher.removeFrom(input);
assertEquals("thisistest", result);
}
@Test
public void collapseTest() {
/*将charMatcher连续被匹配到的字符用一个replacement替换*/
String input = " hel lo ";
String result = CharMatcher.is(' ').collapseFrom(input, '-');
assertEquals("-hel-lo-", result);
/*先进性Trim操作(讲charSequence头和尾匹配到的连续字符去除),再进行collapseFrom操作*/
result = CharMatcher.is(' ').trimAndCollapseFrom(input, '-');
assertEquals("hel-lo", result);
}
@Test
public void countInTest() {
/*获取charMatcher匹配到字符的个数*/
String input = "H*el.lo,}12";
CharMatcher matcher = CharMatcher.forPredicate(Character::isLetterOrDigit);
int count = matcher.countIn(input);
assertEquals(7, count);
}
@Test
public void forPredicateTest() {
/*通过predicate初始化charMatcher*/
CharMatcher matcher = CharMatcher.forPredicate(Character::isLetterOrDigit);
Predicate<Character> isLetterOrDigit = new Predicate<Character>() {
@Override
public boolean apply(@Nullable Character character) {
return Character.isLetterOrDigit(character);
}
};
CharMatcher matcher1 = CharMatcher.forPredicate(isLetterOrDigit);
}
@Test
public void indexInTest() {
/*获取charMatcher匹配到第一个字符的index*/
String input = "**el.lo,}12";
CharMatcher matcher = CharMatcher.forPredicate(Character::isLetterOrDigit);
int index = matcher.indexIn(input);
assertEquals(2, index);
index = matcher.indexIn(input, 4);
assertEquals(5, index);
}
@Test
public void inRangeTest() {
/*初始化范围匹配器*/
String input = "a, c, z, 1, 2";
int result = CharMatcher.inRange('a', 'h').countIn(input);
assertEquals(2, result);
}
@Test
public void isTest(){
/*通过char初始化charMatcher,匹配单个字符*/
String input = "a, c, z, 1, 2";
int result = CharMatcher.is(',').countIn(input);
assertEquals(4, result);
}
@Test
public void isNotTest(){
/*匹配参数之外的所有字符,与is相反*/
String input = "a, c, z, 1, 2";
String result = CharMatcher.isNot(',').removeFrom(input);
assertEquals(",,,,", result);
}
@Test
public void javaIsoControlTest(){
/*匹配java转义字符*/
String input = "ab\tcd\nef\bg";
String result = CharMatcher.javaIsoControl().removeFrom(input);
assertEquals("abcdefg", result);
}
@Test
public void lastIndexInTest(){
/*获取charMatcher匹配到最后一个字符的index*/
String input = "**e,l.lo,}12";
CharMatcher matcher = CharMatcher.is(',');
int index = matcher.lastIndexIn(input);
assertEquals(8, index);
}
@Test
public void matchesAllOfTest(){
/*判断CharSequence每一个字符是不是都已被charMatcher匹配*/
String input = "**e,l.lo,}12";
CharMatcher matcher = CharMatcher.is(',');
assertFalse(matcher.matchesAllOf(input));
}
@Test
public void matchesAnyOfTest(){
/*判断CharSequence是否存在字符被charMatcher匹配*/
String input = "**e,l.lo,}12";
CharMatcher matcher = CharMatcher.is(',');
assertTrue(matcher.matchesAnyOf(input));
}
@Test
public void matchesNoneOfTest(){
/*判断CharSequence是否每一个字符都没有被charMatcher匹配*/
String input = "**e,l.lo,}12";
CharMatcher matcher = CharMatcher.is('?');
assertTrue(matcher.matchesNoneOf(input));
}
@Test
public void negateTest(){
/*返回与当前CharMatcher相反的CharMatcher*/
String input = "あH*el.lo,}12";
/*charMatcher为非ascii*/
CharMatcher matcher = CharMatcher.ascii().negate();
String result = matcher.retainFrom(input);
assertEquals("あ", result);
}
@Test
public void noneTest(){
/*不匹配任何字符,与any()相反*/
String input = "H*el.lo,}12";
CharMatcher matcher = CharMatcher.none();
String result = matcher.retainFrom(input);
assertEquals("", result);
}
@Test
public void noneOfTest(){
/*不匹配CharSequence内的任意一个字符,与anyOf()相反*/
String input = "H*el.lo,}12";
CharMatcher matcher = CharMatcher.noneOf("Hel");
String result = matcher.removeFrom(input);
assertEquals("Hell", result);
}
@Test
public void orTest(){
/*返回两个Matcher执行逻辑或操作的Matcher*/
String input = "H*el.lo,}12";
CharMatcher matcher0 = CharMatcher.forPredicate(Character::isLetter);
CharMatcher matcher1 = CharMatcher.forPredicate(Character::isDigit);
String result = matcher0.or(matcher1).retainFrom(input);
assertEquals("Hello12", result);
}
@Test
public void trimFromTest(){
String input = "---hello,,,";
/*删除首部匹配到的字符*/
String result = CharMatcher.is('-').trimLeadingFrom(input);
assertEquals("hello,,,", result);
/*删除尾部匹配到的字符*/
result = CharMatcher.is(',').trimTrailingFrom(input);
assertEquals("---hello", result);
/*删除首尾匹配到的字符*/
result = CharMatcher.anyOf("-,").trimFrom(input);
assertEquals("hello", result);
}
@Test
public void whitespaceTest(){
/*匹配所有空白字符*/
String input = " hel lo ";
String result = CharMatcher.whitespace().collapseFrom(input, '-');
assertEquals("-hel-lo-", result);
}
}
-
CaseFormat
CaseFormat格式器,提供不同的ASCII字符格式之间的转换。CaseFormat支持的格式如下:
格式 | 范例 |
---|---|
LOWER_CAMEL | lowerCamel |
LOWER_HYPHEN | lower-hyphen |
LOWER_UNDERSCORE | lower_underscore |
UPPER_CAMEL | UpperCamel |
UPPER_UNDERSCORE | UPPER_UNDERSCORE |
常用方法:
S.N. | 方法及说明 |
---|---|
1 | Converter<String,String> converterTo(CaseFormat targetFormat) 返回一个Converter转换器,该转换器会将String按照源格式器转化为targetFormat格式 |
2 | String to(CaseFormat format, String str) 将str按照源caseFormat格式转化为目标format格式 |
示例代码:
public class CaseFormatTest {
@Test
public void converterToTest(){
/*返回一个Converter转换器,该转换器会将String按照源格式器转化为targetFormat格式*/
Converter<String, String> camelConverter = CaseFormat.LOWER_CAMEL.converterTo(CaseFormat.UPPER_UNDERSCORE);
String input = "input_camel";
String result = camelConverter.convert(input);
assertEquals("INPUT_CAMEL", result);
}
@Test
public void toTest(){
/*将str按照源caseFormat格式转化为目标format格式*/
String result = CaseFormat.LOWER_HYPHEN.to(CaseFormat.LOWER_CAMEL,"foo-bar");
assertEquals("fooBar", result);
}
}