Java 源码分析 --String 详解

本贴最后更新于 1904 天前,其中的信息可能已经事过境迁

介绍

String 类是 final 修饰,说明不可以被继承,String 一旦被创建就不能改变,对 String 修改就会创建一个新的字符串,因此禁止使用循环修改字符串。

Sring 在内存中的位置

    String str1 = new String("hello");
    String str2 = new String("hello");
    String str3 = "hello";
    String str4 = "hello";
    String str5 = "he"+"llo";
    String str6 = "he";
    String str7 = "llo";
    System.out.println(str1==str2);         //false
    System.out.println(str1==str3);         //false
    System.out.println(str3==str4);         //true
    System.out.println(str4 == str5);       //true
    System.out.println(str3=="hello");      //true
    System.out.println(str4==(str6+str7));  //false

使用 new 创建一个 String 对象,会在堆内分配内存,每个对象地址都不一样,而 str3 会先在字符串常量池查找,没有才会添加进去,如果存在直接引用地址。最后 str6+str7 是在 JVM 运行时才操作,会创建一个新的对象。

源码

继承

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {}
    

String 实现了 Serializable、Comparable 和 CharSequence 接口

成员变量

//存放final类型的char数组,不可修改
private final char value[];
//hash值
private int hash; 
private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];

构造函数

    //创建一个空的字符串
    public String() {
        this.value = "".value;
    }

    //新创建的String是参数String的副本
    public String(String original) {
        this.value = original.value;
        this.hash = original.hash;
    }
    //传入char数组,直接拷贝数据到value
    public String(char value[]) {
        this.value = Arrays.copyOf(value, value.length);
    }

    //offset是第一个字符的索引,count是字节数,表示char数组从offset开始,获取长度为count的子数组
    public String(char value[], int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            //长度为0
            if (offset <= value.length) {
                this.value = "".value;
                return;
            }
        }
        // 数组越界
        if (offset > value.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }

    //Unicode 代码点数组参数一个子数组的字符
    public String(int[] codePoints, int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= codePoints.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > codePoints.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }

        final int end = offset + count;

        // Pass 1: Compute precise size of char[]
        int n = count;
        for (int i = offset; i < end; i++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                continue;
            else if (Character.isValidCodePoint(c))
                n++;
            else throw new IllegalArgumentException(Integer.toString(c));
        }

        // Pass 2: Allocate and fill in char[]
        final char[] v = new char[n];

        for (int i = offset, j = 0; i < end; i++, j++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                v[j] = (char)c;
            else
                Character.toSurrogates(c, v, j++);
        }

        this.value = v;
    }

    //不推荐使用
    @Deprecated
    public String(byte ascii[], int hibyte, int offset, int count) {
        checkBounds(ascii, offset, count);
        char value[] = new char[count];

        if (hibyte == 0) {
            for (int i = count; i-- > 0;) {
                value[i] = (char)(ascii[i + offset] & 0xff);
            }
        } else {
            hibyte <<= 8;
            for (int i = count; i-- > 0;) {
                value[i] = (char)(hibyte | (ascii[i + offset] & 0xff));
            }
        }
        this.value = value;
    }

    //不推荐使用
    @Deprecated
    public String(byte ascii[], int hibyte) {
        this(ascii, hibyte, 0, ascii.length);
    }
    //
    public String(byte bytes[], int offset, int length, String charsetName)
            throws UnsupportedEncodingException {
        if (charsetName == null)
            throw new NullPointerException("charsetName");
        checkBounds(bytes, offset, length);
        this.value = StringCoding.decode(charsetName, bytes, offset, length);
    }

    //传入字节数组,使用指定字符集
    public String(byte bytes[], int offset, int length, Charset charset) {
        if (charset == null)
            throw new NullPointerException("charset");
        checkBounds(bytes, offset, length);
        this.value =  StringCoding.decode(charset, bytes, offset, length);
    }

   
    public String(byte bytes[], String charsetName)
            throws UnsupportedEncodingException {
        this(bytes, 0, bytes.length, charsetName);
    }

    
    public String(byte bytes[], Charset charset) {
        this(bytes, 0, bytes.length, charset);
    }

    
    public String(byte bytes[], int offset, int length) {
        checkBounds(bytes, offset, length);
        this.value = StringCoding.decode(bytes, offset, length);
    }

    
    public String(byte bytes[]) {
        this(bytes, 0, bytes.length);
    }

    
    public String(StringBuffer buffer) {
        synchronized(buffer) {
            this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
        }
    }

    
    public String(StringBuilder builder) {
        this.value = Arrays.copyOf(builder.getValue(), builder.length());
    }

    
    String(char[] value, boolean share) {
        // assert share : "unshared not supported";
        this.value = value;
    }

其他方法

    //获取长度
    public int length() {
        return value.length;
    }

   //判断是否为空
    public boolean isEmpty() {
        return value.length == 0;
    }

    //获取char指定索引的值
    public char charAt(int index) {
        if ((index < 0) || (index >= value.length)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        return value[index];
    }
    //获取指定索引处的字符(Unicode代码点)
    public int codePointAt(int index) {
        if ((index < 0) || (index >= value.length)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        return Character.codePointAtImpl(value, index, value.length);
    }
    //获取指定索引之前的字符
    public int codePointBefore(int index) {
        int i = index - 1;
        if ((i < 0) || (i >= value.length)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        return Character.codePointBeforeImpl(value, index, 0);
    }

    //获取String文本范围内Unicode代码点数
    public int codePointCount(int beginIndex, int endIndex) {
        if (beginIndex < 0 || endIndex > value.length || beginIndex > endIndex) {
            throw new IndexOutOfBoundsException();
        }
        return Character.codePointCountImpl(value, beginIndex, endIndex - beginIndex);
    }

    //返回此 String 中从给定的 index 处偏移 codePointOffset 个代码点的索引。 
    public int offsetByCodePoints(int index, int codePointOffset) {
        if (index < 0 || index > value.length) {
            throw new IndexOutOfBoundsException();
        }
        return Character.offsetByCodePointsImpl(value, 0, value.length,
                index, codePointOffset);
    }

    //将字符串中的字符复制到目标数组中,dstBegin是目标数组中的起始偏移量
    void getChars(char dst[], int dstBegin) {
        System.arraycopy(value, 0, dst, dstBegin, value.length);
    }

    //同上,要复制字符串中,从srcBegin开始,到srcEnd-1结束
    public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
        if (srcBegin < 0) {
            throw new StringIndexOutOfBoundsException(srcBegin);
        }
        if (srcEnd > value.length) {
            throw new StringIndexOutOfBoundsException(srcEnd);
        }
        if (srcBegin > srcEnd) {
            throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
        }
        System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);
    }
    //获取命名的字符集字节数组
    public byte[] getBytes(String charsetName)
            throws UnsupportedEncodingException {
        if (charsetName == null) throw new NullPointerException();
        return StringCoding.encode(charsetName, value, 0, value.length);
    }

   //获取指定字符集的字节数组
    public byte[] getBytes(Charset charset) {
        if (charset == null) throw new NullPointerException();
        return StringCoding.encode(charset, value, 0, value.length);
    }

    //获取默认字符集的字节数组
    public byte[] getBytes() {
        return StringCoding.encode(value, 0, value.length);
    }

    //重写equals方法
    public boolean equals(Object anObject) {
        //如果对象地址相同,就是相同的对象
        if (this == anObject) {
            return true;
        }
        //首先判断是否是String对象
        if (anObject instanceof String) {
            String anotherString = (String)anObject;
            int n = value.length;
            //判断长度是否相同,不相同不是相同的字符串
            if (n == anotherString.value.length) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = 0;
                //判断每一个字符是否相同
                while (n-- != 0) {
                    if (v1[i] != v2[i])
                        return false;
                    i++;
                }
                return true;
            }
        }
        return false;
    }

    //equals对比的是String对象,contentEquals对比CharSequence或其子类的对象
    public boolean contentEquals(StringBuffer sb) {
        return contentEquals((CharSequence)sb);
    }
    
    public boolean contentEquals(CharSequence cs) {
        // Argument is a StringBuffer, StringBuilder
        if (cs instanceof AbstractStringBuilder) {
            if (cs instanceof StringBuffer) {
                //StringBuffer 是线程安全的,用synchronized块
                synchronized(cs) {
                   return nonSyncContentEquals((AbstractStringBuilder)cs);
                }
            } else {
                return nonSyncContentEquals((AbstractStringBuilder)cs);
            }
        }
        // 如果是String,就用equals方法
        if (cs instanceof String) {
            return equals(cs);
        }
        // Argument is a generic CharSequence
        char v1[] = value;
        int n = v1.length;
        if (n != cs.length()) {
            return false;
        }
        for (int i = 0; i < n; i++) {
            if (v1[i] != cs.charAt(i)) {
                return false;
            }
        }
        return true;
    }
    
    private boolean nonSyncContentEquals(AbstractStringBuilder sb) {
        char v1[] = value;
        char v2[] = sb.getValue();
        int n = v1.length;
        //判断长度是否相同
        if (n != sb.length()) {
            return false;
        }
        //对比每一个字符
        for (int i = 0; i < n; i++) {
            if (v1[i] != v2[i]) {
                return false;
            }
        }
        return true;
    }

    //对比String,忽略大小写
    public boolean equalsIgnoreCase(String anotherString) {
        return (this == anotherString) ? true
                : (anotherString != null)
                && (anotherString.value.length == value.length)
                && regionMatches(true, 0, anotherString, 0, value.length);
    }

    //如果字符串等于参数字符串anotherString,返回0;如果字符串小于参数字符串anotherString,返回小于0的值;如果字符串大于参数字符串anotherString,返回大于0的值
    public int compareTo(String anotherString) {
        int len1 = value.length;
        int len2 = anotherString.value.length;
        int lim = Math.min(len1, len2);//取长度最小的
        char v1[] = value;
        char v2[] = anotherString.value;

        int k = 0;
        while (k < lim) {
            char c1 = v1[k];
            char c2 = v2[k];
            //如果不相同,就比较不相同的字符
            if (c1 != c2) {
                return c1 - c2;
            }
            k++;
        }
        //上面遍历结束还未返回,如果两个长度相同那说明两个字符串完全相同,返回0
        return len1 - len2;
    }

    //忽略大小写的对比
    public static final Comparator<String> CASE_INSENSITIVE_ORDER
                                         = new CaseInsensitiveComparator();
    private static class CaseInsensitiveComparator
            implements Comparator<String>, java.io.Serializable {
        // use serialVersionUID from JDK 1.2.2 for interoperability
        private static final long serialVersionUID = 8575799808933029326L;

        public int compare(String s1, String s2) {
            int n1 = s1.length();
            int n2 = s2.length();
            int min = Math.min(n1, n2);
            for (int i = 0; i < min; i++) {
                char c1 = s1.charAt(i);
                char c2 = s2.charAt(i);
                if (c1 != c2) {
                    c1 = Character.toUpperCase(c1);
                    c2 = Character.toUpperCase(c2);
                    if (c1 != c2) {
                        c1 = Character.toLowerCase(c1);
                        c2 = Character.toLowerCase(c2);
                        if (c1 != c2) {
                            // No overflow because of numeric promotion
                            return c1 - c2;
                        }
                    }
                }
            }
            return n1 - n2;
        }

      
     private Object readResolve() { return CASE_INSENSITIVE_ORDER; }
    }
    //忽略大小写
    public int compareToIgnoreCase(String str) {
        return CASE_INSENSITIVE_ORDER.compare(this, str);
    }

    
    public int indexOf(int ch) {
        return indexOf(ch, 0);
    }
    //返回指定字符第一次出现的字符串内的索引,从fromIndex开始查找,ch是字符,Unicode代码点
    public int indexOf(int ch, int fromIndex) {
        final int max = value.length;
        //如果偏移量小于0,则从0开始查
        if (fromIndex < 0) {
            fromIndex = 0;
        } else if (fromIndex >= max) {
            // 如果偏移量大于字符数量,那就返回-1,表示没找到
            return -1;
        }

        if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
            // handle most cases here (ch is a BMP code point or a
            // negative value (invalid code point))
            final char[] value = this.value;
            //循环查找,只返回第一次出现的位置
            for (int i = fromIndex; i < max; i++) {
                if (value[i] == ch) {
                    return i;
                }
            }
            return -1;
        } else {
            return indexOfSupplementary(ch, fromIndex);
        }
    }

    //增补字符验证
    private int indexOfSupplementary(int ch, int fromIndex) {
        if (Character.isValidCodePoint(ch)) {
            final char[] value = this.value;
            final char hi = Character.highSurrogate(ch);
            final char lo = Character.lowSurrogate(ch);
            final int max = value.length - 1;
            for (int i = fromIndex; i < max; i++) {
                if (value[i] == hi && value[i + 1] == lo) {
                    return i;
                }
            }
        }
        return -1;
    }

   //返回指定字符最后一次出现的位置
    public int lastIndexOf(int ch) {
        return lastIndexOf(ch, value.length - 1);
    }

    
    public int lastIndexOf(int ch, int fromIndex) {
        if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
            // handle most cases here (ch is a BMP code point or a
            // negative value (invalid code point))
            final char[] value = this.value;
            int i = Math.min(fromIndex, value.length - 1);
            for (; i >= 0; i--) {
                if (value[i] == ch) {
                    return i;
                }
            }
            return -1;
        } else {
            return lastIndexOfSupplementary(ch, fromIndex);
        }
    }

  
    private int lastIndexOfSupplementary(int ch, int fromIndex) {
        if (Character.isValidCodePoint(ch)) {
            final char[] value = this.value;
            char hi = Character.highSurrogate(ch);
            char lo = Character.lowSurrogate(ch);
            int i = Math.min(fromIndex, value.length - 2);
            for (; i >= 0; i--) {
                if (value[i] == hi && value[i + 1] == lo) {
                    return i;
                }
            }
        }
        return -1;
    }

    
    public int indexOf(String str) {
        return indexOf(str, 0);
    }

    
    public int indexOf(String str, int fromIndex) {
        return indexOf(value, 0, value.length,
                str.value, 0, str.value.length, fromIndex);
    }
    //去除字符串两端空格
    public String trim() {
        int len = value.length;
        int st = 0;
        char[] val = value;    /* avoid getfield opcode */

        while ((st < len) && (val[st] <= ' ')) {
            st++;
        }
        while ((st < len) && (val[len - 1] <= ' ')) {
            len--;
        }
        //判断前后空格,然后取子集
        return ((st > 0) || (len < value.length)) ? substring(st, len) : this;
    }

其他的不看了。。

  • B3log

    B3log 是一个开源组织,名字来源于“Bulletin Board Blog”缩写,目标是将独立博客与论坛结合,形成一种新的网络社区体验,详细请看 B3log 构思。目前 B3log 已经开源了多款产品:SymSoloVditor思源笔记

    1083 引用 • 3461 回帖 • 286 关注
  • Java

    Java 是一种可以撰写跨平台应用软件的面向对象的程序设计语言,是由 Sun Microsystems 公司于 1995 年 5 月推出的。Java 技术具有卓越的通用性、高效性、平台移植性和安全性。

    3168 引用 • 8207 回帖

相关帖子

欢迎来到这里!

我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。

注册 关于
请输入回帖内容 ...