Thursday, July 17, 2008

all bout toString

GSoC midterm evaluation is now behind us so I think it's a good occasion to sum up my progress. I want to describe what is already working and what I'm going to add. I hope this post will be also a good start for creating user documentation.

  1. Code styles


    Code style determines how the generated method works and what classes it uses. There are several code styles available to chose from the combo box in generator's dialog.

    • String concatenation


      This style uses simple sum expressions so it's very efficient and relatively easy to read and modify. Here's an example outcome in the simplest case:
      return "FooClass [aFloat=" + aFloat + ", aString=" + aString + ", anInt=" + anInt 
      + ", anObject=" + anObject + "]";

      With "Skip null values" option turned on, the code becomes a little harder to read:
      return "FooClass [aFloat=" + aFloat + ", "
      + (aString != null ? "aString=" + aString + ", " : "")
      + "anInt=" + anInt + ", "
      + (anObject != null ? "anObject=" + anObject : "") + "]";


    • StringBuilder/StringBuffer


      This style uses StringBuilder if project is compatible with JDK1.5 or later and StringBuffer otherwise.
      StringBuilder builder = new StringBuilder();
      builder.append("FooClass [aFloat=");
      builder.append(aFloat);
      builder.append(", aString=");
      builder.append(aString);
      builder.append(", anInt=");
      builder.append(anInt);
      builder.append(", anObject=");
      builder.append(anObject);
      return builder.toString();

      The "Skip null values" option doesn't obfuscate the code as much as previously:
      StringBuilder builder = new StringBuilder();
      builder.append("FooClass [aFloat=");
      builder.append(aFloat);
      builder.append(", ");
      if (aString != null) {
      builder.append("aString=");
      builder.append(aString);
      builder.append(", ");
      }
      builder.append("anInt=");
      builder.append(anInt);
      builder.append(", ");
      if (anObject != null) {
      builder.append("anObject=");
      builder.append(anObject);
      }
      return builder.toString();


    • String.format();


      This style is very pleasant with relatively short list of elements, but with longer ones it becomes hard to see which fields are associated with which variables. Unfortunately, the "Skip null values" option is ignored by this style.
      return String.format("FooClass [aFloat=%s, aString=%s, anInt=%s, anObject=%s]",
      aFloat, aString, anInt, anObject);

      For JDK1.4 and earlier, the code is slightly different:
      UPDATE: As I learned today, there's no String.format() in JDK 1.4, so MessageFormat.format() will be used instead:
      return MessageFormat.format("FooClass [aFloat={1}, aString={2}, anInt={3}, anObject={4}]", 
      new Object[] { Float.valueOf(aFloat), aString, Integer.valueOf(anInt), anObject });


    • Apache Commons-Lang ToStringBuilder


      When this style is chosen, format template is ignored because ToStringBuilder takes care the output string's format itself. Maybe it's a little less flexible, but the power of this solution is that you can easily change the style of all toStrings within the project without changing any actual object's toString method.
      ToStringBuilder builder = new ToStringBuilder(this);
      builder.append("aFloat", aFloat);
      builder.append("aString", aString);
      builder.append("anInt", anInt);
      builder.append("anObject", anObject);
      return builder.toString();

      Skipping nulls works this way:
      ToStringBuilder builder = new ToStringBuilder(this);
      builder.append("aFloat", aFloat);
      if (aString != null)
      builder.append("aString", aString);
      builder.append("anInt", anInt);
      if (anObject != null)
      builder.append("anObject", anObject);
      return builder.toString();


    • Spring Framework's ToStringCreator


      This style behaves the same as Apache ToStringCreator except it uses different class to create output string.


  2. Format templates


    This is a simple mechanism that allows you to change format of generated method's output string: beginning, ending, separator, and so on. There are four tokens to use:




    ${class.name}inserts the class name as a String
    ${member.name}inserts a member's name
    ${member.value}inserts a member's value
    ${otherMethods}this token must stand between the separator and the ending string

    This is the template used for all examples in the previous part of this post:
    ${class.name} [${member.name}=${member.value}, ${otherMembers}]

    And of course, output string for this template looks like this:
    FooClass[aFloat=1.0, aString=hello, anInt=10, anObject=null]


  3. I plan to define more tokens:

    • inserting super.toString() and hashCode() (now they can be printed the same way as other methods which is not convenient)

    • ${class.getName} to use this.getClass.getName() instead of plain string

    • two different tokens for printing method names with or without parenthesis at the end

    • I'm thinking about more type specific options, e. g. putting strings between quotation-marks, printing integers as hexadecimal, printing length of arrays and so on. But don't know how they should work...



  4. Arrays handling


    When "Ignore arrays' default toString()" option is switched on, generated toString() method lists items contained in arrays, for example
    intArray = [1, 2, 3, 5, 8, 13]
    instead of
    intArray = [I@9304b1

    This is realized differently according to chosen JDK compatibility. For JDK1.5 and later java.util.Arrays.toString() is used. For earlier versions, which do not have this method, a helper arrayToString() method is generated:
    private String arrayToString(Object array, int length) {
    StringBuffer stringBuffer = new StringBuffer();
    stringBuffer.append("[");
    for (int i = 0; i < length; i++) {
    if (i > 0)
    stringBuffer.append(", ");
    if (array instanceof Object[])
    stringBuffer.append(((Object[]) array)[i]);
    if (array instanceof float[])
    stringBuffer.append(((float[]) array)[i]);
    if (array instanceof int[])
    stringBuffer.append(((int[]) array)[i]);
    }
    stringBuffer.append("]");
    return stringBuffer.toString();
    }

    It takes object as a parameter and then uses instanceof so that one method can work for all kinds of arrays. It checks only these array types that are actually passed to it in the main toString() method, so in case new types are added, arrayToString is regenerated every time toString generator is run.

  5. Limiting number of items


    This option changes behavior of generated toString in case of Collections, Maps and Arrays(if "Ignore default arrays' toString()" option is on). Again, generated code differs for each JDK version.

    In JDK1.6 collection or map is turned into an array (collection.toArray() or map.entrySet().toArray()), then Arrays.copyOf() is used to make it shorter and Arrays.toString() is called to print it out. Of course, usually it's also necessary to check for nulls. All in all, the code becomes fairly complicated, with something like this in the worst case:
    builder.append(hashMap != null ? Arrays.toString(Arrays.copyOf(hashMap.entrySet().toArray(),
    Math.min(maxItem, hashMap.size()))) : null);

    At least there's no need to generate additional methods.

    Since Arrays.copyOf was introduced in JDK1.6, in earlier versions helper toString() methods must be used to limit number of elements. In case of arrays the method is similar to the one showed in the previous part, only with one more statement at the beginning: length = Math.min(lenght, maxItem);. For collections and maps there's another method:
    private String toString(Collection collection) {
    StringBuilder stringBuilder = new StringBuilder();
    final int maxItem = 10;
    stringBuilder.append("[");
    int i = 0;
    for (Iterator iterator = collection.iterator();
    iterator.hasNext() && i < maxItem; i++) {
    if (i > 0)
    stringBuilder.append(", ");
    stringBuilder.append(iterator.next());
    }
    stringBuilder.append("]");
    return stringBuilder.toString();

    }

    This method is not overwritten every time the generator runs so that it can be changed by user.

    Another solution would be to convert a collection into an array and then use arrayToString, but this way is more efficient and looks better if there are no arrays.

  6. Plans for the future


    In addition to things I mentioned earlier, I'm going to create an extension point for new code styles and add toString generation to code assist. This time I won't be able to copy solutions from hashCode/equals generator though, so the work may not go as smooth as earlier. Still, I don't loose my optimism :)

13 comments:

Anonymous said...

I think you made a mistake, either in this blog article or in the ToStringGenerator itself.
In the section about String.format(), you state 'For JDK1.4 and earlier, the code is slightly different'
There is no String#format() in JDK 1.4.

Mateusz Matela said...

Ops, I wasn't aware of this... Thanks for the info!

Chris Aniszczyk (zx) said...

Overall, good job! I'm happy to see the work get so far within a short amount of time.

Anonymous said...

I am very impressed by the choice you offer and look forward for when this is finally implemented. Great work!!!

Two remarks. One regarding String.format and Message.format. I as user would be equally happy if you'd choose for the sake of simplicity to always use Message.format even with JDK1.5 and newer.

The other remark concerns your example under "Format Templates". You write ${otherFields} in one place and ${otherMembers} in another place though I think you mean the same thing in both places.

Other than that... did I already mention you are doing a great job? Thanks for your great work.

Anonymous said...

what about System.arraycopy ??

Mateusz Matela said...

Thankyou for your kind words :)

To answer your first remark, I think that String.format() is more convenient than MessageFormat (especially in case of modification) as there's no need to deal with numeric positions of arguments. So I don't want to give it up so easily :)

As for the second remark, you're right, when I type ${otherFields} I really mean ${otherMembers}. The first name stayed in my head from the time when the generator worked only for fields... Fixing the post.

Mateusz Matela said...

System.arraycopy? What about it? I can't see how it may help.

ijuma said...

Hi,

Good job. I was not the person who asked about System.arraycopy, but I believe he/she meant to ask why you did not use that in place of Arrays.copyOf for JDKs where the latter is not available (if you look at Arrays.copyOf in the Sun JDK, it simply calls into System.arraycopy although in reality HotSpot has intrinsics for these operations).

Personally, my suggestion would be _not_ to use any of the array copy operations and limit the size by iterating. If you want to limit the size printed, it's possible that you have a very large collection and in that case you do not want to do a full copy before limiting the size.

Regards,
Ismael

shamaz said...

great work :)

About array handling with jdk 1.4 and lower, a very simple solution is to do like this :
java.util.Arrays.asList(theArray).toString();
but maybe in some extreme case it can use too much memory.

and in arrayToString() you forgot some primitive types : byte, boolean, char, long and short :)

shamaz said...

Another simple version of toString(Collection collection)in one line is :

(new ArrayList(collection)).sublist(0, Math.min(maxItem, collection.size())).toString();

Note that this solution can also be used for arrays if you use Arrays.asList(..)

That's a personal opinion but I don't really like the use of arrayCopy, nor the creation helpers method (like arrayToString) in every classes.

Ah and just a last note, I'd rather put maxItem as a parameter.

Mateusz Matela said...

Well, AFAIK I can't use System.arrayCopy without a helper method wich would use if-instanceof checking for every primitive type, so it won't look better than my arrayToString. Besides, in JDK1.4 there's not even Arrays.toString so copying an array is pointless.

shamaz: thanks for your suggestions, I definitely need to consider them. It's hard to choose the best solution as it may depend on project-specific factors, on the other hand I don't want to confuse users with too many options...
BTW, you don't like helper methods but you'd make maxItem a parameter? But toString() cannot take parameters :>

Mateusz Matela said...

Hey, these Array.asList and List.subList methods look very promising. Correct me if I'm wrong, but it seems they don't actually copy contents so memory usage is not an issue. It's only drawback is lack of support for primitive types. How would you go around this?

shamaz said...

LOL
you're right, Arrays.asList does not work for primitive types.
I'm so used to java5 autoboxing -__- ...

hum... how to get around this...
There is org.apache.commons.lang.ArrayUtils.toObject
but it adds a dependency to commons-lang :\

What I was stating about memory, is for large primitive array. Imagine you have an huge char array containing 64Mb of data. Calling Arrays.asList would create an use a lot of memory (It would instantiate a lot of Char)
Anyway... as Arrays.asList does not work with primitive types on java 1.4...