Some Other New Features in Java 2 Standard Edition 1.5

by
Dean Wette, Principal Software Engineer
Object Computing, Inc. (OCI)

Introduction

As this article goes to press, the release of Java 2 Standard Edition version 1.5 (J2SE 1.5) is approaching quickly, and is currently in its second public prerelease (beta 2). This new version of Java represents a significant upgrade, and may possibly have as big an impact to developers and users as the transition from Java 1 to Java 2 back in the late 1990s. A simple Google search reveals considerable information about the major feature enhancements. In a nutshell, these include:

Generic Types
a specification for creating type safe collections and definition of classes with generic types, rather than specific types
Autoboxing
supports automatic conversion between primitive data types and the standard wrapper class objects for them (int values to Integer objects and back again, etc.)
Concurrency Utilities
a new API for creating multi-threaded Java applications
Enumerated Types
a natural language construct for defining enumerated constants
Enhanced For Loop
a simplified form of the for loop for traversing collections and arrays
Metadata
a new facility for annotating class definitions, to support automated code generation

Information in the form of tutorials and articles is readily available for these new additions, and found easily enough. However, J2SE 1.5 also contains other miscellaneous new features that are less conspicuous, but worth noting since they also improve the Java development experience. Those discussed below do not represent a complete list of other new features, but include a sampling of some I find worthwhile or interesting otherwise.

StringTokenizer vs. Splitting Strings

This feature actually appeared in J2SE 1.4, but is noteworthy anyway as a new method for splitting strings. The legacy class StringTokenizer (in the java.util package) breaks a given string into distinct String elements, using whitespace or some other delimiter specified explicitly. For example,

    StringTokenizer st = new StringTokenizer("This is a string");
    while (st.hasMoreTokens()) {
        System.out.println(st.nextToken());
    }

results in the output

    This
    is
    a 
    string

However, one problem is that it discards tokens that are empty strings, and is unreliable when meant to be used as a string iterator. For example,

    // note two consecutive commas separating an empty string
    StringTokenizer st = new StringTokenizer("This,is,,string", ","); // specify comma delimiter
    while (st.hasMoreTokens()) {
        System.out.println(st.nextToken());
    }

results in the output

    This
    is
    string

This introduces a problem parsing comma-separated values where empty strings are valid and should be preserved. With StringTokenizer, the only method for detecting them is to use the variant that also returns the delimiters, making the code more complex.

A much better solution comes with the split() method provided in the String class, which is supported by the regular expression (regex) facilities introduced in J2SE 1.4

    String[] tokens = String.split(",","This,is,,string");
    for (int i = 0, i < tokens.length; ++i) {
        System.out.println(tokens[i]);
    }

results in output preserving the empty string

    This
    is
         
    String

The split() method also has the added advantage that it takes a regular expression for the delimiter argument. For example, if one has comma-separated values with whitespace following commas that should be discarded, it's simple to accomplish.

    // note space after commas, regex for delimiter arg
    String[] tokens = String.split("\\s*,\\s*"," This , is ,  , csv string ".trim()); 
    for (int i = 0, i < tokens.length; ++i) {
        System.out.println(tokens[i]);
    }

This technique trims all the elements of leading and trailing whitespace while preserving internal word spaces with the output:

    This
    is
         
    csv string

The J2SE API Specification now documents StringTokenizer as a legacy class and discourages its use, but doesn't go so far as to deprecate it. Personally, I would also like to see a counterpart for joining an array of String objects into a delimited string, much like Perl provides.

    // note: this is not in the J2SE API
    /** Joins the specified tokens into a delimited string. */
    public static String join(String delimiter, Object[] tokens);

Environment Variables

Java prides itself on being platform (OS) neutral, and discourages code that depends on a particular operating system. But reality dictates that we as developers are called on to support features, from time to time, that necessitate utilizing operating system environment variables. The original version of Java supported the System.getenv() method, and was deprecated long ago. Since then the Java properties mechanism, combined with setting properties in a wrapper shell script for the java interpreter, or a complicated technique using the Runtime.exec() method and Process class, have been the most practical approaches to accessing environment variables. Even the widely used build management tool, Apache Ant, goes to great lengths in its Java implementation code to support environment variables.

Thankfully, the getenv() method is now undeprecated in J2SE 1.5, and even enhanced with an overloaded version in the System class. The original method now provides direct access to a named environment variable

    public static String getenv(String name);

and the overloaded version generates a name/value mapping of environment variables

    public static Map<String,String> getenv();    

Access to environment variables is also protected by the standard Java security manager with checks to RuntimePermission for the "getenv.*" permission.

ProcessBuilder

Related to support for environment variables, a new class, java.lang.ProcessBuilder, supports executing OS specific commands. Previously, one accomplished this using a combination of the Runtime.exec() method and Process class. ProcessBuilder, however, adds additional functionality for getting and setting environment variables, merging the command's standard out and standard error streams, and managing a collection of spawned OS processes. It is now considered the preferred technique for executing commands external to the Java VM.

As an example, the older technique for executing a Perl script from a Java program might look something like this:

    // hypothetical method that sets up the needed environment variables for the command,
    // but there is no easy way to get existing OS environment variables
    String[] envp = createEnvironment();
    String[] command = { "myscript.pl", arg1, arg2 };
    File workingDir =  getWorkingDir();
    Process p = Runtime.getRuntime.exec(command, envp);
    

The new technique is similar, but makes dealing with environment variables much easier, and allows merging the commands standard out stream with the error stream.

    ProcessBuilder pb = new ProcessBuilder("myscript.pl", arg1, arg2);
    Map<String, String> env = pb.environment();
    // can change environment for process
    File myPerl5Lib = ...
    env.remove("PERL5LIB");
    env.put("PERL5LIB", myPerl5Lib.getPath());
    // merge STD OUT and STD ERR
    pb.redirectErrorStream(true);
    Process p = pb.start();
    

XPath in the Java API for XML Processing (JAXP) 1.3

The enhancements for JAXP 1.3 included in J2SE 1.5 merit their own Java News Brief article. These include support for XML Schema, Datatypes, and Namespaces. Another new feature, long overdue, is explicit support for evaluating XPath expressions, for which I have particular interest. While the standard DOM and SAX APIs are fine for parsing XML, using them often results in unnecessary work when only a subset of the XML document is needed. Using DOM or SAX, the entire document (or at least too much of it in many cases) must be processed to obtain the desired subset of document nodes (elements, attributes, etc.). This results in extra code complexity. Furthermore, processing results in String objects, which must be converted to appropriate types.

Using XPath expressions to evaluate a document to a specified set of nodes provides a desirable alternative. It requires some knowledge of XPath, which is not terribly difficult to grasp. With prior releases of J2SE, accessing nodes with XPath required a third party XPath API, such as Apache Xalan (which also provides the default implementation for XSL processing in JAXP). With JAXP 1.3, a new subpackage, javax.xml.xpath, provides the standard, implementation neutral, interface for XPath evaluation of XML documents. This is a welcome addition. The following example, taken directly from the J2SE 1.5 API Specification, illustrates using the new XPath API both for evaluating a document to specific nodes without using DOM tree traversal (or SAX events), and evaluating the nodes to explicit data types without an extra conversion step.

    // parse the XML as a W3C Document
    DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
    org.w3c.Document document = builder.parse(new File("/widgets.xml"));
    
    // evaluate the XPath expression against the Document
    XPath xpath = XPathFactory.newInstance().newXPath();
    String expression = "/widgets/widget[@name='a']/@quantity";
    Double quantity = (Double) xpath.evaluate(expression, document, XPathConstants.NUMBER);
    

This new facility is a welcome addition, and likely to simplify a lot of XML processing code, especially when using XML for configuration or other data that lends itself more typically to ad hoc, rather than sequential, processing.

Variable Arguments (varargs)

With previous versions of Java, creating a method to take a list of arguments of variable size required defining it in terms of a container, like an array or a Java Collection. For example, use of the MessageFormat class illustrates this.

    String pattern = "There are {0, number, integer} days in a {1}.";
    Object args = { new Integer(7), "week" };
    String formattedString = MessageFormat.format(pattern, args);
    // produces "There are 7 days in a week."
    

J2SE 1.5 defines a new way of specifying array arguments with a new syntax. Instead of a defining a method like

    public static format(String pattern, Object[] args);    
    

the method can be defined as

    public static format(String pattern, Object… args);    
    

While the new syntax must still be implemented in terms of an array, it adds more flexibility for the caller. The Object... syntax specifies that the method will be called either with an Object array or as a sequence (arbitrary in number) of Object arguments. As a result, the format() method in the example above may be called as specified, or with the alternate form:

    String pattern = "There are {0, number, integer} days in a {1}.";
    String result = MessageFormat.format(pattern, new Integer(7), "week");
    // produces "There are 7 days in a week."
    

An important benefit to the new syntax – allowing method calls with variable argument lists – is the introduction of C-style printf statements to the Java language.

Formatting Utilities

Users of the C and C++ programming languages know the value of variable argument lists through experience with the printf() utilities in the standard C library for creating formatted output. Java developers, however, have long been limited to the print() methods of the PrintStream class, most commonly utilized with the System.out object. Typically, printing formatted output in Java is accomplished using string concatenation, like this for example:

    double a = 10.3;
    double b = 3.9;
    System.out.println("The result of " + a + "/" + b + " is " + (a/b));
    

This technique has two drawbacks. First, as complex print statements are formed, the code becomes increasingly difficult to read and maintain. The example is fairly simple and straightforward, but already demonstrates this issue. A bigger problem is controlling the formatted output of the floating point division, which is driven by the numeric precision of the double type and operations on it. The output of the above print statement looks like this:

    The result of 10.3/3.9 is 2.6410256410256414    

The formatted output with 16-digit precision may not be what's desired, but to avoid that requires using the DecimalFormat facility and additional code complexity:

    // format number output with exactly 4 digit precision
    double a = 10.3;
    double b = 3.9;
    NumberFormat format = new DecimalFormat("0.0000");
    String result = format.format(a/b);
    System.out.println("The result of " + a + "/" + b + " is " + result);
    // The result of 10.3/3.9 is 2.6410
    

No one wants to go to this much trouble to get nicely formatted output. J2SE 1.5 solves this with the introduction of C-like printf utilities. The new functionality utilizes the new varargs syntax and is implemented in a new Formatter class found in the java.util package. However, it's not necessary to use Formatter directly, as PrintStream and PrintWriter are now retrofitted to support it with new printf() methods. Instead of using string concatenation and the various formatting utilities found in the java.text package, the previous example statement might be reimplemented as follows:

    System.out.printf("The result of %.1f/%.1f is %.4f%n", a, b, a/b);    

The printf() method replaces the println() method. Instead of using concatenation, a single string is used instead, with embedded placeholders to be replaced by actual values. The placeholders include meta-information for specifying format, and the values follow the string argument as a variable sequence of arguments. Note that printf() does not append a newline automatically, but one can be specified with %n. In addition, explicit argument indices are supported for reordering output as in the following example:

// a/b and a args reversed, indices specified in format string
System.out.printf("The result of %3$.1f/%2$.1f is %1$.4f%n", a/b, b, a);

These examples barely start to illustrate the power and flexibility of the formatting utilities new to J2SE1.5, but the API Specification provides extensive detailed documentation in the Formatter class javadocs, and warrants attention by anyone implementing Java programs that generate formatted output. It would have been nice to see this new facility expanded to the logging API with formatted log statements, but sadly, this is not the case. On the other hand, it's understandable as adding overloaded methods to support formatted log statements is not backward compatible with the logging API interfaces, and doing so would break existing Logger implementations.

Scanning Utilities

As of J2SE 1.4, anyone writing Java code to parse and/or process text without using the Regular Expression (regex) API is probably working too hard and creating overly complex (and error prone) code. The regex API vastly simplifies text processing for Java programs and is worth learning by any Java developer. J2SE1.5 takes regular expressions a step further with the introduction of a simple text scanning utility provided in the Scanner class of the java.util package. Based on the regular expressions facility, Scanner provides a simple mechanism for parsing primitive types and strings using regular expressions. The following examples taken from the API Specification for the Scanner class illustrates some of this functionality.

    // read an int from command line input
    // clearly better than the old way 
    Scanner sc = new Scanner(System.in);
    int i = sc.nextInt();
    
    // read some ints and strings with a 'fish' delimiter 
    String input = "1 fish 2 fish red fish blue fish";
    Scanner s = new Scanner(input).useDelimiter("\\s*fish\\s*");
    System.out.println(s.nextInt());
    System.out.println(s.nextInt());
    System.out.println(s.next());
    System.out.println(s.next());
    s.close();

Miscellaneous

A few other miscellaneous features worth noting:

java.lang.StringBuilder
represents an unsynchronized mutable sequence of characters useful as a replacement for StringBuffer in single-threaded applications
Bit Manipulation
the numeric wrapper classes (Integer, etc.) now support common bit manipulation operations
Thread State
an enumerated type Thread.State represents the various possible states a thread may have, and is returned by the new Thread.getState() method for purposes of monitoring thread state
System.nanoTime()
a new method for measuring elapsed time with the more precise granularity than available previously, where supported by the underlying operating system

References


Valid XHTML 1.0 Strict [Valid RSS]
RSS
Top