Saturday, March 17, 2018

Raw String Literals Coming to Java

It appears likely that "raw string literals" are coming to Java. JEP 326 ("Raw String Literals") started as Issue JDK-8196004 and was announced as a "new JEP candidate" on March 2. The JEP and associated issue point out that "Java remains one of a small group of contemporary programming languages that do not provide language-level support for raw strings." The JEP and associated issue specifically reference programming languages C, C++, C# ("verbatim"), Dart, Go, Groovy, Haskell, JavaScript, Kotlin, Perl, PHP, Python, R, Ruby, Scala and Swift and the "Unix tools" bash, grep, and sed that were "surveyed for their delimiters and use of raw and multi-line strings."

JEP 326's "Summary" provides an overview of the proposed Java raw string literals: "A raw string literal can span multiple lines of source code and does not interpret escape sequences, such as \n, or Unicode escapes, of the form \uXXXX." The "Motivations" section of this JEP adds, "This JEP proposes a new kind of literal, a raw string literal, which sets aside both Java escapes and Java line terminator specifications, to provide character sequences that under many circumstances are more readable and maintainable than the existing traditional string literal." JEP 326 does not introduce interpolation and, in fact, rules it out in its "Non-Goals" section: "Raw string literals do not directly support string interpolation. Interpolation may be considered in a future JEP."

Multi-line String literals have long been desired in Java. JEP 326 ("Raw String Literals") currently lists several examples of how raw string literals would make it easier to implement common things in Java and these example uses include multi-line strings, operating system file paths, regular expressions, relational database SQL statements, and polygot (Java+JavaScript).

The current version of JEP 326 states that Java's raw string literals will be denoted via the use of the "backtick" character (`), which is also described in the JEP as \u0060 (Unicode "Grave Accent"), "backquote", and "accent grave". I don't show any examples of the proposed syntax because the JEP already does a nice job of listing these proposed raw string literal examples alongside examples of traditional Java code needed to implement the same thing. This makes it easy to compare the required current syntax to what would be needed in the future to accomplish the same thing if raw string literals are supported.

Support for raw string literals in Java will provide nice convenience for Java developers wishing to write more readable code to support use cases like those described in the JEP. It will provide similar advantages to libraries and even to the JDK code. The core-lib-devs mailing list post "Raw String Literal Library Support" [JDK-8196005] starts a "discussion with regards to RSL library support." (The context of "library support" in this case is the JDK and RSL stands for Raw String Literal.).

In the referenced post Raw String Literal Library Support, Jim Laskey provides a list of methods to potentially add to String to take advantage of raw string literals. These ideas for kicking off discussion include "line support", enhancements to "trim" methods, "margin management", and "escape management". Some of these are facilitated by RSL while others are necessitated by RSL. The cited post provides multiple examples of each of these.

Issue JDK-8198986 points out that "a new JLS section is needed for raw string literals." This issue links to a currently proposed section to be added to the cited Java Language Specification.

Although JEP 326 is still just a "Candidate" and is not associated with a particular release of Java, recent work on it and recent discussion in mailing list posts seeking input related to it lead me to be cautiously optimistic that we'll see multi-line Java strings and other raw string literals coming to Java in a future release.

Tuesday, March 13, 2018

Updates on JavaFX, Valhalla, Data Classes, and Java's Floating-Point

There have been some interesting posts this week and in recent weeks that provide more insight into the future of Java and the JDK.

JavaFX Removed from JDK with JDK 11

In the blog post "The Future of JavaFX and Other Java Client Roadmap Updates," Oracle's Donald Smith announced that "starting with JDK 11, Oracle is making JavaFX easier to adopt by making the technology available as a separate download, decoupled from the JDK." Smith provides a brief history of JavaFX and discusses other motivations for the decision to decouple JavaFX from the JDK.

The "The Future of JavaFX and Other Java Client Roadmap Updates blog post concludes with a reference to the March 2018 Oracle whitepaper "Java Client Roadmap Update." That referenced whitepaper describes "the Java Client" as "consist[ing] of Java Deployment (Applets and Web Start) and Java UI (Swing, AWT and JavaFX) technologies" and then "provides an overview of the current roadmap and recommendations for each technology." About Swing/AWT, this whitepaper states, "Swing and AWT will continue to be supported on Java SE 8 through at least March 2025, and on Java SE 11 (18.9 LTS) through at least September 2026."


Project Valhalla hosts some of the concepts and ideas I most excitedly anticipate for future versions of Java. I have long wanted reified generics (still only listed on the Valhalla Wiki page as "possibly other related topics" to be considered in the future), but also look forward to "value objects." I have written before, in the post "The Value in Project Valhalla," about what makes value objects so exciting.

In a recent post on the core-libs-dev mailing list, Brian Goetz wrote about how Valhalla might one day address "an obvious gap between primitive and reference types in Java." Goetz describes how Java makes it easy to add libraries, but that "the primitive types we have are fixed and we effectively cannot make more." Goetz states that Valhalla's solution to this gap is the introduction of "value types" that "are like classes" in some characteristics, but "are like primitives" in terms of other characteristics. He sums this up as, "Codes like a class, works like an int." Most of this was not new to me thanks to Goetz's and others' earlier posts and articles on value types. However, Goetz points out in this mailing list post a "now obvious" to me consequence of these value types: "Value types let us write new numeric (and other) types as library classes, and get the performance characteristics of primitives." This concept makes me wish we had value types yesterday!

Goetz provides a very rough idea of the timeframe for value types in a future version of the JDK. He writes, "Project Valhalla has been rolling for a few years, and will likely go for a few more until we reach the point where this is practical -- and more after that before value types can fit cleanly into the generic type system." This statement dampens my hopes a bit, but does seem realistic given the magnitude and broad scope of the changes involved.

The "Valhalla EG notes" dated 28 February 2018 provide insight to the current state of Valhalla.

Floating-Point in Java

The Goetz post on how Valhalla might someday allow Java developers to write "their favorite numeric type" as "ordinary user-written libraries" when not already "built into the platform" exists in the context of a mailing list thread about a request user request and statement that "the java language needs to be changed" in relation to "arithmetic underflow and overflow" and "arithmetic approximation" associated with Java's floating-point types. What caught my particular attention (besides Goetz's reference to Valhalla) in this thread is an excellent overview of Java's handling of float-point types provided by Joe Darcy.

In this mailing list message (which would make an outstanding blog post or article), Darcy concisely and thoroughly (hard to do at the same time) lays out the challenges associated with representation of and calculation using floating-point approximations "with pragmatic compromises to facilitate calculation on computers." This post is much more accessible to most of us developers than the 70+ pages of the classic article "What Every Computer Scientist Should Know About Floating-Point Arithmetic."

Darcy also provides guidance on what to expect in future versions of Java's handling of floating-points. He writes (I added the emphasis):

There are no prospects for a fundamental redefinition of the floating-point semantics of the Java language and VM. It is possible a faster and looser mode will be defined at some point, but altering the default is extremely unlikely. Long-term, decimal-based arithmetic (and other kinds of arithmetic) may get better support as a consequence of the features in Valhalla.

The Darcy post is worth reading for its overview of floating-point computations in binary-based computer systems in general and in Java in particular. Darcy concludes the post with a paragraph on expected etiquette on the OpenJDK mailing lists.

Data Classes

In a February 2018 post titled "Data Classes for Java", Goetz "explores possible directions for data classes in the Java Language." Goetz contrasts data classes versus value types (such as pointing out that data classes have identity while value types sacrifice identity to reduce overhead). He also explains why data classes are preferable to tuples and to the "only syntax-deep" "compact syntactic forms for modeling data-oriented classes" such as those provided by Scala (case) and Kotlin (data). Much of this post is on considerations that must be made when designing and implementing data classes in Java.

Tuesday, February 27, 2018

JDK 11 Early Access Builds and Renaming "Incubating" as "Preview"

Mark Reinhold posted "JDK 11 early-access builds" on the jdk-dev mailing list in which he wrote, "JDK 11 EA builds, under both the GPL and Oracle EA licenses, are now available at" Reinhold adds, "These builds include JEP 320 (Remove the Java EE and CORBA Modules), so they're significantly smaller (nine fewer modules, 22 fewer megabytes on Linux/x64)."

In the post "Rename 'incubating' language/VM features to 'preview'" on the same jdk-dev mailing list, Alex Buckley opens with, "Members of the Expert Group for JSR 384 (Java SE 11) have indicated to the Spec Lead that they support the goal of releasing non-final language/VM features in order to gain developer feedback, but also feel that the "incubating" terminology is too confusing." Buckley differentiates between "incubating APIs" (JEP 11) and "incubating language/VM features" and proposes renaming "incubating language/VM features" as "'preview language/VM features' for a number of reasons." Buckley concludes the message with related proposals to "[drop] the operand to --preview" and to "[allow] preview features only when --release is specified."

Monday, February 26, 2018

Java May Use UTF-8 as Its Default Charset

Because Java-based applications are often used in a wide variety of operating systems and environments, it is not uncommon for Java developers to run into issues related to character-based input and output. Blog posts covering these issues include The Policeman's Horror: Default Locales, Default Charsets, and Default Timezones; Annotating JDK default data; Encoding issues: Solutions for linux and within Java apps; Silly Java Strings; Java: a rough guide to character encoding; and this post with too long of a title to list here.

Several enhancements have been made to Java over the years to reduce these issues, but there are still sometimes issues when the default charset is implicitly used. The book Java Puzzlers features a puzzle (Puzzle #18) describing the quirkiness related to the "vagaries of the default charset" in Java.

With all of these issues related to Java's default charset, the presence of the draft JEP "Use UTF-8 as default Charset" (JDK-8187041) is welcome. In addition to potentially resolving issues related to the default charset, this JEP already provides a nice overview of what these issues are and alternatives to deal with these issues today. The JEP's "Motivation" section currently summarizes why this JEP is significant: "APIs that use the default charset are a hazard for developers that are new to the Java platform" and "are also a bugbear for experienced developers."

The issues with "default" charset are complicated by different uses of charsets and by different approaches currently available in the JDK APIs that lead to more than one "default." Here is a breakdown of the issues to be considered.

The draft JEP "Use UTF-8 as default Charset" will help address the issues related to different types of "default" when it comes to charset used by default for reading and writing file contents. For example, it will remove the potential conflict that could arise from writing a file using a method that uses the platform default and reading that file from a method that always uses UTF-8 regardless of the platform default charset. Of course, this is only a problem in this particular case if the platform default is NOT UTF-8.

The following Java code is a simple class that prints out some of the settings related to charsets.

Displaying Default Charset Details

package dustin.examples.charset;

import java.nio.charset.Charset;
import java.util.Locale;

import static java.lang.System.out;

 * Demonstrate default Charset-related details.
public class CharsetDemo
    * Supplies the default encoding without using Charset.defaultCharset()
    * and without accessing System.getProperty("file.encoding").
    * @return Default encoding (default charset).
   public static String getEncoding()
      final byte [] bytes = {'D'};
      final InputStream inputStream = new ByteArrayInputStream(bytes);
      final InputStreamReader reader = new InputStreamReader(inputStream);
      final String encoding = reader.getEncoding();
      return encoding;

   public static void main(final String[] arguments)
      out.println("Default Locale:   " + Locale.getDefault());
      out.println("Default Charset:  " + Charset.defaultCharset());
      out.println("file.encoding;    " + System.getProperty("file.encoding"));
      out.println("sun.jnu.encoding: " + System.getProperty("sun.jnu.encoding"));
      out.println("Default Encoding: " + getEncoding());

The next screen snapshot shows the results of running this simple class on a Windows 10-based laptop without explicitly specifying either charset-related system property, with specification only of the file.encoding system property, and with specification of both system properties file.encoding and sun.jnu.encoding.

The image just shown demonstrates the ability to control the default charsets via properties. It also demonstrates that, for this Windows environment with a locale of en_US, the default charset for both file contents and file paths is windows-1252 (Cp1252). If the draft JEP discussed in this post is implemented, the default charset for file contents will be changed to UTF-8 even for Windows.

There is the potential for significant breakage in some applications when the default charset is changed to be UTF-8. The draft JEP talks about ways to mitigate this risk including early testing for an application's susceptibility to the change by explicitly setting the system property file.encoding to UTF-8 beforehand. For cases where it is necessary to keep the current behavior (using a system-determined default charset rather than always using UTF-8), the current version of the draft JEP proposes supporting the ability to specify -Dfile.encoding=SYSTEM.

The JEP is in draft currently and is not associated with any particular JDK version. However, based on recent posts on the JDK mailing lists, I am optimistic that we'll see UTF-8 as the default charset in a future version of the JDK in the not too distant future.

Saturday, February 24, 2018

Prefer System.lineSeparator() for Writing System-Dependent Line Separator Strings in Java

JDK 7 introduced a new method on the java.lang.System class called lineSeparator(). This method does not expect any arguments and returns a String that represents "the system-dependent line separator string." The Javadoc documentation for this method also states that System.lineSeparator() "always returns the same value - the initial value of the system property line.separator." It further explains, "On UNIX systems, it returns "\n"; on Microsoft Windows systems it returns "\r\n"."

Given that a Java developer has long been able to use System.getProperty("line.separator") to get this system-dependent line separator value, why would that same Java developer now favor using System.lineSeparator instead? JDK-8198645 ["Use System.lineSeparator() instead of getProperty("line.separator")"] provides a couple reasons for favoring System.lineSeparator() over the System.getProperty(String) approach in its "Description":

A number of classes in the base module use System.getProperty("line.separator") and could use the more efficient System.lineSeparator() to simplify the code and improve performance.

As the "Description" in JDK-8198645 states, use of System.lineSeparator() is simpler to use and more efficient than System.getProperty("line.separator"). A recent message on the core-libs-dev mailing list provides more details and Roger Riggs writes in that message that System.lineSeparator() "uses the line separator from System instead of looking it up in the properties each time."

The performance benefit of using System.lineSeparator() over using System.getProperty("line.separator") is probably not all that significant in many cases. However, given its simplicity, there's no reason not to gain a performance benefit (even if tiny and difficult to measure in many cases) while writing simpler code. One of the drawbacks to the System.getProperty(String) approach is that one has to ensure that the exactly matching property name is provided to that method. With String-based APIs, there's always a risk of spelling the string wrong (I have seen "separator" misspelled numerous times as "seperator"), using the wrong case, or accidentally introducing other typos that prevent exact matches from being made.

The JDK issue that introduced this feature to JDK 7, JDK-6900043 ("Add method to return line.separator property"), also spells out some benefits in its "Description": "Querying the line.separator value is a common occurrence in large systems. Doing this properly is verbose and involves possible security failures; having a method return this value would be beneficial." Duplicate JDK-6264243 ("File.lineSeparator() to retrieve value of commonly used 'line.separator' system property") spells out advantages of this approach with even more detail and lists "correctness", "performance", and "ease of use and cross-platform development" as high-level advantages. Another duplicate issue, JDK-6529790 ("Please add LINE_SEPARATOR constant into System or some other class"), makes the point that there should be a "constant" added to "some standard Java class, as String or System," in a manner similar to that provided for file separators by File.pathSeparator.

One of the messages associated with the JDK 7 introduction of System.lineSeparator() justifies it additions with this description:

Lots of classes need to use System.getProperty("line.separator"). Many don't do it right because you need to use a doPrivileged block whenever you read a system property. Yet it is no secret - you can divine the line separator even if you have no trust with the security manager.

An interesting side note related to the addition of System.lineSeparator() in JDK 7 is that the Javadoc at that time did not indicate that the method was new to JDK 7. JDK-7082231 ("Put a @since 1.7 on System.lineSeparator") addressed this in JDK 8 and two other JDK issues (JDK-8011796 and JDK-7094275) indicate that this was desired by multiple Java developers.

The introduction of System.lineSeparator() was a very small enhancement, but it does improve the safety and readability of a relatively commonly used API while not reducing (and, in fact, while improving) performance.

Wednesday, February 21, 2018

JDK 10: FutureTask Gets a toString()

I've felt for a long time that, for most Java classes that have distinguishing attributes, developers should take the time to override Object.toString(), even if it's just with an IDE-generated implementation or using a library class such as Apache Commons Lang's ToStringBuilder. The overloaded Objects.toString() methods also make this easier than ever if one wants to implement toString by hand. The JDK class FutureTask, introduced with J2SE 5, finally gets its own toString() implementation in JDK 10.

Richard Nichols's 2012 post "How to get the running tasks for a Java Executor..." highlights the omission of a toString() method on the FutureTask class. He wrote:

It seems odd that the API doesn't include any way to gather info about what's happening inside the Executor, and also, there's not even a toString() implementation for wrapping classes like FutureTask which would bubble your Runnable or Callable classes' toString() methods.

Nichols's post is in the context of his observation that "it's quite difficult to actually expose at run-time what ... Java's Executor is actually doing at any point in time."

Issue JDK-8186326 ["Make toString() methods of "task" objects more useful"] talks about aligning FutureTask toString() with that of CompletableFuture, which the issue states "already has a useful toString method, giving the current status." An e-mail thread in late 2017 documents the discussions around the addition of toString() to FutureTask and other "task classes in j.u.c." (java.util.concurrent).

The Javadoc comments for the new FutureTask.toString() method state, "The default implementation returns a string identifying this FutureTask, as well as its completion state. The state, in brackets, contains one of the strings 'Completed Normally', 'Completed Exceptionally', 'Cancelled', or 'Not completed'." Three of these four potential completion states for FutureTask's toString() are also potentially written as part of CompletableFuture's toString() ["Cancelled" is the exception].

The addition of a specific implementation of toString() to the FutureTask class in JDK 10 is a small one. However, for a developer "staring at output of toString for 'task' objects (Runnables, Callables, Futures) when diagnosing app failures" as described in JDK-8186326's "Problem" statement, this "small" addition is likely to be very welcome.

Tuesday, February 20, 2018

JDK 10: Accessing Java Application's Process ID from Java

A popular question on is, "How can a Java program get its own process ID?" There are several answers associated with that question that include parsing the String returned by ManagementFactory.getRuntimeMXBean().getName() [but that can provide an "arbitrary string"], using ProcessHandle.getPid() [JEP 102], using Java Native Access (JNA), using System Information Gatherer And Reporter (SIGAR), using JavaSysMon, using Java Native Runtime - POSIX, parsing the results of jps (or jcmd) via invocation of Runtime.getRuntime().exec(String), and other approaches. JDK 10 introduces perhaps the easiest approach of all for obtaining a JVM process's PID via a new method on the RuntimeMXBean.

JDK-8189091 ("MBean access to the PID") introduces the RuntimeMXBean method getPid() as a default interface method with JDK 10. That issue states the "Problem" as: "The platform MBean does not provide any API to get the process ID of a running JVM. Some JMX tools rely on the hotspot implementation of RuntimeMXBean::getName which returns < pid >@< hostname >." The issue also provides the "Solution": "Introduced new API, so that JMX tools can directly get process ID instead of relying on the implementation detail, RuntimeMXBean#getName().split("@")[0]."

The next code listing is a simple one and it demonstrates use of this new getPid() method on RuntimeMXBean.

Using JDK 10's RuntimeMXBean.getPid()

final RuntimeMXBean runtime = ManagementFactory.getRuntimeMXBean();
final long pid = runtime.getPid();
final Console console = System.console();
out.println("Process ID is '" + pid + "' Press <ENTER> to continue.");

When the code above is contained within an executable main(String[]) function and that function is executed from the command line, the output is as shown in the next screen snapshot (which also includes a separate terminal used to verify the PID is correct via jcmd).

The process ID is provided as a long and no parsing of an "arbitrary string" is necessary. This approach also does not require a third-party library or elaborate code to determine the current Java process's identifier.

This post has provided a brief introduction to what will perhaps be the easiest approach for a Java application (written with JDK 10 or later) to determine its own underlying process ID.