Just a few minutes for a simple exercise: show how Java enum
can be used to implement a simple state machine. Also to have a quick run with my latest HTML editor for my blog...
Consider the problem of parsing an XHTML document to split it into three parts:
- a prolog, that is the portion from the beginning up to the
<body>
element included; - a body, that is the portion between the
<body>
and</body>
elements; - an epilog, that is the portion after the
</body>
element included.
For the sake of simplicity, I suppose that the XHTML document is already well-formatted, thus the <body>
and </body>
elements are in a single line and the analysis of the file can be done by means of simple string manipulation instead of using an XML API.
The above requisites can be described by means of this unit test:
public class HtmlDocumentTest { private static final String ORIGINAL_PROLOG = "<html>\n" + "<head><meta name=\"prolog\"/></head>\n" + "<body>\n"; private static final String ORIGINAL_BODY = "body\n"; private static final String ORIGINAL_EPILOG = "</body>\n" + "</html>\n"; @Test public void must_properly_create_from_text() { // given final String text = ORIGINAL_PROLOG + ORIGINAL_BODY + ORIGINAL_EPILOG; // when final HtmlDocument result = HtmlDocument.createFromText(text); // then assertThat(result.getProlog(), is(ORIGINAL_PROLOG)); assertThat(result.getBody(), is(ORIGINAL_BODY)); assertThat(result.getEpilog(), is(ORIGINAL_EPILOG)); } }
Seen as a state machine, we have three states: PROLOG, BODY and EPILOG, with the transition from the 1st to the 2nd triggered when <body>
is seen, and from the 2nd to the 3rd when </body>
is seen. The data part of the state can be modelled by three StringBuilder
s where the three portions of the text are accumulated.
The states can be implemented by a polymorphic enum
with a single method, receiving the input and the state to process, and returning the next state.
enum State { PROLOG { @Override State process (final @Nonnull String line, final @Nonnull StringBuilder prologBuilder, final @Nonnull StringBuilder bodyBuilder, final @Nonnull StringBuilder epilogBuilder) { prologBuilder.append(line).append("\n"); return line.contains("<body") ? BODY : PROLOG; } }, BODY { @Override State process (final @Nonnull String line, final @Nonnull StringBuilder prologBuilder, final @Nonnull StringBuilder bodyBuilder, final @Nonnull StringBuilder epilogBuilder) { final boolean containsEndBody = line.contains("</body"); (containsEndBody ? epilogBuilder : bodyBuilder).append(line).append("\n"); return containsEndBody ? EPILOG : BODY; } }, EPILOG { @Override State process (final @Nonnull String line, final @Nonnull StringBuilder prologBuilder, final @Nonnull StringBuilder bodyBuilder, final @Nonnull StringBuilder epilogBuilder) { epilogBuilder.append(line).append("\n"); return EPILOG; } }; abstract State process (@Nonnull String line, @Nonnull StringBuilder prologBuilder, @Nonnull StringBuilder bodyBuilder, @Nonnull StringBuilder epilogBuilder); }
The state machine itself, on these premises, can be implemented with this simple for
loop:
@Nonnull public static HtmlDocument createFromText (final @Nonnull String text) { final StringBuilder prologBuilder = new StringBuilder(); final StringBuilder bodyBuilder = new StringBuilder(); final StringBuilder epilogBuilder = new StringBuilder(); State state = State.PROLOG; for (final String line : text.split("\\n")) { state = state.process(line, prologBuilder, bodyBuilder, epilogBuilder); } return new HtmlDocument(prologBuilder.toString(), bodyBuilder.toString(), epilogBuilder.toString()); }
The whole working source code can be found in the classes HtmlDocument and HtmlDocumentTest at BitBucket.