Quick regex tips

This is a simple tip that I'm writing for myself, as I frequently use regular expressions but I often hesitate because of doubts about the correct syntax.

I've recently taken again in my hands code that lied substantially untouched for years and I'm refactoring it. In many cases I'm changing syntax of some core library, a thing that needs a consequential update in other projects. Regex can be used to perform the required changes in batch mode.

Let's start with a simple example. I have an utility, which is Key, an object associated to a name and a type. The old syntax was:

Key<String> property = new Key<>("property") {}

The trick was using the curly braces to create a subclass for which the information about generic type can be retrieved by reflection. Now I've changed my mind and the new syntax is a more regular:

Key<String> property = Key.of("property", String.class);

The search & replace regular expressions for the job are:

new Key<([A-Z][A-Za-z]+)>\((.*)\) \{\};

and

Key.of($2, $1.class);

Now a slightly more complex thing. For reasons that I'm not going to explain here, in certain cases I define a "shortcut" for referring to some interfaces, such as:

                    public interface Foo
                       {
                         public final static Class<?> Foo = Foo.class;
                         // ...
                       }
                        

In short, this is used in some expressions like as(Foo) in place of as(Foo.class). When Java 8 introduced static methods in interfaces, static imports for my idiom often became ambiguous (are we importing the shortcut or a class name?); so I decided to change the pattern to:

public final static Class<?> _Foo_ = Foo.class;

The search & replace regular expressions for changing shortcuts are:

public static final Class\<([A-Z][A-Za-z]+)\> \1 \= \1\.class;

and

public static final Class<$1> _$1_ = $1.class;

Here the trick is using the capture group syntax already in the search regex, to match the latter Foo — the one after the equal sign — that needs to match the former.

To change the import the same trick is used:

(import static .*\.)([A-Z][A-Za-z]+)\.\2;

and

$1$2._$2_;

Comments are managed by Disqus, which makes use of a few cookies. Please read their cookie policy for more details. If you agree to that policy, please click on the button below to accept it. If you don't, you can still enjoy this site without using Disqus cookies, but you won't be able to see and post comments. Thanks.