Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×

Java Regular Expressions 181

Simon P. Chappell writes "Regular expressions (regex to their friends) are an incredibly powerful addition to most programmer's personal toolkit of techniques. Programming using a language that doesn't support them can be frustrating if you need to do any amount of non-trivial string handling. Java was just such a language until the release of the 1.4.x series. Sure, there were libraries like ORO that would provide regex support, but it wasn't built in and not many companies allow the use of 3rd party libraries. With version 1.4.x, the corporate Java developer in the trench, received the power of regular expression pattern matching." Read the rest of Simon's review.
Java Regular Expressions
author Mehran Habibi
pages 255 (7 page index)
publisher Apress
rating 8/10
reviewer Simon P. Chappell
ISBN 1590591070
summary A great starter for using regular expressions in Java


The book seems targeted towards those who have a solid level of Java programming skills, but who have not yet used the java.util.regex package. I see two types of Java programmers who might not have used the regex package, those who do not know about regular expressions and those who know them, but have not yet used them within Java. This book should satisfy both sets of users. The first group will be benefited by the general introduction to regular expressions and the gentle introduction to using them within Java. The later group will benefit from the more advanced material in the book.

The book is nicely structured and progresses easily through its subject matter. The first chapter is an introduction to regular expressions. While this is most obviously for the readers new to the subject, it will be useful for those more experienced, because not all regex engines are created equal and this chapter lays out the particular dialect of regular expressions used by the Java 1.4.x regex engine. The second chapter introduces the object model used by java.util.regex. This gives detailed explanations of the Pattern and Matcher objects as well as the new regular expression methods added to the standard String class.

The third chapter takes the reader into advanced Regular expressions. While there is much that can be done using just the Pattern and Matcher objects, the path to the full power of regex travels through an understanding of groups (and subgroups) and qualifiers. Regex groups are hard to explain until you've seen them in action, whereupon you may find yourself wondering how you'd ever managed without them before. Mr. Habibi does an excellent job, both explaining them and introducing us to the unusual noncapturing subgroups. (I'd never heard of these before.) Qualifiers are the other side of the same coin with groups. While it's one thing to define a group and whether it's expected and to be captured, it's equally important to be able to describe the expected occurrence of those groups using qualifiers.

Chapter four tackles the interesting challenges of using regex in an object-oriented language. Mr. Habibi describes the general principles of use of regex as similar to those used with SQL through the JDBC interface. These principles are the optimisimg of connections, batching reads and writes, storing patterns externally, Just In Time compilation of patterns and remembering that not every piece of String handling code needs to be written as a regex. All very useful advice.

Chapter five is the big examples chapter. All of the examples are intended to be practical; the kind of thing you might have to address at the day job. With examples covering Zip codes, telephone numbers, dates, searching text files and even validating an EDI document, he seems to have delivered on that assertion. There are further examples in Appendix C, if the afore-mentioned patterns aren't enough.

The writing and progression of material are good. The examples are very well thought out and explained. Many of the examples are built from first principles. Mr. Habibi seems to want to not only teach you how to use regular expressions, but also how to design them. He does this by working up from an understanding of the data until he has a working regex.

While it doesn't make any promises about being an encyclopedia of regex patterns, this book does contain enough of the normal business patterns to be a useful initial reference work, before turning to the Internet to search for patterns.

If you want an encyclopedic reference work on regex, then buy Jeffery Friedl's Mastering Regular Expressions which is published by O'Reilly. This is not that book, preferring to stick with the practical usage of regex.

This is a great starter book, for developers who are new to using regular expressions in Java."


You can purchase Java Regular Expressions from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
This discussion has been archived. No new comments can be posted.

Java Regular Expressions

Comments Filter:
  • by happyfrogcow ( 708359 ) on Wednesday August 02, 2006 @04:49PM (#15834894)
    two slashes "\\" is nothing. the real PITA begins when you need to do "\\\\"

    effing java.
  • Re:Wha-wha-what? (Score:2, Insightful)

    by JoshDM ( 741866 ) on Wednesday August 02, 2006 @05:13PM (#15835073) Homepage Journal

    ...and not many companies allow the use of 3rd party libraries.
    Who are these companies and what can possibly be their justification for such a blanket policy.

    Actually there are a number of firms that contain multitudes of red tape that disable their employees from getting anything done without the barest of tools. I have witnessed major separations of "church and state" with these larger companies. This includes the company that did not allow the developers access to the servers, resulting in a system administrator who refused to allow a Java web server more powerful than JServ because he didn't know how to properly install Apache/Tomcat/JBoss/Whatever on Linux.

    More recently, it's a concern with larger companies that want "someone to blame" and "someone to call for support." These places use "Websphere" instead of "Eclipse and Tomcat" or "Oracle JDeveloper" instead of "Borland JBuilder". Wherever there is a "free" version of something that is supported by a community effort, there is a "pay" edition of that same item (usually 1-2 versions behind the curve) hosted by a company that sells support and takes the blame.

  • Re:Wrong way round (Score:4, Insightful)

    by smittyoneeach ( 243267 ) * on Wednesday August 02, 2006 @05:16PM (#15835096) Homepage Journal
    I would assert that if your input data are sufficiently irregular that you require a parser/lexical analyzer, you may have exceeded the bounds of "regular" expressions.
  • by Ryan Amos ( 16972 ) on Wednesday August 02, 2006 @06:21PM (#15835549)
    When speed matters

    ...you don't use Java.

    (I know, let the flames commence! :)

  • by mongus ( 131392 ) <aaron@mongus.com> on Wednesday August 02, 2006 @06:44PM (#15835687)
    I used to think the same thing. Back in '99 a guy I was working with would produce a regex and I had no idea what that strange looking thing did. I got a book on Perl and spent quite a bit of time wrapping my head around regular expressions. That's probably the only thing I retained from Perl because I really don't like the language. I started using the ORO package in Java to do regular expressions and switched to the standard library when it was introduced in 1.4. Java's syntax is nearly identical to Perl's.

    If you'll take the time to understand them you'll never go back to parsing strings yourself. They can make your code MUCH easier to maintain. There is a steep learning curve but they're well worth learning. Your code will be much more readable with a regular expression instead of lines and lines of code. Debugging is much easier too.

    Maybe you should give the reviewed book a shot. I can't comment on it as I've never read it but I do highly recommend learning regular expressions.
  • by Abcd1234 ( 188840 ) on Wednesday August 02, 2006 @06:58PM (#15835796) Homepage
    Am I the only one that finds it quite easy to get regexs right just by, you know, typing them in? If a regex fails for me, 99% of the time, it's because my input data is in a different format from what I expected. But I've almost never needed any kind of "explorer" tool... that smacks of "tweak it until it works", which is never a good idea, IMHO...
  • Re:Java sucks (Score:2, Insightful)

    by cowboy76Spain ( 815442 ) on Wednesday August 02, 2006 @07:00PM (#15835806)
    Apart from the fact that your code is the worst that you can write when using RegEx in Java (as pointed by another post, RTFApi doc if you want to use Java properly), it amuses me that you are complaining that Java (a language designed for using strong OO and being multiplatform) is slower than Perl (a language designed for processing regular expressions).

    You could have said also that the Fire Department sucks because they are not good at catching burglars, or that the Police Department is full of losers because they can not put down a fire. Myself, I will keep using the FD to deal with fire and the PD to deal with crimes.

"The one charm of marriage is that it makes a life of deception a neccessity." - Oscar Wilde

Working...