Archive for February, 2007
Framework Design Guidelines book
Posted by claudiolassala in Software Development on February 21, 2007
Good free book on Code Review
Posted by claudiolassala in Software Development on February 19, 2007
Studying for the 70-536 exam: the shortage of good books
Posted by claudiolassala in Software Development on February 15, 2007
Those Regular Expressions again…
Posted by claudiolassala in Software Development on February 13, 2007
I’ve been growing everyday day more and more fondly of the following principle: Code Talks! The main idea is that well-written code is self-documented and should not require a lot of comments in-line to be understood.
That said, if code talks, when I see code like the one below, I think to myself: “Well, if code talks, this one swears…”: J
public bool IsValidPhoneNumber(string number)
{
return Regex.IsMatch(number, @"^\(?(\d{3})\)?[\s\-]?(\d{3})\-?(\d{4})$");
}
Once upon a time I’ve posted something regarding putting comments in regular expressions:
http://claudiolassala.spaces.live.com/Blog/cns!E2A4B22308B39CD2!117.entry
When I look back, that still seems a bit cryptic, though. Last week I was thinking: “how the heck can a developer do code review where there’s a regular expression involved?”. Given the sample code above, I’ve been thinking about splitting the RegEx into something that’s easier to read, and therefore, easier to review. The code would look something like this:
/// <summary>
/// Checks whether a given number is a valid phone number (according to the common format).
/// </summary>
/// <param name="number">The phone number.</param>
/// <returns>True if the number is valid, or false if it is invalid.</returns>
/// <remarks>
/// Examples of valid phone numbers:
/// (123)456-7890
/// (123) 456-7890
/// 123-456-7890
/// 1234567890
/// </remarks>
public bool IsValidPhoneNumber(string number)
{
return Regex.IsMatch(number, REGEX_VALID_PHONE_NUMBER);
}
Notice that I’ve replace the RegEx by a constant that is just easier to read. That constant is defined as follows:
private const string REGEX_VALID_PHONE_NUMBER =
MATCHES_BEGINNING +
MATCHES_OPTIONAL_OPENING_PARENTHESIS +
MATCHES_EXACTLY_THREE_NUMERIC_DIGITS +
MATCHES_OPTIONAL_CLOSING_PARENTHESIS +
MATCHES_EITHER_SPACE_OR_HYPHEN +
MATCHES_EXACTLY_THREE_NUMERIC_DIGITS +
MATCHES_OPTIONAL_HYPHEN + MATCHES_STRING_ENDS_WITH_FOUR_NUMERIC_DIGITS_REQUIREMENT;
That’s a lot more verbose, but in this case, something more verbose than the cryptic RegEx. The other constants are defined like so:
private const string MATCHES_BEGINNING = "^";
private const string MATCHES_OPTIONAL_OPENING_PARENTHESIS = @"\(?";
private const string MATCHES_EXACTLY_THREE_NUMERIC_DIGITS = @"\d{3}";
private const string MATCHES_OPTIONAL_CLOSING_PARENTHESIS = @"\)?";
private const string MATCHES_EITHER_SPACE_OR_HYPHEN = @"[\s\-]";
private const string MATCHES_OPTIONAL_HYPHEN = @"\-?";
private const string MATCHES_STRING_ENDS_WITH_FOUR_NUMERIC_DIGITS_REQUIREMENT = @"\d{4}$";
This does seems a lot easier to review, but there’s one part that I’m not sure it would work: when we’re building and testing a RegEx, we normally use a tool such as Regulator or RegEx Buddy. I’m thinking I need some little tool where I can select the pieces of a RegEx and then create the declarations for the constants out of it, otherwise it’d be painful to do it for a long and complex expression.
I’m wondering what other developers are doing out there. Any thoughts?
Even though some RegEx developers out there may think this is silly, most of the developers I’ve encountered aren’t that familiar even with the most simple expressions, so I don’t think I’m lone on the frustration of trying to understand those cartoon swear expressions. 🙂