Can you please put some comment on that Regular Expression?!

Regular expressions are one of those things on the software development world that give me nausea when I come across them. Reading something like this in code just makes me shiver:
 
^(ht|f)tp(s?)\:\/\/[0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*(:(0-9)*)*(\/?)([a-zA-Z0-9\-\.\?\,\’\/\\\+&%\$#_]*)?$
 
It seems to me like cartoon characters swearing: &$^%#&%@&$#**^&$. 
 
I just happened to run accross a little article (found at  that taught something I didn’t know about regular expression: one can put comments on them!
 
So, instead of having some cryptic code like this one:
Regex regex = new Regex(@"^(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,10}\s$);
One could make the world a favor and rewrite that code like so:
Regex regex = new Regex(@"
                                            ^                 # anchor at the start
                                            (?=.*\d)      # must contain at least one numeric character
                                            (?=.*[a-z])  # must contain one lowercase character
                                            (?=.*[A-Z]) # must contain one uppercase character
                                            {8,10}         # From 8 to 10 characters in length
                                            \s                 # allows a space 
                                            $                  # anchor at the end", 
                                            RegexOptions.IgnorePatternWhitespace);
 
That way, even my little brain can understand a freakin’ regular expression.
 

  1. #1 by Bob Barker on May 14, 2020 - 11:12 am

    another idea is that regexes are just text so you can save them as variables and use them in that way.

    it’s also possible to use them with string interpolation“`
    import re

    text_to_search = ”’
    321-555-4321
    123.555.1234
    123*555*1234
    800-555-1234
    900-555-1234
    ”’

    sentence = ‘Start a sentence and then bring it to an end’

    three_digits = r’\d\d\d’
    four_digits = r’\d\d\d\d’
    dot_or_dash = r'(\.|-)’

    pattern = re.compile(three_digits + dot_or_dash + three_digits + dot_or_dash + four_digits)

    # string interpolation
    matches = pattern.finditer(text_to_search)

    print(‘starting…’)

    for match in matches:
    print(match)
    “`

    • #2 by Bob Barker on May 14, 2020 - 11:14 am

      edit: I hit something and it submitted my post when I wasn’t ready it was supposed to say:

      “`
      # string interpolation
      pattern = re.compile(rf'{three_digits}(\.|-){three_digits}(\.|-){four_digits}’)
      matches = pattern.finditer(text_to_search)
      “`

      • #3 by claudiolassala on May 18, 2020 - 2:46 pm

        Woah, that’s pretty cool. I didn’t know it could be done that way. Thanks for sharing, Bob!

      • #4 by claudiolassala on May 18, 2020 - 2:48 pm

        Come to think of it, I do remember using string interpolation but couldn’t remember about the “compile” method. It’s been a while. Anyway, anything to make those expressions more readable. 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: