Many programmers/users will create a regular expression
on one line and then either:
-write paragraphs and paragraphs
trying to explain what the regular
expression does, or..
-not write any description of the
regex and assume it's better that
every programmer spend several
hours decrypting it.
-Or maybe just delete it all and
never even offer it in the first
place. That might be more efficient.
Typically this occurs on a web forum when someone
suggests a regex to another programmer for his
problem. They give the regex, and then they explain
it paragraph by paragraph instead of just breaking
it into pieces and explaining it that way. Or they
just give the regex and don't offer any breakdown.
Break it down into pieces like this
(.*) //this piece does..
[a-z]* //this piece does..
[1-5]* //this piece does..
Then combine it
(.*)[a-z]*[1-5]* // this together does
Regular expressions are to be broken down into pieces,
and each piece is discussed with a source code comment.
Stop writing one line regular expressions and trying
to explain them in essay format to prove that your
regular expression even does something.
Break them down into sections, source code comment them,
and then give the full regular expression on one line.
The problem with regular expressions is that they are
quick and dirty - just because you had full
concentration and were able to write the regex on one
line in at that moment, it does not mean you actually
understand it 6 minutes later. Instead, you could have
source code commented the regular expression in pieces,
commented it, and it would then be maintainable.
-The amount of time spent time to describe the regular
expression as a whole one, is the same or more amount
of time than just not using a regular expression in the
first place, and sticking to long drawn out system
and parsing functions. This is the case if you really
know your programming language well..
-Beginners/Intermediates tend to use regexes often
(perl, php people especially) while true hackers write
parsers that are more maintainable.
-The amount of time it takes to decrypt a regular
expression forces most people to not reuse regular
expressions - people tend to create their own since
they are like a write-once technology.
-modifying a regex may end up turning out to be a night
mare and it may screw up an entire program. Modifying
a true parser that doesn't use regexes is usually much
easier. Imagine, for a moment, if the freepascal,
visual C++, or Gnu C Compiler was based on regex
parsing.. good luck modifying and improving the
compiler.
-One little mistake in a regular expression can bring
a whole system down, such as an htaccess file in root
directory. This is not worth leaving un-commented.
At least carry a descriptor file around if you don't
want to make the htaccess file to cluttered with
comment noise.
By the time one writes paragraphs of explanations about
the regular expression (or even more typically, they don't
even explain what the regular expression does anywhere)
these people could have written a routine programmatically
with normal system and string functions and commented it
properly.
Regular expressions are a great tool, but don't think you
are elite leaving that htaccess mod_rewrite file un-commented
with regular expressions as the only explanation for
what the .htaccess file does.
What's worse: some languages, i.e. perl, php, force you to
write more than just the regular expression on one line.
They place the modifiers on one line with the regular
expression. As if this little time saver is going to make
life easier in the long term when you are maintaining
the website or program in the future.
Overall regular expressions and some languages remind me of
selfishness and write-once technology.
Make no mistake, regular expressions are fast, dirty and
quick.. especially if you can read and remember your own.
But you've got to start thinking for the long term.
|