[ Date Index ][
Thread Index ]
[ <= Previous by date /
thread ]
[ Next by date /
thread => ]
Adrian Midgley wrote:
NHS Numbers are 10 digits. The canonical formatting of them is 999 999 9999 However often they are present as 9999999999 I am always amazed by the cleverness of people who can produce regexps such as one that would enable me to grep all lines out of a big text (csv, values in "") file which contain a specific NHS number regardless of which format it is presented in...
Sometimes it is easier to process the number format before you write the regular expression. What did our Perl guru say about write once code ;) #!/bin/bash # # Program expectsd 10 digit number and greps for all occurances of # the same 10 digit number in double quotes. # # Both input and found format may be 123 456 7890 or 1234567890 # # Simon Waters 2002 # if [ $# -ne 2 -a $# -ne 4 ] then echo "Usage $0 123 456 7890 myfile" echo " or $0 1234567890 myfile" if [ $# -eq 2 ] then FIRST=`expr $1 : "\(...\)"` SECOND=`expr $1 : "...\(...\)"` THIRD=`expr $1 : "......\(....\)"` FILE=$2 fi if [ $# -eq 4 ] then FIRST=$1 SECOND=$2 THIRD=$3 FILE=$4 fi grep -E "\"$FIRST $SECOND $THIRD\"|\"$FIRST$SECOND$THIRD\"" $FILE -- The Mailing List for the Devon & Cornwall LUG Mail majordomo@xxxxxxxxxxxx with "unsubscribe list" in the message body to unsubscribe.