Regex for UK Telephone Numbers

for all subjects/topics not covered by the other forum categories
Post Reply
User avatar
BeebMaster
Posts: 2977
Joined: Sun Aug 02, 2009 4:59 pm
Location: Lost in the BeebVault!
Contact:

Regex for UK Telephone Numbers

Post by BeebMaster » Fri Feb 14, 2020 10:25 am

Can anyone suggest a regular expression to match UK telephone numbers? I have 600-odd plain text files to process and I need to match 'phone numbers in the text. I've tried several expressions from websites, and none has found a single number.

It's made me a bit suspicious that there may be hidden characters in the files, so I think I will have to check some of the files with a hex editor to make sure the 'phone numbers are only ASCII numeric characters plus brackets and space. Oh and plus.
Image

User avatar
BeebMaster
Posts: 2977
Joined: Sun Aug 02, 2009 4:59 pm
Location: Lost in the BeebVault!
Contact:

Re: Regex for UK Telephone Numbers

Post by BeebMaster » Fri Feb 14, 2020 12:15 pm

Well well, what a difference a day makes! This works:

Code: Select all

((\+44\s?\(0\)\s?\d{2,4})|(\+44\s?(01|02|03|07|08)\d{2,3})|(\+44\s?(1|2|3|7|8)\d{2,3})|(\(\+44\)\s?\d{3,4})|(\(\d{5}\))|((01|02|03|07|08)\d{2,3})|(\d{5}))(\s|-|.)(((\d{3,4})(\s|-)(\d{3,4}))|((\d{6,7})))
Although I did notice it pulls in some strings of 10/11 numbers which aren't telephone numbers (eg. "0123456789.pdf" etc) but I think I can live with that.
Image

RobC
Posts: 2768
Joined: Sat Sep 01, 2007 9:41 pm
Contact:

Re: Regex for UK Telephone Numbers

Post by RobC » Fri Feb 14, 2020 12:50 pm

BeebMaster wrote:
Fri Feb 14, 2020 12:15 pm
I did notice it pulls in some strings of 10/11 numbers which aren't telephone numbers (eg. "0123456789.pdf" etc)
I worked on a similar problem once and used probabilities derived from order statistics to flag up cases like that.

User avatar
1024MAK
Posts: 9610
Joined: Mon Apr 18, 2011 4:46 pm
Location: Looking forward to summer in Somerset, UK...
Contact:

Re: Regex for UK Telephone Numbers

Post by 1024MAK » Fri Feb 14, 2020 12:51 pm

Hi Ian

Given the number ( :P ) of times there has been a change to the format and length of mainland U.K. telephone numbers, good luck with that!

Mark

User avatar
BeebMaster
Posts: 2977
Joined: Sun Aug 02, 2009 4:59 pm
Location: Lost in the BeebVault!
Contact:

Re: Regex for UK Telephone Numbers

Post by BeebMaster » Fri Feb 14, 2020 12:59 pm

Indeed. I remember the Current Bun or somesuch showing all the number changes since the original "Whitehall 1212" a few years back!

I'm pretty sure I'm not dealing with any old area codes before the extra 1 was introduced though.

Despite websites with sample expressions claiming infallibility, it probably isn't possible to catch everything. That string won't capture local numbers only ("222 3333") for example.

However it looks like if I add \s to the end of that string it omits "string-of-numbers.filetype" whilst still matching "call me on 0123456789 or 07772345678".

But now it doesn't match "my number is 01234567890." because of the full stop!

I think I will process the files using the string ending in /s and then manually look at matches with the original string.
Image

User avatar
jgharston
Posts: 3802
Joined: Thu Sep 24, 2009 11:22 am
Location: Whitby/Sheffield
Contact:

Re: Regex for UK Telephone Numbers

Post by jgharston » Fri Feb 14, 2020 3:58 pm

Not a regular expression, but pass it into and then out of the Phone library functions.

num$=FNPhone_ToStrF(FNPhone_FromStr(numb$))

will take a mangled number and return it as a properly-formatted number.

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.25
(C) Copyright J.G.Harston 1989,2005-2015
>_

scruss
Posts: 159
Joined: Sun Jul 01, 2018 3:12 pm
Location: Toronto
Contact:

Re: Regex for UK Telephone Numbers

Post by scruss » Sat Feb 15, 2020 4:52 am

BeebMaster wrote:
Fri Feb 14, 2020 12:59 pm
I'm pretty sure I'm not dealing with any old area codes before the extra 1 was introduced though.
That would likely be impossible to regex. Some cities had special short codes that allowed nearby but not quite local towns to be dialled at local rates. From Glasgow, "32" would get you East Kilbride and "36" would get you Killearn/Balfron.

Post Reply