Tuesday, October 2, 2012

online python regular expression tester

online python regular expression tester

http://www.pythonregex.com/

Python Regular Expression Testing Tool


^[a-zA-Z0-9_-]{0,50}$
C_1234-5678


Code:

>>> regex = re.compile("^[a-zA-Z0-9_-]{0,50}$")
>>> r = regex.search(string)
>>> r
<_sre.SRE_Match object at 0x51150ec3caa25ba0>
>>> regex.match(string)
<_sre.SRE_Match object at 0x51150ec3caa25f90>

# List the groups found
>>> r.groups()
()

# List the named dictionary objects found
>>> r.groupdict()
{}

# Run findall
>>> regex.findall(string)
[u'C_1234-5678']




2
http://stackoverflow.com/questions/2525327/regex-for-a-za-z0-9-with-dashes-allowed-in-between-but-not-at-the-start-or-e


Update:

This question was an epic failure, but here's the working solution. It's based on Gumbo's answer (Gumbo's was close to working so I chose it as the accepted answer):

Solution:

r'(?=[a-zA-Z0-9\-]{4,25}$)^[a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*$'

Original Question (albeit, after 3 edits)

I'm using Python and I'm not trying to extract the value, but rather test to make sure it fits the pattern.

allowed values:

spam123-spam-eggs-eggs1
spam123-eggs123
spam1234
eggs123

Not allowed values:

eggs1-
-spam123
spam--spam
I just can't have a dash at the starting or the end. There is a question on here that works in the opposite direction by getting the string value after the fact, but I simply need to test for the value so that I can disallow it. Also, it can be a maximum of 25 chars long, but a minimum of 4 chars long. Also, no 2 dashes can touch each other.
Here's what I've come up with after some experimentation with lookbehind, etc:
# Nothing here





3 matching a hyphen in python re
http://stackoverflow.com/questions/8383213/python-regex-for-hyphenated-words



Try this:
re.findall(r'\w+(?:-\w+)+',text)
Here we consider a hyphenated word to be:
  • a number of word chars
  • followed by any number of:
    • a single hyphen
    • followed by word chars


question was:

I'm looking for a regex to match hyphenated words in python.
The closest I've managed to get is: '\w+-\w+[-w+]*'
text = "one-hundered-and-three- some text foo-bar some--text"
hyphenated = re.findall(r'\w+-\w+[-\w+]*',text)
which returns list ['one-hundered-and-three-', 'foo-bar'].
This is almost perfect except for the trailing hyphen after 'three'. I only want the additional hyphen if followed by a 'word'. i.e. instead of the '[-\w+]*' I need something like '(-\w+)*' which I thought would work, but doesn't (it returns ['-three, '']). i.e. something that matches |word followed by hyphen followed by word followed by hyphen_word zero or more times|.


The main problem in your own expression are the square brackets. They don't group the content together, they create a character class, thats something completely different. – stema Dec 5 '11 at 9:46
Thanks for the input, lazyr. I have considered the cases you point out, and they will not pose a problem. Thanks for the clarification, stema. I realised that the square brackets did not group the content, but they resulted in the closest match for what I was attempting to do. – user1081231 Dec 5 '11 at 11:55





I don't know what you plan to use this for, but have you considered cases where a trailing or prefixed hyphen is valid, like "nineteenth- and twentieth-century" or "investor-owned and -operated"? – lazyr Dec 5 '11 at 9:38







No comments:

Post a Comment