python - Modifying a group within Regular Expression Match -


so have function apart of django (v 1.5) model takes body of text , finds of tags, such , converts correct ones user , removes of others.

the below function works requires me use note_tags = '.*?\r\n' because tag group 0 finds of tags regardless of whether user's nickname in there. curious how use groups can remove of un-useful tags without having modify regex.

def format_for_user(self, user):     body = self.body     note_tags = '<note .*?>.*?</note>\r\n'     user_msg = false     if not user none:         user_tags = '(<note %s>).*?</note>' % user.nickname         user_tags = re.compile(user_tags)         tag in user_tags.finditer(body):             if tag.groups(1):                 replacement = str(tag.groups(1)[0])                 body = body.replace(replacement, '<span>')                 replacement = str(tag.group(0)[-7:])                 body = body.replace(replacement, '</span>')                 user_msg = true                 note_tags = '<note .*?>.*?</span>\r\n'     note_tags = re.compile(note_tags)     tag in note_tags.finditer(body):         body = body.replace(tag.group(0), '')     return (body, user_msg) 

so abarnert correct, shouldn't using regex parse html , instead should use along lines of beautifulsoup.

so used beautifulsoup , resulting code , solves lot of problems regex having.

def format_for_user(self, user):     body = self.body     soup = beautifulsoup(body)     user_msg = false     if not user none:         user_tags = soup.findall('note', {"class": "%s" % user.nickname})         tag in user_tags:             tag.name = 'span'     all_tags = soup.findall('note')     tag in all_tags:         tag.decompose()     soup = soup.prettify()     return (soup, user_msg) 

Comments

Popular posts from this blog

php - Submit Form Data without Reloading page -

linux - Rails running on virtual machine in Windows -