Find matching phrases and words in a string python -
using python, efficient way 1 extract common phrases or words given string?
for example,
string1="once upon time there large giant called jack" string2="a long time ago brave young man called jack"
would return:
["a","time","there","was very","called jack"]
how 1 go in doing efficiently (in case need on thousands of 1000 word documents)?
you can split
each string, intersect
set
s.
string1="once upon time there large giant called jack" string2="a long time ago brave young man called jack" set(string1.split()).intersection(set(string2.split()))
result
set(['a', 'very', 'jack', 'time', 'was', 'called'])
note matches individual words. have more specific on consider "phrase". longest consecutive matching substring? more complicated.
Comments
Post a Comment