Find matching phrases and words in a string python -
using python, efficient way 1 extract common phrases or words given string?
for example,
string1="once upon time there large giant called jack" string2="a long time ago brave young man called jack" would return:
["a","time","there","was very","called jack"] how 1 go in doing efficiently (in case need on thousands of 1000 word documents)?
you can split each string, intersect sets.
string1="once upon time there large giant called jack" string2="a long time ago brave young man called jack" set(string1.split()).intersection(set(string2.split())) result
set(['a', 'very', 'jack', 'time', 'was', 'called']) note matches individual words. have more specific on consider "phrase". longest consecutive matching substring? more complicated.
Comments
Post a Comment