Find matching phrases and words in a string python -


using python, efficient way 1 extract common phrases or words given string?

for example,

string1="once upon time there large giant called jack" string2="a long time ago brave young man called jack" 

would return:

["a","time","there","was very","called jack"]  

how 1 go in doing efficiently (in case need on thousands of 1000 word documents)?

you can split each string, intersect sets.

string1="once upon time there large giant called jack" string2="a long time ago brave young man called jack" set(string1.split()).intersection(set(string2.split())) 

result

set(['a', 'very', 'jack', 'time', 'was', 'called']) 

note matches individual words. have more specific on consider "phrase". longest consecutive matching substring? more complicated.


Comments

Popular posts from this blog

php - Submit Form Data without Reloading page -

linux - Rails running on virtual machine in Windows -

php - $params->set Array between square bracket -