multiprocessing - Fastest way to extract tar files using Python -


i have extract hundreds of tar.bz files each size of 5gb. tried following code:

import tarfile multiprocessing import pool  files = glob.glob('d:\\*.tar.bz') ##all files in d f in files:     tar = tarfile.open (f, 'r:bz2')    pool = pool(processes=5)     pool.map(tar.extractall('e:\\') ###i want extract them in e    tar.close() 

but code has type error: typeerror: map() takes @ least 3 arguments (2 given)

how can solve it? further ideas accelerate extracting?

you need change pool.map(tar.extractall('e:\\') pool.map(tar.extractall(),"list_of_all_files")

note map() takes 2 argument first function , second iterable , , apply function every item of iterable , return list of results.

edit : need pass tarinfo object other process :

def test_multiproc():     files = glob.glob('d:\\*.tar.bz2')     pool  = pool(processes=5)     result = pool.map(read_files, files)   def read_files(name):   t = tarfile.open (name, 'r:bz2')  t.extractall('e:\\')  t.close()  >>>test_multiproc() 

Comments

Popular posts from this blog

php - Submit Form Data without Reloading page -

linux - Rails running on virtual machine in Windows -