python load csv file with quoted fields where commas are used as 1000s separator -
is there simple way in python load csv file may contain lines ones listed below dataframe?
1.0, 2.0, 3.0, "123,456,789.999" 1000.0, 2000.0, 3000.0, "123,456,789.123"
obviously type of of columns should numeric(float64, int64, etc.)
. additionally, countries use (space)" "
1000
separator rather thancomma
. there way specify that?
pandas.io.parsers.read_table
can handle comma separated numbers provided give converters
argument handles commas:
converters
: dict. optional dict of functions converting values in columns. keys can either integers or column labels
here solution in vanilla python:
import csv def try_convert_number(s): val = s.replace(',', '') try: return int(val) except valueerror: try: return float(val) except valueerror: return s result = [] # in python 2 use: open('file.csv', 'rb') f: open('file.csv', newline='') f: reader = csv.reader(f) if you_have_a_header_row: next(reader) row in reader: result.append(map(try_convert_number, row))
another option create new csv file lacks superfluous commas:
def replace_commas(s): return s.replace(',', '') open('orig.csv', newline='') fin, open('new.csv', newline='') fout: reader = csv.reader(fin) writer = csv.writer(fout) row in reader: writer.writerow(map(replace_commas, row))
Comments
Post a Comment