需求:
现有多个文件,每个文件有如下格式,需将所有文件的数据合并并按照第一列排序,数据按空格分隔。数据量较大,在excel内不能完成。
数据格式如下:
0.01 0 50661888 8 1
0.01 0 50661896 8 1
21.27 0 50661904 8 1
4616.62 0 92880896 8 1
4616.64 0 92880904 8 1
4616.64 0 92880912 8 1
4616.64 0 92880920 8 1
4616.65 0 92880928 8 1
4616.65 0 92880936 8 1
python:
#!/usr/bin/env python
import os
import sys
#import pandas
#fs = sys.argv[1:]
fs = os.listdir("E:\\Dataspace\\zhou")
def proc_file_one(filename, lst):
f = file(filename, "r")
for line in f.readlines():
line = line.rstrip("\n")
tp = tuple(line.split(" "))
if len(tp) > 1:
lst.append(tp)
f.close()
return lst
def f(a):
return float(a[0])
lst = []
out = ""
for i in fs:
i = os.path.join("E:\\Dataspace\\zhou", i)
lst = proc_file_one(i, lst)
for i in sorted(lst, key=f):
out = out + " ".join(i) + '\n'
f=open("E:\Dataspace\zhou\out.txt", 'w')
f.write(out)
#print >>f, out
f.close()
转载于:https://www.cnblogs.com/nohadoop/p/4424469.html