Peeking at Large Files in Python

July 14, 2008 at 3:07 pm

I have been parsing files that are in the multi-gigabyte range. Python can handle them pretty well, but it can still take awhile to chug through them. I have to be honest in saying I don’t know of any great tricks to speed this up. However, one thing that can be helpful when parsing large files is to read a few lines to be able to see the format. The following code will allow you to look at the first 100 lines of a text file with Python (like when you want to see the format of a large file without reading through all of it). To read the entire file, you just would take out the if statement.

inFileName = “associations.txt”
inFile = open(inFileName, ‘r’)
outFile = open(“peek_%s” % inFileName, ‘w’)

count = 0

for line in inFile:
  count += 1

if count <= 100:
  outFile.write(line)
else: break

outFile.close()
inFile.close()

Advertisements

Entry filed under: Python. Tags: , .

How to Concatenate Two String Values in SQL Parsing Text Files in Java



%d bloggers like this: