Reading really large files in Python

There are many ways to read files in python, which is kinda un-intuitive and against the PEP-8 guidelines. However, this is the best:

It creates a generator which only reads one line at a time into memory. I’ve used this code to process multiple-gigabyte files with ease. Using  with is useful since it automatically closes the file after the code indented below it has finished.

Often people make the mistake of using:

  • for line in f.read()  – this loads the entire file at once and reads per character
  • for line in f.readlines()  – this loads the entire file into a list in memory

One thought on “Reading really large files in Python

  1. Pingback: Using ACORA to process hundreds of stopwords at once | ikigomu

Leave a Reply