Python makes downloading files super easy. Parsing data directly from memory is sometimes more convenient than fussing around with temporary files, directories, and the ensuing housekeeping. Fortunately, Python makes parsing data in memory just as easy (if not easier) than actually saving the file to local storage.
Parsing a CSV File from Memory
In this article, we’ll cover how to use the requests library to download a remote CSV file directly into memory and parse it using the csv
module. This doesn’t involve much code but it will be helpful to outline the steps first. Here is what will happen:
- Define a URL
- Send HTTP GET request via the requests library
- Convert response data into iterator of lines
- Parse response lines as CSV reader object
- Iterator, save or manipulate as desired
Listing these steps takes almost as many characters as the actual code. Below is the implementation resulting in iterating over a csv.reader
object to print each line to the standard output:
import request import csv # Define the remote URL url = "https://query1.finance.yahoo.com/v7/finance/download/SPY" # Send HTTP GET request via requests data = requests.get(url) # Convert to iterator by splitting on \n chars lines = data.text.splitlines() # Parse as CSV object reader = csv.reader(lines) # View Result for row in reader: print(row) ['Date', 'Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume'] ['2021-08-27', '447.119995', '450.649994', '447.070007', '450.100006', '450.100006', '58642636']
Here we see two ‘rows’ of data printing out—the first row being the header and the second being the most recent OHLC quote for $SPY. Converting a response
object into an iterator is as simple as using the splitlines()
method. This method is the equivalent of using split('\n')
and in either case, will return an iterable (the required argument for the csv.reader
method.
Final Thoughts
Python offers several ways to download a file. In any case, the resulting textual data can be parsed—directly from memory—to be used rather than first saving to local storage. This can often be a more efficient approach in cases where temporary file use is preferred and remote files are not excessive in size. Readers interested in retrieving financial data using Python should check out the article 3 Easy Ways to Get Financial Data Using Python.