The main reasons for using raw streams are to improve performance and save memory, especially when dealing with large files. By streaming downloads, data can be read chunk by chunk without having to load the entire response into memory. This is especially important for large files or requests with large amounts of data.
Original raw stream
If you want to get the raw data stream returned by the server, you can use the response.raw
property, which will return a HTTPResponse
object, and you can use the read()
method of the object to read the content.
If you do this, you should set stream=True
in your initial request:
import requests
url = "http://example.com/largefile.zip"
response = requests.get(url,stream=True)
response.raw.read()
You can specify the number of bytes to read:
response.raw.read(1024)
iter_content method
When the data is too large, the response.iter_content()
method should be used to process it. This method reads the response content block by block in the form of an iterator;
Example: Downloading a large file
In this example, the file is downloaded piece by piece to avoid loading the entire file into memory at once, thus improving efficiency;
import requests
url = 'http://example.com/largefile.zip'
response = requests.get(url, stream=True)
response.raise_for_status()
with open('largefile.zip','wb') as f:
for chunk in response.iter_content(chunk_size=8192):
#Read 8kb each time
f.write(chunk)