If you're yolo-ing on the web and downloading a lot of content, especially arbitrary media files using a crawler, it might be useful to first check the mimetype & filesize before downloading it.
To do this with Python's
requests module, you'll have to set
stream=True and examine the headers for size & mime type. Following that, you can retrieve the content.
'Content-Length' gives the file size in bytes while
'Content-type' gives the mime type (not always reliable).
Here's a quick example.
import requests MAX_SIZE = 2**20 url = "https://i.imgur.com/AD3MbBi.jpeg" resp = requests.get(url, stream=True) if resp.headers.get("Content-Type", "") == "image/jpeg" and int(resp.headers.get("Content-length")) < MAX_SIZE: content = resp.content with open("image.jpg", 'wb') as f: f.write(content)