Today, since I need to store a large dataset and to access it like I am on my local machine, I explored AWS S3. After having created a bucket, I had a look at the official guide of boto3. The guide suggests to download the files to your local HD and then processing them. This brings to a continuous download and delete files that is not healthy for your hard drive. So, I found a way to assign the images to a NumPy variable through OpenCV directly. First of all we have to installa the folling libries (I assume you already have OpenCV installed):
pip install boto3
Boto3 is the library responsible for accessing and exploring your aws bucket. Now, I’ll show you my solution:
import boto3 import cv2 import numpy as np # AWS CREDENTIALS s3 = boto3.resource( service_name='s3', region_name='<BUCKET-REGION-NAME>', aws_access_key_id='<AWS-ACCESS-KEY-ID>', aws_secret_access_key='<AWS-SECRET-ACCESS-KEY>' ) #BUCKET NAME bucket_name = "<BUCKET-NAME>" bucket = s3.Bucket(bucket_name) for obj in bucket.objects.all(): #ARRAY OF BYTES WHERE YOUR IMAGE IS TEMPORARY SAVED img = bucket.Object(obj.key).get().get('Body').read() #DECODING THE IMAGE nparray = cv2.imdecode(np.asarray(bytearray(img)), cv2.IMREAD_COLOR) #SHOWING IT cv2.imshow("image", nparray) cv2.waitKey(0)
As you can see, it is very simple, you store the bytes of your image in a temporary variable and then you use the OpenCV method
cv2.imdecode for decoding it.