Basics of Python Image Library (PIL)
Python Imaging Library (PIL)¶
- Date: 2020-03-01 (Sun)
from matplotlib import pyplot as plt
%matplotlib inline
from PIL import Image
def print_info(arr):
print("dtype: ", arr.dtype)
print("range: ", f'({arr.min(), arr.max()})')
print("shape: ", arr.shape)
fn = '/data/hayley/maptiles/paris/EsriImagery/16617_11283_15.png'
im = Image.open(fn)
plt.imshow(im);
PIL.Image <-> np.ndarray converison¶
PIL.Image to
ndarray
im = Image.open(img_fn) np_im = np.asarray(im)
np.array to PIL.Image
pil_im = Image.fromarray(np_im)
print(im.mode)
np_im = np.asarray(im)
print(np_im.shape)
plt.imshow(np_im[...,:3]);
im_3d = Image.fromarray(np_im)
plt.imshow(im_3d);
Basic Image Processing¶
plt.imshow(im_3d.crop((150, 50, 250, 200)))
PIL Modes¶
A mode
of an PIL.Image
object defines the type and depth of a pixel in the image. Each pixel uses the full range of the bit depth
- a 1-bit pixel can take either 0 or 1
- a 8-bit pixel can take an integer value in range 0,1, ..., 255
Resource:
- SOF
Here's the kicker... if you want and expect an RGB image, you should just convert to RGB on opening:
im = Image.open(fn).convert('RGB')
List of standard modes:
- Single channel image
1
: 1-bit pixels, binary Black and WhiteL
: 8-bit pixels, Greyscale in 256 degree. "L" for "Luminance" (not color)Multi-channel image
RGB
:(3x8-bit pixels), true colorRGBA
: (4X8-bit pixels), true color with transparency maskCMYK
: (4X8-bit pixels), color separationLAB
: (3X8-bit pixels), Lab color spaceYCbCr
: (3x8-bit pixels), color video formatHSV
: (3X8-bit pixels), Hue-Saturation-Value
I
: 32-bit signed integer pixelsF
: 32-bit floating point pixels
im_3d = im.convert('RGB')
print(im_3d.mode)
plt.imshow(im_3d);
Data type of PIL.Image objects¶
Most underlying data of a PIL.Image object is uint8
dtype, except:
I
: 32-bit signed integer pixels (4*8-bit signed integer)F
: 32-bit floating point pixels (4*8-bit floating point)
# im is PIL.Image of mode `RGBA`
im_3d = im.convert('RGB')
np_im_3d = np.asarray(im_3d)
print_info(np_im_3d)
plt.imshow(np_im_3d);
im_L = im.convert('L')
np_im_L = np.asarray(im_L)
print_info(np_im_L)
plt.imshow(np_im_L);
im_I = im.convert('I') #single-channel, 32bit signed integer (not uint8)
np_im_I = np.asarray(im_I)
print_info(np_im_I)
plt.imshow(np_im_I);
im_F = im.convert('F') #single-channel, 32bit floats (not uint8) in range of [0.0, 256.0)
np_im_F = np.asarray(im_F)
print_info(np_im_F)
plt.imshow(np_im_F);
Conversions between PIL.Image
and torch.tensor
¶
Use torchvision.transforms
tvts.ToTensor()
(pil_im) $\Rightarrow$torch.FloatTensor
of shsape (C,H,W)- this transform scales the values to range [0.0, 1.0] if the PIL.Image belongs to one of modes in (1, L, LA, P, RGB, YCbCr, RGBA, CMYK, I, F,). Note this list of modes covers most of the cases
- this function can take
np.ndarray
as an input to return atorch.FloatTensor
.- assumes the
np.ndarray
follows (H,W,C) numpy image convension. - scaling to range [0.0, 1.0] happens if the
np.ndarray
had dtype ofnp.uint8
Let's check if the scaling effect oftvts.ToTensor()
actually happens only when its inputnp.ndarray
is of dtype ==np.uint8
- assumes the
PIL.Image
object with its mode
== F
stores each pixel as 32-bit float in range of [0.0, 256.0).
im_F = im.convert('F')
print('PIL Image mode: ', im_F.mode)
np_im_F = np.asarray(im_F)
print_info(np_im_F)
Now, let's try to convert this 32-bit floating point np.ndarray
to a torch.tensor
object using torchvision.transforms.ToTensor()
class
import torchvision.transforms as tvts
t_from_pil = tvts.ToTensor()(im_F)
print('=== t_from_pil ===')
print_info(t_from_pil)
t_from_np = tvts.ToTensor()(np_im_F)
print('=== t_from_np ===')
print_info(t_from_np)
We can see that the output tensors are still torch.FloatTensor
(ie. torch.tensor
of dtype of float32) but the values are not normalized to range [0.0, 1.0]. This confirms that the tvts.ToTensor()
scales its input PIL.Image
or np.nadarray
object to range [0.0, 1.0] only when the input's dtype == uint8
. Note that the returned torch.Tensor
always has 3 dims.
Now, let's check if it does the scaling when the input np.ndarray
or PIL.Image
indeed has dtype of unsigned 8bit-integer (ie. uint8
). We are going to use PIL.Image
with mode RGB
as an example.
im_rgb = im.convert('RGB')
print(im.mode)
np_rgb = np.asarray(im_rgb)
print_info(np_rgb)
# to tensor
t_from_pil_rgb = tvts.ToTensor()(im_rgb)
print(' === t_from_pil_rgb === ')
print_info(t_from_pil_rgb)
t_from_np_rgb = tvts.ToTensor()(np_rgb)
print(' === t_from_np_rgb === ')
print_info(t_from_np_rgb)
Notice the range of the returned torch.Tensor
s. They are both in range of [0.0, 1.0].