This is a feature request for the `imgproc` module.
Motivation: frequently, p…eople (on Stack Overflow and elsewhere) want to overlay one image onto another, with an actual alpha channel. That means it's not just blending two pictures, or fading two pictures with one global alpha value. The top image is frequently not of the same size as the bottom one, and a particular position is desired.
Such a function would:
- take a destination image
- take an overlay image that should have an alpha-channel (but need not)
- take a position
- if the destination image is an empty Mat, resize it to overlay size, and generate the stereotypical gray checkerboard background instead
- overlay the top image at the given position (the overlay may be completely or partially off-screen)
- be sensitive to the alpha channel of the *bottom* image, and update it
- use the alpha channel of the top and bottom image to blend each pixel accordingly
- optionally erase color data that is completely transparent (GIMP doesn't do it...)
- the `dst` image could be altered or left alone, that's a design decision
[Related but separate concepts](https://github.com/opencv/opencv/issues?q=is%3Aissue+label%3A%22category%3A+imgproc%22++%28blend+OR+alpha+OR+transparen%29):
- `cvtColor` with `BGRA2BGR`, which simply drops the fourth channel
- drawing of primitives (lines and shapes) with some transparency
- blend modes, which only take into account the color channels of two images (in the basic formulation)
- simple fading, which applies the same transparency to every overlying pixel
- taking a subregion of a Mat and copying data into it
I think this would be a popular function to have. It could do more or less of the described functionality.
Until now, the described functionality required someone to take an explicit subregion, split the channels of the `Mat`s, and calculate the result manually (using whole-array operations).
Here is some... *pseudocode* that should be a good guideline for a C++ implementation. Ask me if any of that Python is unfamiliar to you. I chose to work with floats because that makes the numerics slightly simpler. This code should be able to handle `uint8` and `float32` without problems.
Another "defect": this assumes a linear color space. That is often *not* the case but all of OpenCV assumes it. Accounting for the gamma-mapping would be "more" correct.
```python
#!/usr/bin/env python3
import os
import sys
import time
import numpy as np
import cv2 as cv
np.set_printoptions(suppress=True, linewidth=120)
def intersect_rects(r1, r2):
(r1x, r1y, r1w, r1h) = r1
(r2x, r2y, r2w, r2h) = r2
rx = max(r1x, r2x)
rw = min(r1x+r1w, r2x+r2w) - rx
ry = max(r1y, r2y)
rh = min(r1y+r1h, r2y+r2h) - ry
if rw > 0 and rh > 0:
return np.array([rx, ry, rw, rh])
else:
return None
def rect_to_slice(rect):
(rx, ry, rw, rh) = rect
return np.s_[ry:ry+rh, rx:rx+rw]
def composite(src, dst=None, position=(0,0), background_color=None):
# this is pronounced "com-PUH-sit" to all you non-native speakers
(src_height, src_width, src_ch) = src.shape
src_has_alpha = (src_ch == 4)
# shortcut
# composite() may be useful to use in imshow, so people "see" transparent data
# and if data has no alpha channel, pass through
if dst is None and not src_has_alpha:
return src
if dst is None:
# background (grid or flat color)
dst = np.empty((src_height, src_width, 3), dtype=np.uint8)
if background_color is None:
(i,j) = np.mgrid[0:src_height, 0:src_width]
i = (i // 8) % 2
j = (j // 8) % 2
dst[i == j] = 192
dst[i != j] = 255
else:
dst[:,:] = background_color
(dst_height, dst_width) = dst.shape[:2]
dst_has_alpha = (dst.shape[2] == 4)
src_rect = np.array([0, 0, src_width, src_height])
dst_rect = np.array([0, 0, dst_width, dst_height])
offset = position + (0,0) # 4-tuple
src_roi = intersect_rects(src_rect, dst_rect - offset)
dst_roi = intersect_rects(dst_rect, src_rect + offset)
if src_roi is not None: # there is overlap
assert dst_roi is not None
dst_slice = dst[rect_to_slice(dst_roi)]
src_slice = src[rect_to_slice(src_roi)]
if src_has_alpha:
src_alpha = src_slice[:,:,3][..., None] # None adds a dimension for numpy broadcast rules
if src_alpha.dtype == np.uint8:
src_alpha = src_alpha / np.float32(255)
blended = src_slice[:,:,0:3] * src_alpha + dst_slice[:,:,0:3] * (1-src_alpha)
else:
blended = src_slice[:,:,0:3]
dst_slice[:,:,0:3] = blended.astype(dst.dtype)
if dst_has_alpha:
new_alpha = src_slice[:,:,3] + dst_slice[:,:,3] * (1-src_alpha)
dst_slice[:,:,3] = new_alpha.astype(dst.dtype)
return dst
def main():
logo = cv.imread(cv.samples.findFile("opencv-logo.png"), cv.IMREAD_UNCHANGED)
lena = cv.imread(cv.samples.findFile("lena.jpg"), cv.IMREAD_UNCHANGED)
# smaller logo...
# inpainting here to fill ordinarily transparent areas with color
# those transparent pixels will become part of edge pixels (where alpha channel has gradients)
# and if they stay black, those edge pixels will turn dark
# a different fix would require alpha-aware resizing
logo[:,:,0:3] = cv.inpaint(logo[:,:,0:3], inpaintMask=255-logo[:,:,3], inpaintRadius=0, flags=cv.INPAINT_NS)
logo = cv.pyrDown(logo)
composite0 = composite(logo)
composite1 = composite(logo, background_color=(128, 255, 255))
composite2 = composite(logo, dst=lena.copy(), position=(257,59))
# NOTE: `dst` will be altered
# this is a useful behavior in case we want to composite multiple things onto the same background
# it's like the drawing primitives in imgproc
# if you don't want it to be touched, pass a copy instead (as is done here)
logo = (cv.pyrDown(cv.pyrDown(logo)))
logo[:,:,3] //= 2 # reduce transparency to half strength for this demo
for k in range(100):
x = np.random.randint(-100, +600)
y = np.random.randint(-100, +600)
composite(logo, dst=lena, position=(x,y))
cv.imshow("composite0", composite0)
cv.imshow("composite1", composite1)
cv.imshow("composite2", composite2)
cv.imshow("lena", lena)
while True:
key = cv.waitKey(-1)
if key in (13, 27): # ESC, ENTER
break
cv.destroyAllWindows()
if __name__ == '__main__':
main()
```
![composite0](https://user-images.githubusercontent.com/7065108/135487541-4979e1f5-5f85-4384-8cfb-c750d7b23837.png) ![composite1](https://user-images.githubusercontent.com/7065108/135487524-1bf7122d-ef50-4eb5-bf64-8a39f07dba11.png)
![composite2](https://user-images.githubusercontent.com/7065108/135487497-a17b1b18-541f-452b-a9e6-40cbd6caebb9.png)
![lena](https://user-images.githubusercontent.com/7065108/135488615-9a856a1e-a9ee-4bef-83f6-81d49235565b.png)