You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+53-22Lines changed: 53 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,6 +34,10 @@ I've heard good things about this deep learning stuff, so let's try that. I firs
34
34
35
35
I had a look at the corresponding [Python example](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/python/label_image.py), [C++ example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/examples/label_image), and [Android example](https://github.com/tensorflow/examples/tree/master/lite/examples/image_segmentation/android), and based on those, I first cobbled together a [Python demo](https://github.com/floe/deepbacksub/blob/master/deepseg.py). That was running at about 2.5 FPS, which is really excruciatingly slow, so I built a [C++ version](https://github.com/floe/deepbacksub/blob/master/deepseg.cc) which manages 10 FPS without too much hand optimization. Good enough.
36
36
37
+
I've also tested a TFLite-converted version of the [Body-Pix model](https://blog.tensorflow.org/2019/11/updated-bodypix-2.html), but the results haven't been much different to DeepLab for this use case.
38
+
39
+
More recently, Google has released a model specifically trained for [person segmentation that's used in Google Meet](https://ai.googleblog.com/2020/10/background-features-in-google-meet.html). This has way better performance than DeepLab, both in terms of speed and of accuracy, so this is now the default. It needs one custom op from the MediaPipe framework, but that was quite easy to integrate. Thanks to @jiangjianping for pointing this out in the [corresponding issue](https://github.com/floe/deepbacksub/issues/28).
40
+
37
41
## Replace Background
38
42
39
43
This is basically one line of code with OpenCV: `bg.copyTo(raw,mask);` Told you that's the easy part.
@@ -48,46 +52,73 @@ The dataflow through the whole program is roughly as follows:
48
52
49
53
- init
50
54
- load background.png, convert to YUYV
51
-
- load DeepLab v3+ network, initialize TFLite
55
+
- initialize TFLite, register custom op
56
+
- load Google Meet segmentation model
52
57
- setup V4L2 Loopback device (w,h,YUYV)
53
58
- loop
54
59
- grab raw YUYV image from camera
55
-
- extract square ROI in center
56
-
- downscale ROI to 257 x 257 (*)
60
+
- extract portrait ROI in center
61
+
- downscale ROI to 144 x 256 (*)
57
62
- convert to RGB (*)
58
-
- run DeepLab v3+
59
-
- convert result to binary mask for class "person"
63
+
- run Google Meet segmentation model
64
+
- convert result to binary mask using softmax
60
65
- denoise mask using erode/dilate
61
66
- upscale mask to raw image size
62
67
- copy background over raw image with mask (see above)
63
68
-`write()` data to virtual video device
64
69
65
-
(*) these are required input parameters for DeepLab v3+
70
+
(*) these are required input parameters for this model
66
71
67
72
## Requirements
68
73
69
74
Tested with the following dependencies:
75
+
76
+
- Ubuntu 20.04, x86-64
77
+
- Linux kernel 5.6 (stock package)
78
+
- OpenCV 4.2.0 (stock package)
79
+
- V4L2-Loopback 0.12.5 (stock package)
80
+
- Tensorflow Lite 2.4.0 (from [repo](https://github.com/tensorflow/tensorflow/tree/v2.4.0/tensorflow/lite))
70
81
- Ubuntu 18.04.5, x86-64
71
-
- Linux kernel 4.15 (stock package)
72
-
- OpenCV 3.2.0 (stock package)
73
-
- V4L2-Loopback 0.10.0 (stock package)
74
-
- Tensorflow Lite 2.1.0 (from [repo](https://github.com/tensorflow/tensorflow/tree/v2.1.0/tensorflow/lite))
75
-
- Ultra-short build guide for Tensorflow Lite C++ library: clone repo above, then...
76
-
- run `./tensorflow/lite/tools/make/download_dependencies.sh`
77
-
- run `./tensorflow/lite/tools/make/build_lib.sh`
82
+
- Linux kernel 4.15 (stock package)
83
+
- OpenCV 3.2.0 (stock package)
84
+
- V4L2-Loopback 0.10.0 (stock package)
85
+
- Tensorflow Lite 2.1.0 (from [repo](https://github.com/tensorflow/tensorflow/tree/v2.1.0/tensorflow/lite))
78
86
79
87
Tested with the following software:
88
+
80
89
- Firefox
81
-
-74.0.1 (works)
90
+
-84.0 (works)
82
91
- 76.0.1 (works)
92
+
- 74.0.1 (works)
83
93
- Skype
84
-
- 8.58.0.93 (works)
94
+
- 8.67.0.96 (works)
85
95
- 8.60.0.76 (works)
86
-
- guvcview 2.0.5 (works with parameter `-c read`)
87
-
- Microsoft Teams 1.3.00.5153 (works)
88
-
- Chrome 81.0.4044.138 (works)
89
-
- Zoom 5.0.403652.0509 (works - yes, I'm a hypocrite, I tested it with Zoom after all :-)
90
-
96
+
- 8.58.0.93 (works)
97
+
- guvcview
98
+
- 2.0.6 (works with parameter `-c read`)
99
+
- 2.0.5 (works with parameter `-c read`)
100
+
- Microsoft Teams
101
+
- 1.3.00.30857 (works)
102
+
- 1.3.00.5153 (works)
103
+
- Chrome
104
+
- 87.0.4280.88 (works)
105
+
- 81.0.4044.138 (works)
106
+
- Zoom - yes, I'm a hypocrite, I tested it with Zoom after all :-)
Run `make` to build everything (should also clone and build Tensorflow Lite).
115
+
116
+
If the first part doesn't work:
117
+
- Clone https://github.com/tensorflow/tensorflow/ repo into tensorflow/ folder
118
+
- Checkout tag v2.4.0
119
+
- run ./tensorflow/lite/tools/make/download_dependencies.sh
120
+
- run ./tensorflow/lite/tools/make/build_lib.sh
121
+
91
122
## Usage
92
123
93
124
First, load the v4l2loopback module (extra settings needed to make Chrome work):
@@ -106,13 +137,13 @@ As usual: pull requests welcome.
106
137
- Resolution is currently hardcoded to 640x480 (lowest common denominator).
107
138
- Only works with Linux, because that's what I use.
108
139
- Needs a webcam that can produce raw YUYV data (but extending to the common YUV420 format should be trivial)
109
-
- CPU hog: maxes out two cores on my 2.7 GHz i5 machine for just VGA @ 10 FPS.
110
-
- Uses stock Deeplab v3+ network. Maybe re-training with only "person" and "background" classes could improve performance?
111
140
112
141
## Fixed
113
142
114
143
- Should probably do a erosion (+ dilation?) operation on the mask.
115
144
- Background image size needs to match camera resolution (see issue #1).
145
+
- CPU hog: maxes out two cores on my 2.7 GHz i5 machine for just VGA @ 10 FPS. Fixed via Google Meet segmentation model.
146
+
- Uses stock Deeplab v3+ network. Maybe re-training with only "person" and "background" classes could improve performance? Fixed via Google Meet segmentation model.
0 commit comments