Skip to content

Commit ff82cb6

Browse files
Food-on-Fork Detection (#169)
* [WIP] FoodOnForkDetector class, dummy instantiation, and node are in place but untested * Tested dummy food on fork detector * Deleted mistakenly added fork handle masks * [WIP] added and tested train-test script * Added PointCloudTTestDetector and tested on offline data * Retrained with new dataset * Fix config file * Fix imports * Re-add wrongly removed masks * Retrained with new rosbag * Added filters, retrained with only the new rosbag * Have a great, working detector * Formatting, cleaning up code * Updated README * Added launchfile * Moved to forkTip frame and changed distance aggregator to 90th percentile * Fixes from in-person testing * Overlaid stored noFof points on camera image --------- Co-authored-by: Ethan K. Gordon <[email protected]>
1 parent 6fc8907 commit ff82cb6

15 files changed

+2937
-23
lines changed

.pylintrc

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -197,24 +197,42 @@ good-names=a,
197197
b,
198198
c,
199199
d,
200+
f,
200201
i,
201202
j,
202203
k,
204+
m,
205+
M,
206+
n,
207+
p,
208+
ps,
203209
x,
210+
x0,
211+
x1,
212+
X,
204213
y,
214+
y0,
215+
y1,
205216
z,
206217
u,
218+
us,
207219
v,
220+
vs,
208221
w,
209222
h,
210223
r,
211224
rc,
225+
S,
226+
S_inv,
227+
t,
212228
ax,
213229
ex,
214230
hz,
215231
kw,
216232
ns,
217233
Run,
234+
train_X,
235+
test_X,
218236
_
219237

220238
# Good variable names regexes, separated by a comma. If names match any regex,

ada_feeding_msgs/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ find_package(rosidl_default_generators REQUIRED)
1919
rosidl_generate_interfaces(${PROJECT_NAME}
2020
"msg/AcquisitionSchema.msg"
2121
"msg/FaceDetection.msg"
22+
"msg/FoodOnForkDetection.msg"
2223
"msg/Mask.msg"
2324

2425
"action/AcquireFood.action"
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# A message with the results of food on fork detection on a single frame.
2+
3+
# The header for the image the detection corresponds to
4+
std_msgs/Header header
5+
6+
# The status of the food-on-fork detector.
7+
int32 status
8+
int32 SUCCESS=1
9+
int32 ERROR_TOO_FEW_POINTS=-1
10+
int32 ERROR_NO_TRANSFORM=-2
11+
int32 UNKNOWN_ERROR=-99
12+
13+
# A probability in [0,1] that indicates the likelihood that there is food on the
14+
# fork in the image. Only relevant if status == FoodOnForkDetection.SUCCESS
15+
float64 probability
16+
17+
# Contains more details of the result, including any error messages that were encountered
18+
string message

ada_feeding_perception/README.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,3 +91,35 @@ Launch the web app along with all the other nodes (real or dummy) as documented
9191
- `offline.images` (list of strings, required): The paths, relative to `install/ada_feeding_perception/share/ada_feeding_perception`, to the images to test.
9292
- `offline.point_xs` (list of ints, required): The x-coordinates of the seed points. Must be the same length as `offline.images`.
9393
- `offline.point_ys` (list of ints, required): The y-coordinates of the seed points. Must be the same length as `offline.images`.
94+
95+
## Food-on-Fork Detection
96+
97+
Our eye-in-hand Food-on-Fork Detection node and training/testing infrastructure was designed to make it easy to substitute and compare other food-on-fork detectors. Below are instructions on how to do so.
98+
99+
1. **Developing a new food-on-fork detector**: Create a subclass of `FoodOnForkDetector` that implements all of the abstractmethods. Note that as of now, a model does not have access to a real-time TF Buffer during test time; hence, **all transforms that the model relies on must be static**.
100+
2. **Gather the dataset**: Because this node uses the eye-in-hand camera, it is sensitive to the relative pose between the camera and the fork. If you are using PRL's robot, [the dataset collected in early 2024](https://drive.google.com/drive/folders/1hNciBOmuHKd67Pw6oAvj_iN_rY1M8ZV0?usp=drive_link) may be sufficient. Otherwise, you should collect your own dataset:
101+
1. The dataset should consist of a series of ROS2 bags, each recording the following: (a) the aligned depth to color image topic; (b) the color image topic; (c) the camera info topic (we assume it is the same for both); and (d) the TF topic(s).
102+
2. We recorded three types of bags: (a) bags where the robot was going through the motions of feeding without food on the fork and without the fork nearing a person or plate; (b) the same as above but with food on the fork; and (c) bags where the robot was acquiring and feeding a bite to someone. We used the first two types of bags for training, and the third type of bag for evaluation.
103+
3. All ROS2 bags should be in the same directory, with a file `bags_metadata.csv` at the top-level of that directory.
104+
4. `bags_metadata.csv` contains the following columns: `rosbag_name` (str), `time_from_start` (float), `food_on_fork` (0/1), `arm_moving` (0/1). The file only needs rows for timestamps when one or both of the latter columns change; for intermediate timestamps, it is assumed that they stay the same.
105+
5. To generate `bags_metadata.csv`, we recommend launching RVIZ, adding your depth and/or RGB image topic, and playing back the bag. e.g.,
106+
1. `ros2 run rviz2 rviz2 --ros-args -p use_sim_time:=true`
107+
2. `ros2 bag play 2024_03_01_two_bites_3 --clock`
108+
3. Pause and play the rosbag script when food foes on/off the fork, and when the arm starts/stops moving, and populate `bags_metadata.csv` accordingly (elapsed time since bag start should be visible at the bottom of RVIZ2).
109+
3. **Train/test the model on offline data**: We provide a flexible Python script, `food_on_fork_train_test.py`, to train, test, and/or compare one-or-more food-on-fork models. To use it, first ensure you have built and sourced your workspace, and you are in the directory that contains the script (e.g., `cd ~/colcon_ws/src/ada_feeding/ada_feeding_perception/ada_feeding_perception`). To enable flexible use, the script has **many** command-line arguments; we recommend you read their descriptions with `python3 food_on_fork_train_test.py -h`. For reference, we include the command we used to train our model below:
110+
```
111+
python3 food_on_fork_train_test.py --model-classes '{"distance_no_fof_detector_with_filters": "ada_feeding_perception.food_on_fork_detectors.FoodOnForkDistanceToNoFOFDetector"}' --model-kwargs '{"distance_no_fof_detector_with_filters": {"camera_matrix": [614.5933227539062, 0.0, 312.1358947753906, 0.0, 614.6914672851562, 223.70831298828125, 0.0, 0.0, 1.0], "min_distance": 0.001}}' --lower-thresh 0.25 --upper-thresh 0.75 --train-set-size 0.5 --crop-top-left 344 272 --crop-bottom-right 408 336 --depth-min-mm 310 --depth-max-mm 340 --rosbags-select 2024_03_01_no_fof 2024_03_01_no_fof_1 2024_03_01_no_fof_2 2024_03_01_no_fof_3 2024_03_01_no_fof_4 2024_03_01_fof_cantaloupe_1 2024_03_01_fof_cantaloupe_2 2024_03_01_fof_cantaloupe_3 2024_03_01_fof_strawberry_1 2024_03_01_fof_strawberry_2 2024_03_01_fof_strawberry_3 2024_02_29_no_fof 2024_02_29_fof_cantaloupe 2024_02_29_fof_strawberry --seed 42 --temporal-window-size 5 --spatial-num-pixels 10
112+
```
113+
Note that we trained our model on data where the fork either had or didn't have food the whole time, and didn't near any objects (e.g., the plate or the user's mouth). (Also, note that not all the above ROS2 bags are necessary; we've trained accurate detectors with half of them.) We then did an offline evaluation of the model on bags of actual feeding data:
114+
```
115+
python3 food_on_fork_train_test.py --model-classes '{"distance_no_fof_detector_with_filters": "ada_feeding_perception.food_on_fork_detectors.FoodOnForkDistanceToNoFOFDetector"}' --model-kwargs '{"distance_no_fof_detector_with_filters": {"camera_matrix": [614.5933227539062, 0.0, 312.1358947753906, 0.0, 614.6914672851562, 223.70831298828125, 0.0, 0.0, 1.0], "min_distance": 0.001}}' --lower-thresh 0.25 --upper-thresh 0.75 --train-set-size 0.5 --crop-top-left 308 248 --crop-bottom-right 436 332 --depth-min-mm 310 --depth-max-mm 340 --rosbags-select 2024_03_01_two_bites 2024_03_01_two_bites_2 2024_03_01_two_bites_3 2024_02_29_two_bites --seed 42 --temporal-window-size 5 --spatial-num-pixels 10 --no-train
116+
```
117+
4. **Test the model on online data**: First, copy the parameters you used when training your model, as well as the filename of the saved model, to `config/food_on_fork_detection.yaml`. Re-build and source your workspace.
118+
1. **Live Robot**:
119+
1. Launch the robot as usual; the `ada_feeding_perception`launchfile will launch food-on-fork detection.
120+
2. Toggle food-on-fork detection on: `ros2 service call /toggle_food_on_fork_detection std_srvs/srv/SetBool "{data: true}"`
121+
3. Echo the output of food-on-fork detection: `ros2 topic echo /food_on_fork_detection`
122+
2. **ROS2 bag data**:
123+
1. Launch perception: `ros2 launch ada_feeding_perception ada_feeding_perception.launch.py`
124+
2. Toggle food-on-fork detection on and echo the output of food-on-fork detection, as documented above.
125+
4. Launch RVIZ and play back a ROS2 bag, as documented above.

ada_feeding_perception/ada_feeding_perception/depth_post_processors.py

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,13 @@
77
from typing import Callable
88

99
# Third-party imports
10+
from builtin_interfaces.msg import Time
1011
import cv2 as cv
1112
from cv_bridge import CvBridge
1213
import numpy as np
1314
import numpy.typing as npt
1415
from sensor_msgs.msg import Image
16+
from std_msgs.msg import Header
1517

1618

1719
def create_mask_post_processor(
@@ -58,7 +60,10 @@ def mask_post_processor(msg: Image) -> Image:
5860

5961
# Get the new img message
6062
masked_msg = bridge.cv2_to_imgmsg(masked_img)
61-
masked_msg.header = msg.header
63+
masked_msg.header = Header(
64+
stamp=Time(sec=msg.header.stamp.sec, nanosec=msg.header.stamp.nanosec),
65+
frame_id=msg.header.frame_id,
66+
)
6267

6368
return masked_msg
6469

@@ -124,7 +129,10 @@ def temporal_post_processor(msg: Image) -> Image:
124129

125130
# Get the new img message
126131
masked_msg = bridge.cv2_to_imgmsg(masked_img)
127-
masked_msg.header = msg.header
132+
masked_msg.header = Header(
133+
stamp=Time(sec=msg.header.stamp.sec, nanosec=msg.header.stamp.nanosec),
134+
frame_id=msg.header.frame_id,
135+
)
128136

129137
return masked_msg
130138

@@ -176,7 +184,10 @@ def spatial_post_processor(msg: Image) -> Image:
176184

177185
# Get the new img message
178186
masked_msg = bridge.cv2_to_imgmsg(masked_img)
179-
masked_msg.header = msg.header
187+
masked_msg.header = Header(
188+
stamp=Time(sec=msg.header.stamp.sec, nanosec=msg.header.stamp.nanosec),
189+
frame_id=msg.header.frame_id,
190+
)
180191

181192
return masked_msg
182193

@@ -234,7 +245,10 @@ def threshold_post_processor(msg: Image) -> Image:
234245

235246
# Get the new img message
236247
masked_msg = bridge.cv2_to_imgmsg(masked_img)
237-
masked_msg.header = msg.header
248+
masked_msg.header = Header(
249+
stamp=Time(sec=msg.header.stamp.sec, nanosec=msg.header.stamp.nanosec),
250+
frame_id=msg.header.frame_id,
251+
)
238252

239253
return masked_msg
240254

ada_feeding_perception/ada_feeding_perception/face_detection.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,8 @@ class FaceDetectionNode(Node):
5656
let the client decide which face to use.
5757
"""
5858

59+
# pylint: disable=duplicate-code
60+
# Much of the logic of this node mirrors FoodOnForkDetection. This is fine.
5961
# pylint: disable=too-many-instance-attributes
6062
# Needed for multiple model loads, publisher, subscribers, and shared variables
6163
def __init__(
@@ -305,10 +307,6 @@ def toggle_face_detection_callback(
305307
the face detection on or off depending on the request.
306308
"""
307309

308-
# pylint: disable=duplicate-code
309-
# We follow similar logic in any service to toggle a node
310-
# (e.g., face detection)
311-
312310
self.get_logger().info(f"Incoming service request. data: {request.data}")
313311
response.success = False
314312
response.message = f"Failed to set is_on to {request.data}"
@@ -563,6 +561,7 @@ def get_mouth_depth(
563561
f"Corresponding RGB image message received at {rgb_msg.header.stamp}. "
564562
f"Time difference: {min_time_diff} seconds."
565563
)
564+
# TODO: This should use the ros_msg_to_cv2_image helper function
566565
image_depth = self.bridge.imgmsg_to_cv2(
567566
closest_depth_msg,
568567
desired_encoding="passthrough",
@@ -651,6 +650,7 @@ def run(self) -> None:
651650
continue
652651

653652
# Detect the largest face in the RGB image
653+
# TODO: This should use the ros_msg_to_cv2_image helper function
654654
image_bgr = cv2.imdecode(
655655
np.frombuffer(rgb_msg.data, np.uint8), cv2.IMREAD_COLOR
656656
)

0 commit comments

Comments
 (0)