-
Notifications
You must be signed in to change notification settings - Fork 286
[Object Detection] Add YOLOv11 Architecture and Presets #1952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Haven't gone over the code yet, but re the API questions...
I think this is probably the right call? If we think it will overall reduce the amount of code without becoming a spaghetti of if, sounds like a worthy clean up.
In general if we are seeing the same layers being used across different models, and we can write a common one that covers both cases, that's a good time to consider pulling things out as a layer. Adding common routines as a layer will up the requirements a bit (solid testing, good docs with examples, etc.), but that's not an issue just something to keep in mind. Random and not originating on this PR, but |
Thanks for weighing in @mattdangerw!
The original repo is written to allow for this customizability, so in principle, it shouldn't be too hard to do it here. Though, the original repo has a lot of proprietary structures and code which we don't want to port, so we'll have to trim down a lot on the sides.
Agreed. It was named |
Yeah, let's keep consistent for now! Leave as is for this PR. |
closing because we decided to not add yolov11 to kerasHub |
May I ask why this decision was made? Since YOLOv8 is already included in keras-cv, I believe adding the latest version of YOLO (YOLOv11), which offers state-of-the-art performance, would be a valuable addition. Personally, I prefer using Keras over TensorFlow, PyTorch, or OpenCV due to its developer-friendly design. Supporting newer models like YOLOv11 would further enhance the usability and appeal of Keras for many developers. |
Draft PR for transparency.
Done
Planned
Out of Scope
These will be exported as separate tasks (i.e.
ImagePoseEstimator
,ImageInstanceSegmentor
, their respective preprocessors, etc.) in separate PRs.API Considerations
There will be lots of reusability between YOLOv11 OD, YOLOv11 Pose, etc. Some functions such as the non-max-supression can be wrapped into generic public layers and reused between object detectors. We could benefit from refactoring these into general utils in KerasHub (currently, they belong to models, such as in the case of RetinaNet).
Some YOLO models are consistent with the same architecture but rely on a different config. Enabling v11 will enable v8 as well, for example. These can be handled through presets. We could turn YOLOv11 into a generic YOLO class, which is configurable through presets and layers. This lets us support multiple versions, but also easily port and publish YOLOv{N} and subsequent versions in the future with minimal code changes (i.e. a layer or two + config).
/cc @divyashreepathihalli @mattdangerw @fchollet for API discussions and considerations.