-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Eval bug: Phi-4 mini in iOS with xcframework #12232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't have a real It looks like this error is happening when the model is being compiled by by Metal. This is done in static struct ggml_backend_metal_context * ggml_metal_init(ggml_backend_dev_t dev) {
...
metal_library = [device newLibraryWithSource:src options:options error:&error];
if (error) {
GGML_LOG_ERROR("%s: error: %s\n", __func__, [[error description] UTF8String]);
return NULL;
} When (lldb) p src
(__NSCFString *) 0x0000000148058000 @"#define GGML_COMMON_DECL_METAL\n#define GGML_COMMON_IMPL_METAL\n#if defined(GGML_METAL_EMBED_LIBRARY)\n#ifndef GGML_COMMON_DECL\n\n#if defined(GGML_COMMON_DECL_C)\n#include <stdint.h>\n\ntypedef uint16_t ggml_half;\ntypedef uint32_t ggml_half2;\n\n#define GGML_COMMON_AGGR_U\n#define GGML_COMMON_AGGR_S\n\n#define GGML_COMMON_DECL\n#elif defined(GGML_COMMON_DECL_CPP)\n#include <cstdint>\n\ntypedef uint16_t ggml_half;\ntypedef uint32_t ggml_half2;\n\n// std-c++ allow anonymous unions but some compiler warn on it\n#define GGML_COMMON_AGGR_U data\n// std-c++ do not allow it.\n#define GGML_COMMON_AGGR_S data\n\n#define GGML_COMMON_DECL\n#elif defined(GGML_COMMON_DECL_METAL)\n#include <metal_stdlib>\n\ntypedef half ggml_half;\ntypedef half2 ggml_half2;\n\n#define GGML_COMMON_AGGR_U\n#define GGML_COMMON_AGGR_S\n\n#define GGML_COMMON_DECL\n#elif defined(GGML_COMMON_DECL_CUDA)\n#if defined(GGML_COMMON_DECL_MUSA)\n#include <musa_fp16.h>\n#else\n#include <cuda_fp16.h>\n#endif\n#include <cstdint>\n\ntypedef half ggml_half;\ntypedef half2 ggml_half2;\n\n#define GGML_" So the In the output above we can see the following message: Memory pressure warning received Perhaps the system is terminating the XPC service because of memory pressure.
Are the older models you were able to load smaller or around the same size as this model? |
I also experience this error sporadically on my iPhone 13 mini. Try to disable residency sets and see if it helps. |
sure, it recently the problem didn't happen again on the Phi-4 mini model. but for the order (old models) is including variety size like qwen 2.5 1.5b Q8 and qwen 2.5 7b Q4 and etc. hope it helps |
I think the |
Try to disable residency set usage by applying this patch: diff --git a/ggml/src/ggml-metal/ggml-metal.m b/ggml/src/ggml-metal/ggml-metal.m
index e51a4169a..e0595351b 100644
--- a/ggml/src/ggml-metal/ggml-metal.m
+++ b/ggml/src/ggml-metal/ggml-metal.m
@@ -24,12 +24,12 @@
#endif
// create residency sets only on macOS >= 15.0
-#if !TARGET_CPU_X86_64 && TARGET_OS_OSX && __MAC_OS_X_VERSION_MAX_ALLOWED >= 150000 || \
- TARGET_OS_IOS && __IPHONE_OS_VERSION_MAX_ALLOWED >= 180000 || \
- TARGET_OS_TV && __TV_OS_VERSION_MAX_ALLOWED >= 180000 || \
- TARGET_OS_VISION && __VISION_OS_VERSION_MAX_ALLOWED >= 200000
-#define GGML_METAL_HAS_RESIDENCY_SETS 1
-#endif
+//#if !TARGET_CPU_X86_64 && TARGET_OS_OSX && __MAC_OS_X_VERSION_MAX_ALLOWED >= 150000 || \
+// TARGET_OS_IOS && __IPHONE_OS_VERSION_MAX_ALLOWED >= 180000 || \
+// TARGET_OS_TV && __TV_OS_VERSION_MAX_ALLOWED >= 180000 || \
+// TARGET_OS_VISION && __VISION_OS_VERSION_MAX_ALLOWED >= 200000
+//#define GGML_METAL_HAS_RESIDENCY_SETS 1
+//#endif
// globals
And then you have to rebuild the XCFramework. Let us know if the issue persists in this case. |
Thanks, let me try with this one. |
disable residency seems work better on my case |
I'd found a slight delay in heating up on the bench when residency sets are disabled, but in most cases on iOS, I don’t think it affects the user experience. I believe it would be better to disable this by default for devices with limited memory. Also, the primary reason I attempted to enable it is because I also have an iOS app running on macOS (Designed for iPad target). |
just follow up on this one, should we just change from
to
|
@Animaxx I would also remove the Vision check.
Hm, I guess that's a valid use case, that would require to have the Anyway, I think it is better to remove the check for now so that iPhone and iPad don't panic. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Name and Version
Run iOS in real device with Phi4 model got "llama_init_from_model: failed to initialize Metal backend"
Operating systems
Mac
GGML backends
Metal
Hardware
iPhone 16 pro
Models
Phi-4 mini Q4L
https://huggingface.co/bartowski/microsoft_Phi-4-mini-instruct-GGUF/blob/main/microsoft_Phi-4-mini-instruct-Q4_K_L.gguf
Problem description & steps to reproduce
Not able to load the Phi4 mini but able to load other (older) models
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: