-
Notifications
You must be signed in to change notification settings - Fork 9.4k
Price index performance on very large stores #37700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @nicka101. Thank you for your report.
Join Magento Community Engineering Slack and ask your questions in #github channel. |
…on catalog_product_index_price_temp), addressing magento#37700
Hi @engcom-Dash. Thank you for working on this issue.
|
Hi @engcom-Hotel. Thank you for working on this issue.
|
Hello @nicka101, Thanks for the report and contribution! We have tried to reproduce the issue in 2.4-develop as well as 2.4.4 just prior to 2.4.5 (Before that commit), but it seems the issue is not reproducible for us. We have tried to reproduce the issue with almost Have you checked the same with some profiling tools? Thanks |
Faced with the same issue on a catalog of 850k products, the proposed solution really speeds up the indexing process. |
Hello @DmitryFurs, Thanks for the response! Can you please let us know the steps to reproduce the issue? Actually, in the first attempt, we were unable to reproduce the issue. You can got through our comment here #37700 (comment). Thanks |
We faced the same issue using Magento 2.4.6 + MSI. We have ~2600 products (simple, configurable, bundle) and 9 sources/stocks. In our particular case, the price index itself takes less than 1 minute but the inventory index takes more than 1 hour. With the proposed solution, the inventory index takes less than 2 minutes now. |
Hi @engcom-Hotel, We reproduced it as follows:
As @vladimirspucko described above, the inventory indexing itself was performed quickly, about a minute for 850k products in our case. But then a plugin came into operation, which starts price indexing in which calls \Magento\Catalog\Model\Indexer\Product\Price\Processor::reindexList() And if the mode is not Note: if we use |
Hello @DmitryFurs, Thanks for the quick response! The issue is reproducible for us in the 2.4-develop branch. We have tried to reproduce the issue with almost 40K products. Hence confirming the issue. Thanks |
✅ Jira issue https://jira.corp.adobe.com/browse/AC-9470 is successfully created for this GitHub issue. |
✅ Confirmed by @engcom-Hotel. Thank you for verifying the issue. |
This is really a big issue. My partial price index takes several hours. I solved it by just adding an index to the "catalog_product_index_price_tmp" for "entity_id", "customer_group_id", "website_id" |
Any updates on this issue? In our case, reindexing of 100k products takes hours... |
In Magento Interestingly, there's a quality patch I'm surprised that maintainers didn't mention this patch here as a possible solution. I've adopted the |
Hello @speller, The patch will be incorporated into the forthcoming release, namely 2.4.8-beta1. But we will update you if this can be incorporated sooner. Thanks |
@engcom-Hotel could you also make the patch available for 2.4.7? |
@speller I will update you on this soon. |
This was actually introduced in 2.4.6: https://github.com/magento/magento2/blob/2.4.6/app/code/Magento/Catalog/Model/Indexer/Product/Price/AbstractAction.php We've experienced the indexing to not complete even though it was left running overnight from the end of the working day to the start of the new one due to a very large amount of prices; there's multiple stores, multiple customer groups and a lot of products. I can see that the solution proposed at #37701 has been rejected because indexes slow down table writes. However, the table writes are for a temporary table that is only involved in the indexing process and for the most part the tradeoff is better overall. I also tried changing the sql query itself to improve performance. This worked well, but I've found the best results, for our case anyway, was to combine both the query change and add indexes to the temporary table
<constraint xsi:type="unique" referenceId="CAT_PRD_IDX_PRICE_TMP_ENTT_ID_CSTR_GROUP_ID_WS_ID">
<column name="entity_id"/>
<column name="customer_group_id"/>
<column name="website_id"/>
</constraint> Note I've not fully tested data integrity yet with this. Probably be on Monday 17th I get back to look at that. I'll post a patch if all looks good then. EDIT: See below |
On further testing we've found that the change on the sql query only makes a slight difference so in terms in risk and reward, it's probably better just leaving the query alone. If anyone needs the patch (based on #37701) to change the table indexes then here's a patch for 2.4.7:
diff --git a/vendor/magento/module-catalog/etc/db_schema_whitelist.json b/vendor/magento/module-catalog/etc/db_schema_whitelist.json
--- a/vendor/magento/module-catalog/etc/db_schema_whitelist.json 2024-06-17 11:12:06.595496943 +0100
+++ b/vendor/magento/module-catalog/etc/db_schema_whitelist.json 2024-06-14 08:27:31.648158048 +0100
@@ -987,7 +987,8 @@
"final_price": true,
"min_price": true,
"max_price": true,
- "tier_price": true
+ "tier_price": true,
+ "id": true
},
"index": {
"CATALOG_PRODUCT_INDEX_PRICE_TMP_CUSTOMER_GROUP_ID": true,
@@ -995,7 +996,8 @@
"CATALOG_PRODUCT_INDEX_PRICE_TMP_MIN_PRICE": true
},
"constraint": {
- "PRIMARY": true
+ "PRIMARY": true,
+ "CAT_PRD_IDX_PRICE_TMP_ENTT_ID_CSTR_GROUP_ID_WS_ID": true
}
},
"catalog_category_product_index_tmp": {
diff --git a/vendor/magento/module-catalog/etc/db_schema.xml b/vendor/magento/module-catalog/etc/db_schema.xml
--- a/vendor/magento/module-catalog/etc/db_schema.xml 2024-06-17 11:11:56.495474101 +0100
+++ b/vendor/magento/module-catalog/etc/db_schema.xml 2024-06-14 08:28:02.284226062 +0100
@@ -1645,6 +1645,11 @@
<constraint xsi:type="primary" referenceId="PRIMARY">
<column name="id"/>
</constraint>
+ <constraint xsi:type="unique" referenceId="CAT_PRD_IDX_PRICE_TMP_ENTT_ID_CSTR_GROUP_ID_WS_ID">
+ <column name="entity_id"/>
+ <column name="customer_group_id"/>
+ <column name="website_id"/>
+ </constraint>
</table>
<table name="catalog_category_product_index_tmp" resource="default" engine="innodb"
comment="Catalog Category Product Indexer temporary table"> Apply patch with |
would it be possible to apply the patch to magento 2.3.7 ? nicka101@90e4ecc |
The patch in the Quality Patch Tool is not available yet for 2.4.7, and the change is not implemented in 2.4.7-p4 (only in Please can you check if the patch will be released officially for 2.4.7? |
Hello @mzeis, Thanks for raising the concern! We are in discussion for this concern with the internal team. I will update you soon on this. Meanwhile moving this issue on hold. Thanks |
Preconditions and environment
Steps to reproduce
Expected result
Price reindex is workable, and uses indexes for its join in deleteOutdatedData, because they exist
Actual result
It's 800-1200x slower than it should be, making a reindex take multiple days (based on the runtime of the individual queries, but can't tell for sure, as i'm not waiting that long), when on previous versions it took 4 hours
Additional information
When catalog_product_index_price_temp is created, it's based on catalog_product_index_price_tmp , which since 263c17f has had different primary key and indexes to the main table, meaning this JOIN has no indexes to use for the select, and is forced to do a full tablescan
We reproduced it as follows:
All indexes have been switched to
realtime
mode -bin/magento indexer:set-mode realtime
Then we started indexing only for MSI -
bin/magento indexer:reindex inventory
As @vladimirspucko described above, the inventory indexing itself was performed quickly, about a minute for 850k products in our case. But then a plugin came into operation, which starts price indexing in
realtime
mode:\Magento\InventoryCatalog\Plugin\InventoryIndexer\Indexer\Stock\Strategy\Sync\PriceIndexUpdatePlugin::afterExecuteList()
which calls \Magento\Catalog\Model\Indexer\Product\Price\Processor::reindexList()
And if the mode is not
schedule
this indexing has never been performed on our installation with 850k products.Note: if we use
schedule
mode, then full inventory indexing takes about 50 seconds for all the same 850k productsRelease note
No response
Triage and priority
The text was updated successfully, but these errors were encountered: