Skip to content

Price index performance on very large stores #37700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 of 5 tasks
nicka101 opened this issue Jul 3, 2023 · 23 comments
Open
1 of 5 tasks

Price index performance on very large stores #37700

nicka101 opened this issue Jul 3, 2023 · 23 comments
Labels
Area: Framework Component: Catalog Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Issue: On Hold Priority: P2 A defect with this priority could have functionality issues which are not to expectations. Progress: ready for dev Reported on 2.4.5 Indicates original Magento version for the Issue report. Reproduced on 2.4.x The issue has been reproduced on latest 2.4-develop branch Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it Triage: Need PO Confirmation Requirements should be clarified/approved/confirmed with Product Manager. Not ready for fix/delivery

Comments

@nicka101
Copy link
Contributor

nicka101 commented Jul 3, 2023

Preconditions and environment

  • Magento version 2.4.5+ (or any release after 263c17f)
  • Have a lot of products (the more, with more prices, the easier this issue is to reproduce for demo)

Steps to reproduce

  1. Reindex catalog_product_index_price
  2. Observe that since 263c17f the performance of deleteOutdatedData is much worse (around 800-1200x slower than it should be in our very large instances with ~30 million prices)

Expected result

Price reindex is workable, and uses indexes for its join in deleteOutdatedData, because they exist

Actual result

It's 800-1200x slower than it should be, making a reindex take multiple days (based on the runtime of the individual queries, but can't tell for sure, as i'm not waiting that long), when on previous versions it took 4 hours

Additional information

When catalog_product_index_price_temp is created, it's based on catalog_product_index_price_tmp , which since 263c17f has had different primary key and indexes to the main table, meaning this JOIN has no indexes to use for the select, and is forced to do a full tablescan

We reproduced it as follows:

  1. All indexes have been switched to realtime mode - bin/magento indexer:set-mode realtime

  2. Then we started indexing only for MSI - bin/magento indexer:reindex inventory

As @vladimirspucko described above, the inventory indexing itself was performed quickly, about a minute for 850k products in our case. But then a plugin came into operation, which starts price indexing in realtime mode:
\Magento\InventoryCatalog\Plugin\InventoryIndexer\Indexer\Stock\Strategy\Sync\PriceIndexUpdatePlugin::afterExecuteList()

which calls \Magento\Catalog\Model\Indexer\Product\Price\Processor::reindexList()

And if the mode is not schedule this indexing has never been performed on our installation with 850k products.

Note: if we use schedule mode, then full inventory indexing takes about 50 seconds for all the same 850k products

Release note

No response

Triage and priority

  • Severity: S0 - Affects critical data or functionality and leaves users without workaround.
  • Severity: S1 - Affects critical data or functionality and forces users to employ a workaround.
  • Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
  • Severity: S3 - Affects non-critical data or functionality and does not force users to employ a workaround.
  • Severity: S4 - Affects aesthetics, professional look and feel, “quality” or “usability”.
@m2-assistant
Copy link

m2-assistant bot commented Jul 3, 2023

Hi @nicka101. Thank you for your report.
To speed up processing of this issue, make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, Add a comment to the issue:


Join Magento Community Engineering Slack and ask your questions in #github channel.
⚠️ According to the Magento Contribution requirements, all issues must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting.
🕙 You can find the schedule on the Magento Community Calendar page.
📞 The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, join the Community Contributions Triage session to discuss the appropriate ticket.

@engcom-Bravo engcom-Bravo added the Reported on 2.4.5 Indicates original Magento version for the Issue report. label Jul 3, 2023
@engcom-Hotel engcom-Hotel added the Priority: P2 A defect with this priority could have functionality issues which are not to expectations. label Jul 4, 2023
nicka101 added a commit to nicka101/magento2 that referenced this issue Jul 6, 2023
@engcom-Dash engcom-Dash self-assigned this Aug 10, 2023
@m2-assistant
Copy link

m2-assistant bot commented Aug 10, 2023

Hi @engcom-Dash. Thank you for working on this issue.
In order to make sure that issue has enough information and ready for development, please read and check the following instruction: 👇

    1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).
    1. Verify that issue has a meaningful description and provides enough information to reproduce the issue.
    1. Add Area: XXXXX label to the ticket, indicating the functional areas it may be related to.
    1. Verify that the issue is reproducible on 2.4-develop branch
      Details- Add the comment @magento give me 2.4-develop instance to deploy test instance on Magento infrastructure.
      - If the issue is reproducible on 2.4-develop branch, please, add the label Reproduced on 2.4.x.
      - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!

@engcom-Dash engcom-Dash removed their assignment Aug 14, 2023
@engcom-Dash engcom-Dash added the Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it label Aug 14, 2023
@engcom-Hotel engcom-Hotel self-assigned this Aug 16, 2023
@m2-assistant
Copy link

m2-assistant bot commented Aug 16, 2023

Hi @engcom-Hotel. Thank you for working on this issue.
In order to make sure that issue has enough information and ready for development, please read and check the following instruction: 👇

  • 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).
  • 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue.
  • 3. Add Area: XXXXX label to the ticket, indicating the functional areas it may be related to.
  • 4. Verify that the issue is reproducible on 2.4-develop branch
    Details- Add the comment @magento give me 2.4-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.4-develop branch, please, add the label Reproduced on 2.4.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!
  • 5. Add label Issue: Confirmed once verification is complete.
  • 6. Make sure that automatic system confirms that report has been added to the backlog.

@engcom-Hotel
Copy link
Contributor

Hello @nicka101,

Thanks for the report and contribution!

We have tried to reproduce the issue in 2.4-develop as well as 2.4.4 just prior to 2.4.5 (Before that commit), but it seems the issue is not reproducible for us.

We have tried to reproduce the issue with almost 39360 products, by changing the price of the products but for us the reindexing took a similar time.

Have you checked the same with some profiling tools?

Thanks

@engcom-Hotel engcom-Hotel added the Issue: needs update Additional information is require, waiting for response label Aug 16, 2023
@DmitryFurs
Copy link
Contributor

Faced with the same issue on a catalog of 850k products, the proposed solution really speeds up the indexing process.

@engcom-Hotel
Copy link
Contributor

Hello @DmitryFurs,

Thanks for the response!

Can you please let us know the steps to reproduce the issue? Actually, in the first attempt, we were unable to reproduce the issue. You can got through our comment here #37700 (comment).

Thanks

@vladimirspucko
Copy link

We faced the same issue using Magento 2.4.6 + MSI. We have ~2600 products (simple, configurable, bundle) and 9 sources/stocks. In our particular case, the price index itself takes less than 1 minute but the inventory index takes more than 1 hour. With the proposed solution, the inventory index takes less than 2 minutes now.

@DmitryFurs
Copy link
Contributor

DmitryFurs commented Sep 5, 2023

Hi @engcom-Hotel,

We reproduced it as follows:

  1. All indexes have been switched to realtime mode - bin/magento indexer:set-mode realtime

  2. Then we started indexing only for MSI - bin/magento indexer:reindex inventory

As @vladimirspucko described above, the inventory indexing itself was performed quickly, about a minute for 850k products in our case. But then a plugin came into operation, which starts price indexing in realtime mode:
\Magento\InventoryCatalog\Plugin\InventoryIndexer\Indexer\Stock\Strategy\Sync\PriceIndexUpdatePlugin::afterExecuteList()

which calls \Magento\Catalog\Model\Indexer\Product\Price\Processor::reindexList()

And if the mode is not schedule this indexing has never been performed on our installation with 850k products.

Note: if we use schedule mode, then full inventory indexing takes about 50 seconds for all the same 850k products

@engcom-Hotel
Copy link
Contributor

Hello @DmitryFurs,

Thanks for the quick response!

The issue is reproducible for us in the 2.4-develop branch. We have tried to reproduce the issue with almost 40K products.

Hence confirming the issue.

Thanks

@engcom-Hotel engcom-Hotel added Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Reproduced on 2.4.x The issue has been reproduced on latest 2.4-develop branch Area: Framework and removed Issue: needs update Additional information is require, waiting for response labels Sep 6, 2023
@github-jira-sync-bot
Copy link

✅ Jira issue https://jira.corp.adobe.com/browse/AC-9470 is successfully created for this GitHub issue.

@m2-assistant
Copy link

m2-assistant bot commented Sep 6, 2023

✅ Confirmed by @engcom-Hotel. Thank you for verifying the issue.
Issue Available: @engcom-Hotel, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.

@mvenghaus
Copy link

This is really a big issue. My partial price index takes several hours. I solved it by just adding an index to the "catalog_product_index_price_tmp" for "entity_id", "customer_group_id", "website_id"

@speller
Copy link

speller commented Mar 3, 2024

Any updates on this issue? In our case, reindexing of 100k products takes hours...
On custom product bulk import, we try to reindex only changed products using the reindexList method. It takes hours. If we do the full reindex with the reindexAll method, it takes a couple of minutes.

@speller
Copy link

speller commented Apr 25, 2024

In Magento 2.4.7, the issue is still there.

Interestingly, there's a quality patch ACSD-56415 Fixes the issue where the performance of the partial price indexing is slowed down due to a DELETE query when the database has a lot of partial price data to index for 2.4.6-p3 which fixes the issue but this patch is not available for 2.4.7. The ACSD-56415 patch does exactly the same as what @mvenghaus mentioned.

I'm surprised that maintainers didn't mention this patch here as a possible solution.

I've adopted the ACSD-56415 patch to 2.4.7 by converting it to a composer patch and can confirm that it fixes the issue on my side. For example, without the patch, on my dev machine, 250 products were reindexed in 10 mins so I didn't even wait for the full reindex. It's clear that it'll take hours for 50K products. The full reindex of the 2M products database took 80 mins. With the patch applied, 50K products were reindexed in 1 minute.

@engcom-Hotel
Copy link
Contributor

Hello @speller,

The patch will be incorporated into the forthcoming release, namely 2.4.8-beta1. But we will update you if this can be incorporated sooner.

Thanks

@speller
Copy link

speller commented May 7, 2024

@engcom-Hotel could you also make the patch available for 2.4.7?

@engcom-Hotel
Copy link
Contributor

@speller I will update you on this soon.

@andy-igoo
Copy link

andy-igoo commented Jun 13, 2024

This was actually introduced in 2.4.6: https://github.com/magento/magento2/blob/2.4.6/app/code/Magento/Catalog/Model/Indexer/Product/Price/AbstractAction.php

We've experienced the indexing to not complete even though it was left running overnight from the end of the working day to the start of the new one due to a very large amount of prices; there's multiple stores, multiple customer groups and a lot of products.

I can see that the solution proposed at #37701 has been rejected because indexes slow down table writes. However, the table writes are for a temporary table that is only involved in the indexing process and for the most part the tradeoff is better overall.

I also tried changing the sql query itself to improve performance. This worked well, but I've found the best results, for our case anyway, was to combine both the query change and add indexes to the temporary table catalog_product_index_price_tmp as in the pull request #37701 (also linked above).

magento/vendor/magento/module-catalog/etc/db_schema.xml

        <constraint xsi:type="unique" referenceId="CAT_PRD_IDX_PRICE_TMP_ENTT_ID_CSTR_GROUP_ID_WS_ID">
            <column name="entity_id"/>
            <column name="customer_group_id"/>
            <column name="website_id"/>
        </constraint>

Note I've not fully tested data integrity yet with this. Probably be on Monday 17th I get back to look at that. I'll post a patch if all looks good then.

EDIT: See below

@andy-igoo
Copy link

andy-igoo commented Jun 17, 2024

This was actually introduced in 2.4.6: https://github.com/magento/magento2/blob/2.4.6/app/code/Magento/Catalog/Model/Indexer/Product/Price/AbstractAction.php

We've experienced the indexing to not complete even though it was left running overnight from the end of the working day to the start of the new one due to a very large amount of prices; there's multiple stores, multiple customer ....

On further testing we've found that the change on the sql query only makes a slight difference so in terms in risk and reward, it's probably better just leaving the query alone.

If anyone needs the patch (based on #37701) to change the table indexes then here's a patch for 2.4.7:

37700_inventoryIndexer.patch

diff --git a/vendor/magento/module-catalog/etc/db_schema_whitelist.json b/vendor/magento/module-catalog/etc/db_schema_whitelist.json
--- a/vendor/magento/module-catalog/etc/db_schema_whitelist.json	2024-06-17 11:12:06.595496943 +0100
+++ b/vendor/magento/module-catalog/etc/db_schema_whitelist.json	2024-06-14 08:27:31.648158048 +0100
@@ -987,7 +987,8 @@
             "final_price": true,
             "min_price": true,
             "max_price": true,
-            "tier_price": true
+            "tier_price": true,
+            "id": true
         },
         "index": {
             "CATALOG_PRODUCT_INDEX_PRICE_TMP_CUSTOMER_GROUP_ID": true,
@@ -995,7 +996,8 @@
             "CATALOG_PRODUCT_INDEX_PRICE_TMP_MIN_PRICE": true
         },
         "constraint": {
-            "PRIMARY": true
+            "PRIMARY": true,
+            "CAT_PRD_IDX_PRICE_TMP_ENTT_ID_CSTR_GROUP_ID_WS_ID": true
         }
     },
     "catalog_category_product_index_tmp": {

diff --git a/vendor/magento/module-catalog/etc/db_schema.xml b/vendor/magento/module-catalog/etc/db_schema.xml
--- a/vendor/magento/module-catalog/etc/db_schema.xml	2024-06-17 11:11:56.495474101 +0100
+++ b/vendor/magento/module-catalog/etc/db_schema.xml	2024-06-14 08:28:02.284226062 +0100
@@ -1645,6 +1645,11 @@
         <constraint xsi:type="primary" referenceId="PRIMARY">
             <column name="id"/>
         </constraint>
+        <constraint xsi:type="unique" referenceId="CAT_PRD_IDX_PRICE_TMP_ENTT_ID_CSTR_GROUP_ID_WS_ID">
+            <column name="entity_id"/>
+            <column name="customer_group_id"/>
+            <column name="website_id"/>
+        </constraint>
     </table>
     <table name="catalog_category_product_index_tmp" resource="default" engine="innodb"
            comment="Catalog Category Product Indexer temporary table">

Apply patch with patch -p1 --forward < 37700_inventoryIndexer.patch

@engcom-Hotel engcom-Hotel moved this to Ready for Development in High Priority Backlog Aug 19, 2024
@vandark25
Copy link

would it be possible to apply the patch to magento 2.3.7 ? nicka101@90e4ecc

@mzeis
Copy link
Contributor

mzeis commented Mar 11, 2025

@engcom-Hotel

The patch in the Quality Patch Tool is not available yet for 2.4.7, and the change is not implemented in 2.4.7-p4 (only in 2.4.-develop with 2.4.8-beta1 as the first tag containing the commit, see details).

Please can you check if the patch will be released officially for 2.4.7?

@engcom-Hotel
Copy link
Contributor

Hello @mzeis,

Thanks for raising the concern!

We are in discussion for this concern with the internal team. I will update you soon on this. Meanwhile moving this issue on hold.

Thanks

@engcom-Hotel engcom-Hotel added Triage: Need PO Confirmation Requirements should be clarified/approved/confirmed with Product Manager. Not ready for fix/delivery Issue: On Hold labels Mar 13, 2025
@github-project-automation github-project-automation bot moved this to Ready for Confirmation in Issue Confirmation and Triage Board Mar 13, 2025
@engcom-Hotel engcom-Hotel moved this from Ready for Confirmation to On Hold in Issue Confirmation and Triage Board Mar 13, 2025
@engcom-Hotel
Copy link
Contributor

Hello @mzeis,

We have an update from the internal team, they have created this JIRA for the patch release for 2.4.7 release line.

I will update you once same has been done. Till the time keeping this issue On Hold status.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Framework Component: Catalog Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Issue: On Hold Priority: P2 A defect with this priority could have functionality issues which are not to expectations. Progress: ready for dev Reported on 2.4.5 Indicates original Magento version for the Issue report. Reproduced on 2.4.x The issue has been reproduced on latest 2.4-develop branch Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it Triage: Need PO Confirmation Requirements should be clarified/approved/confirmed with Product Manager. Not ready for fix/delivery
Projects
Development

Successfully merging a pull request may close this issue.