-
Notifications
You must be signed in to change notification settings - Fork 350
Description
Here's a draft of your feature request for the Salesforce Merlion library, incorporating your points about MARS regression and pyearth:
Is your feature request related to a problem? Please describe.
As a user working with time series forecasting and regression in Python, I often find myself needing a robust and interpretable non-linear regression technique. While Merlion offers various powerful models, there isn't a direct equivalent to Multivariate Adaptive Regression Splines (MARS). This gap often leads me to use external libraries like pyearth for MARS, which then requires additional steps to integrate its outputs into a Merlion-centric workflow. Furthermore, the pyearth library, while excellent, appears to be less actively maintained, raising concerns about long-term compatibility and ongoing support for new Python versions or dependencies.
Describe the solution you'd like
I would like to request the addition of Multivariate Adaptive Regression Splines (MARS) as a new model within the Salesforce Merlion library. This implementation should aim to provide similar functionality and interpretability to the pyearth library's MARS implementation. Key features would include:
- Non-linear modeling: Ability to capture complex, non-linear relationships in data.
- Automatic feature interaction detection: Automatically identify and model interactions between features.
- Interpretability: Provide clear insights into feature importance and the functional form of the relationships.
- Integration with Merlion's existing framework: Seamlessly integrate with Merlion's data loading, forecasting, and evaluation utilities. This would ideally mean the model can be used within existing pipelines and benefit from Merlion's infrastructure for hyperparameter tuning and model selection.
Describe alternatives you've considered
I have primarily relied on the pyearth library for MARS regression. While effective, the lack of active maintenance for pyearth presents a long-term concern regarding its compatibility with future Python versions and evolving scientific computing ecosystems. Additionally, using pyearth requires an extra step to bring its results into a Merlion workflow, which adds complexity. Other alternatives within Merlion, such as generalized additive models (GAMs) or more complex deep learning models, while powerful, may not offer the same balance of interpretability and non-linear fitting capabilities as MARS for certain problem types.
Additional context
MARS regression is a highly valuable technique in various domains, including time series analysis and general regression tasks, due to its ability to handle non-linearities and interactions while maintaining interpretability. Its inclusion would significantly enhance Merlion's capabilities and make it a more comprehensive library for users seeking robust and insightful regression models, especially those transitioning from or looking for an alternative to less actively maintained external libraries. The pyearth library could serve as a strong reference for the desired functionality and API design.