[Deployment] 4.5 Should feature selection be part of the Machine Learning automated pipeline?
in Development on API
Third Party Machine Learning Pipeline: Leveraging the power of Scikit-Learn
4.5 Should feature selection be part of the Machine Learning automated pipeline?
Feature selection in CI/CD 장단점
Advantages
- Reudced overhead in the implementation of the new model
- THe new model is almost imediately available to the business systems
Disadvantages
- Lack of data versatility
- No additional data can be fed through the pipeline, as the entire processes are based on the first dataset on which it was built
Feature selection in CI/CD
Suitable:
- Model build and refresh on same data
- Model build and refresh on smaller datasets
Not Suitable:
- If model is built using datasets with a high feature space
- If model is constantly enriched with new data sources
