[Deployment] 4.5 Should feature selection be part of the Machine Learning automated pipeline?


Third Party Machine Learning Pipeline: Leveraging the power of Scikit-Learn


4.5 Should feature selection be part of the Machine Learning automated pipeline?

Feature selection in CI/CD 장단점

Advantages

  • Reudced overhead in the implementation of the new model
  • THe new model is almost imediately available to the business systems

Disadvantages

  • Lack of data versatility
  • No additional data can be fed through the pipeline, as the entire processes are based on the first dataset on which it was built

Feature selection in CI/CD

Suitable:

  • Model build and refresh on same data
  • Model build and refresh on smaller datasets

Not Suitable:

  • If model is built using datasets with a high feature space
  • If model is constantly enriched with new data sources





© 2018. by statssy

Powered by statssy