[Deployment] 4.3 Custom Machine Learning Pipeline
in Development on API
Custom Machine Learning Pipeline
4.3 Custom Machine Learning Pipeline
: procedural Progamming의 단점은 Hard-code parameters와 Save multple objects or data structures이다.
그러므로 Object Oriented Programming(OOP)를 써야된다.
- Data -> attributes
- Instructions or procedures -> methods
Custom ML Pipeline: OOP
: In OOP the “objects” can learn and store this parameters
- Parameters get automatically refreshed every time model is re-trained
No need of manual hard-coding
- Methods:
- Fit : to learn parameters (파라미터를 학습해서 저장용도)
- Saves the parameter in object attribute
- Transform : to transform data with the learnt parameters(학습한 파라미터로 데이터를 변환 시킨다. Fit은 단지 학습 용도로만 쓰이고 버린다고 생각하면 편하다)
- Fit : to learn parameters (파라미터를 학습해서 저장용도)
- Attribute : Store the learn parameters
- Methods:
Custom ML Pipeline : Pipeline
: A pipeline is a set of data processing steps connected in series, where typically, the output of one element is the input of the next one.(아웃풋이 다음 인풋으로 가게 프로세스 짜는거)
Custome ML Pipeline : Overview
- Advantages
- Can be tested versioned, tracked and controlled
- Can build future models on top
- Good software developer practice
- Built to satisfy business needs
- Disadvantages
- Required team of software developers to build and maintain
- Overhead for DS to familiarise with code for debugging or adding on future models
- Preprocessor not reuseable, need to re-write Preprocessor class each new ML model(통채로 묶어놓은 형태라 재사용이 불가하므로 class로 각각 흩트려놔야함)
- Need to write new pipeline for each new ML model
- Lacks versatility, ,ay constrain DS to what is available with the implemented pipeline