Process of training to classify the facies label using ML

1. Preprocessing

- Feature Augmentation

Mostly used : Feature Window, Spatial Gradient (from ‘ISPL team’)
“polynomial features” to augment the features.

- Outlier Removal

Z-score, MAD(modified Z-score)

if the data’s Z-score is more than threshold, regarded as outlier, and remove it. (only ‘PA team’ used)

- Handling Missing data

PE imputation
  • NaN to 0 or mean value
  • make_pipeline(… , …)
  • MLP Regressor
  • PCA, TPOT

2. Train Model

Usually Scaling process is contained.

  • Deep Neural Network
  • SVM
  • Random Forest RandomForestClassifier
  • Gradient Boosting GradientBoostingClassifier
  • AdaBoost AdaBoostClassifier
  • XGBoost XGBClassifier

To tune parameters, GridSearch is mostly used. (GridSearchCV or test one by one)
(CrossValidation to validate the model is used only by ‘LA team’.)

3. Apply to Testdata

Scoring
  • Deterministic score : F1 score of the result when applying the model to test well.
  • Stochastic score : mean value of F1 scores when 100 random_seed is given.

Categories:

Updated: