Crossvalidations 
Concept 
Applicable dataset 
Advantages 
Disadvantages 
Standard holdout validation 
You split your dataset into two parts: a training set and a test set. The training set is used to train your model, and the test set is used to evaluate its performance. The primary purpose of this approach is to estimate how well your model will generalize to unseen data. 
Deep learning 
Simplicity: It is easy to understand and implement. You only need to split the dataset into two parts, making it a straightforward method for assessing a model's performance.
Efficiency: Standard holdout validation is computationally efficient, especially when dealing with large datasets. It requires fewer computations than some other crossvalidation techniques.
Speed: Training and evaluating a model using a holdout validation set is quicker than some more complex crossvalidation methods, making it practical for rapid model prototyping and development.
Useful for Large Datasets: It is wellsuited for cases where you have a large amount of data, and the performance of the model on the holdout set can provide a reasonable estimate of generalization performance. 
Variance: The performance estimate from a single train/test split can be highly variable. Depending on the random split, you might get different results, which may not be representative of the model's true generalization performance.
Bias: The performance estimate can be biased, especially when the dataset is imbalanced. A random split may lead to an unrepresentative distribution of classes in the training and test sets.
Limited Information: You are using only a portion of your data for testing, which means you might not be fully utilizing the information available in the dataset to assess your model's performance.
Overfitting Risk: There is a risk of overfitting to the specific holdout set, as the model might perform well on that particular data but poorly on unseen data.
Unreliable for Small Datasets: In cases where you have a small dataset, the standard holdout method might not provide a robust estimate of the model's performance, and you might be better off using more sophisticated crossvalidation techniques. 
KFold CrossValidation 
The dataset is randomly divided into K subsets (folds) of roughly equal size. The model is trained and tested K times, with each fold used as the test set once and the remaining folds as the training data. The results are averaged to evaluate model performance. 
Small (e.g. 100 samples) and large 
Provides a robust estimate of model performance.
Helps assess model stability and generalization.
Can be useful for both small and large datasets. 
Can be computationally expensive, especially with a large number of folds.
The results may vary depending on the random splitting of data. 
Stratified KFold CrossValidation 
Similar to KFold CrossValidation, but it ensures that each fold has a similar class distribution to the entire dataset. It's particularly useful when dealing with imbalanced datasets. 

Ensures a more representative distribution of classes in each fold, which is important for imbalanced datasets. 
Still subject to computational cost and randomness. 
LeaveOneOut CrossValidation (LOOCV) 
K is set to the number of samples in the dataset. Each data point is used as the test set once while the rest serve as the training data. This method is useful for small datasets but can be computationally expensive. 
Extremely small (e.g. 20~50 samples) 
Provides the least biased estimate of model performance for small datasets. 
Extremely computationally expensive for large datasets.
Prone to high variance in the performance estimate. 
LeavePOut CrossValidation 
Generalizing from LOOCV, this method involves leaving out P data points as the test set while using the remaining data for training. It strikes a balance between computational cost and variance in the estimated performance. 

Low Bias: LPOCV provides a less biased estimate of a model's performance compared to simpler techniques like holdout validation (e.g., train/test split). This is because you are leaving out multiple data points as the test set, allowing you to assess how well the model generalizes to various subsets of the data.
Variability in Evaluation: LPOCV allows you to evaluate your model using multiple different test sets (combinations of P data points). This helps to assess how robust your model is and provides a better understanding of its overall performance.
Utilizes Most of the Data: Since you are repeatedly using (NP) data points for training and P data points for testing, LPOCV makes efficient use of your dataset. This can be important when you have limited data. 
Computational Intensity: LPOCV can be computationally expensive, especially when P is a large fraction of the total number of data points (N). With a large dataset, the number of possible test sets can be extremely high, leading to long training and evaluation times.
High Variance: LPOCV can yield high variance in the estimated performance, which may make it less stable compared to techniques like KFold CrossValidation. The variance is especially prominent when P is close to N, approaching the LeaveOneOut CrossValidation (LOOCV) scenario.
Resource Intensive: With a large value of P, the amount of memory and computational resources required can become a limiting factor. This is especially true for datasets with a substantial number of features.
Dependence on P: The choice of P is critical. If P is too small, the test sets may not be representative enough, leading to biased estimates. On the other hand, if P is too large, you might encounter the computational and variance issues mentioned above. 
Time Series CrossValidation 
Specifically designed for time series data, it involves splitting the dataset into consecutive and nonoverlapping time periods. This helps to evaluate a model's ability to make predictions into the future. 

Specifically designed for timedependent data.
Helps evaluate a model's ability to make future predictions. 
May not be suitable for nontime series data.
Limited to sequential data with a clear time order. 
ShuffleSplit CrossValidation 
In ShuffleSplit, the dataset is randomly shuffled and split into multiple nonoverlapping traintest splits. This approach is useful for large datasets or when you want to assess model stability. 

Useful for large datasets or when assessing model stability. 
May introduce some randomness into the results. 
Repeated KFold CrossValidation 
It's a variation of KFold CrossValidation where the process is repeated multiple times with different random splits. This helps in obtaining more reliable estimates of model performance. 

Provides more reliable and less biased estimates than a single KFold validation. 
Increased computational cost due to repetition. 
Group CrossValidation 
Used when dealing with data that has natural groupings, such as medical data for patients from different hospitals. It ensures that all data from a specific group is either in the training or test set but not in both. 

Suits datasets with groupings or clusters, like medical data from different hospitals. 
Requires additional information about groupings. 
Nested CrossValidation 
This technique is often used for hyperparameter tuning and model selection. It involves having an inner and an outer loop of crossvalidation, where the inner loop optimizes model parameters, and the outer loop evaluates the model's generalization performance. 

Helps in hyperparameter tuning and model selection.
Provides a more robust assessment of model performance. 
Increases computational complexity. 
Monte Carlo CrossValidation 
Subsets of data are randomly sampled, and crossvalidation is performed on each subset. It's useful for assessing model performance in situations where data sampling is stochastic or uncertain. 

Useful when data sampling is stochastic or uncertain. 
Can be computationally intensive if the number of Monte Carlo samples is high. 