=================================================================================
Supervised learning algorithms can be broadly divided into two categories: parametric and non-parametric. Parametric Learning Algorithms These algorithms assume a specific form for the function that maps input data to output data. They are characterized by having a fixed number of parameters, regardless of the number of training samples. The primary goal in these algorithms is to estimate the optimal values of these parameters. Once the model is trained (i.e., parameters are estimated), the number of parameters remains constant, no matter how much additional data is fed into the model. Examples include: - Linear Regression
- Logistic Regression
- Perceptron
- Naive Bayes
Non-Parametric Learning Algorithms Non-parametric algorithms, on the other hand, do not make strong assumptions about the form of the mapping function. These models increase in complexity with the amount of data and are potentially capable of fitting a wide range of data shapes. They are called non-parametric not because they have no parameters, but because the number of parameters is not fixed prior to training; rather, it grows with the training data. Examples include: - k-Nearest Neighbors (k-NN)
- Decision Trees
- Random Forests
- Kernel SVMs
While parametric models offer simplicity and speed, non-parametric models offer flexibility and are often more powerful at the cost of increased computational complexity and data needs. Each type has its strengths and scenarios where it is most appropriate.
Unsupervised learning algorithms can also be categorized into both parametric and non-parametric types, depending on whether they make assumptions about the form or the number of parameters of the underlying data distribution: - Parametric Unsupervised Learning Algorithms
Parametric unsupervised algorithms assume a specific form for the data distribution and attempt to learn the parameters of that model from the data. A common example is the Gaussian Mixture Model (GMM), where the data is assumed to be generated from a mixture of several Gaussian distributions. The task then is to estimate the parameters of these Gaussians (means, variances, and mixture coefficients).
- Non-Parametric Unsupervised Learning Algorithms
Non-parametric algorithms, on the other hand, do not assume a fixed form or number of parameters for the model. These methods can adapt to the data structure as more data becomes available, typically growing in complexity with the size of the data. Examples include: - K-means clustering: While K-means assumes a fixed number of clusters (K), it does not make assumptions about the form of these clusters beyond their tendency to be spherical.
- DBSCAN (Density-Based Spatial Clustering of Applications with Noise): This algorithm does not assume any specific number of clusters. Instead, it groups together points that are closely packed together, marking as outliers the points that lie alone in low-density regions.
- Kernel density estimation (KDE): Used for density estimation, where the idea is to estimate the underlying probability density function of the data without assuming its shape beforehand.
The choice between parametric and non-parametric unsupervised learning methods depends on the specific characteristics of the data and the goals of the analysis. Non-parametric methods are generally more flexible but can require more data and computational resources to effectively capture the underlying patterns in the data.
===========================================
|