Training Process (Train Model ) in ML - Python for Integrated Circuits - - An Online Book - |
||||||||
Python for Integrated Circuits http://www.globalsino.com/ICs/ | ||||||||
Chapter/Index: Introduction | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Appendix | ||||||||
================================================================================= Training data sets require several example predictor variables to classify or predict a response. In machine learning, the predictor variables are called features and the responses are called labels. The training data is used to fit models, the validatioan data is used to make sure the model generalizes well to new data, and the test set is used to evaluate the performance of th efinal model. In predictive modeling and machine learning, the term "fit" or "model fitting" refers to the process of training a machine learning model on a dataset. When you "fit" a model to your training data, you're essentially teaching the model to learn patterns, relationships, and associations within that dataset. The goal is to make the model capable of making accurate predictions or classifications based on the input data. Here's a more detailed explanation of the three stages you mentioned:
Therefore, "fitting" a model refers to the training phase where the model learns from the training data to make predictions or classifications. The validation and test datasets are used to ensure that the model can generalize to new data and performs well when deployed for practical use. On the other hand, when training a machine learning model, the goal is often to minimize a certain objective function (e.g., a loss function) by adjusting the weights of the model. In fact, in most cases, the "training" or "model training" is actually the process, where the goal is to minimize the difference between the predicted output (hypothesis(x)) and the actual output (y), which can be given by (h(x)-y)2. The learning rate plays a crucial role in this process. If the learning rate is too small, the model may take a long time to converge or may get stuck in a local minimum. On the other hand, if the learning rate is too large, the optimization process may oscillate or even diverge, making it difficult to find the optimal set of weights. In supervised learning, this process is a fundamental step where the machine learning algorithm adjusts its parameters to make the predictions as close as possible to the true target values in the training dataset. The process of minimizing the difference between the hypothesis and the actual target, by choising correct parameters (θ), is typically achieved through various optimization techniques, such as gradient descent, which iteratively updates the model's parameters to reduce the prediction error. The objective is to find the set of parameters that results in the best possible fit of the model to the training data, allowing it to generalize well to new, unseen data. Figure 4109a shows how supervised learning works. To provide a precise characterization of the supervised learning problem, the objective is to acquire a function h: X → Y from a given training set. This function, denoted as h(x), should excel at predicting the associated value y. Traditionally, this function h is referred to as a "hypothesis" due to historical conventions. Figure 4109a. Workflow of supervised learning. Figure 4109b shows the Vertex AI providing a unified set of APIs for the ML lifecycle. When sending training jobs to Vertex AI, most of the logic is split into a task.py and a model.py. Figure 4109b. Vertex AI providing a unified set of APIs for the ML lifecycle. [1] To make your code compatible with Vertex AI, there are three basic steps that must be completed in a specific order: You can use either pre-built containers or custom containers to run training jobs. Both containers require you specify settings that Vertex AI needs to run your training code, including region, display-name and worker-pool-spec. Figure 4109c shows simplified overview of a machine learning workflow. The machine learning model is trained on input data gathered from different databases. Once it is trained, it can be applied to make predictions for other input data.
Non-linearity helps in training your model at a much faster rate and with more accuracy without the loss of your important information. Non-saturating, non-linear activation functions such as ReLUs can be used to fix the successive reduction of signal vs. noise caused by each additional layer in the network during the training process. Models for ML can be trained by using cloud infrastructure through Google Cloud Training Service. Training package can be run by using Docker containers and then training Docker images can be pushed on Docker registry. With a Keras model for training, the significance of the .fit() method is that it can define the number of epochs. Table 4109. Training models and training objectives.
============================================
[1] Diagram courtesy Henry Tappen and Brian Kobashikawa.
|
||||||||
================================================================================= | ||||||||
|
||||||||