What is a hyperparameter?

Large Language Model - Interview Questions

In machine learning, a hyperparameter is a parameter that is set before the training process begins and controls the behavior of the training algorithm. Unlike regular parameters, which are learned by the model during training, hyperparameters are set by the developer or researcher based on prior knowledge or trial-and-error experimentation.

Examples of hyperparameters include the learning rate, batch size, number of epochs, regularization strength, and network architecture. The selection of appropriate hyperparameters can greatly affect the performance of a model, and tuning hyperparameters is often an essential part of developing an effective machine learning system.

Hyperparameters are typically set using heuristics or by searching over a range of possible values. Grid search and random search are common techniques for hyperparameter tuning, but other more advanced methods, such as Bayesian optimization and gradient-based optimization, have also been developed.