Machine Learning Interview Questions
Machine learning is the form of Artificial Intelligence that deals with system programming and automates data analysis to enable computers to learn and act through experiences without being explicitly programmed.
EX : Robots are coded in such a way that they can perform the tasks based on data they collect from sensors. They automatically learn programs from data and improve with experiences.
Machine Learning algorithms can be primarily classified depending on the presence/absence of target variables.
Supervised learning : [Target is present]
* The machine learns using labelled data. The model is trained on an existing data set before it starts making decisions with the new data.
* The target variable is continuous: Linear Regression, polynomial Regression, quadratic Regression.
* The target variable is categorical: Logistic regression, Naive Bayes, KNN, SVM, Decision Tree, Gradient Boosting, ADA boosting, Bagging, Random forest etc.
Unsupervised learning : [Target is absent]
* The machine is trained on unlabelled data and without any proper guidance. It automatically infers patterns and relationships in the data by creating clusters. The model * learns through observations and deduced structures in the data.
* Principal component Analysis, Factor analysis, Singular Value Decomposition etc.
Reinforcement Learning :
* The model learns through a trial and error method. This kind of learning involves an agent that will interact with the environment to create actions and then discover errors or rewards of that action.
Machine Learning Are Used Two Types of Datas.

* Structured Data
* Unstructured Data.
Structured Data : This type of data is predefined, labeled, and well-formatted before being stored in a data storage. Example: Student Records Table.
Unstructured Data : This Type of data is in native format, and it's not processed until it is used. Example: Text, Audio, Video, Emails, etc.
Reinforcement learning is an area of machine learning in which the model is trained according to the rewards given to it based on its previous actions in the environment. There is an agent whose task is to give rewards and also to maximize the rewards. If the model performs the task correctly, it gets a +1 reward, but if it does a task wrong, then it gets a -1 reward.
Applications :

* Self-driven cars
* Automatic parking
* puzzle solver, etc,.
Some of the popular Machine Learning algorithms are :
* K-Means
* Naive Bayes
* Random Forest
* Linear Regression
* Gradient Boosting algorithms
* Logistic RegressionDecision Tree
* Dimensionality Reduction Algorithms
Machine Learning is about the study, design, and development of the algorithms that make computers work without being explicitly programmed.
Data Mining is a process wherein the unstructured data tries to extract knowledge or unknown interesting patterns, using Machine Learning algorithms.
Ensemble learning is a machine learning technique that uses various base models such as classifiers or experts to produce an optimal predictive model. To solve any computational program, such models are strategically generated and combined. The ensemble is a supervised learning algorithm, as it can be trained and used to make predictions.
Model Building : Choose a suitable algorithm for the model and train it according to the requirement 

Model Testing : Check the accuracy of the model through the test data 

Applying the Model : Make the required changes after testing and use the final model for real-time projects

Here, it’s important to remember that once in a while, the model needs to be checked to make sure it’s working correctly. It should be modified to make sure that it is up-to-date.
Regression : regression is a process of finding the correlation between the dependent and independent variables. It is helpful in the prediction of continuous variables, such as in the prediction of the stock market, house prices, etc. In regression, our task is to find the best suitable line that can predict the output accurately.
Classification : Classification is the process of finding a function that helps in dividing the data into different classes. These are mainly used in discrete data. In classification, our aim is to find the decision boundary which can divide the dataset into different classes.
There are various means to select important variables from a data set that include the following :
* Lasso Regression
* Random Forest and plot variable chart
* Forward, Backward, and Stepwise selection
* The variables could be selected based on ‘p’ values from Linear Regression
* Identify and discard correlated variables before finalizing on important variables
* Top features can be selected based on information gain for the available set of features.