Posts

Mechanism of Principal Component Analysis

Image
 Principal components analysis (PCA) is a popular approach for deriving a low-dimensional set of features from a large set of variables. It is a tool for unsupervised learning and is often used as a dimension reduction technique for regression problems to tackle the curse of dimensionality in datasets. The Curse of Dimensionality The dimensionality of a dataset is the number of attributes or features present in the dataset. As the dimensionality of the problem increases, the probability of adding noise features that are not truly associated with the response increases, leading to a deterioration in the fitted model, and consequently an increased test set error. Thus, higher dimensionality of the dataset exacerbates the risk of overfitting. Even if they are relevant features, the variance incurred in fitting their coefficients may outweigh the reduction in bias that they bring. Thus, the curse of dimensionality includes the role of the bias-variance trade-off and the danger of ...

Complete Linear Regression Analysis

Image
 Much of mathematics is devoted to studying variables that are deterministically related. Saying that x and y are related in this manner means that once we are told the value of x, the value of y is completely specified. Equation for linear relationship between x and y : \begin{equation} y = \beta_0 + \beta_1x \end{equation} However, there are many variables that would appear to be related to one another, but not in a deterministic fashion that is even n for a fixed value of x, there is uncertainty in the value of y. In this case, x will be called the independent, predictor, or explanatory variable Regression Analysis is the part of statistics that investigates the relationship between two or more variables related in a non-deterministic fashion. The linear regression model has the form : \begin{equation} f(X) = \beta _0 + \sum_{j=1}^{n}X_j\beta_j \end{equation}   The variable whose value is fixed will be denoted by x and will be called the independent predict...