Least Median of Squares Regression | ymxb

The Least Median of Squares (LMedS) is a robust statistical method predominantly used in the context of regression analysis. This technique is designed to fit a model to a dataset in a way that is resistant to outliers. Developed as an alternative to more traditional methods like Ordinary Least Squares (OLS) regression, LMedS is distinguished by its focus on minimizing the median of the squares of the residuals rather than their mean. Residuals are the differences between observed and predicted values.

The key advantage of LMedS is its robustness against outliers. In contrast to methods that minimize the mean squared residuals, the median is less influenced by extreme values, making LMedS more reliable in datasets where outliers are present. This is particularly useful in linear regression, where it identifies the line that minimizes the median of the squared residuals, ensuring that the line is not overly influenced by anomalies.

A critical feature of the LMedS method is its robustness, particularly its resilience to outliers. The method boasts a high breakdown point, which is a measure of an estimator's capacity to handle outliers. In the context of LMedS, this breakdown point is approximately 50%, indicating that it can tolerate corruption of up to half of the input data points without a significant degradation in accuracy. This robustness makes LMedS particularly valuable in real-world data analysis scenarios, where outliers are common and can severely skew the results of less robust methods.

Rousseeuw, Peter J.. “Least Median of Squares Regression.” Journal of the American Statistical Association 79 (1984): 871-880.

The LMedS estimator is also characterized by its equivariance under linear transformations of the response variable. This means that whether you transform the data first and then apply LMedS, or apply LMedS first and then transform the data, the end result remains consistent. However, it's important to note that LMedS is not equivariant under affine transformations of both the predictor and response variables.

The algorithm randomly selects pairs of points, calculates the slope (m) and intercept (b) of the line, and then evaluates the median squared deviation (mr2) from this line. The line minimizing this median squared deviation is considered the best fit.

In the LMedS approach, a subset of the data is randomly selected to compute potential models (e.g., lines in linear regression). The method then evaluates these models based on the median of the squared residuals. Since the selection of data points is random, different runs may select different subsets, leading to variability in the computed models.
Skript med en öppen källkod

I sann TradingView-anda har författaren publicerat detta skript med öppen källkod så att andra handlare kan förstå och verifiera det. Hatten av för författaren! Du kan använda det gratis men återanvändning av den här koden i en publikation regleras av våra ordningsregler. Du kan ange den som favorit för att använda den i ett diagram.

Frånsägelse av ansvar

Informationen och publikationerna är inte avsedda att vara, och utgör inte heller finansiella, investerings-, handels- eller andra typer av råd eller rekommendationer som tillhandahålls eller stöds av TradingView. Läs mer i Användarvillkoren.

Vill du använda det här skriptet i ett diagram?