{"id":1416,"date":"2020-08-12T09:49:03","date_gmt":"2020-08-12T09:49:03","guid":{"rendered":"http:\/\/blogs.gm.fh-koeln.de\/westenberger\/?p=1416"},"modified":"2020-08-19T20:54:23","modified_gmt":"2020-08-19T20:54:23","slug":"test-based-detection-of-required-re-training-of-a-prediction-method","status":"publish","type":"post","link":"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/2020\/08\/12\/test-based-detection-of-required-re-training-of-a-prediction-method\/","title":{"rendered":"Test-based Detection of Required Re-Training of a Prediction Method"},"content":{"rendered":"<p class=\"lead\">A method is given which predicts a value for time window W(i).<\/p>\n<p>Posteriori, the quality of the predicted value in the past can be checked against the realized value. A prediction error E(i) and an accuracy measure A(i) can be computed for each window W(i) of the past. We assume that A(i) can be considered as an observation of a random variable X with normal distribution (parameters: expected value &#8222;mu&#8220; and square root of its variance &#8222;sigma&#8220;).<\/p>\n<p><strong>Scenario<\/strong>: The error of prediction value for day n should be tested based on an one-tailed test.<\/p>\n<p>A rough approach to detect that re-training of the prediction model is indicated as follows: The parameters mu and sigma of the assumed normal distribution are estimated by the observations A(1), \u2026 A(n-1) of days 1,\u2026 n-1<\/p>\n<p>Then the new observation gives the new accuracy A(n) which is checked against the null hypothesis that this value is a realization of the same random variable with identical distribution. If the new value A(n) is too extreme (means that the value drops beneath a specific treshold) the null hypothesis is rejected. This triggers the conclusion that the trained model needs to be updated and a new training cycle is needed.<\/p>\n<p>This algorithm is explained by the following small example (n=6).<\/p>\n<table style=\"height: 250px\" width=\"438\">\n<tbody>\n<tr>\n<td width=\"187\"><\/td>\n<td width=\"80\">Day<\/td>\n<td width=\"80\">Accuracy A(i)<\/td>\n<\/tr>\n<tr>\n<td width=\"187\">previous day<\/td>\n<td width=\"80\">W(1)<\/td>\n<td width=\"80\">0,8<\/td>\n<\/tr>\n<tr>\n<td width=\"187\">previous day<\/td>\n<td width=\"80\">W(2)<\/td>\n<td width=\"80\">0,75<\/td>\n<\/tr>\n<tr>\n<td width=\"187\">previous day<\/td>\n<td width=\"80\">W(3)<\/td>\n<td width=\"80\">0,6<\/td>\n<\/tr>\n<tr>\n<td width=\"187\">previous day<\/td>\n<td width=\"80\">W(4)<\/td>\n<td width=\"80\">0,65<\/td>\n<\/tr>\n<tr>\n<td width=\"187\">previous day<\/td>\n<td width=\"80\">W(5)<\/td>\n<td width=\"80\">0,75<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>Parameter estimation of normal distribution of X:<\/p>\n<table width=\"160\">\n<tbody>\n<tr>\n<td width=\"80\">mu<\/td>\n<td width=\"80\">0,71<\/td>\n<\/tr>\n<tr>\n<td width=\"80\">sigma<\/td>\n<td width=\"80\">0,0822<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<a class=\"thickbox\" href=\"http:\/\/blogs.gm.fh-koeln.de\/westenberger\/files\/2020\/08\/Estimated-Distribution-of-Accuracy-Random-Variable-X-2.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"176\" class=\"alignnone size-medium wp-image-1432\" src=\"http:\/\/blogs.gm.fh-koeln.de\/westenberger\/files\/2020\/08\/Estimated-Distribution-of-Accuracy-Random-Variable-X-2-300x176.jpg\" alt=\"\" srcset=\"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/files\/2020\/08\/Estimated-Distribution-of-Accuracy-Random-Variable-X-2-300x176.jpg 300w, https:\/\/blogs.gm.fh-koeln.de\/westenberger\/files\/2020\/08\/Estimated-Distribution-of-Accuracy-Random-Variable-X-2-1024x602.jpg 1024w, https:\/\/blogs.gm.fh-koeln.de\/westenberger\/files\/2020\/08\/Estimated-Distribution-of-Accuracy-Random-Variable-X-2-768x452.jpg 768w, https:\/\/blogs.gm.fh-koeln.de\/westenberger\/files\/2020\/08\/Estimated-Distribution-of-Accuracy-Random-Variable-X-2.jpg 1280w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a>\n<p>&nbsp;<\/p>\n<p>Fig: Estimated distribution of X based on the values and assumptions made above.<\/p>\n<p>&nbsp;<\/p>\n<p>An acceptance level alpha 0.2 for a one-sided test (left-hand side) would result in a threshold value of<\/p>\n<p>P = 0,641.<\/p>\n<p>That means that error values higher than P= 0,641\u00a0 would indicate to reject the null hypothesis. If we improve our approach by substituting the estimation of sigma by using the Student\u2019s t-distribution with 4 degrees of freedom the derived threshold t-value for a one-tailed test becomes<\/p>\n<p>P = 0,675 .<\/p>\n<p>If the test method is applied to drift detection in a speed layer the moving time frame leads to recomputed values of mu and sigma for each day. Especially in cases in which a slowly creeping drift occurs it may be considered to introduce a parameter beta which scales the inertness of moving average to achieve a more stable approach:<\/p>\n<p>mu (i) = beta*mu(new) +\u00a0 (1-beta)*mu(i-1)<\/p>\n<p>sigma (i) = beta* sigma (new) +\u00a0 (1-beta)* sigma (i-1)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A method is given which predicts a value for time window W(i). Posteriori, the quality of the predicted value in the past can be checked against the realized value. A prediction error E(i) and an accuracy measure A(i) can be computed for each window W(i) of the past. We assume that A(i) can be considered&#8230;  <a href=\"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/2020\/08\/12\/test-based-detection-of-required-re-training-of-a-prediction-method\/\" class=\"more-link\" title=\"Read Test-based Detection of Required Re-Training of a Prediction Method\"><?php _e(\"Read more &raquo;\",\"wpbootstrap\"); ?><\/a><\/p>\n","protected":false},"author":27,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[134],"tags":[],"class_list":["post-1416","post","type-post","status-publish","format-standard","hentry","category-allgemein"],"acf":[],"_links":{"self":[{"href":"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/wp-json\/wp\/v2\/posts\/1416","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/wp-json\/wp\/v2\/users\/27"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/wp-json\/wp\/v2\/comments?post=1416"}],"version-history":[{"count":10,"href":"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/wp-json\/wp\/v2\/posts\/1416\/revisions"}],"predecessor-version":[{"id":1433,"href":"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/wp-json\/wp\/v2\/posts\/1416\/revisions\/1433"}],"wp:attachment":[{"href":"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/wp-json\/wp\/v2\/media?parent=1416"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/wp-json\/wp\/v2\/categories?post=1416"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.gm.fh-koeln.de\/westenberger\/wp-json\/wp\/v2\/tags?post=1416"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}