编辑精选OPEN-SOURCE SCRIPT

Weighted percentile nearest rank

已更新

Yo, posting it for the whole internet, took the whole day to find / to design the actual working solution for weighted percentile 'nearest rank' algorithm, almost no reliable info online and a lot of library-style/textbook-style solutions that don't provide on real world production level.

The principle:

0) initial data
data = 22, 33, 11, 44, 55
weights = 5 , 3 , 2 , 1 , 4

array(s) size = 5

1) sort data array, apply the sorting pattern to the weights array, resulting:
data = 11, 22, 33, 44, 55
weights = 2 , 5 , 3 , 1 , 4

2) get weights cumsum and sum:
weights = 2, 5, 3 , 1 , 4
weights_cum = 2, 7, 10, 11, 15
weights_sum = 15

3) say we wanna find 50th percentile, get a threshold value:
n = 50
thres = weights_sum / 100 * n
7.5 = 15 / 100 * 50

4) iterate through weights_cum until you find a value that >= the threshold:
for i = 0 to size - 1
2 >= 7.5 ? nah
7 >= 7.5 ? nah
10 >= 7.5 ? aye

5) take the iteration index that resulted "aye", and find the data value with the same index, that's gonna be the resulting percentile.
i = 2
data = 33

This one is not an approximation, not an estimator, it's the actual weighted percentile nearest rank as it is.

I tested the thing extensively and it works perfectly.
For the skeptics, check lines 40, 41, 69 in the code, you can comment/uncomment dem to switch for unit (1) weights, resulting in the usual non-weighted percentile nearest rank that ideally matches the TV's built-in function.

Shoutout for @wallneradam for the sorting function mane
...
Live Long and Prosper

版本注释

Significant Update Alert

- 10x and faster calculation speed due to improved algo complexity from O(n²) to O(n log n), effectively allowing you to comfortably use the thing on long moving windows (as you shoulda anyways) like 256 datapoints and more;
- Now supports combined weighting by time And inferred volume at the same time (as it should've).