Like us on Facebook


Follow us on Twitter





Page 1 of 4 123 ... LastLast
Results 1 to 15 of 52
  1. #1
    Join Date
    Aug 2008
    Location
    Indianapolis, IN
    Posts
    7,465

    RPM Primer

    Real plus-minus (RPM) was developed by Jeremias Engelmann, formerly of the Phoenix Suns, in consultation with Steve Ilardi, University of Kansas psychology professor and former NBA consultant.

    It follows the development of adjusted plus-minus (APM) by several analysts and regularized adjusted plus-minus (RAPM) by Joe Sill.

    RPM reflects enhancements to RAPM by Engelmann, among them the use of Bayesian priors, aging curves, score of the game and extensive out-of-sample testing to improve RPM's predictive accuracy.
    This guide is going to be broken into two posts. The first will be some brief history on the statistic, and what it’s trying to accomplish. The second will be issues with RPM, and how we use it wrong. This is not meant to be exhaustive, or detail how to actually calculate RPM, but rather to give everyone a good idea on what the stat is and what it can (and can’t) do. Almost all the math will be skipped, because no one really cares about that. There will be a link at the bottom if anyone really wants it.

    1. What is RPM

    RPM stands for Real Plus-Minus, which is really just marketing talk. It is a derivative of xRAPM, or Expected Regularized Adjusted Plus-Minus, which was created by Engelmann. xRAPM is, as you would expect, itself a derivative of RAPM, which is a derivative of APM, which was an attempt to fix the OG plus-minus.

    PM in it’s most basic form is how much the score changes when a player is on the floor. This can be looked at for individuals, combinations of multiple players, or entire lineups. The issue with PM at an individual level is noise. There are 9 other players on the floor at any given time (4 teammates and 5 opponents) all contributing varying amounts to the final score. How can you attribute that to one player? It becomes even more difficult when two or more players play so much of their minutes together that PM cannot tell them apart. This is called collinearity. The example I give is Kobe Bryant and Derek Fisher. Whenever Fisher was on the floor, Bryant almost always was as well. Essentially, PM said they were equally impactful, despite how absurd that statement is.

    APM uses linear regression to model the minutes the players are apart to attempt to isolate one player from the rest. Unfortunately, this does not resolve collinearity, since the minutes Fisher and Bryant were apart were so few that you ended up with extremely small samples wildly swinging the data.

    RAPM attempts to solve this problem by applying ridge regression, which pulls the data toward a prior, or predetermined expectation. RAPM uses a player’s previous seasons as the prior. If a player rates higher or lower than previous seasons would indicate, RAPM assumes that is a fluke that will average out long term, and pulls the rating closer to previous seasons. The unfortunately has the effect of throwing out a lot of relevant data.

    Consider a rookie player. They haven’t played an 82 game season before, haven’t spent a lot of time in the weight room, are playing against grown men much bigger, stronger, and faster than any competition they’ve had before, and have to learn a whole new system. Rookies are usually bad. Now as a sophomore, that player is often substantially better, but RAPM tries to ignore those improvements, because it is skeptical of fluctuations that differ from the prior.

    xRAPM (we’re almost there) is a more aggressive version of RAPM. RAPM uses a prior rating of 0, where xRAPM dynamically sets the rating. Essentially, given the Bryant/Fisher example, if there is a large positive effect, xRAPM tends towards crediting the player we think is better (Bryant).

    So how is RPM different? “RPM reflects enhancements to RAPM by Engelmann, among them the use of Bayesian priors, aging curves, score of the game and extensive out-of-sample testing to improve RPM's predictive accuracy.” What does that mean? I don’t know, because ESPN has not published how to calculate RPM. Presumably, aging curves account for things like the rookie/sophomore example above. Bayesian priors refers to the regression. The rest is gibberish. If someone with more statistical background than me wants to take crack at it, links are below:

    https://cornerthreehoops.wordpress.c...al-plus-minus/

    https://deadspin.com/just-what-the-h...a-s-1560361469

    https://www.poundingtherock.com/2014...eal-plus-minus

    The math:
    https://squared2020.com/2017/09/18/d...ctory-example/

  2. #2
    Join Date
    Aug 2008
    Location
    Indianapolis, IN
    Posts
    7,465
    2. What do we get wrong about RPM?

    A) The first and most glaring issue with how we use RPM is that it doesn’t tell you what happened. Why? From above “If a player rates higher or lower than previous seasons would indicate, RAPM assumes that is a fluke that will average out long term, and pulls the rating closer to previous seasons.“ RPM uses prior seasons as a template for the current one, and pulls the current data toward what happened in previous years. As a result, except in the case of rookies RPM never actually shows you what happened THIS YEAR. What RPM is designed to do is predict what’s going to happen. “If my primary goal is to evaluate how well a player did this season, it wouldn’t make a lot of sense to use data from other seasons. However, if I want to predict what will happen in the future, the older numbers can help me differentiate between players who have been consistently good (and will likely keep being good) and players who are merely going through a hot streak (and will likely regress to their mean).”

    Take for example Lebron James. At the time I write this, he has career highs in TS%, 2pt FG%, 3pt FG%, assist rate, and 2nd highest rebounding rate. He is 2nd in Win Shares, 2nd in BPM, 2nd in PER, and 1st in VORP. In RPM he’s 4th. This is likely because Lebron is posting numbers substantially higher than he has in the last 3 years, and because of ridge regression RPM thinks that it’s a fluke and that Lebron will regress to the mean. Thus, his RPM value is lower for this year than what EVERYTHING ELSE suggests. The crux is that RPM should not be used to say, “Lebron James is not in the running for MVP.” RPM never really says that player A has outperformed player B, it says it thinks that player A will outperform player B “IN THE FUTURE”.

    B) The next issue is with how the priors are rated. Remember that RAPM uses a prior rating of 0, meaning that it’s extremely skeptical of new data. RPM uses a variable prior rating which skews the data to try and throw out less. Essentially, we KNOW that Kobe Bryant is better than Derrick Fisher, so we’re going to count Kobe for more. Every player has a different prior rating, reflecting how confident RPM is that new data is a fluke. So, how is the rating determined? Who knows. Again, ESPN has not saw fit to publish their calculations, and there is no consensus in economics on how it should be done. For all we know, there’s a keyboard monkey in Bristol who just types in numbers manually based on what he thinks. More likely, there’s some complex calculation that pulls in non-PM boxscore data. Regardless, what we know is that RPM DOES NOT TREAT ALL DATA EQUALLY. By design, RPM biases the data.

    C) Position matters. The average value for centers is different than the average for point guards. RPM is not position adjusted like PER, where an average player at any position is 15.00. Comparing across positions is tenuous.

    D) It doesn’t necessarily work. This is a problem inherent in all PM based statistics, and what every single iteration has tried to correct. Does RPM really differentiate between a player and his teammates? A good example is 2014 Jae Crowder, who ranked 9th among SFs that year. I’m just going to quote this, because I can’t say it any better.

    Jae Crowder is used primarily in a specific role in a certain line-up -- he's the small forward when Carlisle makes his second rotation of the half. Carlisle takes Dirk out early and goes with a small-ball team with Shawn Marion at the four, then he brings Dirk back in with Crowder, Brandan Wright and Devin Harris. That unit absolutely kills opposing second teams and Jae has very little to do with it.

    All the Mavs are asking him to do is pass the ball, don't turn it over and be passable at defense. They run pick-and-rolls with Dirk or Wright and Jae is one of the guys who rotates the ball off the pick-and-roll. He's a league-average shooter who doesn't take many shots. The ones he takes generally tend to be open and he scores 10 points on 44% shooting per-36 minutes. There's just not much going on with the guy.

    Jae is literally out there to give Shawn Marion a breather. Marion plays 31 minutes, splitting those between the three and four. The minutes Marion doesn't play at the three, Carlisle splits between Crowder and Vince Carter. Crowder will get spot minutes in a blow-out or when Marion can't go, but playing at the three when Dirk and Wright are in the game is the only defined role he has on this team.

    Anything that measures the output of the possessions that Jae Crowder is on the floor is only measuring how well the Mavericks are doing as a team in that one role. He spots up and shoots threes at a league average rate and he is a decent defender at the wing position. The Mavs can't give the line-up he is in more minutes because a team that played Wright and Dirk together for more than 15+ minutes would not have enough rebounding or interior defense.

    The only reason Dallas can get away with using that line-up -- the one that makes Crowder an effective player -- is because Carlisle uses it for four to five minute stretches against second units. He's putting Dirk and Wright, two of the most efficient offensive players in the NBA, against what are usually straight up terrible defenders. Backup fours and fives have pretty much zero chance of guarding Dirk or Wright. Just as important, they can't exploit their "defense" at those positions.
    https://www.mavsmoneyball.com/2014/4...plus-minus-rpm

  3. #3
    Join Date
    Sep 2013
    Posts
    17,015
    Props for your effort. Will read tomorrow. I think there should be a basketball guide involving statistics. Sticky it and have it explained so a ten year old could understand. Who's up to contribute?

  4. #4
    Join Date
    Sep 2013
    Posts
    17,015

    NBA Advanced Statistics FAQ/Guide:

    It's no secret that analytics have been vital in NBA's progression over these past few years. This is going to be a thread that I would hope could spread some knowledge regarding some advanced statistics being used. This isn't going to be a debate but purely a knowledge-based thread where we can all contribute. Please do not use this thread and turn it into a LeBron vs Curry or Jordan vs LeBron thread UNLESS you are using it as context to explain the advanced statistic. At the moment, I am still finding posters who are interested. However, if you are interested, just post below and I'll organize the list.

    Requirements:
    1) Try to explain it in a way that a casual fan would understand. Many of us began as casual fans and aren't basketball scouts. We don't need a Harvard-translated definition.

    2) Use examples if it helps but do NOT turn this into a debate thread.

    3) Feel free to correct or add to a definition respectfully. Some of us know more than another. That's perfectly fine. We all have something worthy to contribute at the end of the day.

    4) You can use Basketball-Reference but I think we would all prefer some originality in your example.

    Some advanced statistics you could work on:
    PER, VORP, BPM, RPM, TS%, EFG%, USG%, WS48, WS TOTAL.

  5. #5
    Join Date
    Oct 2007
    Posts
    17,277
    Thanks Indy!

    Can the mods please sticky this?

    We can use this thread as a reference.
    “It’s about winning,” Stoudemire said. “You win, you’re going to get on national TV. Simple. In Phoenix, we won — Western Conference finals three, four years, playoffs every year. We won. If you don’t win, nobody really wants to see you.”

  6. #6
    Join Date
    Sep 2013
    Posts
    17,015
    PER: Measures the productivity of a player in a per-minute basis, it takes into account the positive and negative contributions of a player. This stat does not favor defensive contributions and heavily favors offensive contributions. This stat also takes into account for pace. However, it is not ranking of how good a player is. We've seen examples where players with very few minutes played experience a high PER measurement (Hassan Whiteside 2014-2015), and we've seen players such as Draymond Green (PER of 16.5 in 2016-2017) experience slightly above average PER. For reference, the average PER for the league has been measured to be 15.

    TS%:

    EFG%:

    FtR%:


    WS Total:

    WS48:

    VORP:

    BPM:

    RPM:
    Last edited by FlashBolt; 12-07-2017 at 12:20 AM.

  7. #7
    Join Date
    Sep 2013
    Posts
    17,015
    Quote Originally Posted by aman_13 View Post
    Thanks Indy!

    Can the mods please sticky this?

    We can use this thread as a reference.
    I'm making an official thread where we can all add and contribute to an overall glossary of advanced statistics. I'll add this to it if Indy Realist permits.

  8. #8
    Join Date
    Jun 2009
    Posts
    8,248
    I was spat on, pissed on, crucified for saying the exact same thing about PM stats. I clearly defined how there were so many variables (coaching schemes, rotations, player roles) that were completely unrelated to the player that would make PM stats unreliable as a sole measurement.

    Use it IN CONJUNCTION with a multitude of other stats - not by itself or you'll be fooled into thinking the George Hills, Kyle Lowrys, and Dellevadovas are better than Kyrie. Funny how Kyrie's advanced stats did a massive 180 when switching to a different team with different schemes, players, coaching, and culture!

    Extreme but accurate example: I would have higher PM stats if I played on a team with 4th graders compared to Kemba Walker playing on an Olympic team. Does that mean I'm a better/superior player than Kemba? No.

    I like RPM but it is used SOOO incorrectly and it should NEVER be used as the SOLE and ONLY stat to declare one player as being superior or more impactful or more conducive to winning than another.

    /mic drop

  9. #9
    Join Date
    Sep 2013
    Posts
    17,015
    Quote Originally Posted by Vee-Rex View Post
    I was spat on, pissed on, crucified for saying the exact same thing about PM stats. I clearly defined how there were so many variables (coaching schemes, rotations, player roles) that were completely unrelated to the player that would make PM stats unreliable as a sole measurement.

    Use it IN CONJUNCTION with a multitude of other stats - not by itself or you'll be fooled into thinking the George Hills, Kyle Lowrys, and Dellevadovas are better than Kyrie. Funny how Kyrie's advanced stats did a massive 180 when switching to a different team with different schemes, players, coaching, and culture!

    Extreme but accurate example: I would have higher PM stats if I played on a team with 4th graders compared to Kemba Walker playing on an Olympic team. Does that mean I'm a better/superior player than Kemba? No.

    I like RPM but it is used SOOO incorrectly and it should NEVER be used as the SOLE and ONLY stat to declare one player as being superior or more impactful or more conducive to winning than another.

    /mic drop
    All advanced stats are similar in that way. Never understood why someone would type:

    Player X is better because he has a better PER or VORP than player Y. Therefore, I have a bigger penis.

    I've actually given up on using advanced statistics. I've literally enjoyed the game more just watching them play and seeing how they can actually contribute to a championship team. At the end of the day, it's about who can contribute to winning the most and some of these players generate nice numbers and stats but end up being losers. I've watched enough LeBron to admit that it's not even worth looking into advanced statistics the majority of the time. Just watch them play! I just watched RWB put up a historic individual season and after he got that triple double, it was empty as hell. No one really cared anymore. Even today, who cares? It's the past. We're losing and that's the biggest part of the game. Blake put up nice stats for years. He's, excuse my harshness, a total loser of a player. Draymond defies all advanced stats but makes winning plays. Give me that guy.

  10. #10
    Join Date
    Aug 2008
    Location
    Indianapolis, IN
    Posts
    7,465
    Quote Originally Posted by FlashBolt View Post
    I'm making an official thread where we can all add and contribute to an overall glossary of advanced statistics. I'll add this to it if Indy Realist permits.
    Go for it!

  11. #11
    Join Date
    Jul 2005
    Location
    parts unknown
    Posts
    31,872
    Quote Originally Posted by IndyRealist View Post
    2. What do we get wrong about RPM?

    A) The first and most glaring issue with how we use RPM is that it doesn’t tell you what happened. Why? From above “If a player rates higher or lower than previous seasons would indicate, RAPM assumes that is a fluke that will average out long term, and pulls the rating closer to previous seasons.“ RPM uses prior seasons as a template for the current one, and pulls the current data toward what happened in previous years. As a result, except in the case of rookies RPM never actually shows you what happened THIS YEAR. What RPM is designed to do is predict what’s going to happen. “If my primary goal is to evaluate how well a player did this season, it wouldn’t make a lot of sense to use data from other seasons. However, if I want to predict what will happen in the future, the older numbers can help me differentiate between players who have been consistently good (and will likely keep being good) and players who are merely going through a hot streak (and will likely regress to their mean).”

    Take for example Lebron James. At the time I write this, he has career highs in TS%, 2pt FG%, 3pt FG%, assist rate, and 2nd highest rebounding rate. He is 2nd in Win Shares, 2nd in BPM, 2nd in PER, and 1st in VORP. In RPM he’s 4th. This is likely because Lebron is posting numbers substantially higher than he has in the last 3 years, and because of ridge regression RPM thinks that it’s a fluke and that Lebron will regress to the mean. Thus, his RPM value is lower for this year than what EVERYTHING ELSE suggests. The crux is that RPM should not be used to say, “Lebron James is not in the running for MVP.” RPM never really says that player A has outperformed player B, it says it thinks that player A will outperform player B “IN THE FUTURE”.

    B) The next issue is with how the priors are rated. Remember that RAPM uses a prior rating of 0, meaning that it’s extremely skeptical of new data. RPM uses a variable prior rating which skews the data to try and throw out less. Essentially, we KNOW that Kobe Bryant is better than Derrick Fisher, so we’re going to count Kobe for more. Every player has a different prior rating, reflecting how confident RPM is that new data is a fluke. So, how is the rating determined? Who knows. Again, ESPN has not saw fit to publish their calculations, and there is no consensus in economics on how it should be done. For all we know, there’s a keyboard monkey in Bristol who just types in numbers manually based on what he thinks. More likely, there’s some complex calculation that pulls in non-PM boxscore data. Regardless, what we know is that RPM DOES NOT TREAT ALL DATA EQUALLY. By design, RPM biases the data.

    C) Position matters. The average value for centers is different than the average for point guards. RPM is not position adjusted like PER, where an average player at any position is 15.00. Comparing across positions is tenuous.

    D) It doesn’t necessarily work. This is a problem inherent in all PM based statistics, and what every single iteration has tried to correct. Does RPM really differentiate between a player and his teammates? A good example is 2014 Jae Crowder, who ranked 9th among SFs that year. I’m just going to quote this, because I can’t say it any better.

    Thanks Indy. that was a good read
    Rep Power: 0




    Quote Originally Posted by Raps08-09 Champ View Post
    My dick is named 'Ewing'.

  12. #12
    Join Date
    Oct 2005
    Posts
    27,082
    In basketball, true shooting percentage is an APBRmetrics statistic that measures a player's efficiency at shooting the ball. It is intended to more accurately calculate a player's shooting than field goal percentage, free throw percentage, and three-point field goal percentage taken individually.

    TS% should always be used instead of FG%. Simple reason why:

    Dwight Howard shoots 6 for 10 from the field.
    Steph Curry shoots 4 for 10 from the field.

    If you just compare FG% it would appear that Dwight Howard is the more efficient offensive players. However, let's add more context.

    Dwight Howard shoots 6 for 10 from the field. These shots are all 2pt attempts. He also makes 4 out of 8 from the free throw line. Dwight Howard has scored 16 points.

    Steph Curry shoots 4 for 10 from the field. These shots are all 3pt attempts. He also makes 7 out of 8 from the free throw line. Steph Curry has scored 19 points.

    They have both shot exactly the same amount of field goal attempts and free throw attempts. Steph Curry's TS% would be higher because it would give him added value for shooting 3's and shooting a higher percentage from the free throw line. Those are the two important aspects of basketball that true shooting percentage successfully incorporates that field goal percentage does not.


    Kristaps Porzingis
    Stronger than most 15 year old girls.

  13. #13
    Join Date
    Sep 2006
    Posts
    21,998
    Good reading Indy. I think people always look for a single number that is "the answer".
    Last edited by Scoots; 12-07-2017 at 12:40 PM.

  14. #14
    Join Date
    Sep 2006
    Posts
    21,998
    Quote Originally Posted by KnicksorBust View Post
    TS% should always be used instead of FG%.
    No. TS% is just another data point, it's not a replacement for FG%. TS% has issues too in that it undervalues great FT% and overvalues bad FT%. It's not useless but it's not a replacement for basic data either.

    PPP is probably a more useful stat than TS% for figuring out a players offensive efficiency, but like all stats PPP has it's own set of caveats.
    Last edited by Scoots; 12-07-2017 at 12:59 PM.

  15. #15
    Join Date
    Aug 2011
    Location
    Swinging from Bruce Bochy's sack
    Posts
    48,377
    So this is a place to copy paste advanced stat definitions from basketball-reference? [emoji846]

    Sent from my Note 8 using Tapatalk

Page 1 of 4 123 ... LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •