• Support SmashBoards and get Premium Membership today!

  • Welcome to Smashboards, the world's largest Super Smash Brothers community! Over 250,000 Smash Bros. fans from around the world have come to discuss these great games in over 19 million posts!

    You are currently viewing our boards as a visitor. Click here to sign up right now and start on your path in the Smash community!

Official Competitive Character Impressions 2.0

?


  • Total voters
    551

Kokiden

Smash Ace
Joined
Apr 24, 2019
Messages
601

:]
Very interesting data.

Wow I did not expect Joker's win rate to be that low. As for Bayonetta, that would explain why she's not getting any buffs.

As for Richter, yeah he's a menace to deal with online so I can't say I'm surprised.
 

StrangeKitten

Smash Lord
Joined
Mar 25, 2020
Messages
1,109
Location
Pokemon Stadium 2
Very interesting data.

Wow I did not expect Joker's win rate to be that low. As for Bayonetta, that would explain why she's not getting any buffs.

As for Richter, yeah he's a menace to deal with online so I can't say I'm surprised.
I've heard Joker's pretty bad online. At the very least, he's like Pika and Fox where he heavily appreciates offline's faster pace to extend his combos on reaction. That gets heavily nerfed online.
 

Thinkaman

Moderator
Moderator
Joined
Aug 26, 2007
Messages
6,324
Location
Madison, WI
NNID
Thinkaman
3DS FC
1504-5749-3616
As I've said over and over and over, win rates aren't that telling of a statistic. They are fun trivia, but I've never heard of any serious esports live design team using win-rate as a primary guiding star.

It's confounded by everything you can think of. This data is proof, I mean we all agree that for "our" purposes this is basically garbage, right? Show of hands who wants the next patch based on this list?

On the other hand, raw usage is extremely useful, or at least as much as any single statistic over a broad-yet-ambiguous unevenly-sampled population can be.

Happy New Year, I plotted it for you:

smash_gg_data_2021_0.png


Oh look, it actually correlates with OrionRank, just like the old usage numbers did. Spoilers: It also correlates with the averaged PGR opinion data that they collected. Whereas win-rate doesn't correlate with anything. I'm not even going to bother showing you the plot, because it's literally just noise with a perfectly flat "trendline."

It kills me that this is the best usage data we have, a fantastic achievement in data collection, and all anyone talks about is useless win-rates that mean nothing.
 

Nobie

Smash Champion
Joined
Sep 27, 2002
Messages
2,156
NNID
SDShamshel
3DS FC
2809-8958-8223
As I've said over and over and over, win rates aren't that telling of a statistic. They are fun trivia, but I've never heard of any serious esports live design team using win-rate as a primary guiding star.

It's confounded by everything you can think of. This data is proof, I mean we all agree that for "our" purposes this is basically garbage, right? Show of hands who wants the next patch based on this list?

On the other hand, raw usage is extremely useful, or at least as much as any single statistic over a broad-yet-ambiguous unevenly-sampled population can be.

Happy New Year, I plotted it for you:

View attachment 297942

Oh look, it actually correlates with OrionRank, just like the old usage numbers did. Spoilers: It also correlates with the averaged PGR opinion data that they collected. Whereas win-rate doesn't correlate with anything. I'm not even going to bother showing you the plot, because it's literally just noise with a perfectly flat "trendline."

It kills me that this is the best usage data we have, a fantastic achievement in data collection, and all anyone talks about is useless win-rates that mean nothing.
I'm no statistician, but when I look at the characters who have the highest usage rates, I see in them the best characters per rough archetype and/or have the most reliable gimmicks. For example, I think most agree that Bowser is not a top tier, but very few would disagree that he's the best of the superheavies. Ness back throw never goes out of style. Cloud is arguably the strongest swordsman in a purely netplay setting, and limit gives him the ability to just throw out even more incredibly safe kill moves.
 

Diddy Kong

Smash Obsessed
Joined
Dec 8, 2004
Messages
24,574
Switch FC
SW-1597-979602774

:]
Diddy scores surprisingly high for a character so notoriously considered bad on WiFi. Diddy requires enormous precision, and that's often a bad call on WiFi.

Sheik and Marth being considered so weak online is not a big surprise however.

The Belmonts are also a hell to fight online so them placing high is also no surprise.
 

blackghost

Smash Lord
Joined
Jul 9, 2015
Messages
1,925
As I've said over and over and over, win rates aren't that telling of a statistic. They are fun trivia, but I've never heard of any serious esports live design team using win-rate as a primary guiding star.

It's confounded by everything you can think of. This data is proof, I mean we all agree that for "our" purposes this is basically garbage, right? Show of hands who wants the next patch based on this list?

On the other hand, raw usage is extremely useful, or at least as much as any single statistic over a broad-yet-ambiguous unevenly-sampled population can be.

Happy New Year, I plotted it for you:

View attachment 297942

Oh look, it actually correlates with OrionRank, just like the old usage numbers did. Spoilers: It also correlates with the averaged PGR opinion data that they collected. Whereas win-rate doesn't correlate with anything. I'm not even going to bother showing you the plot, because it's literally just noise with a perfectly flat "trendline."

It kills me that this is the best usage data we have, a fantastic achievement in data collection, and all anyone talks about is useless win-rates that mean nothing.
Win rate is heavily used for card game balance. For example when demon hunter launched in hearthstone it had a 60 percent win rate and had to be nerfed.
If a character in smash was in the top 5 in usage AND had North of a 55 percent win rate I think you would see that character get nerfed. That's Likely that is what happened with Mac in smash 4 launch.

Most esports in both the fgc and others use win rate data but they also have people that just actively play test. Given that Nintendo has actively overnerfed a character just to appease a crowd reaction, I'd guess and wager they aren't using better approaches on what characters should get nerfed or buffed.
 

Thinkaman

Moderator
Moderator
Joined
Aug 26, 2007
Messages
6,324
Location
Madison, WI
NNID
Thinkaman
3DS FC
1504-5749-3616
Win-rates only make sense if your players are less elastic and you have volumous data.

For example, win-rate in StarCraft is reasonably useful:
  • It's extremely rare for players (at any level) to switch playable factions.
  • Each faction is such a huge slice of the player population that most confounding population factors are controlled for. (It is unlikely that the entire 3_% of Zerg players are of a specific personality type predisposed to be better or worse at the game than the 3_% of Terran or Protoss players.)
  • There is without a doubt sufficient matchup data to drive even narrow conclusions; each SC matchup has ( relatively speaking) about 8x times the data of the median entire character in Smash Ultimate.
These factors hold true for most games when addressing stuff at lower levels of play. Riot Games has explicity codified their balance goals around this line of thinking: They use win-rate for lower levels of play and usage (including draft ban-rate) for the big boy leagues. (Top 0.1% originally)


I want to remind everyone exactly what type of data this is. It's not raw win-rates or usage. It's total usage in tourney sets with registered characters on smash.gg.

If we accept some regional bias, we can probably safely assume that this is very similar to total usage in tourney sets. Then we are still left with:
  • Win (and loss) statistics will be exaggerated, since you are measuring total games/wins in an elimination-based format.
    • This also means that the average player skill represented is considerably higher than the median; players who go 2-2 are represented twice as much as those who go 0-2.
  • Switching to Bo5 farther in the bracket, as well as running Grand Finals in general, provides an identical but smaller extra magnification. (Yet more games/wins for the winners)
  • It captures secondaries in a lopsided-context. Characters played more frequently as a secondary will post more futile-character-switch losing games. Character mains more likely to be played by those willing to switch to a secondary will have correspondingly fewer losses as well.
  • More volatile characters have a slight usage penalty, as it's easier who them to underperform than overperform in elimination-based formats.
Contrast with say, if all of these events were round-robin or a non-elimination based format. Think how the data would be different--all of those factors except for the bit involving secondaries would go away.

Now assume it's a Bo1 endless ladder, like most online esports. No more secondary confounding bias either. (Games with "secondaries" would be against fresh opponents.) Then you would have traditional win-rate statistics for your sample.


But would that even be valuable for this level of play? This data set is about 2010k total games, or just over 1 mil total sets. If all or almost all were double elim, that's 500k entrants. Most players only enter a fewtournaments, but then some enter 100. My guesstimate would be 500k entrants over 2+ years boils down to somewhere around 40k players. That would be:
  • <0.2% of the copies of the game sold.
  • ~0.45% of people who have played online.
  • ~5.4% of the population of Elite Smash.
And remember, the average level of play within this sample is skewed due to elimination; it's not evenly representative of the 40k players represented. It's safe to say that the average play level of this sample exceeds the top 0.2% of general online play.

It's a pretty small sample in that lens. Even ignoring DLC and combining echoes, we still have more than 10 characters with less than 20,000 games lifetime. And it spans online and offline games, spans multiple patches, with entire characters released at different times.


So is this data useless? NO! I'm just trying to get you to understand the limitations. So we can appreciate the real value.

Look here:


smash_gg_data_2021_1.png


That is change in usage from the original dataset (about 1050k games) to the games since then. (Around 2010k games total, a nice even doubling.) We can see who was played more (and less) in the "second half" of Ultimate's competitive life thus-far vs. the first.

Please note that DLCs should, in a vacuum, have all improved. Their usage was handicapped in the original data, since they were only out for some of the time. So:
  • Terry at +67% is really impressive, but in truth much less meaningful than those just below him.
  • Byleth at only +20% should actually be read as lackluster. Same with Hero at -1%.
  • Joker at -17% should probably be read as more like -23% or so.
Technically, the typical characters should be down a couple %, because the roster has grown. This is not controlled for or otherwise reflected in the numbers shown.

The loudest trend this chart shows is "which buffs worked." It also suggests, pretty loudly, that the balance changes both had a measurable and clear change on player behavior in this sample, and that their magnitude heavily outweighed any gradual trend to the contrary of people slowly migrating to higher-tier characters.

(Keep in mind that all of these deltas are relative to previous usage. Sure, Joker plummeted over -17%, but he was also the #1 used character in the game. (He has "pummeted" to a mere #3, the horror.) On the other hand, Olimar has "surged" over +15%, but he was (and still is!) the least-played character in the game.) Joker is still being played over 10x as much as Olimar.

Also note, we are on the cusp of all characters being under 3% usage under the direct terms of this data set.(If you cound one way Ness is a few games over, while if you count losses as a better proxy for total players Joker is the only one over.) Exactly half the cast is used 1%-3% of the time, exactly half the cast less than 1% STDEV is down to 0.75%, excluding Steve and Sephiroth.
 

Hydreigonfan01

Smash Lord
Joined
Aug 24, 2018
Messages
1,296
Reddit's new tier list. Would just like to point out that r/smashbros is pretty big on the competitive scene.
g5xh8y3.png

My personal opinion is that I think Pika is too high, Lucina should be lower than Roy, Bowser,Ike Pichu, Rosa and Sheik are not high-mid tier and Toon Link is way too low compared to the other Links. Ganon should also be lower than Little Mac.
 

Rizen

Smash Legend
Joined
May 7, 2009
Messages
14,327
Location
Colorado
Reddit's new tier list. Would just like to point out that r/smashbros is pretty big on the competitive scene.
View attachment 297985
My personal opinion is that I think Pika is too high, Lucina should be lower than Roy, Bowser,Ike Pichu, Rosa and Sheik are not high-mid tier and Toon Link is way too low compared to the other Links. Ganon should also be lower than Little Mac.
:ultrob:'s the new :ultpalutena: in the sense that he's getting results everywhere: #1 on Orion rank, several players popping up in top 8s all over including Grayson, Wadi, Benny and the Jets, Epic Gaberial and less so 8bitman, he has a ridiculous kit that can steal stocks at 80% with offstage rotor arms but everyone's still saying "no he's not top tier but rather a gatekeeper". Then a ROB beats MKLeo and everyone starts saying he's on of the best in the game. Pika wishes he had those kinds of results.

How many roads must a man walk down before you call him a man? How many results must ROB get before you call him top tier?
 
Last edited:

Hydreigonfan01

Smash Lord
Joined
Aug 24, 2018
Messages
1,296
My personal belief with :ultpikachu: is that he's in a similar vein to :ultshulk:, and I have the opinion that I think Shulk is about the 6th best character in the game. He's not nearly as bad as his lack of results suggests and he's a top tier for sure, but he's not top 1 until someone picks up Pikachu and shows that their theorycrafting isn't just "with perfect play, Pika wins nearly every matchup". I was hoping Zackray was going to pick up Pika after ESAM's request (Especially because Zackray had a Pika at Squad Strike at Smash Summit 2), which may have been a showcase of what the character can really do, but that doesn't seem to have happened.

I also wonder for :ultrob: their would have more people believing the character is top 10 if Zackray solo-mained the character, but because he uses Joker to cover ROBs bad matchups people don't think the character is good enough to be top 10 yet.
 

Frihetsanka

Smash Lord
Joined
Apr 26, 2016
Messages
1,808
Location
Sweden
I think ROB is currently overrated by most people.

According to the Tier List Tier List (see: https://youtu.be/suQ0kohHg9E), tiers are decided in the following order: Weighted Matchups > Matchups > Theorycrafting/Observation > Results. I'm inclined to agree with it: Ultimately, the main factor of how good a character is is their weighted MU spread.

One could argue that some players underestimate their character. That is plausible, although it seems somewhat unlikely. I would be interested in seeing what proponents of ROB as a top 15 (or even top 10) character think his MU spread would look like. Let's see what some notable ROB mains think:


We see that ROB mains seem to be rather pessimistic, that is not the MU spread of a top 15 character. As such, we can draw one of two conclusions: Either ROB is not top 15, or these ROB players are wrong.

What about non-ROB players? We can look at Mr. Game & Watch and see that they also seem to believe that ROB loses hard: https://twitter.com/FridoSSB/status/1277199712949538816

And one exception where ROB only loses slightly:


Pikachu players also seem to think Pikachu is bad for ROB:


It is possible (although not probable) that they are all incorrect and that ROB's MU chart actually is good enough to warrant him being top 15. It does not seem very plausible to me, and even if we skip the Weighted Matchups and Matchups part and go to Theorycrafting/Observation I don't think ROB has what it takes to be top 15.

Results do matter to some extent, but less than Theorycrafting/Observation, and much less than Weighted Matchups. I suspect that in a few years ROB will have fallen off quite a bit and his results won't last.

As for the tier list, it's a Reddit list. Greninja slept on, I'm not entirely sure why but it is what it is. Sonic is underrated, Young Link is (based on what I've seen) overrated, Sheik and Corrin are both very underrated (I would put both in top 20 probably), Jigglypuff seems underrated (perhaps understandably so since she got buffed over time in Ultimate and was quite bad in 4). It's not the worst list ever but it's fairly flawed.


I also wonder for :ultrob: their would have more people believing the character is top 10 if Zackray solo-mained the character[...]
The meta in Japan is a bit more kind to ROB since there aren't many top level Pikachu mains or G&W mains (Zackray himself is probably one of the best G&Ws in Japan). Having probable -2 MUs to two top tiers is rough if you want to solo main a character. By no means impossible to win, but still quite rough.
 
Last edited:

RonNewcomb

Smash Journeyman
Joined
Nov 29, 2014
Messages
377
Reddit's new tier list. Would just like to point out that r/smashbros is pretty big on the competitive scene.
The pics used in the list are top tier. : chef's kiss : (List itself is idk idc.)

How many roads must a man walk down before you call him a man? How many results must ROB get before you call him top tier?
A Like for a LOL, but Rob clearly has some very bad MUs, and characters with clearly bad MUs will always be vulnerable to counter-picking. Hence, never top-tier as I see the term used. AFAIK trying to counterpick a toptier instead of using your main is a dicey proposition at best. As we've discussed before, a low tier char discovering a great MU vs a top-tier doesn't usually cause the low tier to rise much; it just drags down the top-tier. And ROB has more than one bad MU of notable severity.
My personal belief with :ultpikachu: is that he's in a similar vein to :ultshulk:,
I've asked before why Shulk jumped to theoretical toptier and never received a definitive answer, so I'll propose one and people can yell at me when I get it wrong. :)

1) Monado dial instead of pressing a button eighty bagillion times to select an art is a big "framedata-like" buff all on its own.
2) Switching during hitstun is a buff, whether intentional or not. But it mainly just reinforces point #1 which allows getting what Art you want, when you want it.
3) General gamespeed increase!!

The first two allow Shulk to actually use an Art when its needed instead of having to plan ahead so much. The gamespeed increase from Smash4 means that all his aerials, and all his approaches in general, are a lot less reactable. Being less reactable in what he does means that range on his sword actually means something. As in, something other than, "what can my character whiff-punish him with?" Now you have to respect him even from a few steps outside his range.
 
Last edited:

RonNewcomb

Smash Journeyman
Joined
Nov 29, 2014
Messages
377
Matchup charts are theorycraft.
Disagree. MU charts are backward-looking; theorycraft is future-looking.

Also: Back in my day we didn't have statistics. You just played CE Bison vs CE Guile until you felt your options dwindle to nothing, and called it. :)
 

Thinkaman

Moderator
Moderator
Joined
Aug 26, 2007
Messages
6,324
Location
Madison, WI
NNID
Thinkaman
3DS FC
1504-5749-3616
Disagree. MU charts are backward-looking; theorycraft is future-looking.
What are all these losing Pikachu matches everyone is looking back on? How did Das Koopa and all the results guys miss them all?

Results are backwards-looking, theorycraft is forward-looking, "matchup charts" are opinions and anecdotes. Nothing less, nothing more.
 
  • Like
Reactions: Nah

Rizen

Smash Legend
Joined
May 7, 2009
Messages
14,327
Location
Colorado
I think ROB is currently overrated by most people.

According to the Tier List Tier List (see: https://youtu.be/suQ0kohHg9E), tiers are decided in the following order: Weighted Matchups > Matchups > Theorycrafting/Observation > Results. I'm inclined to agree with it: Ultimately, the main factor of how good a character is is their weighted MU spread.

One could argue that some players underestimate their character. That is plausible, although it seems somewhat unlikely. I would be interested in seeing what proponents of ROB as a top 15 (or even top 10) character think his MU spread would look like. Let's see what some notable ROB mains think:


We see that ROB mains seem to be rather pessimistic, that is not the MU spread of a top 15 character. As such, we can draw one of two conclusions: Either ROB is not top 15, or these ROB players are wrong.

What about non-ROB players? We can look at Mr. Game & Watch and see that they also seem to believe that ROB loses hard: https://twitter.com/FridoSSB/status/1277199712949538816

And one exception where ROB only loses slightly:


Pikachu players also seem to think Pikachu is bad for ROB:


It is possible (although not probable) that they are all incorrect and that ROB's MU chart actually is good enough to warrant him being top 15. It does not seem very plausible to me, and even if we skip the Weighted Matchups and Matchups part and go to Theorycrafting/Observation I don't think ROB has what it takes to be top 15.

Results do matter to some extent, but less than Theorycrafting/Observation, and much less than Weighted Matchups. I suspect that in a few years ROB will have fallen off quite a bit and his results won't last.

As for the tier list, it's a Reddit list. Greninja slept on, I'm not entirely sure why but it is what it is. Sonic is underrated, Young Link is (based on what I've seen) overrated, Sheik and Corrin are both very underrated (I would put both in top 20 probably), Jigglypuff seems underrated (perhaps understandably so since she got buffed over time in Ultimate and was quite bad in 4). It's not the worst list ever but it's fairly flawed.


The meta in Japan is a bit more kind to ROB since there aren't many top level Pikachu mains or G&W mains (Zackray himself is probably one of the best G&Ws in Japan). Having probable -2 MUs to two top tiers is rough if you want to solo main a character. By no means impossible to win, but still quite rough.
That's a good case however I disagree with it. The problem with this is that you're treating MUs as if they aren't theory. And they are. You can say ROB doesn't have a top 15 MU spread all you want but at the end of the day he's winning tournaments more than anyone else. Look at it scientifically: you have a hypothesis about ROB's placement, MUs etc so you gather data. You test your hypothesis. Smash isn't an exact science with factors like skill and popularity but you can still get a pretty good idea of how viable a character is by how well they do. The more evidence the stronger the case is and ROB has the most evidence of any character. There's no such thing as an air tight case in smash but I find it very unlikely that so much data is the product of environmental errors. If ROB's not a top 15 character than why does he win so much?
Second, people are generally very bad at correctly assessing data. It's human nature. Take St. Elmo's fire. Pointed objects like ship masts glowing during thunderstorms. Sailors and people for centuries came up with all manors of hypotheses for the occurrence of this phenomenon. Things like favorable omens or even signs that god would destroy an opposing army. But now with scientific methods and better understand we know it's "a form of plasma. The electric field around the object in question causes ionization of the air molecules, producing a faint glow easily visible in low-light conditions." This is why I value results; results are the raw data. At first I thought Joker was mid tier. Then I got more data both from playing vs him in tournaments and results of players like MKLeo. I amended my theory and now consider him top tier.
 

Thinkaman

Moderator
Moderator
Joined
Aug 26, 2007
Messages
6,324
Location
Madison, WI
NNID
Thinkaman
3DS FC
1504-5749-3616
Different topic: Stage performance metrics are a fantastic case study of Simpson's Paradox.

You can find tons of wacky, totally "backwards" results, so many I won't bother linking them all. Mac does worse on FD than most stages, and better on not just Battlefield but Kalos. Mario does poor on Yoshi's Story. Lucario apparently hates big stages like Kalos, and does better on small stages.

What is going on??? Is literally all of our stage knowledge backwards?

No, it's Simpson's Paradox. There is a confounding lurking variable that "reverses" the data. Can you figure out what it is?

Our stage selection process means you play on your prefered stages when you are losing, and your less-prefered stages when you are winning. You mostly play your own counterpicks against people better than you, and never against people significantly worse.
 

Kokiden

Smash Ace
Joined
Apr 24, 2019
Messages
601
In regards to the win rate discussion earlier, didn't Sakurai refer to the online win rates in an interview to discuss the game balance? It was between 45-55% average and he thinks the game is in a good place generally speaking (I don't agree with his assessment but... nothing I, or anyone else, can do...)

I'm not saying it's the be all, end all, when it comes to game balance, but I think it may have a hand in who gets nerfed or buffed. That's all, really.

I'm not too well invested into the smash scene like I used to be, so I'm missing out on some statistics and data charts, but just saying is all.
 

Thinkaman

Moderator
Moderator
Joined
Aug 26, 2007
Messages
6,324
Location
Madison, WI
NNID
Thinkaman
3DS FC
1504-5749-3616
In regards to the win rate discussion earlier, didn't Sakurai refer to the online win rates in an interview to discuss the game balance? It was between 45-55% average and he thinks the game is in a good place generally speaking (I don't agree with his assessment but... nothing I, or anyone else, can do...)

I'm not saying it's the be all, end all, when it comes to game balance, but I think it may have a hand in who gets nerfed or buffed. That's all, really.

I'm not too well invested into the smash scene like I used to be, so I'm missing out on some statistics and data charts, but just saying is all.
It seems likely that, like Riot does with League, they used win-rates to make early decisions about low-level play. This is surely what guided the early (2.0) nerfs to K. Rool, Inkling, ect, as well as the early Smash 4 nerfs to Little Mac, DDD, and Greninja.

In both games, they then seemed to switch immediately to only caring about high level play after the first major patch. The closest changes we've gotten to lower-level play focus was the targeted buffs to Ness/Sonic, and maybe Cloud/Young Link?

Whatever the case, it's very obvious that the win-rate data we are seeing is not guiding balance decisions at all, for which we should all be thankful.
 

RonNewcomb

Smash Journeyman
Joined
Nov 29, 2014
Messages
377
You can find tons of wacky, totally "backwards" results, so many I won't bother linking them all. Mac does worse on FD than most stages, and better on not just Battlefield but Kalos.
My first thought: Kalos improves Mac's weakest attribute, his recovery, via both distance and mixup. Also, camping and "lame play" is harder than is given credit for.

Results are backwards-looking, theorycraft is forward-looking, "matchup charts" are opinions and anecdotes. Nothing less, nothing more.
Hard disagree. MUs are based on tool interactions, not tourney results or theory. If you want to get an idea of a particular MU, you don't watch tourney matches or online play and you don't chatter in forums. You take a friend to Training Mode and do drills. "I'm going to recover from this point via strategies like X, Y, or Z. You try to stop me using R, S, T or anything else you can think of. After a half-hour or so, we'll repeat with my character's extra resource burned from the start..." ..and so on, all night, in all the major situations, for just one MU. You first get a sense of what's possible, and then a sense of what's probable, and then a sense of counterplay, all the way up the yomi tree til it circles back to zero. You talk to other pairs that did the same just to ensure no one had a blindspot.

Just because modern Smash players make "match up charts" via opinions and anecdotes and some online play with a dash of Twitter highlights doesn't change the definition of what matchup charts are or how they're properly constructed. It just means there are literally no MU charts for smash, anywhere.

(In their defense, there's thousands of MUs + seasonal patch changes, so corners are cut of course. Sometimes they make MU charts not to make a MU chart, but to make content, for their brand on YouTube or Twitch or whatever. I see ZeRo and Esam both do this pretty nakedly, what with their tier lists having two tiers called "top" and three tiers called "high" and a bottom called "mid but viable", so as not to offend any subscriber, current or future. I respect this. But it's not a MU chart despite the label on the thumbnail.)

I would also say the old "Backroom" served a purpose here, for giving players a safe space to speak off-brand. We did stuff like that in SF years ago too, less formally and crucially never giving a name to such a fluid group. That at least was wise.
 

KirbySquad101

Smash Ace
Joined
Sep 7, 2015
Messages
811
Reddit's new tier list. Would just like to point out that r/smashbros is pretty big on the competitive scene.
View attachment 297985
My personal opinion is that I think Pika is too high, Lucina should be lower than Roy, Bowser,Ike Pichu, Rosa and Sheik are not high-mid tier and Toon Link is way too low compared to the other Links. Ganon should also be lower than Little Mac.
For some clarity on the list:

Low-Mid, High-Mid, and Borderline weren't even tiers in the "beta" version of this tier list; instead, the top of half of Borderline made it into top tier, while the bottom half was thrown into High, and so forth for the other tiers, which led to this tier list prior:

Top Tier: :ultpikachu::ultjoker::ultpalutena::ultpeach::ultwario::ultzss::ultshulk::ultwolf::ultmario::ultpokemontrainerf::ultgnw::ultrob::ultsnake::ultlucina:
High Tier: :ultroy::ultpacman::ultfox::ultchrom::ultmegaman::ultinkling::ultgreninja::ultyounglink::ultsonic::ultness::ultolimar::ultcloud::ultlink::ult_terry::ultminmin:ultsamus::ultyoshi::ultken::ultdiddy::ultbowser::ultike::ultfalcon::ultpichu:
Mid Tier: :ultrosalina::ultsheik::ultryu::ultfalco::ultcorrinf::ultwiifittrainer::ultbrawler::ultluigi::ulttoonlink::ultmarth::ultduckhunt::ultlucas::ultsteve::ultrobin::ultbanjokazooie::ultdarkpit::ultpit::ultmewtwo::ultvillager::ulticeclimbers::ultmetaknight::ultzelda::ultbayonetta::ultbowserjr::ultpiranha:
Low Tier: :ultridley::ultsimon::ultgunner::ultkrool::ultswordfighter::ultdk::ultkingdedede::ultjigglypuff::ultkirby::ultisabelle::ultincineroar::ultdoc::ultlucario:
Bottom Tier: :ultganondorf::ultlittlemac:

Needless to say, some parts of the tier list raised some major red flags for many players, biggest ones being:

  • :ultroy::ultpacman::ultfox::ultchrom: all placed in High Tier
  • :ultlucina: being placed in Top Tier, especially over the four characters listed
  • :ultrosalina::ultsheik: being placed in Mid Tier

That led to the creation of the borderline, upper-mid, and low-mid sections which tried to alleviate the issues players had with the list. Unfortunately, this led to some questionable movements like Snake and ROB being robbed of their Top Tier positions.

=======================================================================================================

On the subject of ROB MUs and whatnot: We've seen Zackray absolutely floor Etsuji's Pikachu and WaDi defeat ESAM's Pikachu as well as bring Maister's G&W to a game 5 situation. I won't deny that ROB most likely loses to both characters, but I also feel like you have to play really carefully with both and capitalize on your advantage to actually make things rough for the robot. Because ROB's combo game and aggression can be just as ruthless against both characters as their's are against ROB, and in those last stock situations, ROB's comparatively abundant kill options and confirms forces both to be walking on eggshells in those kinds of scenarios. And in that regard, I can't see either being -2 levels of bad, ESPECIALLY Pikachu who basically has to take whatever ROB's throwing at him once he gets the pain train rolling and doesn't even have an eject button for those kinds of scenarios.
 

Frihetsanka

Smash Lord
Joined
Apr 26, 2016
Messages
1,808
Location
Sweden
Matchup charts are theorycraft.

We don't have even 1% of the data that would be required to make matchup charts with even a mediocre statistical basis.
At this point in time? Yes, I would be inclined to agree. It takes a good amount of time (many years) to actually build decent MU charts, and with Ultimate we're not that close yet (and even after years people could still disagree. Good theorycrafting still trumps results, although bad theorycrafting does not. We also need to keep in mind solo-maining vs using a character as part of a roster: Many, but not all, ROB players use ROB alongside many other characters, thus avoiding some of this worst matchups. This could be an indication that ROB isn't as good as people think, although it could also just be that ROB players tend to prefer playing multiple characters.

That's a good case however I disagree with it. The problem with this is that you're treating MUs as if they aren't theory. And they are. You can say ROB doesn't have a top 15 MU spread all you want but at the end of the day he's winning tournaments more than anyone else. Look at it scientifically: you have a hypothesis about ROB's placement, MUs etc so you gather data. You test your hypothesis. Smash isn't an exact science with factors like skill and popularity but you can still get a pretty good idea of how viable a character is by how well they do. The more evidence the stronger the case is and ROB has the most evidence of any character. There's no such thing as an air tight case in smash but I find it very unlikely that so much data is the product of environmental errors. If ROB's not a top 15 character than why does he win so much?
Is ROB actually winning more tournaments than other characters? Or are several ROB players placing high to score points on OrionStats? Zackray uses a good amount of Joker too, no? As for data, ideally you'd test every MU (or every relevant MU) for every character with multiple players in order to get information on MU spread. This would, of course, be extremely time intensive. Instead you could look at tournament results, but there are so many factors involved (luck is huge, if you play a character with many losing MUs you'll really hope to run into easy MUs in your bracket). I also don't agree that he has "the most" evidence, in phase 1 he was #13, in phase 2 he was #7, in phase 3 he's #1, on OrionStats. Wolf was #1, #2, #2. Joker #22 (released late), #1, #5. Also, two very interesting cases for why results can be misleading: Shulk #25, #19, #23, Pikachu #24, #23, #29. An approach too focused on results would put Pikachu and Shulk outside of top 10, maybe even outside of top 15!

On the subject of ROB MUs and whatnot: We've seen Zackray absolutely floor Etsuji's Pikachu and WaDi defeat ESAM's Pikachu as well as bring Maister's G&W to a game 5 situation. I won't deny that ROB most likely loses to both characters, but I also feel like you have to play really carefully with both and capitalize on your advantage to actually make things rough for the robot. Because ROB's combo game and aggression can be just as ruthless against both characters as their's are against ROB, and in those last stock situations, ROB's comparatively abundant kill options and confirms forces both to be walking on eggshells in those kinds of scenarios. And in that regard, I can't see either being -2 levels of bad, ESPECIALLY Pikachu who basically has to take whatever ROB's throwing at him once he gets the pain train rolling and doesn't even have an eject button for those kinds of scenarios.
Could it be that the top ROB and G&W and Pikachu players all underestimate ROB in those MUs? Sure, that certainly happens with other characters and has happened in the past as well. Non-Japanese Zero Suit Samus players seem to underestimate her MU spread (if you look at Bankai, JeBB/Clovers, Juice, and Marss ZSS looks like a top 25 character at best). One difference is that non-ZSS MU charts don't seem to reflect that she loses hard to them, notable Wolf mains don't think they win +2 (some of them even think it's Even!), and ESAM thinks Pikachu is just slightly winning, Cosmos Inkling slightly winning. So with ZSS it does seem that most of the notable mains underestimate her, but the main difference is that mains of other characters don't seem to. In the case of ROB, both the mains and the opponents seem to think ROB loses hard. Are they all mistaken? It's not impossible, although perhaps a bit improbable.

Something worth considering: ROB appears to be the 10th most used character according to the SmashGG picture that Thinkaman posted earlier. This is probably a factor for his results, especially considering players like WaDi and Zackray play him.

I am legit curious though: If ROB is top 10, then he should have a top 10 MU chart. What would that look like?

Hypothetically, let's assume someone watched two equally skilled top players play 30 serious games versus each other. One plays ROB, while the other plays every other character. In this scenario, assuming both players know every MU and is equally skilled with every character, would ROB still get top 10 results? I'm inclined to believe that he wouldn't, and that his good results in phase 3 are due to a combination of multiple factors, such as: Using multiple characters to cover for ROBs bad MUs (like Zackray and Joker/G&W), having a large playerbase (#10 on SmashGG), having a good amount of top/high level players (such as WaDi, 8BitMan, Zackray, Epic_Gabriel, Grayson, Meteo, OCEAN, Raffi-X), being a character which benefits from a lack of MU experience (and, let's be real, this is a pretty significant factor in Ultimate, unfortunately). I suspect that he won't remain #1 on OrionStats for very long, I would be very surprised if he still is in two years. Is it possible that I'm underestimating ROB? Of course. I don't think I am, but I could be. For now, I would put him in top 20 or top 25, which is still high tier. He's a good character, but I don't think he's top 10 material in this patch and this meta.

Oh, what do people here think about Sonic :ultsonic: ? How much did he benefit from the 9.0.0 buffs? Wrath and Sonido seem very optimistic, KEN a bit less so. Personally, I think the character is very underrated, and has a strong case for being top 15, if not top 10.
 

Thinkaman

Moderator
Moderator
Joined
Aug 26, 2007
Messages
6,324
Location
Madison, WI
NNID
Thinkaman
3DS FC
1504-5749-3616
Just because modern Smash players make "match up charts" via opinions and anecdotes and some online play with a dash of Twitter highlights doesn't change the definition of what matchup charts are or how they're properly constructed. It just means there are literally no MU charts for smash, anywhere.

(In their defense, there's thousands of MUs + seasonal patch changes, so corners are cut of course. Sometimes they make MU charts not to make a MU chart, but to make content, for their brand on YouTube or Twitch or whatever. I see ZeRo and Esam both do this pretty nakedly, what with their tier lists having two tiers called "top" and three tiers called "high" and a bottom called "mid but viable", so as not to offend any subscriber, current or future. I respect this. But it's not a MU chart despite the label on the thumbnail.)
This is basically my position. The daily sets of charts being pumped out, whatever that is called, is theory/opinion. (Regardless of what the label might have meant previously or aspires to be.) The usage of the term has sort of been bent, and in my case I've just sort of begrudgingly accepted the new semantics. (You are 100% right that it's the drive for "content" driving much of this editorialization.)

And you can't blame anyone. The traditional ideal of the matchup chart is not realistic for Smash Ultimate. Think how many Melee or SF2 matchups comprise at least 1% of competitive play. (Several are over 2%, to say nothing of the Fox ditto.) Now reflect that this data suggests even the most-played Smash Ultimate matchup (Ness/Cloud) is less than 0.1% of matches played. The typical, median Ultimate matchup is less than 0.01% of all matches played.

You'd have to play Smash Ultimate around 30x as much to get comparable confidence to older FGC games or Melee. And that's just a baseline, absurdly assuming remembering/parsing more experience/data is not more difficult or error-prone, pretending that a more balanced game isn't harder to decide MUs for, and ignoring patches constantly moving the goalposts. It's a completely unreasonable proposition.

Even if you restrict things to a single character's set of matchups, it is still a magnitude bigger, baseline.


League of Legends has a somewhat similar problem with making accurate counter lists--a huge roster and frequent patches. But LoL generates over 1000x the data, logging more matches each day than we have recorded for Smash Ultimate lifetime. (With 5x the players per game!) And it's still a struggle.
 

Frihetsanka

Smash Lord
Joined
Apr 26, 2016
Messages
1,808
Location
Sweden
This is basically my position. The daily sets of charts being pumped out, whatever that is called, is theory/opinion. (Regardless of what the label might have meant previously or aspires to be.)
I agree with this. Currently, every MU chart is speculative to a high degree, and is based on a mix of theory, analysis, and experience. It is not pure theory, since it is guided by results and experience, but currently most people are leaning heavily on theory. This is understandable, given the huge amount of characters Ultimate has. Ultimate being better balanced also makes it trickier to make MU charts: In Brawl or Melee, it's a fairly safe assumption that a random top or high tier will beat a random mid tier. In Ultimate, it is fairly common for mid tiers to occasionally go even with top/high tiers, and some mid tiers might even slightly beat some high tiers!

How does this affect tier lists? Not all that much. In the end, even MU charts based on theory seem to be more reliable than going purely by results. We shouldn't entirely ignore results, but results are often misleading. They should be kept in mind, but they are not the be-all and end-all of making a tier list.

So... Sonic :ultsonic: -the biggest nerf from 4 was not being able to cancel Spin Dash to shield, correct? But from watching Sonic players this doesn't seem to be that big of a deal, he can still cancel with dair or find ways to play around it. In Smash 4, pretty much everyone agreed that Sonic was a top 10 character. I believe that, after buffs, he may yet again be a top tier character. I could be wrong and would be interested in hearing why. Sure, he can't cancel Spin Dash. In which other ways is he worse compared to 4? Is anyone up for making a comparison between Smash 4 Sonic and Ultimate Sonic?
 

The_Bookworm

Smash Champion
Joined
Jan 10, 2018
Messages
2,713
So... Sonic :ultsonic: -the biggest nerf from 4 was not being able to cancel Spin Dash to shield, correct? But from watching Sonic players this doesn't seem to be that big of a deal, he can still cancel with dair or find ways to play around it. In Smash 4, pretty much everyone agreed that Sonic was a top 10 character. I believe that, after buffs, he may yet again be a top tier character. I could be wrong and would be interested in hearing why. Sure, he can't cancel Spin Dash. In which other ways is he worse compared to 4? Is anyone up for making a comparison between Smash 4 Sonic and Ultimate Sonic?
Other notable nerfs include:
  • In addition to Spin Dash not being shield cancelable, it no longer goes through the opponent's shields unless fully charged. Sonic simply stops in his tracks and is wide open for a punish. This also applies to Spin Charge.
  • Up air, a key move from SSB4, is not very reliable in this game. Up air was very reliable to connect in the SSB4, but in Ultimate, it drops a lot. It got a little bit better thanks to 9.0, but the issue isn't fully fixed since all the patch did was simply adding a new hitbox below Sonic on the second hit, but didn't address the first hit.
  • Up throw pretty much lost all of its combo potential and Spring Jump setups, due to its increased endlag.
  • Spring Jump lost quite a bit of its intangibility, and the spring projectile sends at a more horizontal angle. Not really that much of a nerf, especially since Spring Jump received notable improvements in other areas, but something to note.
  • He is significantly lighter in this game than in SSB4. He has gone from 94 units (33th-36th out of 58 heaviest character) to 86 units (67th-68th out of 85 heaviest character). This noticeably weakens his endurance. This would have the benefit of making him less susceptible to combos, but his increased falling speed counteracts this.
He has, however, received many notable buffs:
  • The new attack canceling mechanics works pretty well with Sonic's amazing mobility.
  • His initial dash got massively buffed from SSB4, going from the 26th fastest initial dash (out of 58) to the 3rd fastest initial dash (out of 85).
  • His traction got massively buffed from SSB4, going from 14th-24th highest traction (out of 58) to the absolute highest traction in the game (out of 85).
  • His down tilt now launches more vertically and have much more knockback, improving its combo potential.
  • Pretty much all of his Smash attacks got noticeably buffed. Forward smash now has more range. Up smash has less startup, more reliable to connect, have more intangibility, and the final hit has more knockback (although it is still rather weak). Down smash's front hit has more knockback to match the back hit, although it does have less reach.
  • Possessing pretty laggy aerials in SSB4, he benefits quite a bit from Ultimate's (almost) universal landing lag reductions, although his landing lag is still a bit in the higher end. This grants neutral air combo potential and makes spacing back airs more viable.
  • Homing Attack got reworked to be much, much, MUCH better than it was previously. It can now be held for much longer, less startup when uncharged, much higher damage with knockback not being fully compensated, it has better accuracy due to Sonic having a much straighter trajectory, is has larger hitboxes, and it has much less endlag to grant the uncharged version combo potential. It did receive a 0.5x shieldstun multiplier, making it less safe on shield despite the endlag reduction, but this is still a huge buff to what it otherwise a relatively worthless move.
  • Spring Jump has slightly more distance, matching Brawl's vertical distance, while being boosted further by the addition of directional airdodging.

So while a lot of Sonic's standard attacks, particularly the underwhelming ones from SSB4, got greatly buffed, a lot of Sonic's key tools that made him top tier in SSB4 got nerfed or removed entirely. This forces Sonic to make use of his standard moveset, instead of solely relying on Spin Dash and Spin Charge to compensate for his generally underwhelming standard moveset. He now must take more risks, which his lowered frailty certainly doesn't help.

However, thanks to his retained strengths, direct buffs, and massive indirect buffs, he is able to do so, especially after game updates fixed a lot of annoying inconsistencies that plagued him (at least in terms of perception) early in the game. His standard moveset is now strong enough to hold his own, which helps out by the fact he still has fundamentally potent advantages, as well as Spin Dash/Charge still being pretty good moves (especially online).

It is basically the reverse Brawl Sonic. In that game, he was pretty much Spin Dash/Charge: The Character, because his moveset was pretty underwhelming in that game. It is enough to give him a spot in mid tier though. In Ultimate, his moveset is relatively more well-rounded, but he lacks some of the sauce his SSB4 counterpart has.

Hopefully that paints a picture on why Sonic is less effective in this game than in SSB4. He is a very good character that could MAYBE land a spot in the top tiers, but for now, he is a solid high tier character.
 
Last edited:

TennisBall

Smash Journeyman
Joined
Aug 17, 2019
Messages
251
Location
The Darker Side of the Moon
Oh, what do people here think about Sonic :ultsonic: ? How much did he benefit from the 9.0.0 buffs? Wrath and Sonido seem very optimistic, KEN a bit less so. Personally, I think the character is very underrated, and has a strong case for being top 15, if not top 10.
Very optimistic about this character. The masterful way KEN manuvered around hitboxes and forced his opponents to guess what options he was going to "commit" to looked very promising for this character's meta. Could very likely be a top tier imo when offline comes back and we see if the gap difference between Online and Offline Sonic is really that big.
 

Melonsismyusername

Smash Apprentice
Joined
Feb 14, 2019
Messages
153
Very optimistic about this character. The masterful way KEN manuvered around hitboxes and forced his opponents to guess what options he was going to "commit" to looked very promising for this character's meta. Could very likely be a top tier imo when offline comes back and we see if the gap difference between Online and Offline Sonic is really that big.
Sonic as a character has such a unique gameplan that he basically obliterates characters, so although he has a decent top-tier mu spread, what really pushes him there is the fact that he obliterates so many mid, low, and high tiers.
 

Swamp Sensei

Today is always the most enjoyable day!
BRoomer
Joined
Jan 4, 2013
Messages
35,737
Location
Um....Lost?
NNID
Swampasaur
3DS FC
4141-2776-0914
Switch FC
SW-6476-1588-8392
Sonic seems to be consistently able to do what all characters WANT to do at all times. Hit when they're comfortable and get back to safety when they're not. People like to say he's campy but he really isn't. He has a hit and run playstyle and is REALLY darn good at it. Good Sonics need to be moving at all times, either at the opponent or away. Every character tries to do this. Some do well, some simply can't. Sonic exceeds at it.

Hit and run tactics are blatantly the best kind of strategy if you can actually pull them off. It's a simple fact of combat in and out of video games and Ultimate is no exception. Hit the opponent, don't get hit. It's why we call a match where no one gets hit "perfect." The sheer fact that Sonic can run away better than most of the cast is a huge boon and gives him the chance to pick his fights.

Sonic is simply the best at using a winning formula. He can't be any less than high tier.
 

Melonsismyusername

Smash Apprentice
Joined
Feb 14, 2019
Messages
153
Who actually needs nerfs?
GnW
Sonic seems to be consistently able to do what all characters WANT to do at all times. Hit when they're comfortable and get back to safety when they're not. People like to say he's campy but he really isn't. He has a hit and run playstyle and is REALLY darn good at it. Good Sonics need to be moving at all times, either at the opponent or away. Every character tries to do this. Some do well, some simply can't. Sonic exceeds at it.

Hit and run tactics are blatantly the best kind of strategy if you can actually pull them off. It's a simple fact of combat in and out of video games and Ultimate is no exception. Hit the opponent, don't get hit. It's why we call a match where no one gets hit "perfect." The sheer fact that Sonic can run away better than most of the cast is a huge boon and gives him the chance to pick his fights.

Sonic is simply the best at using a winning formula. He can't be any less than high tier.
Hit and Run and Campy are not mutually exclusive words, if anything they are correlative.
 

Diddy Kong

Smash Obsessed
Joined
Dec 8, 2004
Messages
24,574
Switch FC
SW-1597-979602774
Diddy and Roy / Chrom seem to have a similar hit and run playstyle, but more combo based than Sonic. 🤔 And in Diddy's case, it has item play and defensive tactics with the hit and run, and with Roy and Chrom a lot has to do with their range as well.

All are also high tier characters, so we could state this type of character is quite effective in Smash yeah.
 

Swamp Sensei

Today is always the most enjoyable day!
BRoomer
Joined
Jan 4, 2013
Messages
35,737
Location
Um....Lost?
NNID
Swampasaur
3DS FC
4141-2776-0914
Switch FC
SW-6476-1588-8392
All are also high tier characters, so we could state this type of character is quite effective in Smash yeah.
It's effective in every fighting game if the character has the tools to make the strategy actually work.

Teleports on characters are so powerful simply because they allow characters to move in or move out depending on the situation. I mean look at Street Fighter's Dhalsim. He's the posterboy for zoning with normals, but his teleports are the only reason the character works nowadays. If Dhalsim couldn't actually escape a bad situation or capitalize on a good one, he couldn't do his job (and even if Dhalsim's normals are Min Min tier in range, his damaging combos only work if he's up close).

The characters you mentioned have similar attributes. Diddy's Banana lets him control the pace of the battle and Roy and Chrom's raw speed and frame data allow them to weave in and out of combat.

Sonic has even more speed and some useful utility tools. Hit and run is very effective for him.

Hit and Run and Campy are not mutually exclusive words, if anything they are correlative.
I may be arguing semantics but I'm going to disagree here.

"Hit and Run" implies you'll approach sometime, just when its safe.

"Camping" means you don't want to approach.

Saying those two terms are correlative is like saying Hit and Run and Rushdown are correlative because they both involve rushing towards your opponent sometimes.

Sonic is best when played with a hit and run style. He CAN be campy, but I don't believe its the intention behind his design. Moreover, the really successful Sonics like KEN incorporate constant pressure with their movement. And sometimes that pressure involves running away from his opponent just to prompt a reaction.


I can't in good conscious call Ken's Sonic campy and its hard to argue he isn't one of, if not the best, Sonic main.
 

Melonsismyusername

Smash Apprentice
Joined
Feb 14, 2019
Messages
153
Diddy and Roy / Chrom seem to have a similar hit and run playstyle, but more combo based than Sonic. 🤔 And in Diddy's case, it has item play and defensive tactics with the hit and run, and with Roy and Chrom a lot has to do with their range as well.

All are also high tier characters, so we could state this type of character is quite effective in Smash yeah.
Calling chrome, diddy kong and sonic all the same archetype is a bit of a stretch, don't ya think?
 

NotLiquid

Smash Lord
Joined
Jul 14, 2014
Messages
1,193
"Campy" and "hit-and-run" are distinctions without a difference when it comes to Sonic. At the end of the day he's capable of doing both these things very well. People associate an ability to play campy with projectiles, but that's setting aside the fact that Sonic is designed in a way where he himself, for all intents and purposes, is the projectile.
 

Thinkaman

Moderator
Moderator
Joined
Aug 26, 2007
Messages
6,324
Location
Madison, WI
NNID
Thinkaman
3DS FC
1504-5749-3616
The value of theory as a forward-looking alternative to hard data is entirely predicated on its ability to predict future outcomes.

For example, people confidently predicted Joker was the #1 character months before he finally reached that goal in any of the stats. By the time it happened, no one was surprised. This theory had multiple strong points of basis:
  • Joker was still somwhat new DLC. Everyone else had a head start in both existing results, adoption, and mastery.
  • Leo had quickly cemented his position as #1 player with Joker, and it was only a matter of time before others adopted his techniques.
  • Both Joker's nuanced design and his status as an entirely new character predisposed his userbase to start small and grow over time.
You don't even have to get into nebulous gameplay theorycraft, attempting to explain what about Joker actually makes him good, mechanically. This theory had strong pillars of support without getting into the weeds. And, of course, it "came true." The hypothesis was proven correct, in a way one would be hard pressed to deny.


If you had asserted that Ike was the #1 character, it's fair to say your theory would have been proven wrong. Or if you hid behind "data" and claimed that the results supported this 'cause Leo had won the most with Ike, then your (very bad) interpretation of the data would have been proven wrong.

If you continued to insist that no, Ike is in truth #1, and people just don't play him enough or play him right for whatever reasons, your tree falling in the woods is pointless until someone shows up to hear it. At some point your theory has to be actually predicting something about observable reality, or else you concede it's all just in your head.


So.

How well have tier lists predicted changes in results?

To compare, I looked back at 2019, where we have 2 six-month blocks of OrionRank data. When Phase 1 was ending, how well did the tier lists of the time predict which characters would rise and fall in Phase 2? (As opposed to a baseline of say, assuming all the results would continue exactly the same.)

Reddit compiles a community tier list every month, but we're comparing across six-month blocks of results. Since the OrionRank Phase 1 data spans activity from Janurary to June, it would be thus be an unfair comparison to use only Reddit's last list of the period (June), just as it would be unfair to use the Janurary one. I decided to be generous to Reddit and use the May 2019 list, the last list before the first big round of balance changes.

I excluded DLC, combined the 4 appropriate echoes, and excluded Marth to be extra-generous to Reddit. I used absolute rankings of the characters, #1-69.

This is where OrionRank 2 falls compared to Reddit (May 2019) and OrionRank Phase 1:

smash_gg_data_2021_2.png

(Big numbers means an overestimate, big negative numbers means an underestimate. 0 is dead-on.)
.......ouch.

Yup. The deviation is lower when comparing Phase 2 with just Phase 1. Using this tier list to predict results trends is worse than literally doing nothing. After six months, results only shifted in the tier list's direction in 35 cases; literally a coin flip.

This is like index funds outperforming money managers picking stocks. Except in this case it's more like money-under-your-mattress beating r/wallstreetbets.

And it's actually much worse than it seems:
  • This is comparing directly to the raw total results, not even taking any trends into account. Everyone following results at the time saw DDD/Bayo/MK results slowing before their eyes.
  • External events are generally fair to both comparisons, as they apply equally: Neither could have predicted that Pichu would get nerfed or that Leo would drop Ike. However, Reddit got lucky a few times: it overestimated WFT and Diddy, only for them to get large buffs and rise in results anyway. Two wrongs made a right!
  • Meanwhile, when results underestimated Ken (and then he got buffed), comparing with previous results was extra-penalized.
But wait, there's more! I also compiled the December 2019 tier list, to run the comparison the other way--how well did results predict the changes to the tier list?

Even the raw results data was around 60% accurate in guessing whether a given character would move up or down on the tier list over the next six months.

And if you factor out Pikachu, Ken, and WFT as explainable outliers, the Spring 2019 results actually correlate more tightly overall with the future December 2019 tier list than the Spring 2019 tier list does! And if you permit inclusion of the most obvious trends (in this case DDD, Bayo, ect.) the remaining deviation can go down by a full third or more, with much of the remainder being external events like patches.

It seems tier lists follow results, more than the other way around. (At least for this one case study.)
 

RonNewcomb

Smash Journeyman
Joined
Nov 29, 2014
Messages
377
The value of theory as a forward-looking alternative to hard data is entirely predicated on its ability to predict future outcomes.
Maybe I'm again just an old man yelling at :ultcloud::ultcloud::ultcloud:, but tier lists aren't supposed to predict anything. They're just a summation of all MU charts, full stop. Hence they're backward-looking and a kind of report on the state of balance.

I realize enacting the formula of "old tier list + new patch notes = new tier list" is practically its own sport (and a fun one), but that isn't how they're actually made. So throwing statistics at a 'teir list' which was made that way seems like the long way around to saying what's evident in the definition.

Heck, in many cases one can't even say if adding 3 points to a char's low kick will affect any MUs at all, let alone a tier position. Esp. since, generally speaking, a tier isn't in any particular order, since the differences between same-tier chars is slight enough to be swamped by human factors.
 

The_Bookworm

Smash Champion
Joined
Jan 10, 2018
Messages
2,713
Maybe I'm again just an old man yelling at :ultcloud::ultcloud::ultcloud:, but tier lists aren't supposed to predict anything. They're just a summation of all MU charts, full stop. Hence they're backward-looking and a kind of report on the state of balance.

I realize enacting the formula of "old tier list + new patch notes = new tier list" is practically its own sport (and a fun one), but that isn't how they're actually made. So throwing statistics at a 'teir list' which was made that way seems like the long way around to saying what's evident in the definition.

Heck, in many cases one can't even say if adding 3 points to a char's low kick will affect any MUs at all, let alone a tier position. Esp. since, generally speaking, a tier isn't in any particular order, since the differences between same-tier chars is slight enough to be swamped by human factors.
[Btw, this post doesn't just apply to you. Just want to share my general thoughts since this is the current topic of the thread.]

New patch notes doesn't always directly affect tier lists. If they provide changes that do impact a character's viability, then yes, their tier positions may change.

For example, the changes in two of our most recent patches, 9.0 and 10.1 didn't really change up the meta too much. The most that these patches did, in my opinion, was adding Steve and Sephiroth to the game, but I personally doubt their presence in the meta is going to change the viability of characters ranked below them too much.

However, most tier lists throughout the ages, in all Smash games, gets adjusted whenever a character's meta changes. Did the character's standing in the meta improve, and continue to perform well or even better than previously? Or has the character in question stagnated in the meta? Does the character have the potential to maybe move higher/lower?
Those are merely some of the questions that are put in when making a tier list.


There are many examples of this in Ultimate, but two of the most notable examples are with two of the Mii Fighters: :ultbrawler::ultswordfighter:.

In the early to even mid meta, Brawler was considered to be one of the worst characters in the game, while Swordfighter was considered to be a solid high tier character mainly thanks to Gale Strike + Hero's Spin. Granted that in the case for Brawler, he was a legitimately terrible character at launch, but despite being heavily buffed later in patches, most players believe him to still be a very mediocre character, and the worst of the 3 Miis. Swordfighter enjoyed having some notable players and some notable players having him as a secondary, and was considered to be the best of the 3 Miis by a large margin.

However, the tables turned later on. Brawler would obtain some increasingly notable results as time went on, until Rizeasu would get very notable results in offline tourneys with Brawler. Now the character is considered to be mid to upper-mid tier, and the best of the 3 Miis by far. Swordfighter, on the other hand, got the opposite treatment. The character would soon be deemed as not as good as initially thought, and would soon be considered a mid tier. However, the character would stagnate even more as time goes on, with his main players either not placing that well and/or not being very active, and his secondaries would not pick him as often. The end result is that the character, both statistically and in public perception, plummeted drastically. Most players now view him to be in the low tiers, and likely the weakest of the 3 Miis. And the character, outside of Gale Strike and Chakram getting their distance reduced on the literal first patch of the game, didn't even receive any nerfs to affect this perception.



I notice people bring up the problem of that tier lists sometimes end up being merely a statistic of what is going on right now, and not actual, more everlasting placements based on theory/matchups. The thing is, in my opinion, tier lists end up always falling into this trap, NO MATTER what you value most when assembling a tier list. You have top players like ESAM mostly basing their lists mainly off of theory, putting current performances / results under the way-side, but you always see their lists constantly changing with time, even with patches not being a factor.

In Melee and Brawl, games that lack any sort of balance patches (outside of Melee PAL), we see the tier list change drastically with time, as new theories, strategies, and demonstrations of character strength would be displayed, improving/dropping off the character results as a result. We see the current official tier list of Melee, made at 2015, and we see many things that can be changed, such as the large rise of Yoshi, Marth stepping up to the plate, Dr. Mario falling off a bit, Ice Climbers losing Wobbling thanks to rulesets, and more. Melee tier lists, although more stagnate than other Smash games in terms of tier placements, is still changing a noticeable amount, and this is off a now 20 year old game.

Brawl also has made some notable tier changes in late 2013/early 2014 that isn't noted by the final official tier list, including characters like Zero Suit Samus, Fox, and Sonic rising a bit more in terms of perception, among other changes. I also still hold the belief that SSB4 deserved one more official tier list, as the final one was made at December 2017, and the meta changed quite a lot in 2018, including the rise of characters like Corrin, Lucina, Pac-Man, Duck Hunt, and Wario; the fall of characters like Diddy, Bowser, Pit, Robin, and Palutena; and many other changes.

No matter what, tier lists are going to change. Tier lists are always going to differ from player to player. It is impossible, especially this early in the meta, to come up with actual concrete tier lists that is going to be everlasting, that is going to be accurate with matchups + results + much more. Whenever I make a tier list myself, I always make them with the knowledge that it is inevitably going to be different in the future.
 

Frihetsanka

Smash Lord
Joined
Apr 26, 2016
Messages
1,808
Location
Sweden

TL;DW: Mid/High online, High/Top offline. I can see it, that sounds about where I rate him right now (somewhere around #10-15 offline, worse online but I don't really make online tier lists). Obviously he's a very new character still, but he seems very strong to me.
 
Top Bottom