Ok, so tiers can't be made by just comparing raw move statistics? I think they can because all it really boils down to is mathematics. Everything is numbers and the only objective way to determine tiers is to define what the human limits are and that way you can sort out the potential actual human performance of each character. It could be as simple as TAS testing using rules that simulate human controller input and reaction limits to determine the outcome. Why not try the direct experimental approach? Just try it out with the two suspected best characters as a smaller-scale experiment.
TAS testing wouldn't work. Imagine a chess game where two perfect computers play against each other. Every single game would end up in a draw. It's kind of the same thing here. You know these times when two smash players take dozens of seconds to actually get something going from the neutral, because no one want to try an approach? You would just get that everytime.
Smash relies a lot on mindgames. If you get hit, it's because you made an error, as simple as that. It means you either ****ed up yourself by making an unnecessary/dangerous maneuver, or the opponent outplayed/baited you and forced you to make an unnecessary/dangerous maneuver. Since AIs don't think like we do, TAS perfect AIs would always take the safest option, and no one would commit to an option that can be punished, thus making the match endless. Deciding when a ''worse'' option is actually ''better'' not to become predictable is what differenciates human players from AIs.
Making humans play frame by frame in a TAS setting, even if it simulates human input, wouldn't work either because of two things. Not everyone has the same reaction time. Where do you set the limit? Do you take airforce pilots' average reaction time, or the reaction time of semi-competitive players? It would be too hard to set a line because there is no universal human reaction/input time. And also, it wouldn't work because that's not how we play the game. Yes, in theory, that's how we should make tiers, but playing in an environment where you can and will make frame perfect (or the best you're capable of) decisions and inputs simply will never happen in a real competitive environment, so it's kind of pointless. Yes, you get an accurate tier list of what the ''real', tiers are supposed to look like, but it wouldn't be representative of how we really play smash, which I think is important. That brings me to my next point.
----
There are different things that determine where a character is placed on a tier list.
1. Frame data. Moves with less startup/endlag offer less reaction time to the opponent and are safer. Generally, a character with good frame data will be doing better than a character with bad frame data.
2. Options. The more options a character has, the more he will be able to mix things up and become less predictable. If a character has amazing frame data but only has one option for each situation, he will be outplayed hard because you will always know his next move.
3. Results. I know some people will not like this point, but I believe it is essential. A character might be the best on paper, but if nobody is ever winning with it, I don't care, it's not the best character. Theorycrafting is good to an extend, but without some concrete results and proofs it doesn't mean anything. There can always be some aspects that we forgot to cover tha make the character worse, and if he actually is any good, it's only a matter of time before we get the confirmation with real results. TAS testing will never give us ''real'' results, and neither will theorycrafting.
There are some other things to consider, though. Range, for example, is important. This is often overlooked, but if a character has phenomenal range, it doesn't matter if his options or frame data are not that good. He can simply abuse his range to get significant results. This is why disjointed hitboxes are so good. You can't hit the opponent, while he can hit you. He has little to lose and a lot to gain. The other way around, if a character can't hit you at all without getting hit in return, he's bad. This is a little extreme, but it's to illustrate my point. You could make similar arguments about weight, speed, and other characteristics like these.
I think these are the three main things to consider when making a tier list: frame data, options, and results. Yes Yoshi has really good frame data, but he doesn't have too many approach options. Yes Pacman has amazing options, but he lacks in the frame data. Yes Villager seems amazing, yet he doesn't have too many significant results. Don't kill me for these examples, please. I know they aren't 100% accurate, but you get the idea. A character will (almost, it's debatable) excel in every category, that's why it's so damn hard to make a tier list. Is frame data more important than options? How much should we factor results in the tier list? Is being fast and really light better than being slow but super heavy? That's why we shoud not seek to make perfect tier lists, but to make them as close to what the majority of players can agree with, and that's also why they are changing all the time.
Woooah sorry if it's TL;DR. I tried to make it short, but I really wanted to express myself on this and didn't want to leave things out.