So comparing the balance of different fighting games has 4 primary problems.
First, it doesn't take into account differences in format of the games. Suppose I say a matchup is "3:7"; does that mean I win 3 out of 10...
- 1 1-Stock Game?
- 1 2-Stock Game?
- 1 3-stock Game?
- 1 4-stock Game?
- Any of the above with the addition of a timer?
- 1 Bo3 Rounds of any of the above?
- 1 Bo3 Round with meter carry-over?
- 1 Bo5 Round of either?
- 2 Round Sets of any of those?
- 3 Round Sets of any of those?
Sure, we can say "default settings" (pretending they exist and are universally accepted), but now we're just comparing Apples and Oranges--even more than we already are.
The different structuring of the game and the radical difference of carrying over damage, stale moves, rage, meter, or other resources has a huge impact on what the definition of these ratios are.
Second, these ratios are abstracted to the point where when communities discuss them, they have little connection to actual statistical results and are just mnemonic labels the community ascribes based on social convention. A "8:2" in one community is not the same as an "8:2" in another community.
Third, there is no singular definition of what makes a set of data "more balanced." Two people could have access to perfect data for two different games and still disagree on which one is "more balanced." For example, I am pretty sure the best measure is "standard deviation of matchup ratios"; other people have other viewpoints, and I think they are wrong but certainly can't ignore that it is a disputed subject. Though I disagree, I don't think it's asinine to instead demand a criteria fixated solely on top-level viability, nor to insist on some kind of recursive, weighted measure.
For example, suppose we agree that Ganon in Brawl was the worst character and had a 2:8 matchup against Olimar. If we improve that to a 3:7, is the game more balanced? I would insist it is (just the tiniest bit), but others would say it actually isn't since Ganon is still not viable at top-level play. The point isn't who is correct, but that it's a matter of dispute with no common definition.
Fourth, even if you agree with my (definitely best) definition of "standard deviation of matchup ratios", an implication of that is that larger casts will tend to be more balanced. This is at odds with most people's intuitions regarding balance, since balance is a subtractive design element (so laymen evaluate it based on outlier anecdotes) and balance design work is indeed exponentially more difficult with more characters. (There is on average "less distance to go", but the difficulty of going that distance is exponentially higher.)
For example, if Ryu has a 8:2 matchup against E.Honda in SF2, that's 1/120 matchups, or 0.833% of the entire game.
But if Meta Knight has a 8:2 matchup against Little Mac, that's 1/1326 matchups, or 0.075% of the entire game.
A single skewed matchup in Smash 4, in almost all statistical measures you could consider, has less than a tenth the significance of an equally skewed matchup in Super Turbo. The larger the roster gets, the higher the natural density of even matchup becomes. This is the sort of statistical reality that many people have trouble internalizing, and reject on principle.
In other words, even if you overcame all the other issues to establish a "fair" comparison, most people would not care to listen to it anyway.