HallOfFame {Lahman}R Documentation

Hall of Fame Voting Data

Description

Hall of Fame table. This is comprised of the voting results for all candidates nominated for the Baseball Hall of Fame.

Usage

data(HallOfFame)

Format

A data frame with 4015 observations on the following 8 variables.

hofID

Player ID code

yearID

Year of ballot

votedBy

Method by which player was voted upon. See Details

ballots

Total ballots cast in that year

needed

Number of votes needed for selection in that year

votes

Total votes received

inducted

Whether player was inducted by that vote or not (Y or N)

category

Category of candidate; a factor with levels Manager Pioneer/Executive Player Umpire

needed_note

Explanation of qualifiers for special elections

Details

This table links to the Master table via the hofID.

votedBy: Most Hall of Fame inductees have been elected by the Baseball Writers Association of America (BBWAA). Rules for election are described in http://en.wikipedia.org/wiki/National_Baseball_Hall_of_Fame_and_Museum#Selection_process.

Source

Lahman, S. (2010) Lahman's Baseball Database, 1871-2010, v.5.8, http://baseball1.com/statistics/

Examples


## Some examples for  Hall of Fame induction data

data('HallOfFame')
require('plyr')          ## extensive use of plyr for data manipulation 
require('ggplot2')

############################################################
## Some simple queries

# What are the different types of votedBy?
table(HallOfFame$votedBy)
## 
##            BBWAA       Centennial     Final Ballot     Negro League 
##             3587                6               21               26 
##  Nominating Vote       Old Timers          Run Off Special Election 
##               76               30               81                2 
##         Veterans 
##              186

# What was the first year of Hall of Fame elections?
sort(unique(HallOfFame$yearID))[1]
## [1] 1936
# Who comprised the original class?
subset(HallOfFame, yearID == 1936 & inducted == 'Y')
##        hofID yearID votedBy ballots needed votes inducted category
## 1  cobbty01h   1936   BBWAA     226    170   222        Y   Player
## 2  ruthba01h   1936   BBWAA     226    170   215        Y   Player
## 3 wagneho01h   1936   BBWAA     226    170   215        Y   Player
## 4 mathech01h   1936   BBWAA     226    170   205        Y   Player
## 5 johnswa01h   1936   BBWAA     226    170   189        Y   Player
##   needed_note
## 1        <NA>
## 2        <NA>
## 3        <NA>
## 4        <NA>
## 5        <NA>

# Result of a player's last year on the BBWAA ballot
# Restrict to players voted by BBWAA:
HOFplayers <- subset(HallOfFame, votedBy == 'BBWAA' & category == 'Player')


# Function to calculate number of years as HOF candidate, last pct vote, etc.
# for a given player
HOFun <- function(d) {
    nyears <- nrow(d)
    fy <- d[nyears, ]
    lastPct <- with(fy, 100 * round(votes/ballots, 3))
    data.frame(hofID = fy$hofID, nyears, induct = fy$inducted,
               lastPct, lastYear = fy$yearID)
}

playerOutcomesHOF <- ddply(HOFplayers, .(hofID), HOFun)


############################################################
# How many voting years until election?
inducted <- subset(playerOutcomesHOF,induct == 'Y')
table(inducted$nyears)
## 
##  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 
## 43 10  8  7  8  4  2  4  6  3  3  1  4  1  2
barplot(table(inducted$nyears), main="Number of voting years until election",
        ylab="Number of players", xlab="Years")

plot of chunk unnamed-chunk-1


# What is the form of this distribution?
require('vcd')
## Loading required package: vcd
## Loading required package: MASS
## Loading required package: grid
## Loading required package: colorspace
## Warning: package 'colorspace' was built under R version 2.15.3
goodfit(inducted$nyears)
## 
## Observed and fitted values for poisson distribution
## with parameters estimated by `ML' 
## 
##  count observed    fitted
##      0        0  1.519158
##      1       43  6.449255
##      2       10 13.689456
##      3        8 19.371871
##      4        7 20.559769
##      5        8 17.456408
##      6        4 12.351232
##      7        2  7.490639
##      8        4  3.974985
##      9        6  1.874993
##     10        3  0.795988
##     11        3  0.307199
##     12        1  0.108679
##     13        4  0.035490
##     14        1  0.010762
##     15        2  0.003046
plot(goodfit(inducted$nyears), xlab='Number of years',
    main="Poissonness plot of number of years voting until election")

plot of chunk unnamed-chunk-1

Ord_plot(table(inducted$nyears), xlab='Number of years')

plot of chunk unnamed-chunk-1




# First ballot inductees:
subset(playerOutcomesHOF, nyears == 1L & induct == 'Y')
##           hofID nyears induct lastPct lastYear
## 1    aaronha01h      1      Y    97.8     1982
## 39   bankser01h      1      Y    83.8     1977
## 58   benchjo01h      1      Y    96.4     1989
## 84   boggswa01h      1      Y    91.9     2005
## 102  brettge01h      1      Y    98.2     1999
## 107  brocklo01h      1      Y    79.7     1985
## 147  carewro01h      1      Y    90.5     1991
## 149  carltst01h      1      Y    95.6     1994
## 182   cobbty01h      1      Y    98.2     1936
## 274  eckerde01h      1      Y    83.2     2004
## 294  fellebo01h      1      Y    93.8     1962
## 342  gibsobo01h      1      Y    84.0     1981
## 384  gwynnto01h      1      Y    97.6     2007
## 411  henderi01h      1      Y    94.8     2009
## 460  jacksre01h      1      Y    93.6     1993
## 473  johnswa01h      1      Y    83.6     1936
## 489  kalinal01h      1      Y    88.3     1980
## 520  koufasa01h      1      Y    86.9     1972
## 583  mantlmi01h      1      Y    88.2     1974
## 598  mathech01h      1      Y    90.7     1936
## 610   mayswi01h      1      Y    94.7     1979
## 616  mccovwi01h      1      Y    81.4     1986
## 658  molitpa01h      1      Y    85.2     2004
## 669  morgajo02h      1      Y    81.8     1990
## 683  murraed02h      1      Y    85.3     2003
## 686  musiast01h      1      Y    93.2     1969
## 721  palmeji01h      1      Y    92.6     1990
## 760  puckeki01h      1      Y    82.1     2001
## 793  ripkeca01h      1      Y    98.5     2007
## 801  robinbr01h      1      Y    92.0     1983
## 802  robinfr02h      1      Y    89.2     1982
## 803  robinja02h      1      Y    77.5     1962
## 822   ruthba01h      1      Y    95.1     1936
## 824   ryanno01h      1      Y    98.8     1999
## 844  schmimi01h      1      Y    96.5     1995
## 854  seaveto01h      1      Y    98.8     1992
## 886  smithoz01h      1      Y    91.7     2002
## 903  stargwi01h      1      Y    82.4     1988
## 985  wagneho01h      1      Y    95.1     1936
## 1027 willite01h      1      Y    93.4     1966
## 1033 winfida01h      1      Y    84.5     2001
## 1046 yastrca01h      1      Y    94.6     1989
## 1054 yountro01h      1      Y    77.5     1999

# Who took at least ten years on the ballot before induction?
# (Doesn't include Bert Blyleven, who was inducted in 2011.)
subset(playerOutcomesHOF, nyears >= 10L & induct == 'Y')
##          hofID nyears induct lastPct lastYear
## 80  blylebe01h     14      Y    79.7     2011
## 93  boudrlo01h     10      Y    77.3     1970
## 210 cronijo01h     10      Y    78.8     1956
## 264 drysddo01h     10      Y    78.4     1984
## 402 hartnga01h     11      Y    77.7     1955
## 407 heilmha01h     11      Y    86.8     1952
## 506 kinerra01h     13      Y    75.4     1975
## 548 lemonbo01h     12      Y    78.6     1976
## 585 maranra01h     13      Y    82.9     1954
## 785  riceji01h     15      Y    76.4     2009
## 889 snidedu01h     11      Y    86.5     1980
## 926 suttebr01h     13      Y    76.9     2006
## 938 terrybi01h     13      Y    77.4     1954
## 971 vanceda01h     15      Y    81.7     1955

############################################################
## Plots of voting percentages over time for the borderline
## HOF candidates, according to the BBWAA:

# (1) Set up the data:
longTimers <- as.character(unlist(subset(playerOutcomesHOF,
                                         nyears >= 10, select = 'hofID')))
HOFlt <- subset(HallOfFame, hofID %in% longTimers & votedBy == 'BBWAA')
HOFlt <- ddply(HOFlt, .(hofID), mutate,
                  elected = ifelse(any(inducted == 'Y'),"Elected", "Not elected"),
                  pct = 100 * round(votes/ballots, 3))

# Plot the voting profiles:
ggplot(HOFlt, aes(x = yearID, y = pct,
                  group = hofID)) +
    ggtitle("Profiles of voting percentage for long-time HOF candidates") +
    geom_line() +
    geom_hline(yintercept = 75, col = 'red') +
    labs(list(x = "Year", y = "Percentage of votes")) +
    facet_wrap(~ elected, ncol = 1)

plot of chunk unnamed-chunk-1


# Note: All but one of the players whose maximum voting percentage
# was over 60% and was not elected by the BBWAA has eventually been inducted
# into the HOF. Red Ruffing was elected in a 1967 runoff election while
# the others have been voted in by the Veterans Committee. The lone
# exception is Gil Hodges; his profile is the one that flatlines around 60%
# for several years in the late 70s and early 80s.


[Package Lahman version 2.0-1 Index]