Teams {Lahman} | R Documentation |
Yearly statistics and standings for teams
data(Teams)
A data frame with 2715 observations on the following 48 variables.
yearID
Year
lgID
League; a factor with levels AA
AL
FL
NL
PL
UA
teamID
Team; a factor
franchID
Franchise (links to TeamsFranchises
table)
divID
Team's division; a factor with levels C
E
W
Rank
Position in final standings
G
Games played
Ghome
Games played at home
W
Wins
L
Losses
DivWin
Division Winner (Y or N)
WCWin
Wild Card Winner (Y or N)
LgWin
League Champion(Y or N)
WSWin
World Series Winner (Y or N)
R
Runs scored
AB
At bats
H
Hits by batters
X2B
Doubles
X3B
Triples
HR
Homeruns by batters
BB
Walks by batters
SO
Strikeouts by batters
SB
Stolen bases
CS
Caught stealing
HBP
Batters hit by pitch
SF
Sacrifice flies
RA
Opponents runs scored
ER
Earned runs allowed
ERA
Earned run average
CG
Complete games
SHO
Shutouts
SV
Saves
IPouts
Outs Pitched (innings pitched x 3)
HA
Hits allowed
HRA
Homeruns allowed
BBA
Walks allowed
SOA
Strikeouts by pitchers
E
Errors
DP
Double Plays
FP
Fielding percentage
name
Team's full name
park
Name of team's home ballpark
attendance
Home attendance total
BPF
Three-year park factor for batters
PPF
Three-year park factor for pitchers
teamIDBR
Team ID used by Baseball Reference website
teamIDlahman45
Team ID used in Lahman database version 4.5
teamIDretro
Team ID used by Retrosheet
Variables X2B
and X3B
are named 2B
and 3B
in the original database
Lahman, S. (2010) Lahman's Baseball Database, 1871-2010, v.5.8, http://baseball1.com/statistics/
data(Teams)
# subset on a few variables
teams <- subset(Teams, lgID %in% c("AL", "NL"))
teams <- subset(teams, yearID>1900)
# drop some variables
teams <- subset(teams, select=-c(Ghome,divID,DivWin:WSWin,name,park,teamIDBR:teamIDretro))
teams <- subset(teams, select=-c(HBP,CS,BPF,PPF))
# subset to remove infrequent teams
tcount <- table(teams$teamID)
teams <- subset(teams, teams$teamID %in% names(tcount)[tcount>15], drop=TRUE)
teams$teamID <- factor(teams$teamID, levels=names(tcount)[tcount>15])
# relevel lgID
teams$lgID <- factor(teams$lgID, levels= c("AL", "NL"))
# create new variables
teams <- within(teams, {
WinPct = W / G ## Winning percentage
})
library(lattice)
xyplot(attendance/1000 ~ WinPct|yearID, groups=lgID, data=subset(teams, yearID>1980),
type=c("p", "r"), col=c("red","blue"))
## Not run:
##D if(require(googleVis)) {
##D motion1 <- gvisMotionChart(teams, idvar='teamID', timevar='yearID',
##D chartid="gvisTeams", options=list(width=700, height=600))
##D plot(motion1)
##D #print(motion1, file="gvisTeams.html")
##D
##D #### merge with ave salary, for those years where salary is available
##D
##D avesal <- aggregate(salary ~ yearID + teamID, data=Salaries, FUN=mean)
##D
##D # salary data just starts after 1980
##D teamsSal <- subset(teams, yearID>=1980)
##D
##D # add salary to team data
##D teamsSal <- merge(teamsSal,
##D avesal[,c("yearID", "teamID", "salary")],
##D by=c("yearID", "teamID"), all.x=TRUE)
##D
##D motion2 <- gvisMotionChart(teamsSal, idvar='teamID', timevar='yearID',
##D xvar="attendance", yvar="salary", sizevar="WinPct",
##D chartid="gvisTeamsSal", options=list(width=700, height=600))
##D plot(motion2)
##D #print(motion2, file="gvisTeamsSal.html")
##D
##D }
## End(Not run)