Running Through the Data: Tempo Run: 5K by Runtracker

In the Summer of 2020, I set a really simple goal for myself: run a 5k under 30 minutes. At the time, I had just completed the couch to 5k (C25K) program and was able to complete the distance in around 32-33 minutes, but couldn’t seem to get much quicker than that and wanted to see if trying a different training plan would help. After some experimenting, I settled on the Tempo Run: 5k Plan on the Runtracker app to help me break the 30-minute mark.

Runtracker is an app made by the Fitness 22 company, featuring a series of running plans tailored to individuals’ current fitness levels and goals. Since I was a runner who could currently run the 5k distance and ran about 3 times a week, the app recommended the “Tempo Run: 5k” plan. This running plan consisted of three runs per week for a total of eight weeks, with the same structure each week. The first run of the week consisted of interval training of various lengths throughout the program, while the second  run of the week was always a tempo run of steadily increasing durations. The third and final run each week was a 35-minute base run at an easy pace. This format remained consistent over the course of all 8 weeks and was built to progressively increase both mileage and intensity throughout.

Tempo Run: 5K Training Plan, by Runtracker

My experience with this running plan was great for a variety of reasons. The most structured kind of running I had done before was the run/walk method used in couch to 5k (C25K). Interval sessions, which included high intensity running, easy pace running, and walking helped build power and figure out pacing. Tempo sessions pushed me to find the gear between interval and easy pace, which helped develop the habit of running the second half of my runs, faster than the first (i.e. “negative splits”). The long easy sessions on the weekends helped build confidence and efficiency. By the end of the program, I had taken minutes off my 5K time and had a way better understanding of pacing, which was the biggest takeaway for me. Many of the things I do now as a runner, mirror the types of workouts I was first introduced to in this app, so this data has been fun to look at a few years removed.  

Training Progress 

To get a better picture of my progress throughout the program, three primary variables came into focus: Pace measured in miles per hour (mph); Distance, measured in miles; and Training Session, numbering 1 to 24 and completed in order. Running paces ranged from 5.09 to 6.58 mph (11:47 min/mile to 9:07 min/mile), with a mean of 5.83 mph ( 10:18 min/mile), while distances ran ranged from 2.4 to 5.43 miles, with a mean of 3.44 miles per run. Since there is an ordered component to these workouts (by session), progress can be visualized through scatter plots. Below, are plots of running distance and pace over the course of the 24 workout sessions. Notice how the spread between data opens up as training progresses, especially with respect to distance ran. This “fanning effect” would normally be problematic in statistics, but for running this is often a desired feature in training: 

Image by Author

Workout Type

As I mentioned above, the biggest takeaway of the program for me was my understanding of pacing. Interval sessions, tempo runs, and base runs, require very different kinds of efforts, all of which can improve performance. Interval sessions remained the most consistent with respect to running pace, but had the largest range and highest average number of miles ran. Tempo runs and base runs remained relatively consistent in terms of mileage, with tempo runs having the widest range along with the highest average running pace. These findings can be better visualized through the box plots below for both paces and distances ran:

Comparing with C25K

In my previous blog post, we went through the data of the C25K program.  Since both of these trainings were focused on the same distance, I thought it would be fun to compare progress side by side on the primary variable of interest, pace. The C25K program had a range of 4.01 to 5.51 mph, with an average of 4.79 mph, while the Tempo Runner program had a range of 5.09 to 6.58 mph, with an average of 5.83 mph. Given that both programs had a sequential component (i.e. “training session”), these data can also be expressed as a regression. Below are box plots of running pace distributions (left) and scatter plots of running pace throughout training (right) for both programs. Notice how the Faster 5K program is noticeably higher on average than the C25K program, while the C25K program has a more positive slope. Since the Couch to 5K programs designed to take runners from sedentary to being able to complete a 3.1 mile run, there is naturally going to be much greater gains (i.e. higher slope) in the beginning, with later improvements occurring more incrementally:

Image by Author

Final Thoughts

The Tempo Runner: 5K plan on the runtracker app is geared towards regular runners who can currently run a 5K and are interested in improving performance. The training stricture consists of three runs per week with one interval session, one tempo run, and one 35-minute steady state run. The variety of workouts in the program are designed to build both aerobic (endurance) and anaerobic (speed) capacity in runners. For anyone who is new to running, or hasn’t had structured training before, this program could be an excellent introduction. 

Below are some links related to improving 5K times, along with the raw data and code used to create the charts and analysis.  If you are interested in my experience with Couch to 5K, you can find that post here and for my first half marathon, you can find that here.

Thanks for reading! 

Resources & Code:

# FRONT MATTTER

### Note: The Faster5k.xlxs file will need to be converted to Faster5k.csv to read in correctly. Also, all packages can be downloaded using the install.packages() function. This only needs to be done once before loading. 

# clean up (this clears out the previous environment)
ls()

# Load Packages 
library(tidyverse)
library(wordcloud2)
library(mosaic)
library(readxl)
library(hrbrthemes)
library(viridis)

# Likert Data Packages
library(psych)
library(FSA)
library(lattice)
library(boot)
library(likert)

#install.packages("wordcloud")
library(wordcloud)
library(tm)
library(wordcloud)


# Grid Extra for Multiplots
library("gridExtra")

# Multiple plot function (just copy paste code)

multiplot <- function(..., plotlist=NULL, file, cols=1, layout=NULL) {
  library(grid)

  # Make a list from the ... arguments and plotlist
  plots <- c(list(...), plotlist)

  numPlots = length(plots)

  # If layout is NULL, then use 'cols' to determine layout
  if (is.null(layout)) {
    # Make the panel
    # ncol: Number of columns of plots
    # nrow: Number of rows needed, calculated from # of cols
    layout <- matrix(seq(1, cols * ceiling(numPlots/cols)),
                    ncol = cols, nrow = ceiling(numPlots/cols))
  }

 if (numPlots==1) {
    print(plots[[1]])

  } else {
    # Set up the page
    grid.newpage()
    pushViewport(viewport(layout = grid.layout(nrow(layout), ncol(layout))))

    # Make each plot, in the correct location
    for (i in 1:numPlots) {
      # Get the i,j matrix positions of the regions that contain this subplot
      matchidx <- as.data.frame(which(layout == i, arr.ind = TRUE))

      print(plots[[i]], vp = viewport(layout.pos.row = matchidx$row,
                                      layout.pos.col = matchidx$col))
    }
  }
}



# FASTER 5K

# Import data from CSV, no factors

Faster5K <- read.csv("Faster5k.csv", stringsAsFactors = FALSE)

Faster5K <- Faster5K %>%
  na.omit()

Faster5K

# Plot 1 - Density Plot of Running Distances

p1 <- ggplot(Faster5K, aes(x=Distance)) + 
  geom_density(color="light blue", fill="Pink") + labs( x ="Distance (Miles)", y = "", title = "Running Distances",  subtitle = "Tempo Run: 5K Training Plan", caption = "Data source: TheDataRunner.com") +
  theme(plot.title = element_text(hjust = 0.5, size = 20, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5, size = 12),
    plot.caption = element_text(hjust = 1, face = "italic"), 
    axis.text.y=element_blank(),
    axis.ticks.y=element_blank(),
    panel.background = element_blank())

p1

# Plot 1 - Density Plot of of Running Speeds

p2 <- ggplot(Faster5K, aes(x=Pace_MPH)) + 
  geom_density(color="Pink", fill="light blue") + 
  labs( x ="Speed (Miles per Hour)", y = "", title = "Running Speeds",  subtitle = "Tempo Run: 5K Training Plan", caption = "Data source: TheDataRunner.com") +
  theme(plot.title = element_text(hjust = 0.5, size = 20, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5, size = 12),
    plot.caption = element_text(hjust = 1, face = "italic"), 
    axis.text.y=element_blank(),
    axis.ticks.y=element_blank(),
    panel.background = element_blank())

p2

# Combine plots using multi-plot function:

multiplot( p1, p2, cols=1)

# Plot 3 - Density Plot of of Running Distance over Time

p3 <- ggplot(Faster5K, aes(x= Session, y= Distance)) + geom_point(color="Purple") +  geom_smooth(method=lm , color="Green", se=TRUE) + labs(x ="Training Session", y = "Distance (Miles)", title = "Running Distance",  subtitle = "Tempo Run: 5K Training Plan", caption = "Data source: TheDataRunner.com") +
   theme(
    plot.title = element_text(hjust = 0.5, size = 20, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5, size = 12), 
    plot.caption = element_text(hjust = 1, face = "italic"),
    panel.background = element_blank())

p3

# Plot 4 - Density Plot of of Running Speed over Time

p4<- ggplot(Faster5K, aes(x=Session, y= Pace_MPH)) + geom_point(color="Green") +  geom_smooth(method=lm , color="Purple", se=TRUE) + labs( x ="Training Session", y = "Speed (Miles per Hour)", title = "Running Speed",  subtitle = "Tempo Run: 5K Training Plan", caption = "Data source: TheDataRunner.com") +
  theme(
    plot.title = element_text(hjust = 0.5, size = 20, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5, size = 12), 
    plot.caption = element_text(hjust = 1, face = "italic"),
    panel.background = element_blank())

p4

# Combine plots using multi-plot function
multiplot( p3, p4, cols=1)

# Summary Statistics of Distance
favstats(Faster5K$Distance)

# Summary Statistics of Pace
favstats(Faster5K$Pace_MPH)

# Pearson Product Correlation of Distance over Time (session)
cor.test(Faster5K$Session, Faster5K$Distance, method = "pearson")

# Pearson Product Correlation of Pace over Time (session)
cor.test(Faster5K$Session, Faster5K$Pace_MPH, method = "pearson")


# Pearson Product Correlation of Pace over Time (session)
cor.test(C25K$Session, C25K$Pace_MPH, method = "pearson")

# Simple Linear Model of Pace & Session
Distance <- lm(Distance ~ Session, data = Faster5K)
summary(Distance)

# Simple Linear Model of Pace & Session
Speed <- lm(Pace_MPH ~ Session, data = Faster5K)
summary(Speed)


# Import data from CSV, no factors

Plans_5K <- read.csv("5K_Plans.csv",  stringsAsFactors = FALSE)

Plans_5K

# Plot
p7 <- Faster5K %>%
  ggplot( aes(x=Workout, y= Distance, fill=Workout)) +
    geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="Black", size=0.4, alpha=0.9) + 
  labs( x ="", y = "Distance (Miles)", title = "Distance by Workout",  subtitle = "Tempo Run: 5K Running Plan", caption = "Data source: TheDataRunner.com") +
  theme(plot.title = element_text(hjust = 0.5, size = 20, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5, size = 12), 
    plot.caption = element_text(hjust = 1, face = "italic"),
    panel.background = element_blank(),
    legend.position = "none") +
    scale_fill_brewer(palette="Greens")
  

# Plot
p8 <- Faster5K %>%
  ggplot( aes(x=Workout, y= Pace_MPH, fill=Workout)) +
  geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="Black", size=0.4, alpha=0.9) + 
  labs( x ="", y = "Speed (Miles per Hour)", title = "Speed by Workout",  subtitle = "Tempo Run: 5K Running Plan", caption = "Data source: TheDataRunner.com") +
  theme(plot.title = element_text(hjust = 0.5, size = 20, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5, size = 12), 
    plot.caption = element_text(hjust = 1, face = "italic"),
    panel.background = element_blank(),
    legend.position = "none") +
    scale_fill_brewer(palette="Purples")


# Combine plots using multi-plot function
multiplot( p7, p8, cols=1)

# Combine plots using multi-plot function
multiplot( p7, p8, cols=2)


# Combine plots using multi-plot function
multiplot( p1, p7, cols=2)


# Combine plots using multi-plot function
multiplot( p2, p8, cols=2)
aggregate(Faster5K$Workout, list(Faster5K$Pace_MPH), FUN=mean) 


# Summarize Mean Distance & Pace by Workout Type
Faster5K  %>%
  group_by(Workout) %>%
  summarise_at(vars(Distance, Pace_MPH), list(Average = mean))

Plans_5K  %>%
  group_by(Program) %>%
  summarise_at(vars(Distance, Pace_MPH), list(Average = mean))

# Plot
p5 <- Plans_5K %>%
  ggplot( aes(x=Program, y= Pace_MPH, fill=Program)) +
  geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="Black", size=0.4, alpha=0.9) + 
  labs( x ="Training Session", y = "Speed (Miles per Hour)", title = "Comparing Paces",  subtitle = "C25K & Tempo Run: 5K Training Plans", caption = "Data source: TheDataRunner.com") +
  theme(plot.title = element_text(hjust = 0.5, size = 20, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5, size = 12), 
    plot.caption = element_text(hjust = 1, face = "italic"),
    panel.background = element_blank(),
    legend.position = "none") +
    scale_fill_brewer(palette="BuPu")

p5

# Plot
p6 <- Plans_5K %>%
  ggplot( aes(x=Program, y= Distance, fill=Program)) +
    geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="Black", size=0.4, alpha=0.9) + 
  labs( x ="Training Session", y = "Distance (Miles)", title = "Comparing Distances",  subtitle = "C25K & Tempo Run: 5K Training Plans", caption = "Data source: TheDataRunner.com") +
  theme(plot.title = element_text(hjust = 0.5, size = 20, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5, size = 12), 
    plot.caption = element_text(hjust = 1, face = "italic"),
    panel.background = element_blank(),
    legend.position = "none") +
    scale_fill_brewer(palette="PRGn")

p6


multiplot( p5, p6, cols=2)

t.test(Pace_MPH ~ Program, data = Plans_5K)

t.test(Distance ~ Program, data = Plans_5K)

# Plot

p10 <- ggplot(Plans_5K, aes(x=Session, y= Pace_MPH, color = Program )) + geom_point() +  geom_smooth(method=lm , se=TRUE,aes(color=Program)) + labs( x ="Training Session", y = "Speed (Miles per Hour)", title = "Pace Through Training",  subtitle = "C25K & Tempo Run: 5K Training Plans", caption = "Data source: TheDataRunner.com") +
  theme(
    plot.title = element_text(hjust = 0.5, size = 20, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5, size = 12), 
    plot.caption = element_text(hjust = 1, face = "italic"),
    panel.background = element_blank()) + 
  scale_color_manual(values=c('blue', 'orange'))+
  theme(legend.position="none")


p10


multiplot( p5, p10, cols=2)

Author: Scott Atchison

I am a statistician and data scientist, who enjoys writing, visualizing, and talking about data, especially when we can use it to answer interesting questions.

One thought on “Running Through the Data: Tempo Run: 5K by Runtracker”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: