Fix the ‘User Data Dir’ Error in R Selenider on Linux

Recently, I decided to move one of my web scraping scripts from Python over to R because, frankly, I’m much more comfortable in R. Everything ran perfectly on my Windows machine, but the moment I tried it on my Linux server boom i hit the dreaded:

“probably user data directory is already in use specify a unique value for user-data-dir, or don’t use it”

If you’ve been there, you know it’s frustrating. So let me walk you through exactly what I did, why the error happens, and how I fixed it with a more production-ready scraping setup.

My Starting Point

Here’s the code I started with pretty standard selenider + rvest scraping:

library(selenider)
library(rvest)

session <- selenider_session("selenium", browser = "chrome")
Sys.sleep(3)

open_url("https://egamersworld.com/callofduty/matches")

elements <- session |> get_page_source() |> html_elements(".item_teams__cKXQT")

res <- data.frame(
home_team_name = elements |>
html_elements(".item_team__evhUQ:nth-child(1) .item_teamName__NSnfH") |>
html_text(trim = TRUE),
home_team_odds = elements |>
html_elements(".item_team__evhUQ:nth-child(1) .item_odd__Lm2Wl") |>
html_text(trim = TRUE),
away_team_name = elements |>
html_elements(".item_team__evhUQ:nth-child(3) .item_teamName__NSnfH") |>
html_text(trim = TRUE),
away_team_odds = elements |>
html_elements(".item_team__evhUQ:nth-child(3) .item_odd__Lm2Wl") |>
html_text(trim = TRUE),
match_date = elements |>
html_elements(".item_scores__Vi7YX .item_date__g4cq_") |>
html_text(trim = TRUE),
match_time = elements |>
html_elements(".item_scores__Vi7YX .item_time__xBia_") |>
html_text(trim = TRUE),
match_type = elements |>
html_elements(".item_scores__Vi7YX .item_bo__u2C9Q") |>
html_text(trim = TRUE)
)

This happily pulled match data on Windows. On Linux? Not so much.

Why Linux Throw This Error

Here’s the thing: Chrome uses something called the user data directory to store your profile, cookies, cache, and extensions. On Linux, if:

  • multiple Chrome/Chromedriver processes try to use the same directory, or
  • a previous run crashed and left a lock file

then Chrome refuses to start. Selenium tries to launch Chrome, Chrome says “nope,” and you get the error.

Two key points I learned:

  1. --user-data-dir is a Chrome flag, not a Selenium server flag. If you pass it to Selenium’s server options, nothing happens.
  2. Python Selenium often creates a temporary profile automatically. R’s selenider wasn’t doing that here it was trying to reuse the default Chrome profile.

A Unique Chrome Profile

The trick is simple create a new temporary profile directory every time the script runs, and tell Chrome to use it. While we’re at it, I added some extras so it runs nicely on a headless Linux server or in Docker:

  • --headless=new for modern headless mode
  • --no-sandbox and --disable-dev-shm-usage for low-memory or container setups
  • explicit waits so JavaScript-rendered elements load
  • cleanup of temporary profile directories

The Improved, Linux Safe Script

library(selenider)
library(rvest)
library(glue)
library(withr)

# Create a unique Chrome profile directory
chrome_profile <- tempfile("chrome-profile-")
dir.create(chrome_profile, recursive = TRUE, showWarnings = FALSE)

# Chrome arguments (passed to Chrome, not Selenium server)
chrome_args <- c(
glue("--user-data-dir={chrome_profile}"),
"--headless=new",
"--no-sandbox",
"--disable-dev-shm-usage",
"--disable-gpu",
"--window-size=1920,1080"
)

# Selenium options for Chrome
server_options <- selenium_options(
browser = "chrome",
browser_args = chrome_args
)

# Clean up on exit
on.exit({
try(close_session(), silent = TRUE)
try(unlink(chrome_profile, recursive = TRUE, force = TRUE), silent = TRUE)
}, add = TRUE)

# Start session
session <- selenider_session(
backend = "selenium",
browser = "chrome",
options = server_options
)

# Go to the page
open_url("https://egamersworld.com/callofduty/matches")

# Wait for dynamic elements
wait_until(
condition = {
src <- get_page_source()
length(html_elements(src, ".item_teams__cKXQT")) > 0
},
timeout = 10
)

# Parse the page
src <- get_page_source()
elements <- html_elements(src, ".item_teams__cKXQT")

res <- data.frame(
home_team_name = elements |>
html_elements(".item_team__evhUQ:nth-child(1) .item_teamName__NSnfH") |>
html_text(trim = TRUE),
home_team_odds = elements |>
html_elements(".item_team__evhUQ:nth-child(1) .item_odd__Lm2Wl") |>
html_text(trim = TRUE),
away_team_name = elements |>
html_elements(".item_team__evhUQ:nth-child(3) .item_teamName__NSnfH") |>
html_text(trim = TRUE),
away_team_odds = elements |>
html_elements(".item_team__evhUQ:nth-child(3) .item_odd__Lm2Wl") |>
html_text(trim = TRUE),
match_date = elements |>
html_elements(".item_scores__Vi7YX .item_date__g4cq_") |>
html_text(trim = TRUE),
match_time = elements |>
html_elements(".item_scores__Vi7YX .item_time__xBia_") |>
html_text(trim = TRUE),
match_type = elements |>
html_elements(".item_scores__Vi7YX .item_bo__u2C9Q") |>
html_text(trim = TRUE),
stringsAsFactors = FALSE
)

print(glue("Scraped {nrow(res)} rows"))

# Save results
outfile <- sprintf("egamers_callofduty_matches_%s.csv", format(Sys.time(), "%Y%m%d_%H%M%S"))
write.csv(res, outfile, row.names = FALSE)
message(glue("Saved to {outfile}"))

What Better Now

  • No more “user data dir already in use”
    Every run gets a fresh profile.
  • Truly headless
    Works without a display server, perfect for cron jobs.
  • Safe in Docker
    Flags like --no-sandbox prevent Chrome crashes in containers.
  • Dynamic content handled
    Explicit waits make sure JavaScript has done its thing before scraping.
  • Clean exit
    Temp profiles are deleted automatically.

Bonus no Selenium at all

If you don’t need Selenium, selenider also supports Chromote, which talks directly to Chrome’s DevTools protocol—no driver, no --user-data-dir headaches:

library(selenider)
library(rvest)

session <- selenider_session("chromote")
open_url("https://egamersworld.com/callofduty/matches")

wait_until({
src <- get_page_source()
length(html_elements(src, ".item_teams__cKXQT")) > 0
}, timeout = 10)

src <- get_page_source()
elements <- html_elements(src, ".item_teams__cKXQT")

This often works out of the box on Linux.

Final thought

Moving scraping jobs to Linux often reveals weird edge cases you never hit on Windows this --user-data-dir issue was one of them. The fix turned out to be straightforward once I understood that Chrome itself, not Selenium, needed the argument.

Now my script runs quietly in the background on a schedule, reliably pulling match data without choking on profile locks. If you’re deploying your own selenider scrapers to Linux, start with a unique profile per run you’ll save yourself a lot of head-scratching later.

Related blog posts