Linux

Fix the ‘User Data Dir’ Error in R Selenider on Linux

Posted by

On August 12, 2025

Recently, I decided to move one of my web scraping scripts from Python over to R because, frankly, I’m much more comfortable in R. Everything ran perfectly on my Windows machine, but the moment I tried it on my Linux server boom i hit the dreaded:

“probably user data directory is already in use specify a unique value for user-data-dir, or don’t use it”

If you’ve been there, you know it’s frustrating. So let me walk you through exactly what I did, why the error happens, and how I fixed it with a more production-ready scraping setup.

My Starting Point

Here’s the code I started with pretty standard selenider + rvest scraping:

library(selenider)
library(rvest)

session <- selenider_session("selenium", browser = "chrome")
Sys.sleep(3)

open_url("https://egamersworld.com/callofduty/matches")

elements <- session |> get_page_source() |> html_elements(".item_teams__cKXQT")

res <- data.frame(
  home_team_name = elements |> 
    html_elements(".item_team__evhUQ:nth-child(1) .item_teamName__NSnfH") |> 
    html_text(trim = TRUE),
  home_team_odds = elements |> 
    html_elements(".item_team__evhUQ:nth-child(1) .item_odd__Lm2Wl") |> 
    html_text(trim = TRUE),
  away_team_name = elements |> 
    html_elements(".item_team__evhUQ:nth-child(3) .item_teamName__NSnfH") |> 
    html_text(trim = TRUE),
  away_team_odds = elements |> 
    html_elements(".item_team__evhUQ:nth-child(3) .item_odd__Lm2Wl") |> 
    html_text(trim = TRUE),
  match_date = elements |> 
    html_elements(".item_scores__Vi7YX .item_date__g4cq_") |> 
    html_text(trim = TRUE),
  match_time = elements |> 
    html_elements(".item_scores__Vi7YX .item_time__xBia_") |> 
    html_text(trim = TRUE),
  match_type = elements |> 
    html_elements(".item_scores__Vi7YX .item_bo__u2C9Q") |> 
    html_text(trim = TRUE)
)

This happily pulled match data on Windows. On Linux? Not so much.

Why Linux Throw This Error

Here’s the thing: Chrome uses something called the user data directory to store your profile, cookies, cache, and extensions. On Linux, if:

multiple Chrome/Chromedriver processes try to use the same directory, or
a previous run crashed and left a lock file

then Chrome refuses to start. Selenium tries to launch Chrome, Chrome says “nope,” and you get the error.

Two key points I learned:

--user-data-dir is a Chrome flag, not a Selenium server flag. If you pass it to Selenium’s server options, nothing happens.
Python Selenium often creates a temporary profile automatically. R’s selenider wasn’t doing that here it was trying to reuse the default Chrome profile.

A Unique Chrome Profile

The trick is simple create a new temporary profile directory every time the script runs, and tell Chrome to use it. While we’re at it, I added some extras so it runs nicely on a headless Linux server or in Docker:

--headless=new for modern headless mode
--no-sandbox and --disable-dev-shm-usage for low-memory or container setups
explicit waits so JavaScript-rendered elements load
cleanup of temporary profile directories

The Improved, Linux Safe Script

library(selenider)
library(rvest)
library(glue)
library(withr)

# Create a unique Chrome profile directory
chrome_profile <- tempfile("chrome-profile-")
dir.create(chrome_profile, recursive = TRUE, showWarnings = FALSE)

# Chrome arguments (passed to Chrome, not Selenium server)
chrome_args <- c(
  glue("--user-data-dir={chrome_profile}"),
  "--headless=new",
  "--no-sandbox",
  "--disable-dev-shm-usage",
  "--disable-gpu",
  "--window-size=1920,1080"
)

# Selenium options for Chrome
server_options <- selenium_options(
  browser = "chrome",
  browser_args = chrome_args
)

# Clean up on exit
on.exit({
  try(close_session(), silent = TRUE)
  try(unlink(chrome_profile, recursive = TRUE, force = TRUE), silent = TRUE)
}, add = TRUE)

# Start session
session <- selenider_session(
  backend = "selenium",
  browser = "chrome",
  options = server_options
)

# Go to the page
open_url("https://egamersworld.com/callofduty/matches")

# Wait for dynamic elements
wait_until(
  condition = {
    src <- get_page_source()
    length(html_elements(src, ".item_teams__cKXQT")) > 0
  },
  timeout = 10
)

# Parse the page
src <- get_page_source()
elements <- html_elements(src, ".item_teams__cKXQT")

res <- data.frame(
  home_team_name = elements |>
    html_elements(".item_team__evhUQ:nth-child(1) .item_teamName__NSnfH") |>
    html_text(trim = TRUE),
  home_team_odds = elements |>
    html_elements(".item_team__evhUQ:nth-child(1) .item_odd__Lm2Wl") |>
    html_text(trim = TRUE),
  away_team_name = elements |>
    html_elements(".item_team__evhUQ:nth-child(3) .item_teamName__NSnfH") |>
    html_text(trim = TRUE),
  away_team_odds = elements |>
    html_elements(".item_team__evhUQ:nth-child(3) .item_odd__Lm2Wl") |>
    html_text(trim = TRUE),
  match_date = elements |>
    html_elements(".item_scores__Vi7YX .item_date__g4cq_") |>
    html_text(trim = TRUE),
  match_time = elements |>
    html_elements(".item_scores__Vi7YX .item_time__xBia_") |>
    html_text(trim = TRUE),
  match_type = elements |>
    html_elements(".item_scores__Vi7YX .item_bo__u2C9Q") |>
    html_text(trim = TRUE),
  stringsAsFactors = FALSE
)

print(glue("Scraped {nrow(res)} rows"))

# Save results
outfile <- sprintf("egamers_callofduty_matches_%s.csv", format(Sys.time(), "%Y%m%d_%H%M%S"))
write.csv(res, outfile, row.names = FALSE)
message(glue("Saved to {outfile}"))

What Better Now

No more “user data dir already in use”
Every run gets a fresh profile.
Truly headless
Works without a display server, perfect for cron jobs.
Safe in Docker
Flags like --no-sandbox prevent Chrome crashes in containers.
Dynamic content handled
Explicit waits make sure JavaScript has done its thing before scraping.
Clean exit
Temp profiles are deleted automatically.

Bonus no Selenium at all

If you don’t need Selenium, selenider also supports Chromote, which talks directly to Chrome’s DevTools protocol—no driver, no --user-data-dir headaches:

library(selenider)
library(rvest)

session <- selenider_session("chromote")
open_url("https://egamersworld.com/callofduty/matches")

wait_until({
  src <- get_page_source()
  length(html_elements(src, ".item_teams__cKXQT")) > 0
}, timeout = 10)

src <- get_page_source()
elements <- html_elements(src, ".item_teams__cKXQT")

This often works out of the box on Linux.

Final thought

Moving scraping jobs to Linux often reveals weird edge cases you never hit on Windows this --user-data-dir issue was one of them. The fix turned out to be straightforward once I understood that Chrome itself, not Selenium, needed the argument.

Now my script runs quietly in the background on a schedule, reliably pulling match data without choking on profile locks. If you’re deploying your own selenider scrapers to Linux, start with a unique profile per run you’ll save yourself a lot of head-scratching later.

Fix the ‘User Data Dir’ Error in R Selenider on Linux

My Starting Point

Why Linux Throw This Error

A Unique Chrome Profile

The Improved, Linux Safe Script

What Better Now

Bonus no Selenium at all

Final thought

Know More About FSIBLOG

Useful Pages

4,5

/5

Blog

My Starting Point

Why Linux Throw This Error

A Unique Chrome Profile

The Improved, Linux Safe Script

What Better Now

Bonus no Selenium at all

Final thought

About Karna Sodari

Know More About FSIBLOG

Useful Pages

4,5

/5