How I Handled the “Invalid Grant” Error in My Ruby Service

I’ve been working on a Rails application that integrates with the NetDocuments API, where multiple users share a single ruby service account. The API requires single-use refresh tokens to obtain new access tokens. But every few months, we’d hit a frustrating error: unexpected token at 'Invalid grant'.

After digging into the logs, I realized the root cause: race conditions. When multiple threads or processes tried to refresh the token simultaneously, the first request would invalidate the refresh token, causing subsequent requests to fail. To make matters worse, our token expiration logic was flawed we hardcoded a 1-hour expiry instead of using the API’s expires_in value.

Prevent Race Conditions with Database Locks

The core issue was concurrent token refreshes. To solve this, I used database row-level locking to ensure only one thread/process could refresh the token at a time.

def self.netdocuments_credential
  # Fetch the credential and lock it to block concurrent access
  credential = Credential.find_by(kind: 'netdocuments')
  return unless credential

  credential.with_lock do
    # Reload ensures we have the latest data after acquiring the lock
    credential.reload

    if credential.expired?
      refresh_credential(credential)
    end
  end

  credential.reload
  credential
rescue ActiveRecord::RecordNotFound
  Rails.logger.error("NetDocuments credential not found.")
  nil
end

Why this works:

  • with_lock locks the database row, preventing other threads from modifying the credential until the lock is released.
  • credential.reload ensures we’re working with the freshest data after acquiring the lock.

Use the Correct Expiration Time

Initially, we hardcoded expires_at to 1 hour (3600 seconds). But the API returns an expires_in value indicating the actual token lifespan. Ignoring this caused tokens to expire prematurely or linger past their validity.

def self.update_credential(credential, response_body)
  credential.update!(
    token: response_body["access_token"],
    refresh_token: response_body["refresh_token"],
    expires_at: Time.now + response_body["expires_in"].to_i
  )
end

Key takeaway: Always use the API’s expires_in value to calculate expiration.

Add Robust Error Handling & Retries

APIs fail network issues, timeouts, or server errors. We added retries for transient errors and explicit handling for invalid_grant:

def self.refresh_credential(credential)
  retries ||= 0
  resp = Faraday.post("#{API_BASE_URL[get_data_region]}/v1#{TOKEN_PATH}") do |req|
    # ... token refresh logic ...
  end

  handle_refresh_response(credential, resp)
rescue Faraday::Error => e
  retry if (retries += 1) < 3
  Rails.logger.error("NetDocuments connection failed: #{e.message}")
  raise
end

def self.handle_refresh_error(credential, resp)
  error_body = JSON.parse(resp.body) rescue { error: "unknown" }
  case error_body["error"]
  when 'invalid_grant'
    Rails.logger.error("Invalid grant error: #{error_body}")
    credential.destroy # Force re-authentication
    raise StandardError, "Re-authentication required."
  else
    Rails.logger.error("Token refresh error: #{error_body}")
    raise StandardError, "Failed to refresh token."
  end
end

What this does:

  • Retries transient Faraday errors up to 3 times.
  • Destroys the credential on invalid_grant, forcing a re-authentication flow.

Enhance Observability

We added logging to track token refreshes and errors:

Rails.logger.info("Token refreshed at #{Time.now} for NetDocuments.")

This helps debug issues and monitor token lifecycle events.

Leveling Up: Advanced Practices

Once the core logic worked, I added these improvements:

Background Token Refresh

Use a job scheduler (like Sidekiq) to refresh tokens before they expire:

class TokenRefreshJob < ApplicationJob
  def perform
    credential = Credential.find_by(kind: 'netdocuments')
    return unless credential&.expires_at&. < 5.minutes.from_now

    # Trigger refresh logic here
  end
end

Proactive refreshes reduce latency during API calls.

Circuit Breaker Pattern

Prevent cascading failures with a circuit breaker (using the circuitbox gem):

def refresh_credential(credential)
  Circuitbox.circuit(:netdocuments, timeout: 10).run do
    # Token refresh logic
  end
end

Stops retrying after repeated failures, reducing load on the API.

Admin Alerts

Notify admins when credentials need re-authentication:

def handle_refresh_error(credential, resp)
  if error_body["error"] == 'invalid_grant'
    AdminMailer.authentication_required.deliver_later
  end
end

Thread-Safe Token Caching

Cache tokens in memory to reduce database hits:

def self.netdocuments_credential
  @token_cache ||= {}
  @token_cache[:netdocuments] ||= begin
    # Fetch from DB and refresh if needed
  end
end

Final Thoughts

Handling OAuth tokens in a multi-threaded environment is tricky, but manageable with:

  1. Database locks to prevent race conditions.
  2. Accurate expiration using the API’s expires_in.
  3. Retries & error handling for resilience.
  4. Proactive refreshes to avoid edge cases.

If I were to start over, I’d design the system with these principles from day one. The key lesson? Never assume tokens are thread-safe always coordinate refreshes.

Gotchas to Watch For:

  • Ensure your database supports row-level locks (PostgreSQL/MySQL do).
  • Test token expiration logic with realistic values (e.g., 30-minute tokens).
  • Monitor logs for invalid_grant—it could indicate a compromised token.

By addressing these issues, we reduced invalid_grant errors to zero.

Related blog posts