How to Get PDF Attachments from Gmail Using JavaScript

Fetching email attachments via the Gmail API seems straightforward until you encounter complex using javaScript, nested MIME structures. If your code is stuck in infinite loops or failing to extract PDFs, here’s a step-by-step guide to fix it and add practical improvements.

Nested MIME Structures Causing Infinite Loops

The original code uses a recursive function to search for PDF attachments in an email’s MIME parts. However, emails often have deeply nested structures (e.g., multipart/mixed, multipart/alternative), which can cause recursion to fail or loop indefinitely.

Why It Fails:

Recursion Limitations: Deeply nested parts exceed JavaScript’s recursion stack.
Redundant Checks: Parts may reference subparts in a way that revisits the same nodes.
Incomplete Traversal: The code stops at the first PDF found, but nested parts may contain duplicates or invalid data.

Iterative Traversal for MIME Parts

Replace the recursive approach with an iterative, breadth-first search (BFS) to reliably traverse nested parts without infinite loops.

Modified Code:

function findPdfPart(rootPart) {
  const queue = [rootPart]; // Use a queue for BFS traversal
  while (queue.length > 0) {
    const currentPart = queue.shift();
    // Check if current part is a PDF attachment
    if (
      currentPart.mimeType === 'application/pdf' &&
      currentPart.body?.attachmentId
    ) {
      return currentPart;
    }
    // Add subparts to the queue for further traversal
    if (currentPart.parts) {
      queue.push(...currentPart.parts);
    }
  }
  return null; // No PDF found
}

Key Changes:

BFS Traversal: Processes parts level-by-level using a queue.
No Recursion: Avoids stack overflow in deeply nested structures.
Early Exit: Returns the first valid PDF found.

Enhancements for Robustness

Handle Multiple Attachments

Modify the code to collect all PDFs in the email:

function findAllPdfParts(rootPart) {
  const queue = [rootPart];
  const pdfParts = [];
  while (queue.length > 0) {
    const currentPart = queue.shift();
    if (
      currentPart.mimeType === 'application/pdf' &&
      currentPart.body?.attachmentId
    ) {
      pdfParts.push(currentPart);
    }
    if (currentPart.parts) {
      queue.push(...currentPart.parts);
    }
  }
  return pdfParts;
}

Add Error Handling for Attachments

Check for valid attachment data before saving:

// Inside the try block:
const pdfParts = findAllPdfParts(email.payload);
if (pdfParts.length === 0) {
  console.log("No PDF attachments found.");
  return null;
}

for (const pdfPart of pdfParts) {
  try {
    const attachmentData = await checkInbox({
      token: accessToken,
      messageId: email.id,
      attachmentId: pdfPart.body.attachmentId,
    });
    
    if (!attachmentData?.data) {
      console.log(`Skipping invalid attachment: ${pdfPart.filename}`);
      continue;
    }
    
    // Save the file...
  } catch (error) {
    console.error(`Failed to process ${pdfPart.filename}: ${error.message}`);
  }
}

Add Filename Deduplication

Prevent overwriting files with identical names:

const filename = pdfPart.filename || 'attachment.pdf';
const uniqueFilename = `${Date.now()}_${filename}`;
const filePath = path.join(downloadPath, uniqueFilename);

Final Thoughts

Avoid Recursion for MIME Traversal: Use BFS/DFS with a loop to handle arbitrary nesting depths.
Validate Attachment Data: Not all parts marked as application/pdf may have valid content.
Leverage Libraries: Consider using libraries like googleapis for built-in MIME parsing.

Next Steps:

Add support for ZIP/TXT attachments.
Integrate email filtering by date/sender.
Implement retries for failed downloads.

By anchoring your code to iterative traversal and robust validation, you’ll reliably extract PDFs from even the most convoluted emails.

How to Get PDF Attachments from Gmail Using JavaScript

Nested MIME Structures Causing Infinite Loops

Why It Fails:

Iterative Traversal for MIME Parts

Modified Code:

Enhancements for Robustness

Handle Multiple Attachments

Add Error Handling for Attachments

Add Filename Deduplication

Final Thoughts

Related blog posts

How Did I Fix Xcode Looking for an iOS 8.1 Dylib in My C++ Game Build?

How to Fix Expected Unqualified ID in My iOS Card Game

How I Fix a Android Google Play Game Services and LibGDX

How to Fix Crashing Issue in MatheGame Activity for Android Game