When I first started using the requests-html
library, I was pretty excited about how easily it allowed me to scrape content and even render JavaScript with very little setup. Things were going smoothly until I decided to integrate it with a Flask app. That’s when I hit a wall an annoying and confusing error:
RuntimeError: There is no current event loop in thread 'Thread-3'
At first, I wasn’t sure what was going wrong. Everything worked fine in standalone Python scripts. But the moment I deployed it inside a Flask route, things broke. In this post, I’ll share my journey fixing this issue, explain why it happens, and show how I added extra functionality to make the script production-ready.
The Original Code
Here’s the script I started with. It was working great outside of Flask:
from flask import Flask
from requests_html import HTMLSession
from bs4 import BeautifulSoup
app = Flask(__name__)
@app.route('/<user>')
def hello_world(user):
session = HTMLSession()
r = session.get('https://medium.com/@' + str(user))
r.html.render() # <--- Triggers the error in Flask
divs = r.html.find('div')
lst = []
for div in divs:
soup = BeautifulSoup(div.html, 'html5lib')
div_tag = soup.find()
try:
title = div_tag.section.div.h1.a['href']
if title not in lst:
lst.append(title)
except:
pass
return "\n".join(lst)
if __name__ == '__main__':
app.run(debug=True)
The Problem
Once I moved this into a Flask app and deployed it, I ran into this error:
RuntimeError: There is no current event loop in thread 'Thread-3'
Why It Happens
After digging into the traceback and looking through the requests-html
source, I figured out the root cause.
The r.html.render()
function relies on Pyppeteer, which is a headless browser. And Pyppeteer needs an asyncio
event loop to work.
But Flask runs your route handler inside a new thread, and that thread doesn’t come with an event loop by default. So, when requests-html
tries to grab the event loop using asyncio.get_event_loop()
, it fails because there is no loop.
The Fix
The solution? I needed to create and set an event loop manually before using .render()
.
Update Code with Fix
Here’s the updated and enhanced version of the code:
from flask import Flask, jsonify
from requests_html import HTMLSession
from bs4 import BeautifulSoup
import asyncio
app = Flask(__name__)
@app.route('/<user>')
def get_medium_articles(user):
try:
# Create and set a new event loop for the current thread
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
session = HTMLSession()
url = f'https://medium.com/@{user}'
r = session.get(url)
# Render JavaScript
r.html.render(timeout=20, sleep=2)
# Find all divs on the page
divs = r.html.find('div')
article_links = []
for div in divs:
soup = BeautifulSoup(div.html, 'html5lib')
div_tag = soup.find()
try:
# Try extracting an article link from the nested HTML structure
title = div_tag.section.div.h1.a['href']
if title not in article_links:
article_links.append(title)
except:
continue
return jsonify({
'user': user,
'url': url,
'article_links': article_links,
'total_found': len(article_links)
})
except Exception as e:
return jsonify({'error': str(e)}), 500
if __name__ == '__main__':
app.run(debug=True)
What I Improve
- Fixed the Event Loop Error: Now it works inside Flask.
- Better Output: Returns JSON instead of plain text.
- Timeout and Sleep Added: Prevents the render from hanging indefinitely.
- Cleaner Code: Used
f-strings
for readability. - Improved Error Handling: Gracefully handles failures and sends back error messages.
Extra Practice Ideas
Once I fixed the main issue, I started experimenting with ways to make the project more useful. Here are some ideas that helped me stretch this project further:
- Try using
Quart
instead of Flask: Quart supports async natively and works beautifully with asyncio. - Add caching using Redis or file-based storage so repeated requests don’t always trigger new scrapes.
- Add pagination handling for long profiles on Medium that require “scroll to load more”.
- Hook it up to a frontend: Use JavaScript to call this API and show live results in a web app.
- Scrape different sites: This same setup can be modified to scrape public profiles on other platforms like Twitter or GitHub.
Final Thought
This project taught me a lot about how Flask handles threads and how libraries like requests-html
use asyncio
under the hood. If you’re building something that combines synchronous frameworks (like Flask) with async operations (like Pyppeteer), you’ll likely run into this same issue.
Creating and managing your own event loop isn’t too difficult but it’s something I didn’t expect at first. If you plan on doing more advanced scraping or automation, I’d also recommend looking into Playwright or Selenium, which provide more robust control over headless browsers and have better async support.