How I Fix Pulling a Page with Raw Socket a 404 Errors

I thought my setup was bullet‑proof: a tiny Apache server inside an Ubuntu VM, a single index.html, and a friendly DNS shortcut into /etc/hosts. Type that URL in any browser on the host and the page pops right up.

Then I tried to be clever and fetch the same HTML with nothing but Python sockets. Instead of the page, Apache tossed me a 404 Not Found. One hour of head‑scratching later, I discovered a one‑character typo that broke the whole request.

Below is the journey from the wrong code, through the detective work, to the quick fix and a handful of practice ideas so you can stretch your own socket muscles.

Error Code

import socket
import sys # for sys.exit()

try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
except socket.error:
print("Failed to initialize socket")
sys.exit()

print("Socket initialized")

host = "vulnerable"
port = 80

try:
remote_ip = socket.gethostbyname(host)
except socket.gaierror:
print("Hostname could not be resolved. Exiting")
sys.exit()

s.connect((remote_ip, port))
print(f"Socket connected to {host} on IP {remote_ip}")

message = "GET /HTTP/1.1\r\n\r\n".encode() # send as bytes

try:
s.sendall(message)
except socket.error:
print("Send failed")
sys.exit()

print("Message sent successfully")

reply = s.recv(4096)
print(reply)

The program ran, but Apache answered:

HTTP/1.1 404 Not Found
...
The requested URL /HTTP/1.1 was not found on this server.

Why the Server Spat Out 404

An HTTP/1.1 request line must look exactly like this:

Edit<METHOD> <PATH> <VERSION>\r\n

My request line was:

GET /HTTP/1.1

There’s only one space, so Apache parses it as:

FieldValue
MethodGET
Path/HTTP/1.1
Version(missing)

Apache dutifully looked for a file called /HTTP/1.1, couldn’t find it, and returned 404. A browser, however, sends something like:

GET / HTTP/1.1
Host: vulnerable

Note the extra space after the slash and the mandatory Host header both required for HTTP/1.1.

A Minimal, Working Fix

import socket, sys

HOST, PORT = "vulnerable", 80

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
try:
s.connect((socket.gethostbyname(HOST), PORT))
print(f"Connected to {HOST}")
except socket.error as e:
sys.exit(f"Socket error → {e}")

request = (
"GET / HTTP/1.1\r\n"
f"Host: {HOST}\r\n"
"Connection: close\r\n\r\n"
).encode()

s.sendall(request)

response = bytearray()
while chunk := s.recv(4096):
response.extend(chunk)

print(response.decode(errors="replace"))

What changed?

  • GET / HTTP/1.1  correct path and space
  • Added Host: header (mandatory)
  • Added Connection: close so Apache closes the socket when done
  • Replaced the single recv with a loop so I capture the whole response, not just the first 4 kB

Extra Practice Ideas

  • Save the HTML to a file
open("index_from_socket.html", "wb") as f:
    f.write(response)
  • Measure round‑trip time
time
start = time.perf_counter()
# …send/recv…
print(f"Took {time.perf_counter() - start:.3f}s")
  • Send a HEAD request
    Swap GET / for HEAD / to fetch headers only.
  • Parse response headers
header_blob, _, body = response.partition(b"\r\n\r\n")
headers = dict(
    line.decode().split(":", 1)
    for line in header_blob.split(b"\r\n")[1:]  # skip status line
)
print(headers.get("Content-Type"))
  • Turn it into a helper Wrap everything in fetch(host, path="/" , *, port=80) that returns (status, headers, body).
  • Try basic HTTPS
import ssl
context = ssl.create_default_context()
with context.wrap_socket(socket.socket(), server_hostname=HOST) as s:
    s.connect((HOST, 443))
    s.sendall(request)            # same request string
    ...
  • Follow one redirect
    If the status is 301 or 302, grab the Location: header and call fetch() again.
  • Mini stress‑test Spin up 50 threads via concurrent.futures.ThreadPoolExecutor and record average latency.

Final Thoughts

One stray character /HTTP/1.1 instead of / HTTP/1.1 cost me an afternoon. Raw sockets give you total control, but that means every byte must be perfect. Once the syntax is fixed, the same 30‑line script becomes a playground: you can benchmark your server, parse headers, experiment with HTTPS, or even build a toy web crawler.

Related blog posts