Convert HTML to PDF with Python: WeasyPrint, pdfkit and Playwright
Generating PDFs from HTML is a common task: invoices, reports, certificates, receipts. Python offers several libraries depending on your use case.
Quick Comparison
| Tool | Engine | JavaScript | Modern CSS | Installation |
|---|---|---|---|---|
| WeasyPrint | Pure Python | ❌ | CSS3 + Flexbox | pip install weasyprint |
| pdfkit | wkhtmltopdf | Basic | Good | Requires external binary |
| Playwright | Chromium | ✅ full | ✅ full | pip install playwright |
| xhtml2pdf | Pure Python | ❌ | Limited | pip install xhtml2pdf |
WeasyPrint — Modern CSS Without External Dependencies
WeasyPrint is the most Pythonic option: no external binaries required.
Installation
pip install weasyprint
# On Linux you may need system dependencies:
# Ubuntu/Debian:
sudo apt-get install libpango-1.0-0 libpangoft2-1.0-0
# macOS with Homebrew:
brew install pango
Basic Usage
from weasyprint import HTML, CSS
# From HTML string
HTML(string='<h1>Hello</h1><p>My first PDF.</p>').write_pdf('output.pdf')
# From HTML file
HTML(filename='report.html').write_pdf('report.pdf')
# From URL
HTML(url='https://example.com/page').write_pdf('page.pdf')
# With separate CSS
HTML(string='<h1>Title</h1>').write_pdf(
'output.pdf',
stylesheets=[CSS(string='h1 { color: navy; }')]
)
Templates with Jinja2
The most powerful workflow: HTML template + data → dynamically generated PDF.
from weasyprint import HTML
from jinja2 import Environment, FileSystemLoader
# Configure Jinja2
env = Environment(loader=FileSystemLoader('templates/'))
template = env.get_template('invoice.html')
# Invoice data
data = {
"number": "2024-042",
"client": "XYZ Corp Ltd",
"items": [
{"description": "Web design", "price": 800},
{"description": "Monthly SEO", "price": 300},
],
"total": 1100
}
# Render HTML
rendered_html = template.render(**data)
# Generate PDF
HTML(string=rendered_html).write_pdf('invoice_2024-042.pdf')
print("PDF generated: invoice_2024-042.pdf")
Sample templates/invoice.html:
<!DOCTYPE html>
<html>
<head>
<style>
body { font-family: Arial, sans-serif; margin: 40px; }
table { width: 100%; border-collapse: collapse; }
th, td { border: 1px solid #ddd; padding: 8px; }
.total { font-weight: bold; font-size: 1.2em; }
</style>
</head>
<body>
<h1>Invoice #{{ number }}</h1>
<p>Client: {{ client }}</p>
<table>
<tr><th>Description</th><th>Price</th></tr>
{% for item in items %}
<tr><td>{{ item.description }}</td><td>{{ item.price }} USD</td></tr>
{% endfor %}
<tr class="total"><td>TOTAL</td><td>{{ total }} USD</td></tr>
</table>
</body>
</html>
Print CSS (@page, page breaks)
WeasyPrint supports print CSS:
/* Page setup */
@page {
size: A4;
margin: 2cm 1.5cm;
/* Footer with page number */
@bottom-center {
content: "Page " counter(page) " of " counter(pages);
font-size: 10px;
color: #666;
}
}
/* Prevent page breaks inside important elements */
.invoice-item {
page-break-inside: avoid;
}
/* Force new page before section */
.new-section {
page-break-before: always;
}
/* Hide elements that shouldn't appear in PDF */
.screen-only {
display: none;
}
pdfkit — wkhtmltopdf from Python
pdfkit is a wrapper around wkhtmltopdf, which uses a real WebKit engine:
# Install pdfkit
pip install pdfkit
# Install wkhtmltopdf (external binary)
# Windows: download from https://wkhtmltopdf.org/downloads.html
# Ubuntu: sudo apt-get install wkhtmltopdf
# macOS: brew install --cask wkhtmltopdf
import pdfkit
options = {
'page-size': 'A4',
'margin-top': '1.5cm',
'margin-bottom': '1.5cm',
'margin-left': '1.5cm',
'margin-right': '1.5cm',
'encoding': 'UTF-8',
'no-outline': None,
'footer-right': '[page] of [topage]',
'footer-font-size': '9',
}
# From URL
pdfkit.from_url('https://example.com', 'web.pdf', options=options)
# From HTML file
pdfkit.from_file('report.html', 'report.pdf', options=options)
# From string
pdfkit.from_string('<h1>Hello</h1>', 'simple.pdf', options=options)
# Return bytes (without saving file)
pdf_bytes = pdfkit.from_string('<h1>Hello</h1>', False, options=options)
Limitation: wkhtmltopdf is in maintenance mode
The wkhtmltopdf project no longer receives active updates. For modern pages with CSS Grid, advanced Flexbox or heavy JavaScript, use Playwright instead.
Playwright — Real Chromium for JavaScript-Heavy Pages
pip install playwright
python -m playwright install chromium
import asyncio
from playwright.async_api import async_playwright
async def html_to_pdf(url: str, output_file: str):
async with async_playwright() as p:
browser = await p.chromium.launch()
page = await browser.new_page()
# Load page (wait for JS to finish executing)
await page.goto(url, wait_until='networkidle')
# Generate PDF
await page.pdf(
path=output_file,
format='A4',
print_background=True, # include background colors
margin={
'top': '1.5cm',
'bottom': '1.5cm',
'left': '1.5cm',
'right': '1.5cm'
}
)
await browser.close()
print(f"PDF saved: {output_file}")
# Run
asyncio.run(html_to_pdf('https://my-app.com/invoice/42', 'invoice42.pdf'))
Synchronous version (simpler):
from playwright.sync_api import sync_playwright
def generate_pdf(html_content: str, output: str):
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.set_content(html_content, wait_until='domcontentloaded')
page.pdf(path=output, format='A4', print_background=True)
browser.close()
generate_pdf('<h1>Report</h1><p>Processed data.</p>', 'report.pdf')
Playwright Advantages
- Supports React, Vue, Angular and any SPA
- CSS Grid, Flexbox, Web Fonts (including Google Fonts)
- Waits for network requests to complete (
wait_until='networkidle') - Emulates devices (mobile, tablet) for responsive PDFs
REST API for PDF Generation
If your project generates PDFs from a web server, here's a Flask example:
from flask import Flask, request, send_file
from weasyprint import HTML
import io
app = Flask(__name__)
@app.route('/generate-pdf', methods=['POST'])
def generate_pdf():
data = request.json
html = f"""
<html><body>
<h1>Order #{data['order_id']}</h1>
<p>Client: {data['client']}</p>
<p>Total: {data['total']} USD</p>
</body></html>
"""
pdf_bytes = HTML(string=html).write_pdf()
return send_file(
io.BytesIO(pdf_bytes),
mimetype='application/pdf',
as_attachment=True,
download_name=f"order_{data['order_id']}.pdf"
)
if __name__ == '__main__':
app.run(debug=True)
Which Tool Should You Choose?
| Use case | Recommended tool |
|---|---|
| Invoices, reports, certificates (CSS-designed) | WeasyPrint |
| Existing web page without complex JS | pdfkit |
| Dashboard with React/Vue or dynamic content | Playwright |
| No system dependency installation | WeasyPrint |
| Production on Linux server | WeasyPrint or Playwright |
WeasyPrint is the ideal starting point for 80% of cases. If you need to render JavaScript, switch to Playwright.
Related conversions
Document conversions that follow this topic naturally: