mirror of https://gh.wpcy.net/https://github.com/bohdanbobrowski/blog2epub.git synced 2026-06-10 03:04:03 +08:00

Convert blog (blogspot.com, wordpress.com...) or any website to epub using GUI, CLI or Python. https://github.com/bohdanbobrowski/blog2epub

blog blogging ebook ebook-downloader ebook-generator ebook-reader epub-generator gui linux linux-desktop macos website-archive windows

Find a file

Bohdan Bobrowski df1adde354 AppImage		2025-11-27 22:22:09 +01:00
.github/workflows	AppImage	2025-11-27 22:22:09 +01:00
assets	Remove unused scripts	2025-11-12 23:43:52 +01:00
blog2epub	Snapcraft	2025-11-03 18:42:03 +01:00
docs	Documentation	2025-11-12 08:54:43 +01:00
snap/gui	Linux build	2025-11-08 08:51:50 +01:00
tests	Split integration tests to separate files and add test for salam pax's blog	2024-11-18 21:28:23 +01:00
.gitignore	Snapcraft!	2025-11-03 16:42:43 +01:00
blog2epub_linux.spec	Documentation	2025-11-12 08:47:49 +01:00
blog2epub_macos.spec	Fixes to spec files	2025-11-03 21:21:12 +01:00
blog2epub_windows.spec	More hidden imports	2025-11-01 16:07:33 +01:00
blog2epub_windows_installer.nsi	Publisher and Version in uninstall information	2025-11-06 11:05:17 +01:00
build_android_release.sh	Bump version	2025-10-31 12:42:15 +01:00
build_linux.py	AppImage	2025-11-03 20:37:03 +01:00
build_macos.py	Build scripts rename	2025-11-01 14:29:24 +01:00
build_windows.py	Build scripts rename	2025-11-01 14:29:24 +01:00
buildozer.spec	Fix for float tile size on magazynbiomasa.pl	2025-11-01 13:21:51 +01:00
LICENSE	OSX buildfixes	2023-02-25 11:38:28 +01:00
main.py	Fight with buildozer errors, but at least compiles	2024-09-04 20:09:42 +02:00
make_linux_appimage.sh	AppImage - add version number	2025-11-24 18:27:04 +01:00
make_macos_dmg.sh	AppImage	2025-11-24 18:08:31 +01:00
Makefile	Bump version	2025-10-31 12:42:15 +01:00
pyproject.toml	Build Linux AppImage	2025-11-09 13:19:14 +01:00
README.md	AppImage	2025-11-27 20:02:37 +01:00
requirements.txt	Windows installer update	2025-11-01 19:55:05 +01:00
snapcraft.yaml	Linux build	2025-11-08 08:51:50 +01:00

README.md

blog2epub

Convert website (blog) to epub using command line or GUI.

My main goal in creating this app is to preserve the legacy of the blogosphere for future generations.

Supported blogs:

*.blogspot.com
*.wordpress.com
multiple other blogs and even some webpages

Main features

command line (CLI) and graphic user interface (GUI)
script downloads all text contents of selected blog to epub file,
if it's possible, it includes post comments,
images are downsized to given resolution (600*800,640*960 or 1236*1648)
images are also by default converted to grayscale,
one post = one epub chapter,
chapters are sorted by date ascending,
cover is generated automatically from downloaded images.

Example covers

Installation

Checkout for latest available builds.

Running from sources

Easiest way

pip install git+https://github.com/bohdanbobrowski/blog2epub.git

Developer environment

git clone git@github.com:bohdanbobrowski/blog2epub.git
cd blog2epub
python -m venv venv

Windows:

venv\Scripts\activate
pip install -e .[dev]

macOS/Linux:

source ./venv/bin/activate
pip install -e .[dev]

Building own executable

Build environment should contain only what is necessary to prepare build:

pip install .
pip instal pyinstaller

Android

buildozer android debug

Windows

python build_windows.py
"C:\Program Files (x86)\NSIS\makensis" blog2epub_windows_installer.nsi

macOS

python build_macos.py
./make_macos_dmg.sh

Linux

So this is always a struggle (we all kno why), but eventually AppImage at least is so-so finished. Most important, thing: for building Linux images you need to have dedicated environment with minimal amount installed packages:

python -m venv ./venv_build
source ./venv_build/bin/activate
pip install .
pip install pyinstaller

AppImage

First, prepare "binary":

python build_linux.py

...and finally:

./make_linux_appimage.sh

To build signed appimage use this command:

./make_linux_appimage.sh --sign

Snap

This is promising, despite taking 150 MB... but still i have issues with plyer modules called fileselect and notification.

snapcraft pack

Screenshots of GUI

Android (Google Pixel 6a)

Windows (11)

Linux (Ubuntu 24.04)

macOS (Sequoia 15.6)

CLI

blog2epub --help
usage: Blog2epub Cli interface [-h] [-l LIMIT] [-s SKIP] [-q QUALITY] [-o OUTPUT] [-d] url

Convert blog (blogspot.com, wordpress.com or another based on Wordpress) to epub using CLI or GUI.

positional arguments:
  url                   url of blog to download

options:
  -h, --help            show this help message and exit
  -l LIMIT, --limit LIMIT
                        articles limit
  -s SKIP, --skip SKIP  number of skipped articles
  -q QUALITY, --quality QUALITY
                        images quality (0-100)
  -o OUTPUT, --output OUTPUT
                        output epub file name
  -d, --debug           turn on debug

Example:

blog2epub starybezpiek.blogspot.com -l=2 -o=example.epub
Starting blogger.com crawler
Found 54 articles to crawl.
Downloading.
1. 10 lat kremlowskiej propagandy, czyli RT ujawnia swoje sekrety
Downloading.
2. "Komunę obaliliśmy, a nadal jest źle. Dlaczego?" Czyli 1984 Orwella właściwie odczytany
Locale set as en_US.UTF-8
Generating cover (800px*600px) from 1 images.
Cover generated: .\starybezpiek.blogspot.com\example.epub.jpg
Epub created: .\example.epub

Examples

blog2epub starybezpiek.blogspot.com
blog2epub velosov.blogspot.com -l=10
blog2epub poznanskiehistorie.blogspot.com -q=100
blog2epub classicameras.blogspot.com --limit=10 --no-images

Running tests

pytest ./tests
pytest --cov=blog2epub ./tests
pytest --cov=blog2epub --cov-report=html ./tests

Current version

v1.5.0

integration testing
increase unit test coverage
use sitemaps.xml for scraping
crawlers refactor
- use builtin dataclasses instead of pydantic
- more common methods in crawler class
- expand crawler abstract
cli interface refactor
greek alphabet support
image download and attachment bug solved (ex. modernistyczny-poznan.blogspot.com)
color/bw images and covers
custom image/cover sizes
improved resistance to http errors
dedicated crawler class for zeissikonveb.de
(on GUI) skip value is enlarged on limit value (if such is set)
download progress is much more verbose, also on GUI it can be cancelled everytime
remove poetry as it's overcomplicated for the case,
results of cancelled downloads might be converted to epub
Android version
Windows installer (published on Microsoft Store)
Linux packages: Appimage and Snap (still experimental)
GitHub actions builds for macOS, Windows and Linux

» Complete Changelog here «

Project backlog

And finally, a list known bugs and future plans for some new functions and enhancements: Backlog.md

Road map:

1.0 - somewhat working
2.0 - fully working project, 90% unit tested and available builds for Android/Windows/Linux/MacOS

专为开源 Web 生态打造的企业级代码托管平台，深度支持 WordPress、Laravel、Vue.js、React 等主流技术栈，致力于推动中国开放网络 OpenWeb 发展，助力本土开源项目建设。

基于构建 | 专业 • 开放 • 安全

文派开源（WenPai.org）项目官方代码托管平台，由以下企业技术团队联合运营：

汉中菲比斯网络技术有限公司 | 文派（广州）科技有限公司

莫蒂奇数字技术（苏州）有限公司

探索项目组织机构问题反馈开发者社区

代码托管本地化翻译企业服务私有部署

文派叶子薇晓朵 WP TEA 慕得教育麟悦平台 ArkPress 跨飞独立站橙黑设计