Upgrade python-scraperlib to 3.x, including CLI support for descripti…#191
Upgrade python-scraperlib to 3.x, including CLI support for descripti…#191clencyc wants to merge 2 commits into
Conversation
…on / long_description flags
There was a problem hiding this comment.
- please upgrade directly to latest 5.x (5.3 ATM)
- please add an entry to the CHANGELOG
- please link this PR to issue it will fix so that it will get automatically closed
- I don't get where you've truncated the description, and we should never truncate desccription but invite users to pass an adequate description
- please get inspiration about how things are done in other active scrapers (youtube, gutenberg, mindtouch, freecodecamp, ...) ; there is not "perfect" scraper ATM, but at least good "vibes" to get inspiration from
AFAIK, scraper is a bit broken ATM, how did you tested your changes? I've always considered #175 is the most urgent issue to tackle, but I'm glad if you find a way to do better. I will not merge something we have not tested.
- Upgraded zimscraperlib from 3.x to 5.2.0 - Updated Jinja2 to 3.x for MarkupSafe compatibility - Updated lxml to 5.x for Python 3.13 support - Added long_description parameter support (up to 4000 chars) - Removed description truncation, added validation warnings instead - Updated imports for zimscraperlib 5.2.0 API changes - Defined local metadata constants Fixes openzim#175
|
@benoit74 Thanks for the review! I've made the following changes: ✅ Upgraded zimscraperlib from 3.x to 5.2.0 The scraper now initializes correctly and the CLI help shows both |
|
I would be really surprised this work as expected given all the breaking changes in zimscraperlib 4 and 5, waiting for your input after you've looked into #175 |
Upgrades zimscraperlib from 1.x to 3.x and adds support for the --long-description CLI flag, as required by the new metadata API.
Changes
requirements.txt
Bumped zimscraperlib>=1.3.6,<1.4 to >=3.4.0,<4.0
entrypoint.py
Added --long-description CLI flag (max 4000 chars)
Updated --description help text to mention the 80-char limit
scraper.py
Added long_description parameter to Openedx2Zim.init()
Imported MAXIMUM_DESCRIPTION_METADATA_LENGTH from zimscraperlib.zim.metadata
Updated get_zim_info() to truncate description to 80 chars using the constant
Updated get_zim_info() to include long_description in the returned dict
Renamed favicon= → illustration= in make_zim_file() call (3.x API change)
Added long_description= to make_zim_file() call