Manhuarm (ALL): fix OCR text pages not appearing by MuhamadSyabitHidayattulloh · Pull Request #83 · MuhamadSyabitHidayattulloh/extensions-source

MuhamadSyabitHidayattulloh · 2026-06-12T16:50:40Z

Extract OCR tokens directly from the page HTML via regex.
Bypass new anti-scraping measures that used iframes and Math.random detection.
Preserve all pages in the reader by mapping OCR data to original pages.
Improve null safety in DTO deserialization.
Remove obsolete WebView-based OCR interceptor and associated scripts.
Bump version code to 24.

Checklist:

Updated extVersionCode value in build.gradle for individual extensions
Updated overrideVersionCode or baseVersionCode as needed for all multisrc extensions
Referenced all related issues in the PR body (e.g. "Closes #xyz")
Added the isNsfw = true flag in build.gradle when appropriate
Have not changed source names
Have explicitly kept the id if a source's name or language were changed
Have tested the modifications by compiling and running the extension through Android Studio
Have removed web_hi_res_512.png when adding a new extension
This PR is AI-assisted, I have reviewed the changes manually and confirmed they are not slop

- Extract OCR tokens directly from the page HTML via regex. - Bypass new anti-scraping measures that used iframes and Math.random detection. - Preserve all pages in the reader by mapping OCR data to original pages. - Improve null safety in DTO deserialization. - Remove obsolete WebView-based OCR interceptor and associated scripts. - Bump version code to 24.

- Extract OCR tokens directly from the page HTML via robust regex. - Use network.client directly for OCR POST requests to avoid interceptor recursion and ensure clean connection. - Ensure OCR endpoint URL is absolute. - Improve DTO serialization by using data classes and explicit @SerialName for private properties. - Preserve all pages in the reader by mapping OCR data to original pages. - Address code review feedback on null safety. - Remove obsolete WebView-based OCR interceptor and associated scripts. - Bump version code to 24.

- Extract OCR tokens directly from page HTML script tags. - Use main OkHttp client for OCR requests to maintain session/cookies. - Robust PAGE_REGEX to handle image URLs with query parameters and fragments. - Refactored page matching logic to preserve all pages in the reader. - Improved ComposedImageInterceptor reliability and Content-Type preservation. - Updated DTOs to follow contributor guidelines (regular classes, toJsonRequestBody). - Removed obsolete WebView-based components. - Bump version code to 24.

- Extract OCR tokens directly from page HTML script tags. - Use main OkHttp client for OCR requests to maintain session/cookies and bypass Cloudflare. - Robust PAGE_REGEX to handle various image URL formats with potentially encoded fragments. - Improved matching logic in pageListParse to reliably attach OCR data to existing pages. - Enhanced ComposedImageInterceptor with better error handling and proper Content-Type preservation. - Updated DTOs to follow CONTRIBUTING.md guidelines (regular classes, toJsonRequestBody). - Removed obsolete WebView-based interceptor and associated scripts. - Bump version code to 24.

MuhamadSyabitHidayattulloh added 4 commits June 12, 2026 16:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Manhuarm (ALL): fix OCR text pages not appearing#83

Manhuarm (ALL): fix OCR text pages not appearing#83
MuhamadSyabitHidayattulloh wants to merge 4 commits into
CI-PRfrom
fix-manhuarm-ocr-pages-18443251943517762522

MuhamadSyabitHidayattulloh commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MuhamadSyabitHidayattulloh commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant