Page holdings checker improvements#59
Conversation
ETT-1560 filename format validation ETT-1561 loading states and clear button
accept .tsv regardless of extension case focus output container on load, fix VoiceOver announcement add encoding detection, member ID validation, and pluralization fix Improve error message clarity and consistency in holdings checker Fix meta display, add clarifying comments, fix indentation
| import { buildCard, buildPendingCard, showError } from './ui.js'; | ||
|
|
||
| // Fetch is CORS-blocked outside www.hathitrust.org (local dev, test); catch returns null and member ID check is silently skipped. | ||
| const memberIdsPromise = fetch('https://www.hathitrust.org/files/ht_institutions.tsv') |
There was a problem hiding this comment.
Due to the CORS we can test the member id check fully once it's in production.
|
@eumalin Not sure if this is out of scope, but I found pattern of false positives in the recently submitted file from Example line: holdings-backend's I think it would be fair to make a new ticket to address ocn parsing if it is indeed out of scope. It's also fair to say that |
This includes tickets ETT-1507 ETT-1508 ETT-1559 ETT-1560 ETT-1561 ETT-1563
You can check it out in dev at https://dev-3.www.hathitrust.org/print-holdings-checker/
ETT-1559 - Code quality
I addressed security in the previous PR for this ticket.
In this PR I used
Promise.allinstead of a manual counter for parallel file processing. There's a few smaller cleanups too, file extension check is now case-insensitive so.TSVworks.Page loaded and valid .TSV accepted
ETT-1560 - Filename validation
Here we check a bunch of things: file extension, member ID, holding type, update type, and date.
partialis an error since the backend never implemented it. Date has to be YYYYMMDD and a real calendar date. Member ID is validated against the HathiTrust institutions list, fetched on load.The institutions fetch is blocked by CORS everywhere except
www.hathitrust.orgitself, so the member id check only runs in production. It fails silently elsewhere (not blocking, the check is skipped in this case), so we only can realistically test it fully once it's in production.Bad date in filename
Partial update type rejected
Unrecognized member ID
ETT-1561 - Loading states and accessibility
Each file gets a spinner icon indicator when it's drag dropped and processing is not done yet. Added
aria-live="polite"to the results container. VoiceOver on Safari doesn't reliably announce live region updates on a container that starts asdisplay:none, so we also setaria-labelto "Processing N files..." and focus the container, which gets announced on focus instead. Clear results button appears after processing. Spinner respectsprefers-reduced-motion.Spinner loading state
Multiple files - one pass, one error
ETT-1507, ETT-1508, ETT-1563 - Data row validation
Checks the first 1,000 rows and notes on the card when a file was too large to check fully.

If there were a lot of invalid status values:

99, over 15 digits, with hint to use OCLC numbers instead), and other non-integer values (must be a positive integer)CH,LM,WD, or empty. Condition must beBRTor empty. Govdoc must be0,1, or empty.Data row errors
If there were a lot of rows with similar errors:

Encoding detection
I used
TextDecoderwithfatal: trueto detect non-UTF-8 files. If it throws, we warn the user and fall back to lenient decoding so the rest of the checks still run.Non-UTF-8 encoding warning
Rejected files
Non .tsv files get an inline error.
CSV rejected
Created with help of AI