Skip to content

fix: forward storage_options as kwargs in CloudParquetDir#836

Open
SAY-5 wants to merge 1 commit into
Lightning-AI:mainfrom
SAY-5:fix/parquet-storage-options-kwargs
Open

fix: forward storage_options as kwargs in CloudParquetDir#836
SAY-5 wants to merge 1 commit into
Lightning-AI:mainfrom
SAY-5:fix/parquet-storage-options-kwargs

Conversation

@SAY-5

@SAY-5 SAY-5 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor
Before submitting
  • Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

What does this PR do?

Fixes #835.

CloudParquetDir unpacked storage_options positionally with fsspec.filesystem(provider, *self.storage_options), which passes the dict keys as positional arguments and raises TypeError: filesystem() takes 1 positional argument but ... were given whenever a non-empty storage_options is supplied (e.g. S3 credentials). Switching to **self.storage_options forwards them as keyword arguments, matching how litdata/raw/indexer.py already calls fsspec.filesystem.

Added a regression test asserting the storage options reach fsspec.filesystem as keyword arguments.

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in GitHub issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

Signed-off-by: Sai Asish Y <say.apm35@gmail.com>
@codecov-commenter

codecov-commenter commented Jun 18, 2026

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81%. Comparing base (82a9c3c) to head (3b6b036).
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@         Coverage Diff         @@
##           main   #836   +/-   ##
===================================
  Coverage    81%    81%           
===================================
  Files        54     54           
  Lines      7671   7671           
===================================
  Hits       6186   6186           
  Misses     1485   1485           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread tests/streaming/test_parquet.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

index_parquet_dataset() fails with TypeError when passing storage_options for S3

3 participants