Skip to content

docs: document how data is organized in SST files#2531

Merged
fengjiachun merged 3 commits into
mainfrom
docs/sst-data-layout
Jun 9, 2026
Merged

docs: document how data is organized in SST files#2531
fengjiachun merged 3 commits into
mainfrom
docs/sst-data-layout

Conversation

@evenyag

@evenyag evenyag commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

What's Changed in this PR

Explain the physical SST layout in the storage engine docs:

  • storage-engine.md: add Data Layout in SST Files (sort order, internal
    columns, row groups, file metadata) and Scan Pruning sections with
    diagrams; touch up the WAL/SST/Compactor bullets.
  • data-persistence-indexing.md: note skipping and full-text indexes are
    also stored in Puffin files.
  • data-model.md and design-table.md: cross-link to the storage engine doc
    for how data is laid out in SST files.

Checklist

  • Please confirm that all corresponding versions of the documents have been revised.
  • Please ensure that the content in sidebars.ts matches the current document structure when you changed the document structure.
  • This change requires follow-up update in localized docs.

Explain the physical SST layout in the storage engine docs:

- storage-engine.md: add Data Layout in SST Files (sort order, internal
  columns, row groups, file metadata) and Scan Pruning sections with
  diagrams; touch up the WAL/SST/Compactor bullets.
- data-persistence-indexing.md: note skipping and full-text indexes are
  also stored in Puffin files.
- data-model.md and design-table.md: cross-link to the storage engine doc
  for how data is laid out in SST files.

Signed-off-by: evenyag <realevenyag@gmail.com>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will sync the versioned docs and zh docs later.

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 5, 2026

Copy link
Copy Markdown

Deploying greptime-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 8d49806
Status: ✅  Deploy successful!
Preview URL: https://24954488.greptime-docs.pages.dev
Branch Preview URL: https://docs-sst-data-layout.greptime-docs.pages.dev

View logs

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 83ff57a1f4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread docs/contributor-guide/datanode/storage-engine.md Outdated
React/MDX requires the style prop to be an object, not a string.
A string value made SSR fail and broke the Docusaurus build.

Signed-off-by: evenyag <realevenyag@gmail.com>

@fengjiachun fengjiachun left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@evenyag

evenyag commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

@codex update zh docs and version 1.0 docs.

@chatgpt-codex-connector

Copy link
Copy Markdown

Summary

  • Synced the 1.0 storage engine docs with the SST data layout and scan-pruning sections, including row ordering, internal columns, row-group/file metadata, SST format links, and React object-style image props.
  • Updated 1.0 indexing docs to state that inverted, skipping, and full-text indexes are stored as Puffin blobs.
  • Added 1.0 cross-links from the data model and table design docs to the SST layout documentation.
  • Added matching zh current and zh 1.0 translations for the SST layout, scan pruning, Puffin index storage, and SST-layout cross-links.
  • Confirmed the MDX image style props remain JSX object styles, consistent with Docusaurus MDX guidance that HTML-style string style values should be written as JSX object styles: https://docusaurus.io/docs/3.9.2/markdown-features/react
  • Committed the changes as d81e7ca and created the PR with title docs: sync SST layout docs to zh and 1.0.

Testing

  • pnpm install --frozen-lockfile
  • ⚠️ pnpm build (raw build was blocked by pre-existing untracked Yarn PnP files .pnp.cjs / .pnp.loader.mjs intercepting resolution)
  • mv .pnp.cjs /tmp/docs-pnp.cjs.codexbak && mv .pnp.loader.mjs /tmp/docs-pnp.loader.mjs.codexbak && pnpm build; code=$?; mv /tmp/docs-pnp.cjs.codexbak .pnp.cjs; mv /tmp/docs-pnp.loader.mjs.codexbak .pnp.loader.mjs; exit $code
  • git diff --check -- docs versioned_docs/version-1.0 i18n/zh/docusaurus-plugin-content-docs/current i18n/zh/docusaurus-plugin-content-docs/version-1.0

View task →

@fengjiachun fengjiachun left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fengjiachun fengjiachun merged commit b0a168b into main Jun 9, 2026
5 checks passed
@fengjiachun fengjiachun deleted the docs/sst-data-layout branch June 9, 2026 08:23
evenyag added a commit that referenced this pull request Jun 10, 2026
Main's PR #2531 added authoritative "Data Layout in SST Files" and "Scan Pruning" sections to the storage engine docs. Trim the overlapping explanation in the table design guide and link out to those sections instead. Also fix the storage engine cross-link to the renamed "SST format" section.

Signed-off-by: evenyag <realevenyag@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants