Skip to content

implement translation infrastructure - use portuguese as test example#18

Open
rpetit3 wants to merge 11 commits into
masterfrom
i18n-translations
Open

implement translation infrastructure - use portuguese as test example#18
rpetit3 wants to merge 11 commits into
masterfrom
i18n-translations

Conversation

@rpetit3
Copy link
Copy Markdown
Member

@rpetit3 rpetit3 commented May 18, 2026

Summary

Type of Change

  • Content update (new or modified documentation)
  • Correction (typo, broken link, inaccurate information)
  • Site infrastructure (config, styling, components, CI/CD)
  • Auto-generation (templates, scripts, data files)

Version Impact

  • This change affects the current live version only (no snapshot needed)
  • A version snapshot should be created before merging (new Bactopia release)
  • A snapshot rebuild is needed after merging (fix to snapshotted content)

Checklist

  • Site builds without errors (npm run build)
  • Changes verified in the dev server (npm start)
  • snapshots.json updated (if adding/removing a version)
  • LLM catalog regenerated (make llms-catalog) if pages were added/removed/renamed

Copilot AI review requested due to automatic review settings May 18, 2026 17:52
@netlify
Copy link
Copy Markdown

netlify Bot commented May 18, 2026

Deploy Preview for bactopia-docs ready!

Name Link
🔨 Latest commit ad601ac
🔍 Latest deploy log https://app.netlify.com/projects/bactopia-docs/deploys/6a0b7a44d23b5e0008670f09
😎 Deploy Preview https://deploy-preview-18--bactopia-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds initial translation infrastructure for the Bactopia documentation site, using Portuguese as the first test locale.

Changes:

  • Enables Docusaurus i18n for English and Portuguese with a locale dropdown.
  • Adds Claude-based translation tooling, prompts, glossary post-processing, sync/verify commands, and Makefile targets.
  • Adds initial Portuguese UI translation JSON files and updates acknowledgements for translation attribution.

Reviewed changes

Copilot reviewed 14 out of 32 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
docusaurus.config.ts Enables Portuguese locale and adds locale dropdown.
Makefile Adds translation sync/full/verify targets.
bin/translate/__init__.py Adds package metadata comments for translation tooling.
bin/translate/__main__.py Enables python -m bin.translate.
bin/translate/api.py Adds Claude API integration and retry/continuation logic.
bin/translate/cli.py Adds translation CLI commands for sync, single-file translation, and verification.
bin/translate/config.py Defines translation constants, paths, plugin mappings, and API key validation.
bin/translate/glossary.py Adds glossary-based post-processing and admonition handling.
bin/translate/sync.py Adds source discovery, hash tracking, and orphan detection.
bin/translate/verify.py Adds structural validation checks for translated files.
bin/translate/prompts/general.md Adds general translation rules for Markdown/MDX docs.
bin/translate/prompts/pt.md Adds Brazilian Portuguese-specific translation guidance.
data/translations/pt/glossary.yml Adds Portuguese glossary and protected terms.
i18n/pt/code.json Adds Portuguese theme/UI strings.
i18n/pt/docusaurus-theme-classic/navbar.json Adds Portuguese navbar strings.
i18n/pt/docusaurus-theme-classic/footer.json Adds Portuguese footer strings.
i18n/pt/docusaurus-plugin-content-blog/options.json Adds localized blog metadata.
i18n/pt/docusaurus-plugin-content-docs/current.json Adds localized main docs metadata.
i18n/pt/docusaurus-plugin-content-docs-impact/current.json Adds localized impact docs metadata.
i18n/pt/docusaurus-plugin-content-docs-developers/current.json Adds localized developer docs metadata.
i18n/pt/docusaurus-plugin-content-docs-bactopia-tools/current.json Adds localized tools docs metadata.
i18n/pt/docusaurus-plugin-content-docs-bactopia-pipelines/current.json Adds localized pipelines docs metadata.
templates/acknowledgements.j2 Adds translation infrastructure acknowledgement to generated content.
impact/acknowledgements.md Regenerates acknowledgements with translation attribution.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docusaurus.config.ts Outdated
locales: ['en', 'pt'],
localeConfigs: {
en: { label: 'English', direction: 'ltr' },
pt: { label: 'Portugues', direction: 'ltr' },
rpetit3 and others added 9 commits May 18, 2026 13:16
- Add blog to PLUGIN_MAP with versioned flag for correct i18n path
- Fix sync.py to handle non-versioned plugins (blog has no current/ subdir)
- Add prompt caching to API calls (system prompt cached across all calls)
- Update prompt tone from "professional" to "friendly and approachable"
- Remove __pycache__ from tracking and add to .gitignore

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Translated main documentation pages: index, quick-start, installation,
beginners-guide, full-guide, tutorial, plus developers/cli/bactopia-docs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix orphan removal bug: only scan plugin dirs present in the filtered
file set, preventing --include from deleting other sections' translations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix unclosed code fences: LLM consistently drops trailing ``` at EOF.
Added fix_unclosed_fences to post-processing pipeline to prevent recurrence.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…iles)

Completes all content translations: 317 files across 6 sections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Copy blog post images to i18n directory (relative paths need local copies)
- Fix broken anchor links in PT full-guide and tutorial where translated
  headings generate different slug anchors than the English originals

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Changed full-guide.md to ./full-guide.mdx to match actual filename
and use proper relative path syntax.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Impact & Outreach -> Impacto e Divulgação
Developers -> Desenvolvedores
CLI Reference -> Referência CLI
Modules -> Módulos

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@rpetit3
Copy link
Copy Markdown
Member Author

rpetit3 commented May 18, 2026

@Mxrcon howdy here is the PR I mentioned

@rpetit3
Copy link
Copy Markdown
Member Author

rpetit3 commented May 18, 2026

glossary inconsistencies (LLM chose something other than the glossary)

python -m bin.translate verify --locale pt
  docs/tutorial.md: glossary term 'sequence type' -> 'tipo de sequência' not found in translation
  bactopia-tools/gtdb.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  bactopia-tools/index.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  bactopia-tools/index.mdx: glossary term 'variant calling' -> 'chamada de variantes' not found in translation
  bactopia-tools/mashdist.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  bactopia-tools/mashtree.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  bactopia-tools/merlin.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  bactopia-tools/midas.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  bactopia-tools/tbprofiler.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  developers/modules/bactopia_sketcher.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  developers/modules/csvtk_concat.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  developers/modules/deacon_fetch.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  developers/modules/gtdbtk_classifywf.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  developers/modules/gubbins.mdx: glossary term 'variant calling' -> 'chamada de variantes' not found in translation
  developers/modules/index.mdx: glossary term 'variant calling' -> 'chamada de variantes' not found in translation
  developers/modules/iqtree.mdx: glossary term 'variant calling' -> 'chamada de variantes' not found in translation
  developers/modules/mash_dist.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  developers/modules/midas_download.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  developers/modules/ncbigenomedownload.mdx: glossary term 'variant calling' -> 'chamada de variantes' not found in translation
  developers/subworkflows/checkm2.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  developers/subworkflows/fastani.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  developers/subworkflows/gubbins.mdx: glossary term 'variant calling' -> 'chamada de variantes' not found in translation
  developers/subworkflows/index.mdx: glossary term 'sequence type' -> 'tipo de sequência' not found in translation
  developers/subworkflows/iqtree.mdx: glossary term 'variant calling' -> 'chamada de variantes' not found in translation
  developers/subworkflows/mashdist.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation
  developers/subworkflows/ncbigenomedownload.mdx: glossary term 'variant calling' -> 'chamada de variantes' not found in translation
  developers/subworkflows/sylph.mdx: glossary term 'reference genome' -> 'genoma de referência' not found in translation

Verified 317 files: 291 ok, 26 with issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants