Skip to content

C++ implementation of FL Core + SDK.#705

Open
skottmckay wants to merge 14 commits into
mainfrom
skottmckay/CppCore_PR
Open

C++ implementation of FL Core + SDK.#705
skottmckay wants to merge 14 commits into
mainfrom
skottmckay/CppCore_PR

Conversation

@skottmckay
Copy link
Copy Markdown
Collaborator

@skottmckay skottmckay commented May 11, 2026

C++ implementation of FL Core + SDK.
C# and python SDKs to use new setup.
Currently in sdk_v2 for ease of comparison. Will replace sdk implementation pre-checkin.

To test locally build the c++ sdk first (sdk_v2/cpp/build.{sh|bat}.
C# and python bindings look for the default RelWithDebInfo build from that.

CI produces nuget package that can also be used for testing. Not published to a feed yet though.

Various design decisions are captured in sdk_v2/cpp/docs for reference.

The current implementation uses a static catalog listing. Will change to live catalog when possible. Easily updated if we have additional models to add.

Set the TEST_MODEL_CACHE_DIR environment variable to point to test-data-shared if you have that locally and want to avoid some model downloads.

Update C# and python SDKs to use new setup.
Copilot AI review requested due to automatic review settings May 11, 2026 09:41
@vercel
Copy link
Copy Markdown

vercel Bot commented May 11, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
foundry-local Ready Ready Preview, Comment May 21, 2026 6:29am

Request Review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of files (300). Try reducing the number of changed files and requesting a review from Copilot again.

Add note about AzureModelCatalog cache-only mode
…ed copilot-cpp-review-report.md to capture.

Added some format/lint setup as well.
Fix some issues from the C++ Copilot review changes.
Handle no CPU Whisper models being returned by the catalog.
Fix some C# test issues and gaps.
- CI pipeline and packaging fixes
- fix issues that came up during this
  - some code and some test issues
- add netstandard support to C# SDK
- add winml bootstrapping support
- improve memory lifetime management in python and c# SDKs
- delete some orphaned files and directory

CI should now produce all packages for C++, C# and Python.
Fix some test issues
Explicitly cleanup ORT env
fix a couple of test model ids
…rms (matches ORT and GenAI)

Update CI documentation.
Minor cleanups
Comment thread sdk_v2/cpp/examples/basic_chat/main.cc
Comment thread sdk_v2/cpp/examples/basic_chat/main.cc
Comment thread sdk_v2/cpp/examples/embeddings/main.cc
Comment thread sdk_v2/cpp/include/foundry_local/foundry_local_c.h
Comment thread sdk_v2/python/_build_backend/__init__.py
Comment thread sdk_v2/python/src/foundry_local_sdk/_native/api.py
Use 2.0 for sdk_v2 build
const char* FL_API_T(Status_GetErrorMessage, _In_ const flStatus* status);

/* Manager lifecycle */
FL_API_STATUS(Manager_Create, _In_ const flConfiguration* config, _Outptr_ flManager** out_manager);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Manager really need to be a Singleton? Could we decouple real process globals from the manager?

If we really feel like the singleton pattern is a better fit (so it owns the ortenv and other process globals), could we document that in the c api?

Another point to note is that Manager_Create called a second time will return an error. So, it becomes difficult for a single process that has two different usages of Foundry Local (maybe as two different plug-ins to an application that share the process) to both acquire the manager. It also becomes difficult to know which instance should release the manager if somehow the manager instance can be shared. Should we consider always returning the manager and internally refcount the number of manager creations and releases?

@prathikr
Copy link
Copy Markdown
Contributor

Can we align the naming of the artifacts to be the same as the existing pipeline so that the publishing pipeline can be re-used to publish to ORT-Nightly or Public Feeds?

https://github.com/ai-platform-microsoft/onnxruntime-release-pipelines/blob/95661fb56d66a0bdc874c87e9a70b7d4897c3c51/foundry-local/foundry-local-publishing.yml#L6

@prathikr
Copy link
Copy Markdown
Contributor

@baijumeswani noticed the Core package was renamed to Microsoft.AI.Foundry.Local.Runtime. Any particular reason for that? We think that using the existing Microsoft.AI.Foundry.Local.Core naming would make it easier for customers to transition to the new core without them needing to change their package dependencies. 

typedef int (*flProgressCallback)(float value, void* user_data);

/** Streaming response event callback. Return 0 to continue, non-zero to cancel. */
typedef int (*flStreamingCallback)(flStreamingCallbackData event, void* user_data);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should flStreamingCallbackData be passed in as a const pointer?

/// Manager_GetDiscoverableEps or Manager destruction.
FL_API_STATUS(Manager_GetDiscoverableEps, _In_ const flManager* manager,
_Out_ const char* const** out_names,
_Out_ const int** out_is_registered,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some bool types are represented as int* input arguments. While some other methods seem to use bool in their return types? Shall we keep bool where it makes sense to instead of using ints?

bool FL_API_T(Manager_IsEpDownloadInProgress, _In_ const flManager* manager);

/// Begin graceful shutdown. Safe to call from any thread. Idempotent.
FL_API_STATUS(Manager_Shutdown, _In_ flManager* manager);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a note about the differences between shutdown and manager_release?
I think having both makes sense. But to a consumer, it might be meaningful to understand when to call shutdown.

bool FL_API_T(Manager_IsEpDownloadInProgress, _In_ const flManager* manager);

/// Begin graceful shutdown. Safe to call from any thread. Idempotent.
FL_API_STATUS(Manager_Shutdown, _In_ flManager* manager);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On another note, does Manager_Release also release process globals? Can one re-create a manager after releasing it?

FL_API_STATUS(Status_Create, flErrorCode error_code, _In_ const char* error_msg);
FL_TYPE_RELEASE(Status);
flErrorCode FL_API_T(Status_GetErrorCode, _In_ const flStatus* status);
const char* FL_API_T(Status_GetErrorMessage, _In_ const flStatus* status);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a note on the lifetime of the const char* (for all such functions that return a const char*)?

Comment on lines +574 to +575
_Out_ const char* const** out_names,
_Out_ const int** out_is_registered,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of returning names and is_registered separately, could we return a type here (a list of epinfo). Then expose functions on each epinfo for name, is registered (other fields in the future)?

The list of epinfo can be owned by the manager and can also be made threadsafe with this change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants