C++ implementation of FL Core + SDK.#705
Conversation
Update C# and python SDKs to use new setup.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Add note about AzureModelCatalog cache-only mode
…ed copilot-cpp-review-report.md to capture. Added some format/lint setup as well.
Fix some issues from the C++ Copilot review changes. Handle no CPU Whisper models being returned by the catalog.
Fix some C# test issues and gaps.
- CI pipeline and packaging fixes - fix issues that came up during this - some code and some test issues - add netstandard support to C# SDK - add winml bootstrapping support - improve memory lifetime management in python and c# SDKs - delete some orphaned files and directory CI should now produce all packages for C++, C# and Python.
Fix some test issues Explicitly cleanup ORT env
…rms (matches ORT and GenAI) Update CI documentation. Minor cleanups
Use 2.0 for sdk_v2 build
| const char* FL_API_T(Status_GetErrorMessage, _In_ const flStatus* status); | ||
|
|
||
| /* Manager lifecycle */ | ||
| FL_API_STATUS(Manager_Create, _In_ const flConfiguration* config, _Outptr_ flManager** out_manager); |
There was a problem hiding this comment.
Does Manager really need to be a Singleton? Could we decouple real process globals from the manager?
If we really feel like the singleton pattern is a better fit (so it owns the ortenv and other process globals), could we document that in the c api?
Another point to note is that Manager_Create called a second time will return an error. So, it becomes difficult for a single process that has two different usages of Foundry Local (maybe as two different plug-ins to an application that share the process) to both acquire the manager. It also becomes difficult to know which instance should release the manager if somehow the manager instance can be shared. Should we consider always returning the manager and internally refcount the number of manager creations and releases?
|
Can we align the naming of the artifacts to be the same as the existing pipeline so that the publishing pipeline can be re-used to publish to ORT-Nightly or Public Feeds? |
|
@baijumeswani noticed the Core package was renamed to Microsoft.AI.Foundry.Local.Runtime. Any particular reason for that? We think that using the existing Microsoft.AI.Foundry.Local.Core naming would make it easier for customers to transition to the new core without them needing to change their package dependencies. |
| typedef int (*flProgressCallback)(float value, void* user_data); | ||
|
|
||
| /** Streaming response event callback. Return 0 to continue, non-zero to cancel. */ | ||
| typedef int (*flStreamingCallback)(flStreamingCallbackData event, void* user_data); |
There was a problem hiding this comment.
Should flStreamingCallbackData be passed in as a const pointer?
| /// Manager_GetDiscoverableEps or Manager destruction. | ||
| FL_API_STATUS(Manager_GetDiscoverableEps, _In_ const flManager* manager, | ||
| _Out_ const char* const** out_names, | ||
| _Out_ const int** out_is_registered, |
There was a problem hiding this comment.
Some bool types are represented as int* input arguments. While some other methods seem to use bool in their return types? Shall we keep bool where it makes sense to instead of using ints?
| bool FL_API_T(Manager_IsEpDownloadInProgress, _In_ const flManager* manager); | ||
|
|
||
| /// Begin graceful shutdown. Safe to call from any thread. Idempotent. | ||
| FL_API_STATUS(Manager_Shutdown, _In_ flManager* manager); |
There was a problem hiding this comment.
Could we add a note about the differences between shutdown and manager_release?
I think having both makes sense. But to a consumer, it might be meaningful to understand when to call shutdown.
| bool FL_API_T(Manager_IsEpDownloadInProgress, _In_ const flManager* manager); | ||
|
|
||
| /// Begin graceful shutdown. Safe to call from any thread. Idempotent. | ||
| FL_API_STATUS(Manager_Shutdown, _In_ flManager* manager); |
There was a problem hiding this comment.
On another note, does Manager_Release also release process globals? Can one re-create a manager after releasing it?
| FL_API_STATUS(Status_Create, flErrorCode error_code, _In_ const char* error_msg); | ||
| FL_TYPE_RELEASE(Status); | ||
| flErrorCode FL_API_T(Status_GetErrorCode, _In_ const flStatus* status); | ||
| const char* FL_API_T(Status_GetErrorMessage, _In_ const flStatus* status); |
There was a problem hiding this comment.
Could we add a note on the lifetime of the const char* (for all such functions that return a const char*)?
| _Out_ const char* const** out_names, | ||
| _Out_ const int** out_is_registered, |
There was a problem hiding this comment.
Instead of returning names and is_registered separately, could we return a type here (a list of epinfo). Then expose functions on each epinfo for name, is registered (other fields in the future)?
The list of epinfo can be owned by the manager and can also be made threadsafe with this change?
C++ implementation of FL Core + SDK.
C# and python SDKs to use new setup.
Currently in sdk_v2 for ease of comparison. Will replace sdk implementation pre-checkin.
To test locally build the c++ sdk first (sdk_v2/cpp/build.{sh|bat}.
C# and python bindings look for the default RelWithDebInfo build from that.
CI produces nuget package that can also be used for testing. Not published to a feed yet though.
Various design decisions are captured in sdk_v2/cpp/docs for reference.
The current implementation uses a static catalog listing. Will change to live catalog when possible. Easily updated if we have additional models to add.
Set the TEST_MODEL_CACHE_DIR environment variable to point to test-data-shared if you have that locally and want to avoid some model downloads.