Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 5 additions & 8 deletions docker-compose.dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,15 +46,12 @@ services:
- ${CUSTOM_SCRIPT:-./toktagger/api/run.py}:/app/run.py
- ~/.sal/:/root/.sal
environment:
MONGO_URL: "mongodb://${MONGO_USERNAME}:${MONGO_PASSWORD}@mongo:27017"
UDA_HOST: "uda2.mast.l"
UDA_META_PLUGINNAME: "MASTU_DB"
UDA_METANEW_PLUGINNAME: "MAST_DB"
SAL_HOST: "https://sal.jetdata.eu"
MODEL_STORAGE: "/app/data/models"
API_URL: "http://api_app:8002"
DATABASE_MONGO_URL: "mongodb://${MONGO_USERNAME}:${MONGO_PASSWORD}@mongo:27017"
MODELS_CACHE_DIR: "/app/data/models"
SERVER_HOST: api_app
SERVER_PORT: 8002
SERVER_RELOAD: "true"
CUSTOM_SCRIPT: ${CUSTOM_SCRIPT}
RELOAD: "true"
working_dir: /app
command: ["python", "run.py"]
networks:
Expand Down
13 changes: 5 additions & 8 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,15 +46,12 @@ services:
- ${CUSTOM_SCRIPT:-./toktagger/api/run.py}:/app/run.py
- ~/.sal/:/root/.sal
environment:
MONGO_URL: "mongodb://${MONGO_USERNAME}:${MONGO_PASSWORD}@mongo:27017"
UDA_HOST: "uda2.mast.l"
UDA_META_PLUGINNAME: "MASTU_DB"
UDA_METANEW_PLUGINNAME: "MAST_DB"
MODEL_STORAGE: "/app/data/models"
SAL_HOST: "https://sal.jetdata.eu"
API_URL: "http://api_app:8002"
DATABASE_MONGO_URL: "mongodb://${MONGO_USERNAME}:${MONGO_PASSWORD}@mongo:27017"
MODELS_CACHE_DIR: "/app/data/models"
SERVER_HOST: api_app
SERVER_PORT: 8002
SERVER_RELOAD: "false"
CUSTOM_SCRIPT: ${CUSTOM_SCRIPT}
RELOAD: "false"
working_dir: /app
command: ["python", "run.py"]
networks:
Expand Down
43 changes: 43 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Configuration Options
The following options can be configured within TokTagger to improve your experience. They can either be set via a `toktagger.toml` configuration file in your working directory, or via environment variables. Environment variables will take precidence over settings within the TOML file.

## Server settings
These settings should be defined under the `[server]` heading in the TOML file:

| Setting | Environment Variable | Type | Default | Description |
|-----------------|-------------------------|--------------|-----------------------------------------|--------------------------------------------------------------------------|
| host | SERVER_HOST | str | localhost | Address of the host to launch TokTagger on. |
| port | SERVER_PORT | int | 8002 | The port to use for the TokTagger Rest API. |
| reload | SERVER_RELOAD | bool | False | Whether to hot reload the TokTagger server on changes to files. |
| cache_dir | SERVER_CACHE_DIR | pathlib.Path | ~/.cache/toktagger | The directory to use for storing entries in the Mongita database. |

## Database Settings
These settings should be defined under the `[database]` heading in the TOML file:

| Setting | Environment Variable | Type | Default | Description |
|-----------------|-------------------------|--------------|-----------------------------------------|---------------------------------------------------------------------------------------------|
| mongo_url | DATABASE_MONGO_URL | str | ./toktagger_db | URL of the MongoDB server to connect to as a backend, by default uses local Mongita client. |

## Models Settings
These settings should be defined under the `[models]` heading in the TOML file:

| Setting | Environment Variable | Type | Default | Description |
|-----------------|-------------------------|--------------|-----------------------------------------|--------------------------------------------------------------------------|
| cache_dir | MODELS_CACHE_DIR | pathlib.Path | ~/.cache/toktagger/models | The directory to use for storing ML model weights. |
| max_actors | MODELS_MAX_ACTORS | int | 5 | The maximum number of ML models which can be loaded concurrently. |

## UDA Connection Settings
These settings should be defined under the `[uda]` heading in the TOML file:

| Setting | Environment Variable | Type | Default | Description |
|--------------------|-------------------------|--------------|-----------------------------------------|--------------------------------------------------------------------------|
| host | UDA_HOST | str | uda2.mast.l | Host name for the UDA server to connect to for MAST data loaders. |
| meta_pluginname | UDA_META_PLUGINNAME | str | MASTU_DB | ??? |
| metanew_pluginname | UDA_METANEW_PLUGINNAME | str | MAST_DB | ??? |

## SAL Connection Settings
These settings should be defined under the `[sal]` heading in the TOML file:

| Setting | Environment Variable | Type | Default | Description |
|-----------------|-------------------------|--------------|-----------------------------------------|--------------------------------------------------------------------------|
| host | SAL_HOST | str | https://sal.jetdata.eu | URL for the SAL server to connect to for JET data loaders. |
51 changes: 44 additions & 7 deletions docs/custom_dataloaders.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,6 +294,31 @@ server.run()

Here's an example of loading data from a SQL database:

### Update Config Settings
If your data loader requires configuration inputs from the user, then the `config.Settings` object should be updated to accept this. This takes the form of a [Pydantic Settings object](https://pydantic.dev/docs/validation/latest/concepts/pydantic_settings/#usage), where nested `BaseModels` represent sections inside the `toktagger.toml` configuration file. For example, we can make create a new Settings object which inherits from the one in `toktagger.api.config.py`, and we can add a new `SQL` section where we need the database URL to connect to with our dataloader:
```python
from toktagger.api.config import Settings

class SQL(pydantic.BaseModel):
url: str | None = pydantic.Field(
None,
description="URL of the SQL database to connect to",
)
class UpdatedSettings(Settings):
sql: SQL = pydantic.Field(SQL)
```
Note that this will load settings from the following sources, in the following order:
1. Any values which the `UpdatedSettings` class is initialized with
2. Environment variables, case insensitive, named using the nested model names. Eg for the above setting, it would be `SQL_URL`.
3. Values in the `toktagger.toml` configuration file, with section titles according to nesting. Eg:
```toml
[sql]
url = "sqlite:///./test.db"
```
4. Environment variables provided in a .env file

### Create the DataLoader
We can then create our dataloader, accessing the setting we defined above:
```python
import sqlalchemy as sa
from typing import Type
Expand All @@ -302,18 +327,16 @@ import pydantic
from toktagger.api.core.data_loaders import DataLoader, LoaderRegistry
from toktagger.api.schemas.data import MultiVariateTimeSeriesData, TimeSeriesData, DataParams
from toktagger.api.schemas.samples import ShotData

import toktagger.api.config as config

@LoaderRegistry.register("sql_database")
class SQLDatabaseLoader(DataLoader):
"""DataLoader for retrieving data from a SQL database"""

def __init__(self):
# Initialize database connection
# Connection string should be in environment variable
import os
connection_string = os.environ.get("DATABASE_URL")
self.engine = sa.create_engine(connection_string)
# Connection string should be in the settings object
self.engine = sa.create_engine(config.settings.sql.url)

@classmethod
def sample_data_type(cls) -> Type[ShotData]:
Expand Down Expand Up @@ -353,6 +376,20 @@ class SQLDatabaseLoader(DataLoader):

return MultiVariateTimeSeriesData(values=results)
```
### Launch the Server
To run the server with our custom Settings object and DataLoader, we should create a run script as follows:
```python title="run.py"
from settings import UpdatedSettings
from loader import CSVTimeSeriesLoader
from toktagger.api.main import Server
import toktagger.api.config as config

# Update config.settings to use our new object
config.settings = UpdatedSettings()

server = Server()
server.run()
```

## Using Docker
If you are using the docker compose option to run the server, you can provide a custom script similar to the one above to add your own data loaders. To do this, create a file similar to the one above, but making sure to pass the following arguments into `server.run()`:
Expand All @@ -364,10 +401,10 @@ server.run(
)
```

You can then provide the path to your script when running docker compose. For example, say we have the above script in a file called `custom_toktagger.py` - We simply need to add `CUSTOM_SCRIPT=./custom_toktagger.py` before the docker compose command!
You can then provide the path to your script when running docker compose. For example, say we have the above script in a file called `custom_toktagger.py` - We simply need to add `CUSTOM_SCRIPT=./custom_toktagger.py` before the docker compose command, and a SQL URL as an environment variable:

```sh
CUSTOM_SCRIPT=./custom_toktagger.py docker compose --env-file .env.dev -f docker-compose.dev.yml up --build
CUSTOM_SCRIPT=./custom_toktagger.py SQL_URL=<Your URL> docker compose --env-file .env.dev -f docker-compose.dev.yml up --build
```

!!! tip
Expand Down
3 changes: 3 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,9 @@ toktagger

This will start a local instance of the application running at `http://localhost:8002`.

## Configuration
There are a series of additional options which you can configure to customise the functionality of TokTagger - [find details about these here.](./configuration.md)

## Project Links

- [Git Repo](https://github.com/ukaea/toktagger)
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ dependencies = [
"sal-xarray>=0.2.1",
"bump-my-version>=1.2.7",
"platformdirs>=4.4.0",
"pydantic-settings>=2.11.0",
]
[project.optional-dependencies]
models = [
Expand Down
Loading
Loading