ML Extensions
The ML Extensions specification defines pipeline-level attributes for AI/ML workloads. It complements the ML Resources extension (which handles compute infrastructure) with semantic metadata about the ML task itself.
| Extension | Prefix | Focus |
|---|---|---|
| ML Resources | ext_ml_ | GPU/TPU/CPU requirements, scheduling, preemption |
| ML Extensions | ext_mlx_ | Model metadata, training, inference, evaluation |
Model Metadata
Section titled “Model Metadata”Describe the model a job operates on, including framework and I/O schemas.
| Attribute | Type | Description |
|---|---|---|
ext_mlx_model_name | string | Human-readable model name |
ext_mlx_model_version | string | Semantic version of the model artifact |
ext_mlx_model_framework | string | pytorch, tensorflow, onnx, jax, sklearn, xgboost, custom |
ext_mlx_model_input_schema | object | JSON Schema for expected model input |
ext_mlx_model_output_schema | object | JSON Schema for expected model output |
ext_mlx_experiment_id | string | Experiment tracker ID |
ext_mlx_run_id | string | Run ID within the experiment |
Training Jobs
Section titled “Training Jobs”Training attributes describe datasets, hyperparameters, and output locations.
{ "type": "ml.train", "ext_ml_gpu_type": "nvidia-a100", "ext_ml_gpu_count": 4, "ext_ml_precision": "bf16", "ext_mlx_model_name": "llama-3.1-8b-custom", "ext_mlx_model_framework": "pytorch", "ext_mlx_training_dataset_uri": "s3://datasets/train.parquet", "ext_mlx_training_hyperparameters": { "learning_rate": 2e-5, "batch_size": 16, "weight_decay": 0.01, "seed": 42 }, "ext_mlx_training_epochs": 3, "ext_mlx_training_output_model_uri": "s3://models/custom/v1.0.0/"}| Attribute | Type | Description |
|---|---|---|
ext_mlx_training_dataset_uri | string | Training dataset URI |
ext_mlx_training_validation_uri | string | Validation dataset URI |
ext_mlx_training_hyperparameters | object | Key-value map of hyperparameters |
ext_mlx_training_epochs | integer | Number of training epochs |
ext_mlx_training_checkpoint_uri | string | URI for saving checkpoints |
ext_mlx_training_resume_from | string | Checkpoint URI to resume from |
ext_mlx_training_output_model_uri | string | Output model artifact URI |
Inference Jobs
Section titled “Inference Jobs”Configure how a trained model processes input data.
{ "type": "ml.inference", "ext_ml_gpu_type": "nvidia-l4", "ext_ml_gpu_count": 1, "ext_mlx_model_name": "intent-classifier", "ext_mlx_model_framework": "onnx", "ext_mlx_inference_model_uri": "s3://models/classifier/v3.1.0/model.onnx", "ext_mlx_inference_batch_size": 256, "ext_mlx_inference_input_uri": "s3://data/input.jsonl", "ext_mlx_inference_output_uri": "s3://data/output.jsonl", "ext_mlx_inference_mode": "batch"}| Attribute | Type | Description |
|---|---|---|
ext_mlx_inference_model_uri | string | Model artifact URI |
ext_mlx_inference_batch_size | integer | Inputs per batch |
ext_mlx_inference_timeout_ms | integer | Per-request timeout (ms) |
ext_mlx_inference_input_uri | string | Input data URI (batch mode) |
ext_mlx_inference_output_uri | string | Output data URI |
ext_mlx_inference_mode | string | batch, streaming, or realtime |
Data Preprocessing Jobs
Section titled “Data Preprocessing Jobs”Describe data transformations that prepare raw data for training or inference.
| Attribute | Type | Description |
|---|---|---|
ext_mlx_preprocess_input_uri | string | Raw input data URI |
ext_mlx_preprocess_output_uri | string | Processed output URI |
ext_mlx_preprocess_input_format | string | csv, parquet, json, jsonl, tfrecord, arrow, custom |
ext_mlx_preprocess_output_format | string | Output format (same enum) |
ext_mlx_preprocess_transformations | array | Ordered list of transformations |
ext_mlx_preprocess_split_ratios | object | Train/validation/test split ratios |
Transformation types: tokenize, normalize, augment, filter, sample, deduplicate, encode.
Evaluation Jobs
Section titled “Evaluation Jobs”Assess model quality with metrics and thresholds that can gate deployment.
{ "type": "ml.evaluate", "ext_mlx_eval_dataset_uri": "s3://data/test.parquet", "ext_mlx_eval_model_uri": "s3://models/v1.0.0/", "ext_mlx_eval_metrics": ["accuracy", "f1", "latency_p99"], "ext_mlx_eval_thresholds": { "accuracy": { "min": 0.92 }, "f1": { "min": 0.88 }, "latency_p99": { "max": 100 } }}| Attribute | Type | Description |
|---|---|---|
ext_mlx_eval_dataset_uri | string | Evaluation dataset URI |
ext_mlx_eval_model_uri | string | Model to evaluate |
ext_mlx_eval_metrics | array | Metrics to compute |
ext_mlx_eval_thresholds | object | Min/max thresholds per metric |
ext_mlx_eval_output_uri | string | Results output URI |
ext_mlx_eval_baseline_run_id | string | Baseline run for comparison |
Well-known metrics: accuracy, f1, precision, recall, auc_roc, loss, perplexity, bleu, rouge_l, mae, rmse, latency_p50, latency_p99, throughput.
Artifact Management
Section titled “Artifact Management”Unified artifact references for models, datasets, and checkpoints.
| Attribute | Type | Description |
|---|---|---|
ext_mlx_artifact_model_uri | string | Primary model artifact URI |
ext_mlx_artifact_dataset_uri | string | Primary dataset URI |
ext_mlx_artifact_checkpoint_uri | string | Checkpoint artifact URI |
ext_mlx_artifact_output_uri | string | Primary output artifact URI |
ext_mlx_artifact_registry | string | s3, gcs, azure_blob, mlflow, wandb, huggingface, local |
Supported URI schemes: s3://, gs://, az://, hdfs://, mlflow://, wandb://, hf://, file://.
Scheduling Hints
Section titled “Scheduling Hints”Advisory attributes for optimizing job placement.
| Attribute | Type | Description |
|---|---|---|
ext_mlx_scheduling_gpu_affinity | string | spread, pack, dedicated |
ext_mlx_scheduling_preemption_policy | string | never, save_and_retry, immediate |
ext_mlx_scheduling_spot_preference | string | spot_preferred, on_demand_only, any |
ext_mlx_scheduling_data_locality | string | URI hint for data-local scheduling |
Pipeline Example
Section titled “Pipeline Example”A complete ML pipeline expressed as an OJS workflow chain:
{ "workflow_type": "chain", "id": "019539a4-ml-pipeline-001", "name": "classifier-training-pipeline", "steps": [ { "type": "ml.preprocess", "queue": "cpu-workers", "ext_ml_accelerator": "cpu", "ext_mlx_preprocess_input_uri": "s3://raw/tickets.jsonl", "ext_mlx_preprocess_output_uri": "s3://processed/tickets-v3/" }, { "type": "ml.train", "queue": "gpu-training", "ext_ml_gpu_type": "nvidia-a100", "ext_ml_gpu_count": 4, "ext_mlx_training_dataset_uri": "s3://processed/tickets-v3/train.parquet", "ext_mlx_training_output_model_uri": "s3://models/v1.0.0/" }, { "type": "ml.evaluate", "queue": "gpu-inference", "ext_ml_gpu_count": 1, "ext_mlx_eval_model_uri": "s3://models/v1.0.0/", "ext_mlx_eval_thresholds": { "accuracy": { "min": 0.92 } } }, { "type": "ml.deploy", "queue": "deployment", "ext_ml_accelerator": "cpu" } ]}Each step has independent resource requirements — the backend schedules each on appropriate hardware as the chain progresses. If evaluation thresholds are not met, the pipeline stops before deployment.