Saving and loading models
For the complete documentation index see: llms.txt
All documentation pages available in markdown.
Trained models live in memory unless you persist them. Here, you’ll save the model artifact that Part 3 will later serve. This artifact file contains the parameters learned during training. It holds the model weights and related metadata. Reloading it gives you the same trained model without retraining.
Save, reload, and score the model
- Run
Cell 13to save, reload, and score with the persisted model.
Cell 13: Save, reload, and score with the persisted model
import osfrom pyspark.ml.classification import LogisticRegressionModel
os.makedirs("./models", exist_ok=True)model_path = "./models/trip_decline_risk_lr"
lr_model.write().overwrite().save(model_path)lr_model_loaded = LogisticRegressionModel.load(model_path)
loaded_predictions = lr_model_loaded.transform(test_df)loaded_predictions.select("driver_id", "label", "prediction").show(5)
print(f"Saved model to {model_path}")print("Reloaded model and scored test rows")+----------+-----+----------+| driver_id|label|prediction|+----------+-----+----------+|driver_007| 0| 0.0||driver_012| 1| 1.0||driver_019| 0| 0.0||driver_023| 0| 0.0||driver_031| 1| 1.0|+----------+-----+----------+Saved model to ./models/trip_decline_risk_lrReloaded model and scored test rowsSaving and loading confirms your training output is portable and can be restored later for inference. No new learning happens during load; this is model deserialization followed by normal scoring with the restored artifact.
Why this matters for Part 3
In Part 3, you’ll combine this saved model with live feature retrieval from Aerospike to build an end-to-end serving flow.
You now have the training artifact needed for model serving.