Putting it all together

For the complete documentation index see: llms.txt

All documentation pages available in markdown.

You now have a complete training result. You defined a training slice, materialized it, trained a model, tested it on held-out data, and saved the artifact for reuse. In this tutorial run, the model predicts label from ds_decl_rate, ds_avg_rating, and da_trips_today, and test accuracy often lands in the low-to-mid 90% range. Exact values can vary slightly by Spark version and partitioning.

The model fitting itself happened in Spark MLlib, but Aerospike remained the data backbone for the workflow. Aerospike stored the feature/entity records you trained from, and it stored the Dataset definition that documents exactly which columns and filters were used. That means your training slice is reproducible: teammates can load the same dataset definition and materialize the same dataset definition again.

Continue to Part 3

Part 3 takes this offline model and turns it into a real-time serving path. You will load the saved model, fetch feature values from Aerospike by key, and return live predictions fast enough for request-time decisions. Continue to Part 3: Model Serving and resume the notebook at Cell 14.