Beyond the tutorial

For the complete documentation index see: llms.txt

All documentation pages available in markdown.

You built a full ML pipeline: feature store, training, and serving. Inference was fast, as you’d expect with a small model. Here’s what changes in larger deployments and how this pipeline adapts.

Real models are more complex

The tutorial model uses three features. A production smart dispatch model would balance many more signals to make accurate predictions. Here’s what a realistic feature set might look like:

Feature	Source	Why it matters
`decline_rate`	Historical	Baseline reliability signal
`avg_rating`	Historical	Proxy for driver quality and engagement
`trips_today`	Real-time	Current workload and fatigue
`hours_on_shift`	Real-time	Fatigue accumulates over a shift
`consecutive_trips`	Real-time	No-break streaks increase decline likelihood
`distance_to_pickup`	Per-request	Longer pickups are more likely to be declined
`rider_rating`	Per-request	Drivers may avoid low-rated riders
`is_peak_hour`	Temporal	Behavior varies during rush vs off-peak
`surge_multiplier`	Per-request	Financial incentive reduces decline likelihood
`estimated_trip_duration`	Per-request	Short vs long trips have different decline profiles
`acceptance_rate_1h`	Real-time	Recent behavior vs historical average
`total_lifetime_trips`	Historical	Experience level and platform commitment

Why not just use decline rate?

With only the simple model, sorting drivers by decline_rate gives you roughly the same answer as the model. But with a richer feature set, the model sees what a single metric can’t:

Driver A has a 4% historical decline rate, which looks strong on paper. But right now:

Feature	Value
`decline_rate`	0.04
`trips_today`	19
`hours_on_shift`	11
`consecutive_trips`	6
`distance_to_pickup`	14 min

This driver has been going nonstop for 11 hours. The model recognizes the fatigue pattern and predicts high risk right now, despite the low historical rate.

A naive sort by decline rate would rank this driver near the top. The model knows better.

Driver B has a 14% historical decline rate, which looks less reliable. But right now:

Feature	Value
`decline_rate`	0.14
`trips_today`	1
`hours_on_shift`	0.5
`consecutive_trips`	0
`distance_to_pickup`	2 min

This driver just started their shift and is close to the pickup. The model predicts low risk right now, despite the elevated historical rate.

A naive sort would bury this driver at the bottom. The model recognizes the favorable conditions.

Driver C has a 33% decline rate, the highest in the group polled. But they’ve only completed 3 trips total, meaning they only declined a single request.

Feature	Value
`decline_rate`	0.33
`total_lifetime_trips`	3
`avg_rating`	4.9
`hours_on_shift`	1.0

With more features, the model recognizes that new drivers with few trips have unreliable rates. The high rating and short shift suggest this driver is fine.

Naive sorting permanently penalizes new drivers with small sample sizes.

The model captures interaction effects between features that no single metric can represent. The pipeline handles this with no architectural changes: define the new features in Part 1, train on them in Part 2, and serve them in Part 3. The get_feature_vector() call returns more keys.

Adding features without downtime

Suppose the data science team wants to predict driver churn and needs a new feature: days_since_last_trip from a driver-engagement pipeline.

Does this require a migration? Do you need to coordinate a deployment across teams? What happens to the features that are already there?

Because Aerospike is schemaless and the Entity class builds its schema dynamically, the answer is straightforward:

Register the Feature metadata: Create a new Feature record for driver-engagement_days_since_last_trip with its type, description, and tags.
Update the pipeline: The driver-engagement pipeline starts computing the new feature and writing it using saveDF. For each driver, it adds (or updates) the de_last_trip bin.
Existing features are untouched: Aerospike merges the new bin into each driver’s existing record. The driver-stats features stay exactly as they were.
Consumers opt in: Code that reads entities includes the new column in its schema when it wants the new feature. Existing code continues working unchanged.

In most cases, this avoids blocking migrations and downtime for existing consumers. Different teams can independently add features to the same entity type without breaking existing pipelines, as long as they keep feature definitions documented.

Real datasets are much larger

The tutorial used 100 drivers. The average city has about 8 drivers per 1,000 residents, so a medium to large city probably has several thousand drivers. For each incoming ride request, the platform scores every nearby candidate, potentially 20 to 50 drivers, and must return a ranked list within its latency budget.

If feature retrieval takes even 5 milliseconds per driver in a busy area, that’s 250 milliseconds to read features, even before your models do inference. At peak times, there are probably hundreds of ride requests per second, per city. That latency compounds quickly.

In the next section, you’ll generate a larger dataset with more features, write it to Aerospike, and measure whether feature retrieval time changes.