Skip to content

Beyond the tutorial

For the complete documentation index see: llms.txt

All documentation pages available in markdown.

You built a full ML pipeline: feature store, training, and serving. Inference was fast, as you’d expect with a small model. Here’s what changes in larger deployments and how this pipeline adapts.

Real models are more complex

The tutorial model uses three features. A production smart dispatch model would balance many more signals to make accurate predictions. Here’s what a realistic feature set might look like:

FeatureSourceWhy it matters
decline_rateHistoricalBaseline reliability signal
avg_ratingHistoricalProxy for driver quality and engagement
trips_todayReal-timeCurrent workload and fatigue
hours_on_shiftReal-timeFatigue accumulates over a shift
consecutive_tripsReal-timeNo-break streaks increase decline likelihood
distance_to_pickupPer-requestLonger pickups are more likely to be declined
rider_ratingPer-requestDrivers may avoid low-rated riders
is_peak_hourTemporalBehavior varies during rush vs off-peak
surge_multiplierPer-requestFinancial incentive reduces decline likelihood
estimated_trip_durationPer-requestShort vs long trips have different decline profiles
acceptance_rate_1hReal-timeRecent behavior vs historical average
total_lifetime_tripsHistoricalExperience level and platform commitment

Why not just use decline rate?

With only the simple model, sorting drivers by decline_rate gives you roughly the same answer as the model. But with a richer feature set, the model sees what a single metric can’t:

Driver A has a 4% historical decline rate, which looks strong on paper. But right now:

FeatureValue
decline_rate0.04
trips_today19
hours_on_shift11
consecutive_trips6
distance_to_pickup14 min

This driver has been going nonstop for 11 hours. The model recognizes the fatigue pattern and predicts high risk right now, despite the low historical rate.

A naive sort by decline rate would rank this driver near the top. The model knows better.

The model captures interaction effects between features that no single metric can represent. The pipeline handles this with no architectural changes: define the new features in Part 1, train on them in Part 2, and serve them in Part 3. The get_feature_vector() call returns more keys.

Adding features without downtime

Suppose the data science team wants to predict driver churn and needs a new feature: days_since_last_trip from a driver-engagement pipeline.

Does this require a migration? Do you need to coordinate a deployment across teams? What happens to the features that are already there?

Because Aerospike is schemaless and the Entity class builds its schema dynamically, the answer is straightforward:

  1. Register the Feature metadata: Create a new Feature record for driver-engagement_days_since_last_trip with its type, description, and tags.

  2. Update the pipeline: The driver-engagement pipeline starts computing the new feature and writing it using saveDF. For each driver, it adds (or updates) the de_last_trip bin.

  3. Existing features are untouched: Aerospike merges the new bin into each driver’s existing record. The driver-stats features stay exactly as they were.

  4. Consumers opt in: Code that reads entities includes the new column in its schema when it wants the new feature. Existing code continues working unchanged.

In most cases, this avoids blocking migrations and downtime for existing consumers. Different teams can independently add features to the same entity type without breaking existing pipelines, as long as they keep feature definitions documented.

Real datasets are much larger

The tutorial used 100 drivers. The average city has about 8 drivers per 1,000 residents, so a medium to large city probably has several thousand drivers. For each incoming ride request, the platform scores every nearby candidate, potentially 20 to 50 drivers, and must return a ranked list within its latency budget.

If feature retrieval takes even 5 milliseconds per driver in a busy area, that’s 250 milliseconds to read features, even before your models do inference. At peak times, there are probably hundreds of ride requests per second, per city. That latency compounds quickly.

In the next section, you’ll generate a larger dataset with more features, write it to Aerospike, and measure whether feature retrieval time changes.

Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?