Every missed ETA triggers a chain reaction: delayed docks, idle crews, disrupted production, and disappointed customers. Accurate arrival times are critical for supply chain efficiency because they enable better planning, coordination, and resource allocation throughout the entire logistics network.
Reliable ETAs enable companies to anticipate delays, cut down waiting times, optimize warehouse and dock operations, and deliver to customers on schedule. The result is higher customer satisfaction, reduced operational costs, and fewer disruptions like production stoppages or stockouts. The challenge is that predicting ETAs for truckload freight requires robust machine learning models that capture the many nuances of features influencing ETA, including weather, traffic, seasonality, lane variance, operational nuances and even human behavior.
Unpacking the truckload ETA problem
Unlike parcel or small package moves, a truckload journey is far more variable than “time on the road.” A shipment timeline includes:
- Drive time between stops.
- Dwell time at facilities (like warehouses or yards) waiting for loading, unloading, or paperwork.
- Short breaks for fuel, meals, or minor rests.
- Long breaks for overnight or long rests.

On short trips, a single facility delay can throw off the timeline. On long-haul runs, mandated rest or unexpected dwell can matter even more. And those are just the controllable elements – traffic jams, weather events, and yard congestion all add another layer of uncertainty.
Compounding the problem are data quality gaps (spotty GPS pings, missing milestone updates), lane-level variability across thousands of routes, and operational differences among carriers and facilities. A simplistic distance-based model can’t capture this complexity.
How we measure FTL ETA
To measure ETA accuracy, we track two complementary metrics: Actual Hours Out and Predicted Hours Out.
Together, these measures highlight not only how accurate the ETA was but also how much advance notice it provided, balancing precision with practical usefulness in real-world operations.

Why is it so hard to predict an accurate ETA?
Achieving more accurate ETAs requires systematically addressing the issues outlined below:
- Data quality issues: Gaps in ping coverage, low-quality location signals, and incomplete milestone updates reduce reliability.
- Lane & shipment variability: Routes and trip lengths vary widely, and intermodal legs (e.g., truck–rail–truck) add complexity. Each lane behaves differently, making it difficult to build a single model that performs well across a diverse network
- Unscheduled Dwell: Driver rest, facility delays, or yard congestion can drastically extend trips in unpredictable ways.
- External Disruptions: Weather events and traffic congestion are highly variable and difficult to forecast.
- Operational Nuances: Differences in carrier behavior, driver habits, and poor visibility into facility operating hours all shape outcomes.
- Systematic Error Drivers: Dwell-heavy areas, missed milestones, route deviations, and data discrepancies tend to produce persistent ETA inaccuracies.
These challenges highlight why ETA modeling requires robust data pipelines, adaptive models, continuous error analysis, and monitoring to stay accurate in real-world logistics.
From raw data to reliable ETAs: An end-to-end view
To understand how ETA predictions are made, you need to start with the journey itself. First, the truck moves from origin to destination, then as events, like a position update or milestone completion occur, that information is transmitted to the project44 tracking system. This data is captured in three primary ways:
- ELD devices installed on the truck, which continuously emit pings.
- EDI/API integrations directly with the carrier.
- Drive View mobile app, which shares location data from the driver’s phone.
All of these signals are ingested into the project44 tracking platform and then passed to the ETA service, where data science models generate predictions. Each time a new update is received, the ETA service recalculates the prediction using multiple proprietary models, including regressors, classifiers, transformers, and error correction frameworks, powered by over 150 input features.

The predicted ETA is then sent back to the project44 platform, displayed in the tracking UI, and delivered to customers via webhooks for real-time visibility.

How project44 approaches ETA differently
Over the past year, we’ve experimented with and deployed a wide range of models across our large tenant base of 500+ FTL shippers. In total, we shipped more than 10 production-ready models, each designed to capture different aspects of variability in truckload freight. The result is a layered ETA engine that adapts to real-world complexity and consistently delivers higher accuracy in ETA prediction.
1. Ensemble model
Short- and long-haul Short-haul and long-haul trips behave very differently. Short hauls (<200 km) are usually completed within a day, often with a single driver, and with minimal breaks. Long hauls (>200 km), however, often span multiple days and include significant, mandatory, rest periods that are harder to anticipate and model. By training specialized models for each and combining their predictions into a unified output, we captured the distinct behaviors of both trip types to improve prediction accuracy.

2. Classifier model
While the ensemble model improved ETA predictions, it still struggled in scenarios with high variability like, inconsistent dwell behaviors or lane-level difference because it produced a single point estimate. To address this, we designed a classifier-based approach. Instead of predicting one exact ETA, the model starts with a reference ETA (such as a historical median transit time or the midpoint of an appointment window). We then calculate the residuals (differences between actual ETAs and the reference) and bucket those residuals into ranges.
The classifier’s job is to predict the bucket (or class) the final ETA falls into. Once the bucket is identified (e.g., BKT2), the ETA is computed as:
Final ETA = Reference ETA + Midpoint of the Predicted Bucket
This bucketed approach allowed the model to better capture variability and delivered reasonable lift in accuracy when deployed.

3. Transformer model
Deep learning offers clear advantages for ETA prediction, particularly the ability to model complex patterns at scale. Transformer models, trained on nearly a year of shipment data, were a natural fit. Their key strength is the attention mechanism, which allows the model to focus on the most relevant signals, like traffic spikes, dwell hotspots, or route deviations, that disproportionately impact ETA accuracy.
Unlike tree-based methods that treat features independently, transformers dynamically weigh the importance of each ping and event in context. This capability delivered a significant accuracy lift and allowed the model to generalize across thousands of lanes. Because of these gains, transformers are now our baseline architecture for future ETA development.
4. Error correction Model
Limitations in pure drive-time models, such as dwell, disruptions, and facility delays, can be addressed by learning a residual error. We started with a pseudo-deterministic HERE Maps baseline to estimate pure drive time under ideal conditions. Then, a secondary machine learning model predicts the residual delays caused by dwell, carrier behaviors, or disruption events. Layering error correction on top of the transformer delivered an additional 8–10 percentage point accuracy gain.

What we achieved
Over the past year, we’ve deployed more than ten production-ready models across 500+ FTL shippers. The outcome: a +28 percentage-point improvement in accuracy at the 10-hour horizon (±2-hour window).
For shippers, that improvement means more reliable planning, fewer surprises, and enough lead time to make proactive decisions, whether that’s reallocating dock labor, adjusting inventory, or rerouting freight to avoid disruptions.
Performance of key customers
While a global ETA model works for many shippers, some customers operate with unique patterns that require tailored approaches. A few examples show how flexible modeling delivers real-world impact:
High-volume short-haul shipper (Europe):
This customer had very high full-truckload volumes on a daily basis. The trips were mostly undertaken by the same drivers along consistent routes, resulting in low variability. We observed a very tight distribution of HERE Maps-based residuals and deployed a transformer model for predicting residuals. This solution resulted in significant gains, improving accuracy by 20-25 percentage points.
Multi-stop shipper (North America):
This customer was unique in that their shipments involved multi-stop trips, sometimes up to 14 stops. This behavior was quite different from the scenarios in which our main model was trained. We experimented with several approaches and ultimately developed a heuristic model to predict dwell times at each stop. The combined prediction, leveraging HERE Maps transit times and heuristic-based scheduled dwell models, unlocked over 35 percentage points improvement in accuracy.
Long-haul overnight carrier (North America):
This key customer operated very long, overnight trips with extended dwell times at scheduled locations. We developed a dedicated model by grouping tenants with similar long-trip and multi-stop patterns. We engineered features related to dwell time, such as median dwell duration and lane-specific dwell characteristics. This specialized model yielded an overall accuracy improvement of 8 percentage points.
What have we learned so far?
01
Data quality drives accuracy:
When location info is frequent, accurate, and widespread, arrival times are more reliable. Weak data leads to poor performance.
02
Drift is constant: Supply chains shift weekly, so adopting regular retraining and drift monitoring are essential to keep models aligned with the latest patterns.
03
No more “one-size-fits-all”: Dedicated models for intermodal (rail + truck), multi-stop trips and short vs long-haul scenarios provided notable improvements.
04
Transformers were a step-change: Investing in transformer architectures, and upgrading infra to train larger models, materially improved generalization and performance.
05
Addressing unplanned events matters a lot: Unscheduled/long or weekend dwell, and missed/shifted appointments are major drivers of error, requiring careful feature engineering.
06
Error-correction adds lift: Alternate modeling techniques like error-correction approaches (e.g., modeling residuals over HERE/plan baselines) delivered sizable gains in ETA accuracy.
07
Strong foundations amplify everything: Proactive monitoring, retraining hygiene, and larger historical corpora have all compounded model improvements.
Looking ahead
Even with a +28pp lift in accuracy, we’re not stopping here. Several initiatives are already in motion to push ETA performance even further:
- Tenant-aware post-processing. We’re adding a lightweight layer that adapts a single global model to tenant cohorts—fine-tuning outputs for groups of similar shippers without spinning up separate models for each.
- Tenant Profiler. A new module will cluster similar tenants using shared operating patterns (lanes, service levels, dwell behavior), powering the post-processing layer above.
- Distributed training at scale. We’re investing in a distributed training framework to parallelize learning across nodes, so the model can train on a much larger historical corpus, converge faster, and generalize better.
- Deeper explainability. We’re expanding our explainability stack to clearly show why a prediction was made or changed (e.g., “ETA adjusted +3 hours due to prolonged dwell”), building transparency and trust.
These initiatives complement the core models with strong ML operations discipline and contextual explanations, pushing ETA accuracy even further.