Your driver app needs to work in a Faraday cage

Logistics software that assumes connectivity will fail in production on roughly day three. Building offline-first is not premium engineering; it is the only kind that ships.

The most useful exercise we run for a new logistics client is the elevator test. Take the driver's phone into the freight elevator at the loading dock. Close the doors. Watch the app.

If the app becomes a spinner, freezes, loses the user's place, or — the most common failure — silently drops the scan the driver just performed, the app is not ready for production. Drivers spend a meaningful fraction of their day in steel boxes, concrete buildings, rural dead zones, and tunnels. Connectivity is not a property they have. It is a property they sometimes have.

This is the actual constraint. Building for it is not premium engineering; it is the only kind that ships.

The three failure modes that get teams every time

After enough logistics engagements, three failure modes show up repeatedly.

The optimistic-write that wasn't. The driver scans a package. The app shows "scanned." The server, due to a network blip, never received the scan. The driver believes the package is in the system; the system does not. The package gets to the next stop and nobody knows what to do with it. The root cause: the app showed success based on a local action, with no acknowledgment from the server, and no way for the driver to know the difference.

The login that needs a network. A driver shows up at the start of a shift and the phone has lost its session. The login screen needs to call the auth server. The driver is in the parking lot, with two bars of coverage that drop to zero whenever the truck moves. The shift cannot start. Every minute of this is paid time, and the dispatcher's phone is now ringing.

The "sync later" queue that loses data. Many teams build offline support by queueing actions locally and syncing when the network returns. This works until the queue grows past what the device can hold, or the user closes the app and the queue is lost, or the sync code has a bug and silently drops items. The driver does their job correctly. The data does not arrive. The driver gets blamed.

What offline-first actually requires

Offline-first is not a feature you add. It is a property of the architecture, and either the app is built around it from day one or it is not.

The shape that works:

The local store is the source of truth for the session. When the driver scans a package, the scan is written to a durable local store immediately, and the UI reads from that store. The server sync is a separate process that runs whenever it can. The UI does not wait for the server; it shows the local state, which is the real state the driver experiences.

Every action has an ID generated client-side. UUIDs, generated on the device before the action is taken. This means the same action can be safely retried indefinitely without creating duplicates server-side, and it means the action exists in the local store with a stable identity even before the server has ever seen it.

Sync is observable to the user, not hidden. A small indicator: "12 actions pending, last synced 4 minutes ago." Drivers find this reassuring. They want to know whether their work has been recorded. Hiding the sync state in the name of UX cleanliness is what leads to the "scanned, but not really" scenario.

Authentication tokens have long offline validity. Login once, get a token that survives a multi-day offline period. Refresh the token in the background whenever the network is available. Treat a network failure on token refresh as a non-event; only a token that has actually expired without refresh should force a re-login. This single change eliminates the most common day-start failure.

What this changes about the server side

Offline-first is usually framed as a client-side property. It also has consequences for the server.

Server endpoints have to be idempotent. The client will retry. The retries will happen days after the original. The original might or might not have been received. The endpoint has to handle "I already processed this action with this client-generated ID" gracefully, returning the same result rather than creating a duplicate or rejecting the request.

Time on the server is not the same as time on the client. The action was performed at 9:14 AM driver-local-time, but the server is recording it at 11:47 AM after the driver came back into coverage. Every record needs both timestamps. Reports that conflate them will mislead the business.

The data shape has to tolerate eventually-consistent edits. A driver might mark a package delivered, then the supervisor might mark it as a return, before the driver's action has synced. The conflict resolution needs to be deliberate, not "last write wins by happenstance." Usually the rule is "driver action takes priority within the shift, supervisor action takes priority after the shift," but it depends on the business.

The pragmatic test

If you are evaluating a logistics platform — buying or building — run the elevator test before anything else. Take the app, take the phone, close the elevator doors, do a normal task. See what happens.

The platforms that pass this test are usually fine. The platforms that fail it will fail it in production every day, and they will blame the network when they do.

We want to hear your thoughts.

our CTO Kyrylo Osadchuk, will reply within 24 hours. No SDR funnel.

← Previous

What B2B SaaS founders learn during their first enterprise security review

Most of your real-time logistics features are not real-time