New improvements to mobile-use are out!

Some big improvements on mobile-use were shipped this week - below, a detailed overview!

Technical Insights

Performance

8 Sep 2025

10 min

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

Item 1
Item 2
Item 3

Unordered list

Item A
Item B
Item C

Text link

Bold text

Emphasis

^Superscript

_Subscript

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

Item 1
Item 2
Item 3

Unordered list

Item A
Item B
Item C

Text link

Bold text

Emphasis

^Superscript

_Subscript

Big reliability upgrades in mobile-use: cortex vision, better tools, cleaner prompts

We’ve shipped a batch of improvements that make mobile-use much more transparent for users: here’s exactly what changed this week!

‍

1. Cortex now has full-time visual context

The cortex previously relied on the UI hierarchy and only triggered screen analysis when it thought it needed it. This led to assumptions and inconsistent decisions.

Now, the screenshot feed is always integrated directly into cortex, at all times. With constant visual context, it pairs hierarchy data with what’s on the screen at every step.

The impact: higher accuracy, reliability, and precision!

‍

2. App launch tool now validates launches

Previously, launching an app would show a positive “success” even if the app didn’t open. No double verification meant the downstream tools breaking.

Now, we added a verification loop to make sure the app is indeed launched.

You get:

Consistent launch behavior
No more silent app-start failures
A clear alert if the launch didn’t happen

Minitap now knows with certainty what app is actually running before continuing.

‍

3. Tap tool now exposes error feedback

The tap tool used to fail in a black-box, making it hard to know why a tap didn’t work. Without error tracking, downstream agents blindly retried the same action over and over.

We added full error traceability so every tap failure now returns a clear, phrased reason of failure, giving agents the context they need to adjust instead of repeat the same mistake.

This creates a stable memory of what went wrong, leading to clean follow-up reasoning, fewer useless retries, and way faster action.

‍

4. Prompt volume reduced by ~50%

We fully rewrote the prompt structure across all agents. This now allows faster reasoning, lower token usage, fewer contradictory instructions, and more consistent agent behaviour!

All prompts are now optimised and made easier for the model to interpret.

‍

5. Swipe tool split into two simpler tools

The old swipe tool tried to handle every type of swipe scenario with one single swipe tool. The schema would become overloaded, and the LLM would regularly struggle to decide on the correct swipe type.

This is why swipes were one of the most common failures on mobile-use. We fixed this by dividing the tool into two separate swipe tools, each with a much simpler and clearer structure.

Because the LLM now has two well-defined options instead of one complicated one, it can choose the right swipe much more reliably. And since each tool is explicit about what type of swipe it performs, cortex no longer has to build swipe logic from scratch. It just calls the available tool that matches the intent, making the entire interaction more stable and predictable.

‍

mini-summary

Mobile-use is now more:

Reliable : less black-box failures
Transparent: error signals are now explicit and trackable
Accurate: cortex is now grounded in continuous screenshots
Efficient: leaner prompts, faster reasoning
Predictable: simpler tools

These upgrades make mobile-use much easier to build on, debug, and extend.

Use it now on https://github.com/minitap-ai/mobile-use !

‍

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.