Some big improvements on mobile-use were shipped this week - below, a detailed overview!


Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
Unordered list
Bold text
Emphasis
Superscript
Subscript
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
Unordered list
Bold text
Emphasis
Superscript
Subscript
Big reliability upgrades in mobile-use: cortex vision, better tools, cleaner prompts
We’ve shipped a batch of improvements that make mobile-use much more transparent for users: here’s exactly what changed this week!
1. Cortex now has full-time visual context
The cortex previously relied on the UI hierarchy and only triggered screen analysis when it thought it needed it. This led to assumptions and inconsistent decisions.
Now, the screenshot feed is always integrated directly into cortex, at all times. With constant visual context, it pairs hierarchy data with what’s on the screen at every step.
The impact: higher accuracy, reliability, and precision!
2. App launch tool now validates launches
Previously, launching an app would show a positive “success” even if the app didn’t open. No double verification meant the downstream tools breaking.
Now, we added a verification loop to make sure the app is indeed launched.
You get:
Minitap now knows with certainty what app is actually running before continuing.
3. Tap tool now exposes error feedback
The tap tool used to fail in a black-box, making it hard to know why a tap didn’t work. Without error tracking, downstream agents blindly retried the same action over and over.
We added full error traceability so every tap failure now returns a clear, phrased reason of failure, giving agents the context they need to adjust instead of repeat the same mistake.
This creates a stable memory of what went wrong, leading to clean follow-up reasoning, fewer useless retries, and way faster action.
4. Prompt volume reduced by ~50%
We fully rewrote the prompt structure across all agents. This now allows faster reasoning, lower token usage, fewer contradictory instructions, and more consistent agent behaviour!
All prompts are now optimised and made easier for the model to interpret.
5. Swipe tool split into two simpler tools
The old swipe tool tried to handle every type of swipe scenario with one single swipe tool. The schema would become overloaded, and the LLM would regularly struggle to decide on the correct swipe type.
This is why swipes were one of the most common failures on mobile-use. We fixed this by dividing the tool into two separate swipe tools, each with a much simpler and clearer structure.
Because the LLM now has two well-defined options instead of one complicated one, it can choose the right swipe much more reliably. And since each tool is explicit about what type of swipe it performs, cortex no longer has to build swipe logic from scratch. It just calls the available tool that matches the intent, making the entire interaction more stable and predictable.
mini-summary
Mobile-use is now more:
These upgrades make mobile-use much easier to build on, debug, and extend.
Use it now on https://github.com/minitap-ai/mobile-use !
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
Unordered list
Bold text
Emphasis
Superscript
Subscript
