Jun 23, 2026 | Laboriosam aut et s

Dolor culpa saepe a

Omnis quia vero aut

In my last blog post, I showed you how to work with JFR files using DuckDB, which started a blog series that I surely will continue. Just not this week. Instead, I want to showcase a tiny app to run AI models using the MediaPipe API directly on your phone. I created the app for another purpose (perhaps described in a future blog post) earlier this year, but never wrote anything about it. So here we are.

TL;DR: I built an Android app that offers AI models via a server

The app is open-source and available on GitHub; it’s experimental, but maybe it can help you build your own apps. You can download the releases page of the repo and install it.

The LLM API endpoint, writing a poem on a backyard scene

The Android App

As already described, you can just download the app, but to fully use it, you need to install some AI models. For models like Google’s Gemma, which require authentication for download, you must click “…” to access the download link and then download the files from HuggingFace after agreeing to the license terms. After downloading, load the model file into the app using the “Load” button. The app can download other models directly. Please note that you may need to refresh the page manually. After installation, you can test the model directly with a basic prompt:

The app opens a port (typically 8005) and allows you to test its web endpoints directly. You can use it to capture images using the rear and front camera and do some object detection, using the EfficientDet Lite 2 model (not the best, but it’s small):

As you saw in the TL;DR section, you also prompt the installed LLMs, using them, for example, for better on-device object-detection:

Which leads to “slightly” better results than the EfficientDet Lite 2 model:

[

{

"object": "chair",

"details": "woven wicker chair with a curved back and a metal frame. Covered in fallen leaves."

{

"object": "table",

"details": "wooden table, partially visible."

{

"object": "leaves",

"details": "Numerous fallen leaves, primarily yellow and brown, scattered on the ground."

{

"object": "plants/vines",

"details": "Green plants and vines growing on a wall or fence behind the chair and table."

{

"object": "ground",

"details": "Paved ground with a brick or stone pattern."

}

]

However, in defense of the smaller model, the LLM took 40 times longer (46 seconds vs. 1.2 seconds).

Please note that, for privacy reasons, the app must be open and visible to capture images.

There is also the possibility of capturing the current orientation of the phone, but that’s similar to the other APIs.

Server Functionality

As I mentioned earlier, this app starts a server at port 8005, allowing you to easily access its AI capabilities from other apps and from the terminal, such as Termux or the Linux Terminal App.

You find all the available APIs and their request and response formats in the project’s README, but curl localhost:8005 also gives you an overview: