Be Right Back

What We're Building

Ok so most likely you skipped through that preface pretty quick. That's ok that's fine, let's talk about what we're actually building.

This is a local MacOS (sorry windows / linux users) Tauri desktop application that should be ~150MB so that you can download and set up a local agent that knows how to interact and search your iMessages. This is not sniffing the Ollama space and this is not meant to be a bring-your-own-local-model application. Beyond that basic functionality, it's also meant to give you a better understanding into your relationship with your peers, show interesting texting patterns and analytics, and overall just be a cool tool for visualizing and self-reflection. That's really what I want - people to be able to answer questions about their lives in a safe and secure way, gain better insights, and have fun using this application.

Core Features

We have these as the features I want to highlight:

local-first: all data and processing happens on your mac. no cloud, no external servers, messages stay private
iMessage integration: indexes and understands your iMessages, enabling advanced search, semantic queries, natural language Q&A
convo analytics: visualize your text patterns as the illustrious Github Activity heatmap, message freq, response times, etc etc
casper agent: interact with an agent that can help you answer daily questions
performant: lightweight application, downloads weights on the fly, won't burn out your vram
secure: again, your data is your data

Architecture

System Diagram

Here I have embedded an excalidraw diagram showing the overall system. I'll be doing this throughout.

We'll get into this a lot more

Key Design Principles

These were / are the basic tenets that I wanted to ensure survives for this project:

1. local first, no exceptions

Not local and then if we're using too big of a model then we'll fallback. or not local and at some point, RAG perf sucks and we'll reach out for openai embeddings on the cloud, no this application is truly all local. i have bricked my computer building this more times than I can count. I try to take some preemptive measures about protecting against an OOM given your unified memory is entirely soaked with fine-tuning but this project can still definitely do that.

2. your data doesn't leave your machine

Goes hand in hand with the point above, and I know my girlfriend is sick of me hammering it around the marketing for this project, but your data never goes over the network. The only sockets in this project are unix domain sockets - there's no TLS nor TCP.

3. grounded responses from your data

Yeah yeah, I know RAG is dead and has been dying for about three years now? Long context!! 1M context for Opus 4.6 - yes. I get it and I know.

It's a hot button topic in the engineering community, and there are many many many great posts about it. <- Those links are borderline chronological.

Regardless, I use vector embeddings and RAG. And I use it in a very fun way through DuckDB's VSS extension which i'll be talking a lot about later on. Plus, this project has some other nifty bells and whistles. There's hybrid BM25 search plus a custom reranker! The functional approach for people is that it's actually referencing chunks of your iMessage data and the hallucinations should be minimal.

4. self-reflection over simulation

I am trying to show people grounded in their data insights that they might not have seen before. It's legitimately not that far off from what I am trying to do with Odyssey. With regards to fine-tuning on someone else's voice, that's not allowed. You can create a digital clone of yourself and chat with it, but you cannot do that for others. Consent needs to be guaranteed and that is an engineering + design element that I do not have.

And concretely: if Casper doesn't have data on something, it should say so - not play along. If you ever catch it doing otherwise, file a Github issue and I'll take a look.

The Problem & Vision