First commit

This commit is contained in:
2026-03-09 16:09:28 +00:00
commit c3f8ba4cb6
7 changed files with 4538 additions and 0 deletions

1
.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
/target

2397
Cargo.lock generated Normal file

File diff suppressed because it is too large Load Diff

13
Cargo.toml Normal file
View File

@@ -0,0 +1,13 @@
[package]
name = "vector_explorer"
version = "0.1.0"
edition = "2024"
[dependencies]
anyhow = "1.0"
chrono = { version = "0.4", default-features = false, features = ["clock"] }
gtk4 = "0.10"
rfd = "0.15"
rusqlite = { version = "0.37", features = ["bundled"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"

150
README.md Normal file
View File

@@ -0,0 +1,150 @@
# Vector Explorer
`vector_explorer` is a Rust + GTK desktop viewer for SQLite-backed vector stores. It is designed around OpenClaw's memory layout and can fall back to a generic adapter for similar databases such as NullClaw- or ZeroClaw-style stores that keep chunk text and embeddings in SQLite tables.
## Features
- Store overview with detected adapter, file count, chunk count, embedding dimensions, models, and inferred backend features such as FTS5 or `sqlite-vec`
- Filterable file list with a top-level reset entry to return to the full graph
- Rotatable 3D semantic map based on a lightweight PCA-style projection of embeddings
- Detail panel for the selected chunk or file
- OpenClaw-first schema detection with generic SQLite heuristics for other stores
## Supported store layouts
### OpenClaw
The first-class adapter targets the documented OpenClaw memory structure:
- `files` for indexed documents
- `chunks` for chunk text and stored embeddings
- `meta` for memory index metadata
- `chunks_fts` and `chunks_vec` as optional hybrid-search acceleration tables
The sample `main.sqlite` in this repository matches that layout.
### Generic SQLite vector stores
If the database is not recognized as OpenClaw, the app falls back to schema heuristics:
- Find a table with text/content columns
- Prefer tables that also expose `embedding` or `vector`
- Reconstruct file groupings from a path-like column or from the chunk table itself
That makes the viewer usable for adjacent SQLite-backed vector stores such as NullClaw- or ZeroClaw-style databases, as long as they expose chunk text and embeddings in a reasonably conventional schema.
## Rust dependencies
These are declared in [Cargo.toml](C:/Users/Trude/Desktop/vector_explorer/Cargo.toml):
- `anyhow`
- `chrono`
- `gtk4`
- `rfd`
- `rusqlite` with `bundled`
- `serde`
- `serde_json`
## System dependencies
### Required
- Rust toolchain with `cargo`
- GTK 4 development files and runtime
- `pkg-config`
- C/C++ build tooling appropriate for your platform
### WSL / Ubuntu packages
If you are building in WSL on an Ubuntu-like distro, install:
```bash
sudo apt update
sudo apt install -y build-essential pkg-config libgtk-4-dev
```
### Windows native build
If you are building natively on Windows instead of WSL, you need:
- Visual Studio Build Tools with the MSVC C++ toolchain
- GTK 4 SDK/runtime
- `pkg-config` configured to find the GTK installation
The project compiles more reliably in WSL than in a partially configured native Windows environment.
## How to compile
### WSL
```bash
cd /mnt/c/Users/Trude/Desktop/vector_explorer
cargo build
```
For an optimized build:
```bash
cd /mnt/c/Users/Trude/Desktop/vector_explorer
cargo build --release
```
### Windows native
```powershell
cd C:\Users\Trude\Desktop\vector_explorer
cargo build
```
For an optimized build:
```powershell
cd C:\Users\Trude\Desktop\vector_explorer
cargo build --release
```
## How to run
### WSL
```bash
cd /mnt/c/Users/Trude/Desktop/vector_explorer
cargo run -- main.sqlite
```
To open a different database:
```bash
cd /mnt/c/Users/Trude/Desktop/vector_explorer
cargo run -- /path/to/other.sqlite
```
### Windows native
```powershell
cd C:\Users\Trude\Desktop\vector_explorer
cargo run -- .\main.sqlite
```
## Notes for WSL graphics
- The app is GTK-based and needs a working GUI/display path in WSL.
- If WSLg or X/Wayland forwarding is not available, the binary can compile successfully but fail to open a display at runtime.
- The app forces `GSK_RENDERER=cairo` by default to avoid flaky EGL/Zink GPU paths commonly seen on WSL.
## Development workflow
Useful commands:
```bash
cargo fmt
cargo check
cargo build
```
## Repository contents
- [src/main.rs](C:/Users/Trude/Desktop/vector_explorer/src/main.rs): GTK application and UI behavior
- [src/store.rs](C:/Users/Trude/Desktop/vector_explorer/src/store.rs): SQLite schema detection and adapter loading
- [src/projection.rs](C:/Users/Trude/Desktop/vector_explorer/src/projection.rs): embedding projection for the semantic map
- [main.sqlite](C:/Users/Trude/Desktop/vector_explorer/main.sqlite): sample OpenClaw-style memory database

990
src/main.rs Normal file
View File

@@ -0,0 +1,990 @@
mod projection;
mod store;
use std::cell::RefCell;
use std::path::{Path, PathBuf};
use std::rc::Rc;
use chrono::{DateTime, Utc};
use gtk::cairo::Context;
use gtk::gio;
use gtk::prelude::*;
use gtk::{
Align, Application, ApplicationWindow, Box as GtkBox, Button, DrawingArea, Entry, Frame,
GestureClick, HeaderBar, Label, ListBox, Orientation, Paned, PolicyType, Scale, ScrolledWindow,
SelectionMode, TextView, WrapMode,
};
use gtk4 as gtk;
use crate::projection::ProjectedPoint;
use crate::store::{ChunkRecord, FileRecord, LoadedStore, detect_and_load};
#[derive(Clone)]
struct AppWidgets {
window: ApplicationWindow,
open_button: Button,
clear_button: Button,
path_label: Label,
adapter_label: Label,
stats_label: Label,
models_label: Label,
backend_label: Label,
notes_label: Label,
filter_entry: Entry,
file_list: ListBox,
yaw_scale: Scale,
pitch_scale: Scale,
plot_area: DrawingArea,
selection_label: Label,
meta_label: Label,
chunk_view: TextView,
tables_label: Label,
}
struct AppState {
store: Option<LoadedStore>,
selected_file: Option<String>,
selected_chunk_id: Option<String>,
filter_text: String,
yaw_deg: f64,
pitch_deg: f64,
}
impl Default for AppState {
fn default() -> Self {
Self {
store: None,
selected_file: None,
selected_chunk_id: None,
filter_text: String::new(),
yaw_deg: 28.0,
pitch_deg: -18.0,
}
}
}
struct Runtime {
widgets: AppWidgets,
state: Rc<RefCell<AppState>>,
}
thread_local! {
static RUNTIME: RefCell<Option<Rc<Runtime>>> = const { RefCell::new(None) };
}
fn main() {
if std::env::var_os("GSK_RENDERER").is_none() {
// The viewer draws with Cairo; forcing the Cairo renderer avoids flaky EGL/GPU paths on WSL.
unsafe {
std::env::set_var("GSK_RENDERER", "cairo");
}
}
let application = Application::builder()
.application_id("ai.openclaw.vector-explorer")
.flags(gio::ApplicationFlags::HANDLES_OPEN)
.build();
application.connect_activate(|application| {
let runtime = ensure_runtime(application);
if runtime.state.borrow().store.is_none() {
if let Some(start_path) = startup_path() {
load_database(&runtime.widgets, &runtime.state, &start_path);
} else {
runtime
.widgets
.path_label
.set_text("Open a SQLite memory file to begin.");
}
}
runtime.widgets.window.present();
});
application.connect_open(|application, files, _| {
let runtime = ensure_runtime(application);
if let Some(path) = files.first().and_then(|file| file.path()) {
load_database(&runtime.widgets, &runtime.state, &path);
}
runtime.widgets.window.present();
});
application.run();
}
fn ensure_runtime(application: &Application) -> Rc<Runtime> {
RUNTIME.with(|runtime| {
if let Some(existing) = runtime.borrow().as_ref() {
return existing.clone();
}
let widgets = build_widgets(application);
let state = Rc::new(RefCell::new(AppState::default()));
wire_events(&widgets, &state);
let created = Rc::new(Runtime { widgets, state });
runtime.replace(Some(created.clone()));
created
})
}
fn build_widgets(application: &Application) -> AppWidgets {
let window = ApplicationWindow::builder()
.application(application)
.title("Vector Explorer")
.default_width(1440)
.default_height(920)
.build();
let open_button = Button::with_label("Open Database");
let clear_button = Button::with_label("Clear File Filter");
let filter_entry = Entry::builder()
.placeholder_text("Filter by path, source, model, or chunk text")
.hexpand(true)
.build();
let header = HeaderBar::builder().show_title_buttons(true).build();
header.pack_start(&open_button);
header.pack_start(&clear_button);
header.pack_end(&filter_entry);
window.set_titlebar(Some(&header));
let root = GtkBox::new(Orientation::Vertical, 12);
root.set_margin_top(12);
root.set_margin_bottom(12);
root.set_margin_start(12);
root.set_margin_end(12);
let path_label = Label::builder()
.xalign(0.0)
.wrap(true)
.selectable(true)
.build();
let adapter_label = Label::builder().xalign(0.0).build();
let stats_label = Label::builder().xalign(0.0).build();
let models_label = Label::builder().xalign(0.0).wrap(true).build();
let backend_label = Label::builder().xalign(0.0).wrap(true).build();
let notes_label = Label::builder().xalign(0.0).wrap(true).build();
let overview_frame = Frame::builder().label("Overview").build();
let overview_box = GtkBox::new(Orientation::Vertical, 8);
overview_box.append(&path_label);
overview_box.append(&adapter_label);
overview_box.append(&stats_label);
overview_box.append(&models_label);
overview_box.append(&backend_label);
overview_box.append(&notes_label);
overview_frame.set_child(Some(&overview_box));
root.append(&overview_frame);
let file_list = ListBox::new();
file_list.set_selection_mode(SelectionMode::None);
let file_scroller = ScrolledWindow::builder()
.hscrollbar_policy(PolicyType::Never)
.min_content_width(300)
.build();
file_scroller.set_child(Some(&file_list));
let files_frame = Frame::builder().label("Files").build();
files_frame.set_child(Some(&file_scroller));
let yaw_scale = Scale::with_range(Orientation::Horizontal, -180.0, 180.0, 1.0);
yaw_scale.set_value(28.0);
yaw_scale.set_hexpand(true);
let pitch_scale = Scale::with_range(Orientation::Horizontal, -75.0, 75.0, 1.0);
pitch_scale.set_value(-18.0);
pitch_scale.set_hexpand(true);
let plot_area = DrawingArea::builder()
.content_width(800)
.content_height(640)
.hexpand(true)
.vexpand(true)
.build();
let controls_box = GtkBox::new(Orientation::Horizontal, 8);
controls_box.append(&Label::builder().label("Yaw").build());
controls_box.append(&yaw_scale);
controls_box.append(&Label::builder().label("Pitch").build());
controls_box.append(&pitch_scale);
let plot_panel = GtkBox::new(Orientation::Vertical, 8);
plot_panel.append(&controls_box);
plot_panel.append(&plot_area);
let plot_frame = Frame::builder().label("Semantic Map").build();
plot_frame.set_child(Some(&plot_panel));
let selection_label = Label::builder()
.xalign(0.0)
.wrap(true)
.selectable(true)
.build();
let meta_label = Label::builder()
.xalign(0.0)
.wrap(true)
.selectable(true)
.build();
let chunk_view = TextView::builder()
.editable(false)
.monospace(true)
.wrap_mode(WrapMode::WordChar)
.vexpand(true)
.build();
let chunk_scroller = ScrolledWindow::builder().min_content_width(360).build();
chunk_scroller.set_child(Some(&chunk_view));
let tables_label = Label::builder()
.xalign(0.0)
.wrap(true)
.selectable(true)
.build();
let detail_box = GtkBox::new(Orientation::Vertical, 8);
detail_box.append(&selection_label);
detail_box.append(&meta_label);
detail_box.append(&chunk_scroller);
detail_box.append(&tables_label);
let detail_frame = Frame::builder().label("Selection").build();
detail_frame.set_child(Some(&detail_box));
let center_pane = Paned::builder()
.orientation(Orientation::Horizontal)
.start_child(&plot_frame)
.end_child(&detail_frame)
.wide_handle(true)
.shrink_start_child(false)
.build();
center_pane.set_position(900);
let main_pane = Paned::builder()
.orientation(Orientation::Horizontal)
.start_child(&files_frame)
.end_child(&center_pane)
.wide_handle(true)
.build();
main_pane.set_position(330);
root.append(&main_pane);
window.set_child(Some(&root));
open_button.set_valign(Align::Center);
clear_button.set_valign(Align::Center);
let widgets = AppWidgets {
window,
open_button,
clear_button,
path_label,
adapter_label,
stats_label,
models_label,
backend_label,
notes_label,
filter_entry,
file_list,
yaw_scale,
pitch_scale,
plot_area,
selection_label,
meta_label,
chunk_view,
tables_label,
};
widgets
}
fn wire_events(widgets: &AppWidgets, state: &Rc<RefCell<AppState>>) {
{
let widgets = widgets.clone();
let open_button = widgets.open_button.clone();
let state = Rc::clone(state);
open_button.connect_clicked(move |_| {
if let Some(path) = rfd::FileDialog::new()
.add_filter("SQLite database", &["sqlite", "db"])
.set_directory(current_directory())
.pick_file()
{
load_database(&widgets, &state, &path);
}
});
}
{
let widgets = widgets.clone();
let clear_button = widgets.clear_button.clone();
let state = Rc::clone(state);
clear_button.connect_clicked(move |_| {
state.borrow_mut().selected_file = None;
let state_ref = state.borrow();
refresh_files(&widgets, &state_ref, &state);
refresh_selection(&widgets, &state_ref);
widgets.plot_area.queue_draw();
});
}
{
let widgets = widgets.clone();
let filter_entry = widgets.filter_entry.clone();
let state = Rc::clone(state);
filter_entry.connect_changed(move |entry| {
state.borrow_mut().filter_text = entry.text().to_string();
let state_ref = state.borrow();
refresh_files(&widgets, &state_ref, &state);
refresh_selection(&widgets, &state_ref);
widgets.plot_area.queue_draw();
});
}
{
let widgets = widgets.clone();
let state = Rc::clone(state);
let yaw_scale = widgets.yaw_scale.clone();
yaw_scale.connect_value_changed(move |scale| {
state.borrow_mut().yaw_deg = scale.value();
widgets.plot_area.queue_draw();
});
}
{
let widgets = widgets.clone();
let state = Rc::clone(state);
let pitch_scale = widgets.pitch_scale.clone();
pitch_scale.connect_value_changed(move |scale| {
state.borrow_mut().pitch_deg = scale.value();
widgets.plot_area.queue_draw();
});
}
{
let widgets = widgets.clone();
let state = Rc::clone(state);
widgets
.plot_area
.set_draw_func(move |_, context, width, height| {
draw_plot(context, width, height, &state.borrow());
});
}
{
let widgets = widgets.clone();
let state = Rc::clone(state);
let plot_area = widgets.plot_area.clone();
let click = GestureClick::new();
click.connect_pressed(move |_, _, x, y| {
let width = widgets.plot_area.width() as f64;
let height = widgets.plot_area.height() as f64;
let chunk_id = {
let state_ref = state.borrow();
nearest_chunk_at(&state_ref, x, y, width, height, 12.0)
};
if let Some(chunk_id) = chunk_id {
let mut state_ref = state.borrow_mut();
state_ref.selected_chunk_id = Some(chunk_id.clone());
drop(state_ref);
let state_ref = state.borrow();
refresh_files(&widgets, &state_ref, &state);
refresh_selection(&widgets, &state_ref);
widgets.plot_area.queue_draw();
}
});
plot_area.add_controller(click);
}
}
fn load_database(widgets: &AppWidgets, state: &Rc<RefCell<AppState>>, path: &Path) {
match detect_and_load(path) {
Ok(store) => {
widgets.filter_entry.set_text("");
{
let mut state_ref = state.borrow_mut();
state_ref.filter_text.clear();
state_ref.store = Some(store);
state_ref.selected_file = None;
let selected_chunk = state_ref
.store
.as_ref()
.and_then(|store| store.chunks.first().map(|chunk| chunk.id.clone()));
state_ref.selected_chunk_id = selected_chunk;
}
let state_ref = state.borrow();
refresh_all(widgets, &state_ref, state);
}
Err(error) => {
let mut state_ref = state.borrow_mut();
state_ref.store = None;
state_ref.selected_file = None;
state_ref.selected_chunk_id = None;
drop(state_ref);
widgets.path_label.set_text(&format!(
"Failed to open {}: {error:#}",
path.to_string_lossy()
));
widgets.adapter_label.set_text("");
widgets.stats_label.set_text("");
widgets.models_label.set_text("");
widgets.backend_label.set_text("");
widgets.notes_label.set_text("");
widgets.selection_label.set_text("");
widgets.meta_label.set_text("");
widgets.tables_label.set_text("");
widgets.chunk_view.buffer().set_text("");
clear_list_box(&widgets.file_list);
widgets.plot_area.queue_draw();
}
}
}
fn refresh_all(widgets: &AppWidgets, state: &AppState, shared_state: &Rc<RefCell<AppState>>) {
refresh_overview(widgets, state);
refresh_files(widgets, state, shared_state);
refresh_selection(widgets, state);
widgets.plot_area.queue_draw();
}
fn refresh_overview(widgets: &AppWidgets, state: &AppState) {
if let Some(store) = state.store.as_ref() {
widgets
.path_label
.set_text(&format!("Database: {}", store.db_path.display()));
widgets
.adapter_label
.set_text(&format!("Adapter: {}", store.adapter_name));
widgets.stats_label.set_text(&format!(
"Files: {} Chunks: {} Embeddings: {} Dims: {}",
store.metrics.total_files,
store.metrics.total_chunks,
store.metrics.embedding_rows,
store
.metrics
.embedding_dims
.map(|dims| dims.to_string())
.unwrap_or_else(|| "unknown".to_string())
));
widgets.models_label.set_text(&format!(
"Models: {}",
join_or_unknown(&store.metrics.models)
));
widgets.backend_label.set_text(&format!(
"Backend: {} Sources: {}",
store
.metrics
.vector_backend
.clone()
.unwrap_or_else(|| "not inferred".to_string()),
join_or_unknown(&store.metrics.sources)
));
widgets
.notes_label
.set_text(&format!("Notes: {}", store.notes.join(" ")));
}
}
fn refresh_files(widgets: &AppWidgets, state: &AppState, shared_state: &Rc<RefCell<AppState>>) {
clear_list_box(&widgets.file_list);
let reset_title = Label::builder()
.xalign(0.0)
.wrap(true)
.selectable(false)
.label(if state.selected_file.is_none() {
"Showing all chunks"
} else {
"Show all chunks"
})
.build();
let reset_detail = Label::builder()
.xalign(0.0)
.label("Reset file focus and return to the full semantic map")
.build();
let reset_box = GtkBox::new(Orientation::Vertical, 4);
reset_box.append(&reset_title);
reset_box.append(&reset_detail);
reset_box.set_margin_top(8);
reset_box.set_margin_bottom(8);
reset_box.set_margin_start(8);
reset_box.set_margin_end(8);
let reset_widgets = widgets.clone();
let reset_state = Rc::clone(shared_state);
let reset_button = Button::new();
reset_button.set_halign(Align::Fill);
reset_button.set_hexpand(true);
reset_button.add_css_class("flat");
reset_button.set_child(Some(&reset_box));
reset_button.connect_clicked(move |_| {
{
let mut state_ref = reset_state.borrow_mut();
state_ref.selected_file = None;
}
let state_ref = reset_state.borrow();
refresh_files(&reset_widgets, &state_ref, &reset_state);
refresh_selection(&reset_widgets, &state_ref);
reset_widgets.plot_area.queue_draw();
});
widgets.file_list.append(&reset_button);
for file in filtered_files(state) {
let title_text = if state.selected_file.as_deref() == Some(file.path.as_str()) {
format!("Focused: {}", file.path)
} else {
file.path.clone()
};
let title = Label::builder()
.xalign(0.0)
.wrap(true)
.selectable(false)
.label(&title_text)
.build();
let detail = Label::builder()
.xalign(0.0)
.label(&format!(
"{} chunks {} {}",
file.chunk_count,
file.size
.map(|size| format!("{} bytes", size))
.unwrap_or_else(|| "size unknown".to_string()),
file.mtime_ms
.map(format_timestamp_ms)
.unwrap_or_else(|| "time unknown".to_string())
))
.build();
let path = file.path.clone();
let widgets_for_click = widgets.clone();
let state_for_click = Rc::clone(shared_state);
let row_box = GtkBox::new(Orientation::Vertical, 4);
row_box.append(&title);
row_box.append(&detail);
row_box.set_margin_top(8);
row_box.set_margin_bottom(8);
row_box.set_margin_start(8);
row_box.set_margin_end(8);
let row_button = Button::new();
row_button.set_halign(Align::Fill);
row_button.set_hexpand(true);
row_button.add_css_class("flat");
row_button.set_child(Some(&row_box));
row_button.connect_clicked(move |_| {
{
let mut state_ref = state_for_click.borrow_mut();
state_ref.selected_file = Some(path.clone());
let selected_chunk = state_ref.store.as_ref().and_then(|store| {
store
.chunks
.iter()
.find(|chunk| chunk.path == path)
.map(|chunk| chunk.id.clone())
});
state_ref.selected_chunk_id = selected_chunk;
}
let state_ref = state_for_click.borrow();
refresh_files(&widgets_for_click, &state_ref, &state_for_click);
refresh_selection(&widgets_for_click, &state_ref);
widgets_for_click.plot_area.queue_draw();
});
widgets.file_list.append(&row_button);
}
}
fn refresh_selection(widgets: &AppWidgets, state: &AppState) {
let Some(store) = state.store.as_ref() else {
widgets.selection_label.set_text("");
widgets.meta_label.set_text("");
widgets.tables_label.set_text("");
widgets.chunk_view.buffer().set_text("");
return;
};
if let Some(chunk) = selected_chunk(state) {
widgets.selection_label.set_text(&format!(
"{}\n{}",
chunk.path,
preview_text(&chunk.text, 180)
));
widgets.meta_label.set_text(&format!(
"Chunk: {}\nSource: {}\nModel: {}\nLines: {}\nUpdated: {}\nEmbedding dims: {}",
chunk.id,
chunk
.source
.clone()
.unwrap_or_else(|| "unknown".to_string()),
chunk.model.clone().unwrap_or_else(|| "unknown".to_string()),
format_line_range(chunk),
chunk
.updated_at_ms
.map(format_timestamp_ms)
.unwrap_or_else(|| "unknown".to_string()),
chunk
.embedding
.as_ref()
.map(|embedding| embedding.len().to_string())
.unwrap_or_else(|| "none".to_string())
));
widgets.chunk_view.buffer().set_text(&chunk.text);
} else if let Some(file) = selected_file(state) {
widgets.selection_label.set_text(&file.path);
widgets.meta_label.set_text(&format!(
"Source: {}\nChunks: {}\nSize: {}\nUpdated: {}",
file.source.clone().unwrap_or_else(|| "unknown".to_string()),
file.chunk_count,
file.size
.map(|size| format!("{} bytes", size))
.unwrap_or_else(|| "unknown".to_string()),
file.mtime_ms
.map(format_timestamp_ms)
.unwrap_or_else(|| "unknown".to_string())
));
widgets.chunk_view.buffer().set_text(
&store
.chunks
.iter()
.filter(|chunk| chunk.path == file.path)
.map(|chunk| preview_text(&chunk.text, 160))
.take(3)
.collect::<Vec<_>>()
.join("\n\n"),
);
} else {
widgets.selection_label.set_text("Nothing selected.");
widgets.meta_label.set_text("");
widgets.chunk_view.buffer().set_text("");
}
let table_summary = store
.tables
.iter()
.take(10)
.map(|table| {
format!(
"{} ({}) [{}]",
table.name,
table
.row_count
.map(|count| count.to_string())
.unwrap_or_else(|| "unreadable".to_string()),
table.columns.join(", ")
)
})
.collect::<Vec<_>>()
.join("\n");
widgets
.tables_label
.set_text(&format!("Tables:\n{}", table_summary));
}
fn filtered_files(state: &AppState) -> Vec<FileRecord> {
let Some(store) = state.store.as_ref() else {
return Vec::new();
};
let filter = state.filter_text.to_ascii_lowercase();
store
.files
.iter()
.filter(|file| {
filter.is_empty()
|| file.path.to_ascii_lowercase().contains(&filter)
|| file
.source
.as_ref()
.is_some_and(|source| source.to_ascii_lowercase().contains(&filter))
|| store
.chunks
.iter()
.any(|chunk| chunk.path == file.path && chunk_matches_filter(chunk, &filter))
})
.cloned()
.collect()
}
fn visible_points<'a>(state: &'a AppState) -> Vec<(&'a ChunkRecord, &'a ProjectedPoint)> {
let Some(store) = state.store.as_ref() else {
return Vec::new();
};
let filter = state.filter_text.to_ascii_lowercase();
store
.chunks
.iter()
.filter(|chunk| {
state
.selected_file
.as_ref()
.is_none_or(|selected_file| &chunk.path == selected_file)
&& (filter.is_empty() || chunk_matches_filter(chunk, &filter))
})
.filter_map(|chunk| {
store
.points
.iter()
.find(|point| point.chunk_id == chunk.id)
.map(|point| (chunk, point))
})
.collect()
}
#[derive(Clone, Copy)]
struct RenderedPoint<'a> {
chunk: &'a ChunkRecord,
point: &'a ProjectedPoint,
canvas_x: f64,
canvas_y: f64,
depth: f64,
radius: f64,
}
fn rendered_points<'a>(state: &'a AppState, width: f64, height: f64) -> Vec<RenderedPoint<'a>> {
let mut rendered = visible_points(state)
.into_iter()
.map(|(chunk, point)| {
let (view_x, view_y, depth, scale) = rotated_view(point, state);
let (canvas_x, canvas_y) = point_to_canvas(view_x, view_y, width, height);
RenderedPoint {
chunk,
point,
canvas_x,
canvas_y,
depth,
radius: 3.5 + (depth + 1.0) * 2.2 * scale,
}
})
.collect::<Vec<_>>();
rendered.sort_by(|left, right| {
left.depth
.partial_cmp(&right.depth)
.unwrap_or(std::cmp::Ordering::Equal)
});
rendered
}
fn draw_plot(context: &Context, width: i32, height: i32, state: &AppState) {
context.set_source_rgb(0.97, 0.97, 0.95);
let _ = context.paint();
if state.store.is_none() {
context.set_source_rgb(0.20, 0.20, 0.20);
context.move_to(30.0, 40.0);
let _ = context.show_text("Open a SQLite vector store to draw its chunk map.");
return;
}
let points = rendered_points(state, width as f64, height as f64);
let margin = 48.0;
let plot_width = (width as f64 - margin * 2.0).max(1.0);
let plot_height = (height as f64 - margin * 2.0).max(1.0);
context.set_source_rgb(0.88, 0.88, 0.86);
for step in 0..=4 {
let ratio = step as f64 / 4.0;
let x = margin + plot_width * ratio;
let y = margin + plot_height * ratio;
context.move_to(x, margin);
context.line_to(x, margin + plot_height);
context.move_to(margin, y);
context.line_to(margin + plot_width, y);
}
let _ = context.stroke();
context.set_source_rgb(0.25, 0.25, 0.25);
draw_axis_guides(context, width as f64, height as f64, state);
let selected_id = state.selected_chunk_id.as_deref();
let semantic = points.iter().any(|point| point.point.from_embeddings);
context.move_to(margin, 24.0);
let _ = context.show_text(if semantic {
"3D semantic view from embeddings"
} else {
"3D fallback layout because embeddings were unavailable"
});
for (index, rendered) in points.iter().enumerate() {
let radius = if selected_id == Some(rendered.chunk.id.as_str()) {
rendered.radius + 3.0
} else {
rendered.radius
};
let color = color_for_index(index, points.len().max(1), rendered.depth);
context.set_source_rgba(
color.0,
color.1,
color.2,
0.55 + ((rendered.depth + 1.0) / 2.0) * 0.4,
);
context.arc(
rendered.canvas_x,
rendered.canvas_y,
radius,
0.0,
std::f64::consts::TAU,
);
let _ = context.fill();
}
}
fn nearest_chunk_at(
state: &AppState,
x: f64,
y: f64,
width: f64,
height: f64,
max_distance: f64,
) -> Option<String> {
rendered_points(state, width, height)
.into_iter()
.map(|rendered| {
let dx = rendered.canvas_x - x;
let dy = rendered.canvas_y - y;
(rendered.chunk.id.clone(), (dx * dx + dy * dy).sqrt())
})
.filter(|(_, distance)| *distance <= max_distance)
.min_by(|left, right| {
left.1
.partial_cmp(&right.1)
.unwrap_or(std::cmp::Ordering::Equal)
})
.map(|(chunk_id, _)| chunk_id)
}
fn point_to_canvas(view_x: f64, view_y: f64, width: f64, height: f64) -> (f64, f64) {
let margin = 48.0;
let plot_width = (width - margin * 2.0).max(1.0);
let plot_height = (height - margin * 2.0).max(1.0);
let x = margin + ((view_x + 1.0) / 2.0) * plot_width;
let y = margin + (1.0 - (view_y + 1.0) / 2.0) * plot_height;
(x, y)
}
fn rotated_view(point: &ProjectedPoint, state: &AppState) -> (f64, f64, f64, f64) {
let yaw = state.yaw_deg.to_radians();
let pitch = state.pitch_deg.to_radians();
let x = point.x as f64;
let y = point.y as f64;
let z = point.z as f64;
let yaw_x = x * yaw.cos() + z * yaw.sin();
let yaw_z = -x * yaw.sin() + z * yaw.cos();
let pitch_y = y * pitch.cos() - yaw_z * pitch.sin();
let pitch_z = y * pitch.sin() + yaw_z * pitch.cos();
let camera_distance = 3.2;
let perspective = camera_distance / (camera_distance - pitch_z * 0.9);
let view_x = (yaw_x * perspective).clamp(-1.2, 1.2);
let view_y = (pitch_y * perspective).clamp(-1.2, 1.2);
(view_x, view_y, pitch_z.clamp(-1.0, 1.0), perspective)
}
fn draw_axis_guides(context: &Context, width: f64, height: f64, state: &AppState) {
let axes = [
("X", 1.0_f32, 0.0_f32, 0.0_f32),
("Y", 0.0_f32, 1.0_f32, 0.0_f32),
("Z", 0.0_f32, 0.0_f32, 1.0_f32),
];
let origin = ProjectedPoint {
chunk_id: String::new(),
x: 0.0,
y: 0.0,
z: 0.0,
from_embeddings: true,
};
let (origin_x, origin_y, _, _) = rotated_view(&origin, state);
let (origin_x, origin_y) = point_to_canvas(origin_x, origin_y, width, height);
context.set_source_rgba(0.20, 0.20, 0.20, 0.7);
for (label, x, y, z) in axes {
let axis_point = ProjectedPoint {
chunk_id: String::new(),
x,
y,
z,
from_embeddings: true,
};
let (axis_x, axis_y, _, _) = rotated_view(&axis_point, state);
let (axis_x, axis_y) = point_to_canvas(axis_x, axis_y, width, height);
context.move_to(origin_x, origin_y);
context.line_to(axis_x, axis_y);
let _ = context.stroke();
context.move_to(axis_x + 4.0, axis_y - 4.0);
let _ = context.show_text(label);
}
}
fn selected_chunk<'a>(state: &'a AppState) -> Option<&'a ChunkRecord> {
let store = state.store.as_ref()?;
let selected_id = state.selected_chunk_id.as_ref()?;
store.chunks.iter().find(|chunk| &chunk.id == selected_id)
}
fn selected_file<'a>(state: &'a AppState) -> Option<&'a FileRecord> {
let store = state.store.as_ref()?;
let selected_path = state.selected_file.as_ref()?;
store.files.iter().find(|file| &file.path == selected_path)
}
fn chunk_matches_filter(chunk: &ChunkRecord, filter: &str) -> bool {
chunk.path.to_ascii_lowercase().contains(filter)
|| chunk.text.to_ascii_lowercase().contains(filter)
|| chunk
.model
.as_ref()
.is_some_and(|model| model.to_ascii_lowercase().contains(filter))
|| chunk
.source
.as_ref()
.is_some_and(|source| source.to_ascii_lowercase().contains(filter))
}
fn clear_list_box(list_box: &ListBox) {
while let Some(child) = list_box.first_child() {
list_box.remove(&child);
}
}
fn preview_text(text: &str, max_chars: usize) -> String {
let compact = text.replace("\r\n", "\n");
let trimmed = compact.trim();
if trimmed.chars().count() <= max_chars {
trimmed.to_string()
} else {
format!("{}...", trimmed.chars().take(max_chars).collect::<String>())
}
}
fn format_line_range(chunk: &ChunkRecord) -> String {
match (chunk.start_line, chunk.end_line) {
(Some(start), Some(end)) if start == end => start.to_string(),
(Some(start), Some(end)) => format!("{start}-{end}"),
_ => "unknown".to_string(),
}
}
fn format_timestamp_ms(value: i64) -> String {
DateTime::<Utc>::from_timestamp_millis(value)
.map(|time| time.format("%Y-%m-%d %H:%M:%S UTC").to_string())
.unwrap_or_else(|| value.to_string())
}
fn join_or_unknown(values: &[String]) -> String {
if values.is_empty() {
"unknown".to_string()
} else {
values.join(", ")
}
}
fn color_for_index(index: usize, total: usize, depth: f64) -> (f64, f64, f64) {
let hue = (index as f64 / total as f64) * 0.78;
let value = 0.55 + ((depth + 1.0) / 2.0) * 0.30;
hsv_to_rgb(hue, 0.70, value)
}
fn hsv_to_rgb(hue: f64, saturation: f64, value: f64) -> (f64, f64, f64) {
let section = (hue * 6.0).floor();
let fraction = hue * 6.0 - section;
let p = value * (1.0 - saturation);
let q = value * (1.0 - fraction * saturation);
let t = value * (1.0 - (1.0 - fraction) * saturation);
match section as i32 % 6 {
0 => (value, t, p),
1 => (q, value, p),
2 => (p, value, t),
3 => (p, q, value),
4 => (t, p, value),
_ => (value, p, q),
}
}
fn startup_path() -> Option<PathBuf> {
let default = current_directory().join("main.sqlite");
default.exists().then_some(default)
}
fn current_directory() -> PathBuf {
std::env::current_dir().unwrap_or_else(|_| PathBuf::from("."))
}

223
src/projection.rs Normal file
View File

@@ -0,0 +1,223 @@
use crate::store::ChunkRecord;
#[derive(Clone, Debug)]
pub struct ProjectedPoint {
pub chunk_id: String,
pub x: f32,
pub y: f32,
pub z: f32,
pub from_embeddings: bool,
}
pub fn project_chunks(chunks: &[ChunkRecord]) -> Vec<ProjectedPoint> {
let dims = dominant_embedding_dimension(chunks);
if let Some(dims) = dims {
let embedding_points = chunks
.iter()
.filter_map(|chunk| {
chunk.embedding.as_ref().and_then(|embedding| {
(embedding.len() == dims).then(|| (chunk.id.clone(), embedding.clone()))
})
})
.collect::<Vec<_>>();
if embedding_points.len() >= 3 {
let embeddings = embedding_points
.iter()
.map(|(_, embedding)| embedding.clone())
.collect::<Vec<_>>();
let (component_x, component_y, component_z, mean) = principal_components(&embeddings);
let mut scores = embedding_points
.iter()
.map(|(chunk_id, embedding)| {
let centered = embedding
.iter()
.zip(mean.iter())
.map(|(value, avg)| *value - *avg)
.collect::<Vec<_>>();
let x = dot(&centered, &component_x);
let y = dot(&centered, &component_y);
let z = dot(&centered, &component_z);
(chunk_id.clone(), x, y, z)
})
.collect::<Vec<_>>();
normalize_scores(&mut scores);
let mut projected = scores
.into_iter()
.map(|(chunk_id, x, y, z)| ProjectedPoint {
chunk_id,
x,
y,
z,
from_embeddings: true,
})
.collect::<Vec<_>>();
let existing_ids = projected
.iter()
.map(|point| point.chunk_id.clone())
.collect::<std::collections::BTreeSet<_>>();
projected.extend(
chunks
.iter()
.filter(|chunk| !existing_ids.contains(&chunk.id))
.enumerate()
.map(|(index, chunk)| fallback_point(chunk, index, chunks.len())),
);
return projected;
}
}
chunks
.iter()
.enumerate()
.map(|(index, chunk)| fallback_point(chunk, index, chunks.len()))
.collect()
}
fn fallback_point(chunk: &ChunkRecord, index: usize, total: usize) -> ProjectedPoint {
let columns = (total as f32).sqrt().ceil().max(1.0) as usize;
let row = index / columns;
let column = index % columns;
let width = columns.max(1) as f32;
let height = ((total + columns - 1) / columns).max(1) as f32;
let x = if width <= 1.0 {
0.0
} else {
(column as f32 / (width - 1.0)) * 2.0 - 1.0
};
let y = if height <= 1.0 {
0.0
} else {
(row as f32 / (height - 1.0)) * 2.0 - 1.0
};
let z = if total <= 1 {
0.0
} else {
(index as f32 / (total as f32 - 1.0)) * 2.0 - 1.0
};
ProjectedPoint {
chunk_id: chunk.id.clone(),
x,
y,
z,
from_embeddings: false,
}
}
fn dominant_embedding_dimension(chunks: &[ChunkRecord]) -> Option<usize> {
let mut dims = std::collections::BTreeMap::<usize, usize>::new();
for embedding in chunks.iter().filter_map(|chunk| chunk.embedding.as_ref()) {
*dims.entry(embedding.len()).or_default() += 1;
}
dims.into_iter()
.max_by_key(|(_, count)| *count)
.map(|(dims, _)| dims)
}
fn principal_components(rows: &[Vec<f32>]) -> (Vec<f32>, Vec<f32>, Vec<f32>, Vec<f32>) {
let dims = rows.first().map(|row| row.len()).unwrap_or(0);
let mean = mean_vector(rows, dims);
let first = power_iteration(rows, &mean, &[]);
let second = power_iteration(rows, &mean, &[first.as_slice()]);
let third = power_iteration(rows, &mean, &[first.as_slice(), second.as_slice()]);
(first, second, third, mean)
}
fn mean_vector(rows: &[Vec<f32>], dims: usize) -> Vec<f32> {
let mut mean = vec![0.0; dims];
for row in rows {
for (index, value) in row.iter().enumerate() {
mean[index] += *value;
}
}
if !rows.is_empty() {
let scale = 1.0 / rows.len() as f32;
for value in &mut mean {
*value *= scale;
}
}
mean
}
fn power_iteration(rows: &[Vec<f32>], mean: &[f32], orthogonal_to: &[&[f32]]) -> Vec<f32> {
let dims = mean.len();
let mut vector = vec![0.0; dims];
if dims > 0 {
vector[0] = 1.0;
}
for _ in 0..24 {
let mut next = covariance_mul(rows, mean, &vector);
for orthogonal in orthogonal_to {
orthogonalize(&mut next, orthogonal);
}
let norm = l2_norm(&next);
if norm <= f32::EPSILON {
break;
}
for value in &mut next {
*value /= norm;
}
vector = next;
}
if l2_norm(&vector) <= f32::EPSILON && dims > 1 {
vector[1] = 1.0;
}
vector
}
fn covariance_mul(rows: &[Vec<f32>], mean: &[f32], vector: &[f32]) -> Vec<f32> {
let mut accum = vec![0.0; mean.len()];
for row in rows {
let centered = row
.iter()
.zip(mean.iter())
.map(|(value, avg)| *value - *avg)
.collect::<Vec<_>>();
let score = dot(&centered, vector);
for (index, value) in centered.iter().enumerate() {
accum[index] += *value * score;
}
}
accum
}
fn normalize_scores(scores: &mut [(String, f32, f32, f32)]) {
let max_x = scores
.iter()
.map(|(_, x, _, _)| x.abs())
.fold(0.0_f32, f32::max)
.max(1.0);
let max_y = scores
.iter()
.map(|(_, _, y, _)| y.abs())
.fold(0.0_f32, f32::max)
.max(1.0);
let max_z = scores
.iter()
.map(|(_, _, _, z)| z.abs())
.fold(0.0_f32, f32::max)
.max(1.0);
for (_, x, y, z) in scores {
*x /= max_x;
*y /= max_y;
*z /= max_z;
}
}
fn orthogonalize(vector: &mut [f32], basis: &[f32]) {
let projection = dot(vector, basis);
for (value, basis_value) in vector.iter_mut().zip(basis.iter()) {
*value -= basis_value * projection;
}
}
fn dot(left: &[f32], right: &[f32]) -> f32 {
left.iter().zip(right.iter()).map(|(a, b)| a * b).sum()
}
fn l2_norm(values: &[f32]) -> f32 {
dot(values, values).sqrt()
}

764
src/store.rs Normal file
View File

@@ -0,0 +1,764 @@
#![allow(dead_code)]
use std::collections::{BTreeMap, BTreeSet, HashMap};
use std::path::{Path, PathBuf};
use anyhow::{Context, Result, anyhow};
use rusqlite::types::ValueRef;
use rusqlite::{Connection, Row};
use serde::Deserialize;
use crate::projection::{ProjectedPoint, project_chunks};
#[derive(Clone, Debug)]
pub struct LoadedStore {
pub db_path: PathBuf,
pub adapter_name: String,
pub metrics: OverviewMetrics,
pub tables: Vec<TableSummary>,
pub files: Vec<FileRecord>,
pub chunks: Vec<ChunkRecord>,
pub points: Vec<ProjectedPoint>,
pub notes: Vec<String>,
}
#[derive(Clone, Debug, Default)]
pub struct OverviewMetrics {
pub total_files: usize,
pub total_chunks: usize,
pub embedding_rows: usize,
pub embedding_dims: Option<usize>,
pub vector_backend: Option<String>,
pub models: Vec<String>,
pub sources: Vec<String>,
}
#[derive(Clone, Debug)]
pub struct TableSummary {
pub name: String,
pub columns: Vec<String>,
pub create_sql: Option<String>,
pub row_count: Option<i64>,
}
#[derive(Clone, Debug)]
pub struct FileRecord {
pub path: String,
pub source: Option<String>,
pub size: Option<i64>,
pub mtime_ms: Option<i64>,
pub chunk_count: usize,
}
#[derive(Clone, Debug)]
pub struct ChunkRecord {
pub id: String,
pub path: String,
pub source: Option<String>,
pub model: Option<String>,
pub start_line: Option<i64>,
pub end_line: Option<i64>,
pub updated_at_ms: Option<i64>,
pub text: String,
pub embedding: Option<Vec<f32>>,
}
pub fn detect_and_load(db_path: &Path) -> Result<LoadedStore> {
let connection = Connection::open(db_path)
.with_context(|| format!("failed to open SQLite database at {}", db_path.display()))?;
let schema = SchemaSnapshot::read(&connection)?;
let adapters: Vec<Box<dyn VectorStoreAdapter>> =
vec![Box::new(OpenClawAdapter), Box::new(GenericSqliteAdapter)];
let adapter = adapters
.into_iter()
.find(|adapter| adapter.detect(&schema))
.ok_or_else(|| anyhow!("no compatible adapter found"))?;
adapter.load(&connection, db_path, &schema)
}
trait VectorStoreAdapter {
fn name(&self) -> &'static str;
fn detect(&self, schema: &SchemaSnapshot) -> bool;
fn load(
&self,
connection: &Connection,
db_path: &Path,
schema: &SchemaSnapshot,
) -> Result<LoadedStore>;
}
struct OpenClawAdapter;
impl VectorStoreAdapter for OpenClawAdapter {
fn name(&self) -> &'static str {
"OpenClaw Memory"
}
fn detect(&self, schema: &SchemaSnapshot) -> bool {
schema.has_table_with_columns(
"chunks",
&[
"id",
"path",
"source",
"start_line",
"end_line",
"model",
"text",
"embedding",
],
) && schema.has_table_with_columns("files", &["path", "source", "size", "mtime"])
}
fn load(
&self,
connection: &Connection,
db_path: &Path,
schema: &SchemaSnapshot,
) -> Result<LoadedStore> {
let meta = load_openclaw_meta(connection)?;
let mut files = load_openclaw_files(connection)?;
let chunks = load_openclaw_chunks(connection)?;
files.sort_by(|left, right| {
right
.chunk_count
.cmp(&left.chunk_count)
.then_with(|| left.path.cmp(&right.path))
});
let metrics = build_metrics(
&files,
&chunks,
meta.as_ref().and_then(|meta| meta.vector_dims),
schema,
);
let notes = build_openclaw_notes(schema, meta.as_ref(), &metrics);
let tables = schema.to_summaries(connection);
let points = project_chunks(&chunks);
Ok(LoadedStore {
db_path: db_path.to_path_buf(),
adapter_name: self.name().to_string(),
metrics,
tables,
files,
chunks,
points,
notes,
})
}
}
struct GenericSqliteAdapter;
impl VectorStoreAdapter for GenericSqliteAdapter {
fn name(&self) -> &'static str {
"Generic SQLite Vector Store"
}
fn detect(&self, schema: &SchemaSnapshot) -> bool {
choose_content_mapping(schema).is_some()
}
fn load(
&self,
connection: &Connection,
db_path: &Path,
schema: &SchemaSnapshot,
) -> Result<LoadedStore> {
let mapping = choose_content_mapping(schema)
.ok_or_else(|| anyhow!("unable to find a chunk/content table"))?;
let chunks = load_generic_chunks(connection, &mapping)?;
let files = load_generic_files(connection, schema, &mapping, &chunks)?;
let metrics = build_metrics(&files, &chunks, None, schema);
let mut notes = vec![format!(
"Detected chunk-like table `{}` using heuristic column matching.",
mapping.table_name
)];
if let Some(vector_backend) = metrics.vector_backend.as_ref() {
notes.push(format!(
"Vector backend inferred from schema artifacts: {}.",
vector_backend
));
}
let tables = schema.to_summaries(connection);
let points = project_chunks(&chunks);
Ok(LoadedStore {
db_path: db_path.to_path_buf(),
adapter_name: self.name().to_string(),
metrics,
tables,
files,
chunks,
points,
notes,
})
}
}
#[derive(Clone, Debug)]
struct SchemaSnapshot {
tables: Vec<TableSchema>,
}
#[derive(Clone, Debug)]
struct TableSchema {
name: String,
columns: Vec<String>,
create_sql: Option<String>,
}
impl SchemaSnapshot {
fn read(connection: &Connection) -> Result<Self> {
let mut statement = connection.prepare(
"SELECT name, sql FROM sqlite_master WHERE type IN ('table', 'view') ORDER BY name",
)?;
let raw_tables = statement.query_map([], |row| {
Ok((row.get::<_, String>(0)?, row.get::<_, Option<String>>(1)?))
})?;
let mut tables = Vec::new();
for table in raw_tables {
let (name, create_sql) = table?;
let columns = read_columns(connection, &name).unwrap_or_default();
tables.push(TableSchema {
name,
columns,
create_sql,
});
}
Ok(Self { tables })
}
fn has_table_with_columns(&self, table_name: &str, required: &[&str]) -> bool {
self.table(table_name).is_some_and(|table| {
required
.iter()
.all(|required_name| table.has_column(required_name))
})
}
fn table(&self, table_name: &str) -> Option<&TableSchema> {
self.tables.iter().find(|table| table.name == table_name)
}
fn to_summaries(&self, connection: &Connection) -> Vec<TableSummary> {
self.tables
.iter()
.map(|table| TableSummary {
name: table.name.clone(),
columns: table.columns.clone(),
create_sql: table.create_sql.clone(),
row_count: count_rows(connection, &table.name).ok(),
})
.collect()
}
}
impl TableSchema {
fn has_column(&self, column_name: &str) -> bool {
self.columns.iter().any(|column| column == column_name)
}
}
#[derive(Debug, Deserialize)]
struct OpenClawMeta {
#[serde(default)]
model: Option<String>,
#[serde(default)]
provider: Option<String>,
#[serde(rename = "vectorDims", default)]
vector_dims: Option<usize>,
}
fn load_openclaw_meta(connection: &Connection) -> Result<Option<OpenClawMeta>> {
let raw_value = connection
.query_row(
"SELECT value FROM meta WHERE key = 'memory_index_meta_v1'",
[],
|row| row.get::<_, String>(0),
)
.ok();
raw_value
.map(|value| serde_json::from_str::<OpenClawMeta>(&value).context("failed to parse meta"))
.transpose()
}
fn load_openclaw_files(connection: &Connection) -> Result<Vec<FileRecord>> {
let mut statement = connection.prepare(
"SELECT
f.path,
f.source,
f.size,
CAST(f.mtime AS INTEGER) AS mtime_ms,
COUNT(c.id) AS chunk_count
FROM files f
LEFT JOIN chunks c ON c.path = f.path
GROUP BY f.path, f.source, f.size, f.mtime
ORDER BY f.path",
)?;
let rows = statement.query_map([], |row| {
Ok(FileRecord {
path: row.get(0)?,
source: row.get(1)?,
size: row.get(2)?,
mtime_ms: row.get(3)?,
chunk_count: row.get::<_, i64>(4).unwrap_or_default().max(0) as usize,
})
})?;
rows.collect::<rusqlite::Result<Vec<_>>>()
.map_err(Into::into)
}
fn load_openclaw_chunks(connection: &Connection) -> Result<Vec<ChunkRecord>> {
let mut statement = connection.prepare(
"SELECT
id,
path,
source,
model,
start_line,
end_line,
updated_at,
text,
embedding
FROM chunks
ORDER BY path, start_line, end_line, id",
)?;
let rows = statement.query_map([], decode_chunk_row)?;
rows.collect::<rusqlite::Result<Vec<_>>>()
.map_err(Into::into)
}
fn read_columns(connection: &Connection, table_name: &str) -> Result<Vec<String>> {
let pragma = format!("PRAGMA table_info({})", quote_ident(table_name));
let mut statement = connection.prepare(&pragma)?;
let rows = statement.query_map([], |row| row.get::<_, String>(1))?;
rows.collect::<rusqlite::Result<Vec<_>>>()
.map_err(Into::into)
}
fn count_rows(connection: &Connection, table_name: &str) -> Result<i64> {
let sql = format!("SELECT COUNT(*) FROM {}", quote_ident(table_name));
connection
.query_row(&sql, [], |row| row.get::<_, i64>(0))
.map_err(Into::into)
}
fn build_openclaw_notes(
schema: &SchemaSnapshot,
meta: Option<&OpenClawMeta>,
metrics: &OverviewMetrics,
) -> Vec<String> {
let mut notes = Vec::new();
if let Some(meta) = meta {
if let (Some(provider), Some(model)) = (&meta.provider, &meta.model) {
notes.push(format!(
"Memory index metadata reports provider `{}` with embedding model `{}`.",
provider, model
));
}
}
if schema.table("chunks_fts").is_some() {
notes.push(
"FTS5 is present, so the store can support lexical search alongside vectors."
.to_string(),
);
}
if schema.table("chunks_vec").is_some() {
notes.push("`sqlite-vec` artifacts are present, which aligns with OpenClaw's hybrid search layout.".to_string());
}
notes.push(format!(
"Loaded {} files and {} chunks from the memory index.",
metrics.total_files, metrics.total_chunks
));
notes
}
fn build_metrics(
files: &[FileRecord],
chunks: &[ChunkRecord],
meta_dims: Option<usize>,
schema: &SchemaSnapshot,
) -> OverviewMetrics {
let mut models = BTreeSet::new();
let mut sources = BTreeSet::new();
let mut dims = BTreeMap::<usize, usize>::new();
for chunk in chunks {
if let Some(model) = chunk.model.as_ref() {
models.insert(model.clone());
}
if let Some(source) = chunk.source.as_ref() {
sources.insert(source.clone());
}
if let Some(embedding) = chunk.embedding.as_ref() {
*dims.entry(embedding.len()).or_default() += 1;
}
}
for file in files {
if let Some(source) = file.source.as_ref() {
sources.insert(source.clone());
}
}
let vector_backend = detect_vector_backend(schema);
let embedding_dims = meta_dims.or_else(|| {
dims.into_iter()
.max_by_key(|(_, count)| *count)
.map(|(dims, _)| dims)
});
OverviewMetrics {
total_files: files.len(),
total_chunks: chunks.len(),
embedding_rows: chunks
.iter()
.filter(|chunk| chunk.embedding.is_some())
.count(),
embedding_dims,
vector_backend,
models: models.into_iter().collect(),
sources: sources.into_iter().collect(),
}
}
fn detect_vector_backend(schema: &SchemaSnapshot) -> Option<String> {
let mut backends = Vec::new();
for table in &schema.tables {
let sql = table
.create_sql
.as_deref()
.unwrap_or_default()
.to_ascii_lowercase();
if sql.contains("using vec0") {
backends.push("sqlite-vec".to_string());
} else if sql.contains("using vss0") {
backends.push("sqlite-vss".to_string());
} else if table.name.contains("fts") {
backends.push("fts5".to_string());
}
}
backends.sort();
backends.dedup();
(!backends.is_empty()).then(|| backends.join(" + "))
}
#[derive(Clone, Debug)]
struct GenericContentMapping {
table_name: String,
id_column: Option<String>,
path_column: Option<String>,
text_column: String,
source_column: Option<String>,
model_column: Option<String>,
start_line_column: Option<String>,
end_line_column: Option<String>,
updated_at_column: Option<String>,
embedding_column: Option<String>,
}
fn choose_content_mapping(schema: &SchemaSnapshot) -> Option<GenericContentMapping> {
let mut best_mapping = None;
let mut best_score = i32::MIN;
for table in &schema.tables {
let lower = table
.columns
.iter()
.map(|column| column.to_ascii_lowercase())
.collect::<Vec<_>>();
let text_column = find_column(
&lower,
&[
"text",
"content",
"chunk_text",
"body",
"document",
"payload",
],
);
let embedding_column = find_column(
&lower,
&["embedding", "vector", "embedding_json", "embedding_blob"],
);
let id_column = find_column(&lower, &["id", "chunk_id", "uuid"]);
let path_column = find_column(
&lower,
&[
"path",
"file_path",
"document_path",
"source_path",
"uri",
"doc_id",
"document_id",
],
);
if let Some(text_index) = text_column {
let mut score = 15;
if embedding_column.is_some() {
score += 20;
}
if id_column.is_some() {
score += 4;
}
if path_column.is_some() {
score += 4;
}
if table.name.contains("chunk") || table.name.contains("embedding") {
score += 3;
}
if score > best_score {
best_score = score;
best_mapping = Some(GenericContentMapping {
table_name: table.name.clone(),
id_column: id_column.map(|index| table.columns[index].clone()),
path_column: path_column.map(|index| table.columns[index].clone()),
text_column: table.columns[text_index].clone(),
source_column: find_column(&lower, &["source", "provider", "namespace"])
.map(|index| table.columns[index].clone()),
model_column: find_column(&lower, &["model", "embedding_model"])
.map(|index| table.columns[index].clone()),
start_line_column: find_column(&lower, &["start_line", "line_start"])
.map(|index| table.columns[index].clone()),
end_line_column: find_column(&lower, &["end_line", "line_end"])
.map(|index| table.columns[index].clone()),
updated_at_column: find_column(&lower, &["updated_at", "mtime", "created_at"])
.map(|index| table.columns[index].clone()),
embedding_column: embedding_column.map(|index| table.columns[index].clone()),
});
}
}
}
best_mapping
}
fn find_column(columns: &[String], candidates: &[&str]) -> Option<usize> {
candidates
.iter()
.find_map(|candidate| columns.iter().position(|column| column == candidate))
}
fn load_generic_chunks(
connection: &Connection,
mapping: &GenericContentMapping,
) -> Result<Vec<ChunkRecord>> {
let sql = format!(
"SELECT
{id_expr} AS item_id,
{path_expr} AS item_path,
{source_expr} AS item_source,
{model_expr} AS item_model,
{start_expr} AS item_start_line,
{end_expr} AS item_end_line,
{updated_expr} AS item_updated_at,
{text_expr} AS item_text,
{embedding_expr} AS item_embedding
FROM {table_name}",
id_expr = mapping
.id_column
.as_ref()
.map(|column| quote_ident(column))
.unwrap_or_else(|| "CAST(rowid AS TEXT)".to_string()),
path_expr = nullable_ident(mapping.path_column.as_ref()),
source_expr = nullable_ident(mapping.source_column.as_ref()),
model_expr = nullable_ident(mapping.model_column.as_ref()),
start_expr = nullable_ident(mapping.start_line_column.as_ref()),
end_expr = nullable_ident(mapping.end_line_column.as_ref()),
updated_expr = nullable_ident(mapping.updated_at_column.as_ref()),
text_expr = quote_ident(&mapping.text_column),
embedding_expr = nullable_ident(mapping.embedding_column.as_ref()),
table_name = quote_ident(&mapping.table_name),
);
let mut statement = connection.prepare(&sql)?;
let rows = statement.query_map([], |row| {
let id = row.get::<_, String>(0)?;
let path = row
.get::<_, Option<String>>(1)?
.unwrap_or_else(|| format!("{}#{}", mapping.table_name, id));
let source = row.get(2)?;
let model = row.get(3)?;
let start_line = row.get(4)?;
let end_line = row.get(5)?;
let updated_at_ms = row.get(6)?;
let text = row.get::<_, String>(7)?;
let embedding = decode_embedding_value(row, 8)?;
Ok(ChunkRecord {
id,
path,
source,
model,
start_line,
end_line,
updated_at_ms,
text,
embedding,
})
})?;
rows.collect::<rusqlite::Result<Vec<_>>>()
.map_err(Into::into)
}
fn load_generic_files(
connection: &Connection,
schema: &SchemaSnapshot,
_mapping: &GenericContentMapping,
chunks: &[ChunkRecord],
) -> Result<Vec<FileRecord>> {
if let Some(file_table) = choose_file_table(schema) {
let sql = format!(
"SELECT
{path_expr} AS file_path,
{source_expr} AS file_source,
{size_expr} AS file_size,
{mtime_expr} AS file_mtime
FROM {table_name}",
path_expr = quote_ident(&file_table.path_column),
source_expr = nullable_ident(file_table.source_column.as_ref()),
size_expr = nullable_ident(file_table.size_column.as_ref()),
mtime_expr = nullable_ident(file_table.mtime_column.as_ref()),
table_name = quote_ident(&file_table.table_name),
);
let chunk_counts = group_chunk_counts(chunks);
let mut statement = connection.prepare(&sql)?;
let rows = statement.query_map([], |row| {
let path = row.get::<_, String>(0)?;
Ok(FileRecord {
chunk_count: *chunk_counts.get(&path).unwrap_or(&0),
path,
source: row.get(1)?,
size: row.get(2)?,
mtime_ms: row.get(3)?,
})
})?;
let mut files = rows.collect::<rusqlite::Result<Vec<_>>>()?;
files.sort_by(|left, right| left.path.cmp(&right.path));
return Ok(files);
}
let chunk_counts = group_chunk_counts(chunks);
let mut files = chunk_counts
.into_iter()
.map(|(path, chunk_count)| FileRecord {
path,
source: None,
size: None,
mtime_ms: None,
chunk_count,
})
.collect::<Vec<_>>();
files.sort_by(|left, right| left.path.cmp(&right.path));
Ok(files)
}
#[derive(Clone, Debug)]
struct FileTableMapping {
table_name: String,
path_column: String,
source_column: Option<String>,
size_column: Option<String>,
mtime_column: Option<String>,
}
fn choose_file_table(schema: &SchemaSnapshot) -> Option<FileTableMapping> {
schema.tables.iter().find_map(|table| {
let lower = table
.columns
.iter()
.map(|column| column.to_ascii_lowercase())
.collect::<Vec<_>>();
let path_column = find_column(
&lower,
&["path", "file_path", "document_path", "source_path", "uri"],
)?;
let has_size = find_column(&lower, &["size", "byte_size"]);
let has_mtime = find_column(&lower, &["mtime", "updated_at", "modified_at"]);
if table.name.contains("file") || has_size.is_some() || has_mtime.is_some() {
Some(FileTableMapping {
table_name: table.name.clone(),
path_column: table.columns[path_column].clone(),
source_column: find_column(&lower, &["source", "provider", "namespace"])
.map(|index| table.columns[index].clone()),
size_column: has_size.map(|index| table.columns[index].clone()),
mtime_column: has_mtime.map(|index| table.columns[index].clone()),
})
} else {
None
}
})
}
fn group_chunk_counts(chunks: &[ChunkRecord]) -> HashMap<String, usize> {
let mut counts = HashMap::new();
for chunk in chunks {
*counts.entry(chunk.path.clone()).or_default() += 1;
}
counts
}
fn decode_chunk_row(row: &Row<'_>) -> rusqlite::Result<ChunkRecord> {
Ok(ChunkRecord {
id: row.get(0)?,
path: row.get(1)?,
source: row.get(2)?,
model: row.get(3)?,
start_line: row.get(4)?,
end_line: row.get(5)?,
updated_at_ms: row.get(6)?,
text: row.get(7)?,
embedding: decode_embedding_value(row, 8)?,
})
}
fn decode_embedding_value(
row: &Row<'_>,
column_index: usize,
) -> rusqlite::Result<Option<Vec<f32>>> {
match row.get_ref(column_index)? {
ValueRef::Null => Ok(None),
ValueRef::Text(bytes) => Ok(parse_text_embedding(bytes)),
ValueRef::Blob(bytes) => Ok(parse_blob_embedding(bytes)),
ValueRef::Integer(_) | ValueRef::Real(_) => Ok(None),
}
}
fn parse_text_embedding(bytes: &[u8]) -> Option<Vec<f32>> {
let text = std::str::from_utf8(bytes).ok()?;
if text.trim().is_empty() {
return None;
}
serde_json::from_str::<Vec<f32>>(text)
.ok()
.filter(|values| !values.is_empty())
}
fn parse_blob_embedding(bytes: &[u8]) -> Option<Vec<f32>> {
if bytes.len() % 4 != 0 {
return None;
}
let mut values = Vec::with_capacity(bytes.len() / 4);
for chunk in bytes.chunks_exact(4) {
values.push(f32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]));
}
(!values.is_empty()).then_some(values)
}
fn quote_ident(identifier: &str) -> String {
format!("\"{}\"", identifier.replace('"', "\"\""))
}
fn nullable_ident(identifier: Option<&String>) -> String {
identifier
.map(|identifier| quote_ident(identifier))
.unwrap_or_else(|| "NULL".to_string())
}