First commit
This commit is contained in:
1
.gitignore
vendored
Normal file
1
.gitignore
vendored
Normal file
@@ -0,0 +1 @@
|
|||||||
|
/target
|
||||||
2397
Cargo.lock
generated
Normal file
2397
Cargo.lock
generated
Normal file
File diff suppressed because it is too large
Load Diff
13
Cargo.toml
Normal file
13
Cargo.toml
Normal file
@@ -0,0 +1,13 @@
|
|||||||
|
[package]
|
||||||
|
name = "vector_explorer"
|
||||||
|
version = "0.1.0"
|
||||||
|
edition = "2024"
|
||||||
|
|
||||||
|
[dependencies]
|
||||||
|
anyhow = "1.0"
|
||||||
|
chrono = { version = "0.4", default-features = false, features = ["clock"] }
|
||||||
|
gtk4 = "0.10"
|
||||||
|
rfd = "0.15"
|
||||||
|
rusqlite = { version = "0.37", features = ["bundled"] }
|
||||||
|
serde = { version = "1.0", features = ["derive"] }
|
||||||
|
serde_json = "1.0"
|
||||||
150
README.md
Normal file
150
README.md
Normal file
@@ -0,0 +1,150 @@
|
|||||||
|
# Vector Explorer
|
||||||
|
|
||||||
|
`vector_explorer` is a Rust + GTK desktop viewer for SQLite-backed vector stores. It is designed around OpenClaw's memory layout and can fall back to a generic adapter for similar databases such as NullClaw- or ZeroClaw-style stores that keep chunk text and embeddings in SQLite tables.
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- Store overview with detected adapter, file count, chunk count, embedding dimensions, models, and inferred backend features such as FTS5 or `sqlite-vec`
|
||||||
|
- Filterable file list with a top-level reset entry to return to the full graph
|
||||||
|
- Rotatable 3D semantic map based on a lightweight PCA-style projection of embeddings
|
||||||
|
- Detail panel for the selected chunk or file
|
||||||
|
- OpenClaw-first schema detection with generic SQLite heuristics for other stores
|
||||||
|
|
||||||
|
## Supported store layouts
|
||||||
|
|
||||||
|
### OpenClaw
|
||||||
|
|
||||||
|
The first-class adapter targets the documented OpenClaw memory structure:
|
||||||
|
|
||||||
|
- `files` for indexed documents
|
||||||
|
- `chunks` for chunk text and stored embeddings
|
||||||
|
- `meta` for memory index metadata
|
||||||
|
- `chunks_fts` and `chunks_vec` as optional hybrid-search acceleration tables
|
||||||
|
|
||||||
|
The sample `main.sqlite` in this repository matches that layout.
|
||||||
|
|
||||||
|
### Generic SQLite vector stores
|
||||||
|
|
||||||
|
If the database is not recognized as OpenClaw, the app falls back to schema heuristics:
|
||||||
|
|
||||||
|
- Find a table with text/content columns
|
||||||
|
- Prefer tables that also expose `embedding` or `vector`
|
||||||
|
- Reconstruct file groupings from a path-like column or from the chunk table itself
|
||||||
|
|
||||||
|
That makes the viewer usable for adjacent SQLite-backed vector stores such as NullClaw- or ZeroClaw-style databases, as long as they expose chunk text and embeddings in a reasonably conventional schema.
|
||||||
|
|
||||||
|
## Rust dependencies
|
||||||
|
|
||||||
|
These are declared in [Cargo.toml](C:/Users/Trude/Desktop/vector_explorer/Cargo.toml):
|
||||||
|
|
||||||
|
- `anyhow`
|
||||||
|
- `chrono`
|
||||||
|
- `gtk4`
|
||||||
|
- `rfd`
|
||||||
|
- `rusqlite` with `bundled`
|
||||||
|
- `serde`
|
||||||
|
- `serde_json`
|
||||||
|
|
||||||
|
## System dependencies
|
||||||
|
|
||||||
|
### Required
|
||||||
|
|
||||||
|
- Rust toolchain with `cargo`
|
||||||
|
- GTK 4 development files and runtime
|
||||||
|
- `pkg-config`
|
||||||
|
- C/C++ build tooling appropriate for your platform
|
||||||
|
|
||||||
|
### WSL / Ubuntu packages
|
||||||
|
|
||||||
|
If you are building in WSL on an Ubuntu-like distro, install:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo apt update
|
||||||
|
sudo apt install -y build-essential pkg-config libgtk-4-dev
|
||||||
|
```
|
||||||
|
|
||||||
|
### Windows native build
|
||||||
|
|
||||||
|
If you are building natively on Windows instead of WSL, you need:
|
||||||
|
|
||||||
|
- Visual Studio Build Tools with the MSVC C++ toolchain
|
||||||
|
- GTK 4 SDK/runtime
|
||||||
|
- `pkg-config` configured to find the GTK installation
|
||||||
|
|
||||||
|
The project compiles more reliably in WSL than in a partially configured native Windows environment.
|
||||||
|
|
||||||
|
## How to compile
|
||||||
|
|
||||||
|
### WSL
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /mnt/c/Users/Trude/Desktop/vector_explorer
|
||||||
|
cargo build
|
||||||
|
```
|
||||||
|
|
||||||
|
For an optimized build:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /mnt/c/Users/Trude/Desktop/vector_explorer
|
||||||
|
cargo build --release
|
||||||
|
```
|
||||||
|
|
||||||
|
### Windows native
|
||||||
|
|
||||||
|
```powershell
|
||||||
|
cd C:\Users\Trude\Desktop\vector_explorer
|
||||||
|
cargo build
|
||||||
|
```
|
||||||
|
|
||||||
|
For an optimized build:
|
||||||
|
|
||||||
|
```powershell
|
||||||
|
cd C:\Users\Trude\Desktop\vector_explorer
|
||||||
|
cargo build --release
|
||||||
|
```
|
||||||
|
|
||||||
|
## How to run
|
||||||
|
|
||||||
|
### WSL
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /mnt/c/Users/Trude/Desktop/vector_explorer
|
||||||
|
cargo run -- main.sqlite
|
||||||
|
```
|
||||||
|
|
||||||
|
To open a different database:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /mnt/c/Users/Trude/Desktop/vector_explorer
|
||||||
|
cargo run -- /path/to/other.sqlite
|
||||||
|
```
|
||||||
|
|
||||||
|
### Windows native
|
||||||
|
|
||||||
|
```powershell
|
||||||
|
cd C:\Users\Trude\Desktop\vector_explorer
|
||||||
|
cargo run -- .\main.sqlite
|
||||||
|
```
|
||||||
|
|
||||||
|
## Notes for WSL graphics
|
||||||
|
|
||||||
|
- The app is GTK-based and needs a working GUI/display path in WSL.
|
||||||
|
- If WSLg or X/Wayland forwarding is not available, the binary can compile successfully but fail to open a display at runtime.
|
||||||
|
- The app forces `GSK_RENDERER=cairo` by default to avoid flaky EGL/Zink GPU paths commonly seen on WSL.
|
||||||
|
|
||||||
|
## Development workflow
|
||||||
|
|
||||||
|
Useful commands:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cargo fmt
|
||||||
|
cargo check
|
||||||
|
cargo build
|
||||||
|
```
|
||||||
|
|
||||||
|
## Repository contents
|
||||||
|
|
||||||
|
- [src/main.rs](C:/Users/Trude/Desktop/vector_explorer/src/main.rs): GTK application and UI behavior
|
||||||
|
- [src/store.rs](C:/Users/Trude/Desktop/vector_explorer/src/store.rs): SQLite schema detection and adapter loading
|
||||||
|
- [src/projection.rs](C:/Users/Trude/Desktop/vector_explorer/src/projection.rs): embedding projection for the semantic map
|
||||||
|
- [main.sqlite](C:/Users/Trude/Desktop/vector_explorer/main.sqlite): sample OpenClaw-style memory database
|
||||||
990
src/main.rs
Normal file
990
src/main.rs
Normal file
@@ -0,0 +1,990 @@
|
|||||||
|
mod projection;
|
||||||
|
mod store;
|
||||||
|
|
||||||
|
use std::cell::RefCell;
|
||||||
|
use std::path::{Path, PathBuf};
|
||||||
|
use std::rc::Rc;
|
||||||
|
|
||||||
|
use chrono::{DateTime, Utc};
|
||||||
|
use gtk::cairo::Context;
|
||||||
|
use gtk::gio;
|
||||||
|
use gtk::prelude::*;
|
||||||
|
use gtk::{
|
||||||
|
Align, Application, ApplicationWindow, Box as GtkBox, Button, DrawingArea, Entry, Frame,
|
||||||
|
GestureClick, HeaderBar, Label, ListBox, Orientation, Paned, PolicyType, Scale, ScrolledWindow,
|
||||||
|
SelectionMode, TextView, WrapMode,
|
||||||
|
};
|
||||||
|
use gtk4 as gtk;
|
||||||
|
|
||||||
|
use crate::projection::ProjectedPoint;
|
||||||
|
use crate::store::{ChunkRecord, FileRecord, LoadedStore, detect_and_load};
|
||||||
|
|
||||||
|
#[derive(Clone)]
|
||||||
|
struct AppWidgets {
|
||||||
|
window: ApplicationWindow,
|
||||||
|
open_button: Button,
|
||||||
|
clear_button: Button,
|
||||||
|
path_label: Label,
|
||||||
|
adapter_label: Label,
|
||||||
|
stats_label: Label,
|
||||||
|
models_label: Label,
|
||||||
|
backend_label: Label,
|
||||||
|
notes_label: Label,
|
||||||
|
filter_entry: Entry,
|
||||||
|
file_list: ListBox,
|
||||||
|
yaw_scale: Scale,
|
||||||
|
pitch_scale: Scale,
|
||||||
|
plot_area: DrawingArea,
|
||||||
|
selection_label: Label,
|
||||||
|
meta_label: Label,
|
||||||
|
chunk_view: TextView,
|
||||||
|
tables_label: Label,
|
||||||
|
}
|
||||||
|
|
||||||
|
struct AppState {
|
||||||
|
store: Option<LoadedStore>,
|
||||||
|
selected_file: Option<String>,
|
||||||
|
selected_chunk_id: Option<String>,
|
||||||
|
filter_text: String,
|
||||||
|
yaw_deg: f64,
|
||||||
|
pitch_deg: f64,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Default for AppState {
|
||||||
|
fn default() -> Self {
|
||||||
|
Self {
|
||||||
|
store: None,
|
||||||
|
selected_file: None,
|
||||||
|
selected_chunk_id: None,
|
||||||
|
filter_text: String::new(),
|
||||||
|
yaw_deg: 28.0,
|
||||||
|
pitch_deg: -18.0,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
struct Runtime {
|
||||||
|
widgets: AppWidgets,
|
||||||
|
state: Rc<RefCell<AppState>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
thread_local! {
|
||||||
|
static RUNTIME: RefCell<Option<Rc<Runtime>>> = const { RefCell::new(None) };
|
||||||
|
}
|
||||||
|
|
||||||
|
fn main() {
|
||||||
|
if std::env::var_os("GSK_RENDERER").is_none() {
|
||||||
|
// The viewer draws with Cairo; forcing the Cairo renderer avoids flaky EGL/GPU paths on WSL.
|
||||||
|
unsafe {
|
||||||
|
std::env::set_var("GSK_RENDERER", "cairo");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
let application = Application::builder()
|
||||||
|
.application_id("ai.openclaw.vector-explorer")
|
||||||
|
.flags(gio::ApplicationFlags::HANDLES_OPEN)
|
||||||
|
.build();
|
||||||
|
|
||||||
|
application.connect_activate(|application| {
|
||||||
|
let runtime = ensure_runtime(application);
|
||||||
|
if runtime.state.borrow().store.is_none() {
|
||||||
|
if let Some(start_path) = startup_path() {
|
||||||
|
load_database(&runtime.widgets, &runtime.state, &start_path);
|
||||||
|
} else {
|
||||||
|
runtime
|
||||||
|
.widgets
|
||||||
|
.path_label
|
||||||
|
.set_text("Open a SQLite memory file to begin.");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
runtime.widgets.window.present();
|
||||||
|
});
|
||||||
|
application.connect_open(|application, files, _| {
|
||||||
|
let runtime = ensure_runtime(application);
|
||||||
|
if let Some(path) = files.first().and_then(|file| file.path()) {
|
||||||
|
load_database(&runtime.widgets, &runtime.state, &path);
|
||||||
|
}
|
||||||
|
runtime.widgets.window.present();
|
||||||
|
});
|
||||||
|
application.run();
|
||||||
|
}
|
||||||
|
|
||||||
|
fn ensure_runtime(application: &Application) -> Rc<Runtime> {
|
||||||
|
RUNTIME.with(|runtime| {
|
||||||
|
if let Some(existing) = runtime.borrow().as_ref() {
|
||||||
|
return existing.clone();
|
||||||
|
}
|
||||||
|
let widgets = build_widgets(application);
|
||||||
|
let state = Rc::new(RefCell::new(AppState::default()));
|
||||||
|
wire_events(&widgets, &state);
|
||||||
|
let created = Rc::new(Runtime { widgets, state });
|
||||||
|
runtime.replace(Some(created.clone()));
|
||||||
|
created
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
fn build_widgets(application: &Application) -> AppWidgets {
|
||||||
|
let window = ApplicationWindow::builder()
|
||||||
|
.application(application)
|
||||||
|
.title("Vector Explorer")
|
||||||
|
.default_width(1440)
|
||||||
|
.default_height(920)
|
||||||
|
.build();
|
||||||
|
|
||||||
|
let open_button = Button::with_label("Open Database");
|
||||||
|
let clear_button = Button::with_label("Clear File Filter");
|
||||||
|
let filter_entry = Entry::builder()
|
||||||
|
.placeholder_text("Filter by path, source, model, or chunk text")
|
||||||
|
.hexpand(true)
|
||||||
|
.build();
|
||||||
|
|
||||||
|
let header = HeaderBar::builder().show_title_buttons(true).build();
|
||||||
|
header.pack_start(&open_button);
|
||||||
|
header.pack_start(&clear_button);
|
||||||
|
header.pack_end(&filter_entry);
|
||||||
|
window.set_titlebar(Some(&header));
|
||||||
|
|
||||||
|
let root = GtkBox::new(Orientation::Vertical, 12);
|
||||||
|
root.set_margin_top(12);
|
||||||
|
root.set_margin_bottom(12);
|
||||||
|
root.set_margin_start(12);
|
||||||
|
root.set_margin_end(12);
|
||||||
|
|
||||||
|
let path_label = Label::builder()
|
||||||
|
.xalign(0.0)
|
||||||
|
.wrap(true)
|
||||||
|
.selectable(true)
|
||||||
|
.build();
|
||||||
|
let adapter_label = Label::builder().xalign(0.0).build();
|
||||||
|
let stats_label = Label::builder().xalign(0.0).build();
|
||||||
|
let models_label = Label::builder().xalign(0.0).wrap(true).build();
|
||||||
|
let backend_label = Label::builder().xalign(0.0).wrap(true).build();
|
||||||
|
let notes_label = Label::builder().xalign(0.0).wrap(true).build();
|
||||||
|
|
||||||
|
let overview_frame = Frame::builder().label("Overview").build();
|
||||||
|
let overview_box = GtkBox::new(Orientation::Vertical, 8);
|
||||||
|
overview_box.append(&path_label);
|
||||||
|
overview_box.append(&adapter_label);
|
||||||
|
overview_box.append(&stats_label);
|
||||||
|
overview_box.append(&models_label);
|
||||||
|
overview_box.append(&backend_label);
|
||||||
|
overview_box.append(¬es_label);
|
||||||
|
overview_frame.set_child(Some(&overview_box));
|
||||||
|
root.append(&overview_frame);
|
||||||
|
|
||||||
|
let file_list = ListBox::new();
|
||||||
|
file_list.set_selection_mode(SelectionMode::None);
|
||||||
|
let file_scroller = ScrolledWindow::builder()
|
||||||
|
.hscrollbar_policy(PolicyType::Never)
|
||||||
|
.min_content_width(300)
|
||||||
|
.build();
|
||||||
|
file_scroller.set_child(Some(&file_list));
|
||||||
|
let files_frame = Frame::builder().label("Files").build();
|
||||||
|
files_frame.set_child(Some(&file_scroller));
|
||||||
|
|
||||||
|
let yaw_scale = Scale::with_range(Orientation::Horizontal, -180.0, 180.0, 1.0);
|
||||||
|
yaw_scale.set_value(28.0);
|
||||||
|
yaw_scale.set_hexpand(true);
|
||||||
|
let pitch_scale = Scale::with_range(Orientation::Horizontal, -75.0, 75.0, 1.0);
|
||||||
|
pitch_scale.set_value(-18.0);
|
||||||
|
pitch_scale.set_hexpand(true);
|
||||||
|
let plot_area = DrawingArea::builder()
|
||||||
|
.content_width(800)
|
||||||
|
.content_height(640)
|
||||||
|
.hexpand(true)
|
||||||
|
.vexpand(true)
|
||||||
|
.build();
|
||||||
|
let controls_box = GtkBox::new(Orientation::Horizontal, 8);
|
||||||
|
controls_box.append(&Label::builder().label("Yaw").build());
|
||||||
|
controls_box.append(&yaw_scale);
|
||||||
|
controls_box.append(&Label::builder().label("Pitch").build());
|
||||||
|
controls_box.append(&pitch_scale);
|
||||||
|
let plot_panel = GtkBox::new(Orientation::Vertical, 8);
|
||||||
|
plot_panel.append(&controls_box);
|
||||||
|
plot_panel.append(&plot_area);
|
||||||
|
let plot_frame = Frame::builder().label("Semantic Map").build();
|
||||||
|
plot_frame.set_child(Some(&plot_panel));
|
||||||
|
|
||||||
|
let selection_label = Label::builder()
|
||||||
|
.xalign(0.0)
|
||||||
|
.wrap(true)
|
||||||
|
.selectable(true)
|
||||||
|
.build();
|
||||||
|
let meta_label = Label::builder()
|
||||||
|
.xalign(0.0)
|
||||||
|
.wrap(true)
|
||||||
|
.selectable(true)
|
||||||
|
.build();
|
||||||
|
let chunk_view = TextView::builder()
|
||||||
|
.editable(false)
|
||||||
|
.monospace(true)
|
||||||
|
.wrap_mode(WrapMode::WordChar)
|
||||||
|
.vexpand(true)
|
||||||
|
.build();
|
||||||
|
let chunk_scroller = ScrolledWindow::builder().min_content_width(360).build();
|
||||||
|
chunk_scroller.set_child(Some(&chunk_view));
|
||||||
|
let tables_label = Label::builder()
|
||||||
|
.xalign(0.0)
|
||||||
|
.wrap(true)
|
||||||
|
.selectable(true)
|
||||||
|
.build();
|
||||||
|
|
||||||
|
let detail_box = GtkBox::new(Orientation::Vertical, 8);
|
||||||
|
detail_box.append(&selection_label);
|
||||||
|
detail_box.append(&meta_label);
|
||||||
|
detail_box.append(&chunk_scroller);
|
||||||
|
detail_box.append(&tables_label);
|
||||||
|
let detail_frame = Frame::builder().label("Selection").build();
|
||||||
|
detail_frame.set_child(Some(&detail_box));
|
||||||
|
|
||||||
|
let center_pane = Paned::builder()
|
||||||
|
.orientation(Orientation::Horizontal)
|
||||||
|
.start_child(&plot_frame)
|
||||||
|
.end_child(&detail_frame)
|
||||||
|
.wide_handle(true)
|
||||||
|
.shrink_start_child(false)
|
||||||
|
.build();
|
||||||
|
center_pane.set_position(900);
|
||||||
|
|
||||||
|
let main_pane = Paned::builder()
|
||||||
|
.orientation(Orientation::Horizontal)
|
||||||
|
.start_child(&files_frame)
|
||||||
|
.end_child(¢er_pane)
|
||||||
|
.wide_handle(true)
|
||||||
|
.build();
|
||||||
|
main_pane.set_position(330);
|
||||||
|
|
||||||
|
root.append(&main_pane);
|
||||||
|
window.set_child(Some(&root));
|
||||||
|
|
||||||
|
open_button.set_valign(Align::Center);
|
||||||
|
clear_button.set_valign(Align::Center);
|
||||||
|
|
||||||
|
let widgets = AppWidgets {
|
||||||
|
window,
|
||||||
|
open_button,
|
||||||
|
clear_button,
|
||||||
|
path_label,
|
||||||
|
adapter_label,
|
||||||
|
stats_label,
|
||||||
|
models_label,
|
||||||
|
backend_label,
|
||||||
|
notes_label,
|
||||||
|
filter_entry,
|
||||||
|
file_list,
|
||||||
|
yaw_scale,
|
||||||
|
pitch_scale,
|
||||||
|
plot_area,
|
||||||
|
selection_label,
|
||||||
|
meta_label,
|
||||||
|
chunk_view,
|
||||||
|
tables_label,
|
||||||
|
};
|
||||||
|
widgets
|
||||||
|
}
|
||||||
|
|
||||||
|
fn wire_events(widgets: &AppWidgets, state: &Rc<RefCell<AppState>>) {
|
||||||
|
{
|
||||||
|
let widgets = widgets.clone();
|
||||||
|
let open_button = widgets.open_button.clone();
|
||||||
|
let state = Rc::clone(state);
|
||||||
|
open_button.connect_clicked(move |_| {
|
||||||
|
if let Some(path) = rfd::FileDialog::new()
|
||||||
|
.add_filter("SQLite database", &["sqlite", "db"])
|
||||||
|
.set_directory(current_directory())
|
||||||
|
.pick_file()
|
||||||
|
{
|
||||||
|
load_database(&widgets, &state, &path);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
{
|
||||||
|
let widgets = widgets.clone();
|
||||||
|
let clear_button = widgets.clear_button.clone();
|
||||||
|
let state = Rc::clone(state);
|
||||||
|
clear_button.connect_clicked(move |_| {
|
||||||
|
state.borrow_mut().selected_file = None;
|
||||||
|
let state_ref = state.borrow();
|
||||||
|
refresh_files(&widgets, &state_ref, &state);
|
||||||
|
refresh_selection(&widgets, &state_ref);
|
||||||
|
widgets.plot_area.queue_draw();
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
{
|
||||||
|
let widgets = widgets.clone();
|
||||||
|
let filter_entry = widgets.filter_entry.clone();
|
||||||
|
let state = Rc::clone(state);
|
||||||
|
filter_entry.connect_changed(move |entry| {
|
||||||
|
state.borrow_mut().filter_text = entry.text().to_string();
|
||||||
|
let state_ref = state.borrow();
|
||||||
|
refresh_files(&widgets, &state_ref, &state);
|
||||||
|
refresh_selection(&widgets, &state_ref);
|
||||||
|
widgets.plot_area.queue_draw();
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
{
|
||||||
|
let widgets = widgets.clone();
|
||||||
|
let state = Rc::clone(state);
|
||||||
|
let yaw_scale = widgets.yaw_scale.clone();
|
||||||
|
yaw_scale.connect_value_changed(move |scale| {
|
||||||
|
state.borrow_mut().yaw_deg = scale.value();
|
||||||
|
widgets.plot_area.queue_draw();
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
{
|
||||||
|
let widgets = widgets.clone();
|
||||||
|
let state = Rc::clone(state);
|
||||||
|
let pitch_scale = widgets.pitch_scale.clone();
|
||||||
|
pitch_scale.connect_value_changed(move |scale| {
|
||||||
|
state.borrow_mut().pitch_deg = scale.value();
|
||||||
|
widgets.plot_area.queue_draw();
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
{
|
||||||
|
let widgets = widgets.clone();
|
||||||
|
let state = Rc::clone(state);
|
||||||
|
widgets
|
||||||
|
.plot_area
|
||||||
|
.set_draw_func(move |_, context, width, height| {
|
||||||
|
draw_plot(context, width, height, &state.borrow());
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
{
|
||||||
|
let widgets = widgets.clone();
|
||||||
|
let state = Rc::clone(state);
|
||||||
|
let plot_area = widgets.plot_area.clone();
|
||||||
|
let click = GestureClick::new();
|
||||||
|
click.connect_pressed(move |_, _, x, y| {
|
||||||
|
let width = widgets.plot_area.width() as f64;
|
||||||
|
let height = widgets.plot_area.height() as f64;
|
||||||
|
let chunk_id = {
|
||||||
|
let state_ref = state.borrow();
|
||||||
|
nearest_chunk_at(&state_ref, x, y, width, height, 12.0)
|
||||||
|
};
|
||||||
|
if let Some(chunk_id) = chunk_id {
|
||||||
|
let mut state_ref = state.borrow_mut();
|
||||||
|
state_ref.selected_chunk_id = Some(chunk_id.clone());
|
||||||
|
drop(state_ref);
|
||||||
|
let state_ref = state.borrow();
|
||||||
|
refresh_files(&widgets, &state_ref, &state);
|
||||||
|
refresh_selection(&widgets, &state_ref);
|
||||||
|
widgets.plot_area.queue_draw();
|
||||||
|
}
|
||||||
|
});
|
||||||
|
plot_area.add_controller(click);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn load_database(widgets: &AppWidgets, state: &Rc<RefCell<AppState>>, path: &Path) {
|
||||||
|
match detect_and_load(path) {
|
||||||
|
Ok(store) => {
|
||||||
|
widgets.filter_entry.set_text("");
|
||||||
|
{
|
||||||
|
let mut state_ref = state.borrow_mut();
|
||||||
|
state_ref.filter_text.clear();
|
||||||
|
state_ref.store = Some(store);
|
||||||
|
state_ref.selected_file = None;
|
||||||
|
let selected_chunk = state_ref
|
||||||
|
.store
|
||||||
|
.as_ref()
|
||||||
|
.and_then(|store| store.chunks.first().map(|chunk| chunk.id.clone()));
|
||||||
|
state_ref.selected_chunk_id = selected_chunk;
|
||||||
|
}
|
||||||
|
let state_ref = state.borrow();
|
||||||
|
refresh_all(widgets, &state_ref, state);
|
||||||
|
}
|
||||||
|
Err(error) => {
|
||||||
|
let mut state_ref = state.borrow_mut();
|
||||||
|
state_ref.store = None;
|
||||||
|
state_ref.selected_file = None;
|
||||||
|
state_ref.selected_chunk_id = None;
|
||||||
|
drop(state_ref);
|
||||||
|
widgets.path_label.set_text(&format!(
|
||||||
|
"Failed to open {}: {error:#}",
|
||||||
|
path.to_string_lossy()
|
||||||
|
));
|
||||||
|
widgets.adapter_label.set_text("");
|
||||||
|
widgets.stats_label.set_text("");
|
||||||
|
widgets.models_label.set_text("");
|
||||||
|
widgets.backend_label.set_text("");
|
||||||
|
widgets.notes_label.set_text("");
|
||||||
|
widgets.selection_label.set_text("");
|
||||||
|
widgets.meta_label.set_text("");
|
||||||
|
widgets.tables_label.set_text("");
|
||||||
|
widgets.chunk_view.buffer().set_text("");
|
||||||
|
clear_list_box(&widgets.file_list);
|
||||||
|
widgets.plot_area.queue_draw();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn refresh_all(widgets: &AppWidgets, state: &AppState, shared_state: &Rc<RefCell<AppState>>) {
|
||||||
|
refresh_overview(widgets, state);
|
||||||
|
refresh_files(widgets, state, shared_state);
|
||||||
|
refresh_selection(widgets, state);
|
||||||
|
widgets.plot_area.queue_draw();
|
||||||
|
}
|
||||||
|
|
||||||
|
fn refresh_overview(widgets: &AppWidgets, state: &AppState) {
|
||||||
|
if let Some(store) = state.store.as_ref() {
|
||||||
|
widgets
|
||||||
|
.path_label
|
||||||
|
.set_text(&format!("Database: {}", store.db_path.display()));
|
||||||
|
widgets
|
||||||
|
.adapter_label
|
||||||
|
.set_text(&format!("Adapter: {}", store.adapter_name));
|
||||||
|
widgets.stats_label.set_text(&format!(
|
||||||
|
"Files: {} Chunks: {} Embeddings: {} Dims: {}",
|
||||||
|
store.metrics.total_files,
|
||||||
|
store.metrics.total_chunks,
|
||||||
|
store.metrics.embedding_rows,
|
||||||
|
store
|
||||||
|
.metrics
|
||||||
|
.embedding_dims
|
||||||
|
.map(|dims| dims.to_string())
|
||||||
|
.unwrap_or_else(|| "unknown".to_string())
|
||||||
|
));
|
||||||
|
widgets.models_label.set_text(&format!(
|
||||||
|
"Models: {}",
|
||||||
|
join_or_unknown(&store.metrics.models)
|
||||||
|
));
|
||||||
|
widgets.backend_label.set_text(&format!(
|
||||||
|
"Backend: {} Sources: {}",
|
||||||
|
store
|
||||||
|
.metrics
|
||||||
|
.vector_backend
|
||||||
|
.clone()
|
||||||
|
.unwrap_or_else(|| "not inferred".to_string()),
|
||||||
|
join_or_unknown(&store.metrics.sources)
|
||||||
|
));
|
||||||
|
widgets
|
||||||
|
.notes_label
|
||||||
|
.set_text(&format!("Notes: {}", store.notes.join(" ")));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn refresh_files(widgets: &AppWidgets, state: &AppState, shared_state: &Rc<RefCell<AppState>>) {
|
||||||
|
clear_list_box(&widgets.file_list);
|
||||||
|
|
||||||
|
let reset_title = Label::builder()
|
||||||
|
.xalign(0.0)
|
||||||
|
.wrap(true)
|
||||||
|
.selectable(false)
|
||||||
|
.label(if state.selected_file.is_none() {
|
||||||
|
"Showing all chunks"
|
||||||
|
} else {
|
||||||
|
"Show all chunks"
|
||||||
|
})
|
||||||
|
.build();
|
||||||
|
let reset_detail = Label::builder()
|
||||||
|
.xalign(0.0)
|
||||||
|
.label("Reset file focus and return to the full semantic map")
|
||||||
|
.build();
|
||||||
|
let reset_box = GtkBox::new(Orientation::Vertical, 4);
|
||||||
|
reset_box.append(&reset_title);
|
||||||
|
reset_box.append(&reset_detail);
|
||||||
|
reset_box.set_margin_top(8);
|
||||||
|
reset_box.set_margin_bottom(8);
|
||||||
|
reset_box.set_margin_start(8);
|
||||||
|
reset_box.set_margin_end(8);
|
||||||
|
let reset_widgets = widgets.clone();
|
||||||
|
let reset_state = Rc::clone(shared_state);
|
||||||
|
let reset_button = Button::new();
|
||||||
|
reset_button.set_halign(Align::Fill);
|
||||||
|
reset_button.set_hexpand(true);
|
||||||
|
reset_button.add_css_class("flat");
|
||||||
|
reset_button.set_child(Some(&reset_box));
|
||||||
|
reset_button.connect_clicked(move |_| {
|
||||||
|
{
|
||||||
|
let mut state_ref = reset_state.borrow_mut();
|
||||||
|
state_ref.selected_file = None;
|
||||||
|
}
|
||||||
|
let state_ref = reset_state.borrow();
|
||||||
|
refresh_files(&reset_widgets, &state_ref, &reset_state);
|
||||||
|
refresh_selection(&reset_widgets, &state_ref);
|
||||||
|
reset_widgets.plot_area.queue_draw();
|
||||||
|
});
|
||||||
|
widgets.file_list.append(&reset_button);
|
||||||
|
|
||||||
|
for file in filtered_files(state) {
|
||||||
|
let title_text = if state.selected_file.as_deref() == Some(file.path.as_str()) {
|
||||||
|
format!("Focused: {}", file.path)
|
||||||
|
} else {
|
||||||
|
file.path.clone()
|
||||||
|
};
|
||||||
|
let title = Label::builder()
|
||||||
|
.xalign(0.0)
|
||||||
|
.wrap(true)
|
||||||
|
.selectable(false)
|
||||||
|
.label(&title_text)
|
||||||
|
.build();
|
||||||
|
let detail = Label::builder()
|
||||||
|
.xalign(0.0)
|
||||||
|
.label(&format!(
|
||||||
|
"{} chunks {} {}",
|
||||||
|
file.chunk_count,
|
||||||
|
file.size
|
||||||
|
.map(|size| format!("{} bytes", size))
|
||||||
|
.unwrap_or_else(|| "size unknown".to_string()),
|
||||||
|
file.mtime_ms
|
||||||
|
.map(format_timestamp_ms)
|
||||||
|
.unwrap_or_else(|| "time unknown".to_string())
|
||||||
|
))
|
||||||
|
.build();
|
||||||
|
let path = file.path.clone();
|
||||||
|
let widgets_for_click = widgets.clone();
|
||||||
|
let state_for_click = Rc::clone(shared_state);
|
||||||
|
let row_box = GtkBox::new(Orientation::Vertical, 4);
|
||||||
|
row_box.append(&title);
|
||||||
|
row_box.append(&detail);
|
||||||
|
row_box.set_margin_top(8);
|
||||||
|
row_box.set_margin_bottom(8);
|
||||||
|
row_box.set_margin_start(8);
|
||||||
|
row_box.set_margin_end(8);
|
||||||
|
let row_button = Button::new();
|
||||||
|
row_button.set_halign(Align::Fill);
|
||||||
|
row_button.set_hexpand(true);
|
||||||
|
row_button.add_css_class("flat");
|
||||||
|
row_button.set_child(Some(&row_box));
|
||||||
|
row_button.connect_clicked(move |_| {
|
||||||
|
{
|
||||||
|
let mut state_ref = state_for_click.borrow_mut();
|
||||||
|
state_ref.selected_file = Some(path.clone());
|
||||||
|
let selected_chunk = state_ref.store.as_ref().and_then(|store| {
|
||||||
|
store
|
||||||
|
.chunks
|
||||||
|
.iter()
|
||||||
|
.find(|chunk| chunk.path == path)
|
||||||
|
.map(|chunk| chunk.id.clone())
|
||||||
|
});
|
||||||
|
state_ref.selected_chunk_id = selected_chunk;
|
||||||
|
}
|
||||||
|
let state_ref = state_for_click.borrow();
|
||||||
|
refresh_files(&widgets_for_click, &state_ref, &state_for_click);
|
||||||
|
refresh_selection(&widgets_for_click, &state_ref);
|
||||||
|
widgets_for_click.plot_area.queue_draw();
|
||||||
|
});
|
||||||
|
widgets.file_list.append(&row_button);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn refresh_selection(widgets: &AppWidgets, state: &AppState) {
|
||||||
|
let Some(store) = state.store.as_ref() else {
|
||||||
|
widgets.selection_label.set_text("");
|
||||||
|
widgets.meta_label.set_text("");
|
||||||
|
widgets.tables_label.set_text("");
|
||||||
|
widgets.chunk_view.buffer().set_text("");
|
||||||
|
return;
|
||||||
|
};
|
||||||
|
|
||||||
|
if let Some(chunk) = selected_chunk(state) {
|
||||||
|
widgets.selection_label.set_text(&format!(
|
||||||
|
"{}\n{}",
|
||||||
|
chunk.path,
|
||||||
|
preview_text(&chunk.text, 180)
|
||||||
|
));
|
||||||
|
widgets.meta_label.set_text(&format!(
|
||||||
|
"Chunk: {}\nSource: {}\nModel: {}\nLines: {}\nUpdated: {}\nEmbedding dims: {}",
|
||||||
|
chunk.id,
|
||||||
|
chunk
|
||||||
|
.source
|
||||||
|
.clone()
|
||||||
|
.unwrap_or_else(|| "unknown".to_string()),
|
||||||
|
chunk.model.clone().unwrap_or_else(|| "unknown".to_string()),
|
||||||
|
format_line_range(chunk),
|
||||||
|
chunk
|
||||||
|
.updated_at_ms
|
||||||
|
.map(format_timestamp_ms)
|
||||||
|
.unwrap_or_else(|| "unknown".to_string()),
|
||||||
|
chunk
|
||||||
|
.embedding
|
||||||
|
.as_ref()
|
||||||
|
.map(|embedding| embedding.len().to_string())
|
||||||
|
.unwrap_or_else(|| "none".to_string())
|
||||||
|
));
|
||||||
|
widgets.chunk_view.buffer().set_text(&chunk.text);
|
||||||
|
} else if let Some(file) = selected_file(state) {
|
||||||
|
widgets.selection_label.set_text(&file.path);
|
||||||
|
widgets.meta_label.set_text(&format!(
|
||||||
|
"Source: {}\nChunks: {}\nSize: {}\nUpdated: {}",
|
||||||
|
file.source.clone().unwrap_or_else(|| "unknown".to_string()),
|
||||||
|
file.chunk_count,
|
||||||
|
file.size
|
||||||
|
.map(|size| format!("{} bytes", size))
|
||||||
|
.unwrap_or_else(|| "unknown".to_string()),
|
||||||
|
file.mtime_ms
|
||||||
|
.map(format_timestamp_ms)
|
||||||
|
.unwrap_or_else(|| "unknown".to_string())
|
||||||
|
));
|
||||||
|
widgets.chunk_view.buffer().set_text(
|
||||||
|
&store
|
||||||
|
.chunks
|
||||||
|
.iter()
|
||||||
|
.filter(|chunk| chunk.path == file.path)
|
||||||
|
.map(|chunk| preview_text(&chunk.text, 160))
|
||||||
|
.take(3)
|
||||||
|
.collect::<Vec<_>>()
|
||||||
|
.join("\n\n"),
|
||||||
|
);
|
||||||
|
} else {
|
||||||
|
widgets.selection_label.set_text("Nothing selected.");
|
||||||
|
widgets.meta_label.set_text("");
|
||||||
|
widgets.chunk_view.buffer().set_text("");
|
||||||
|
}
|
||||||
|
|
||||||
|
let table_summary = store
|
||||||
|
.tables
|
||||||
|
.iter()
|
||||||
|
.take(10)
|
||||||
|
.map(|table| {
|
||||||
|
format!(
|
||||||
|
"{} ({}) [{}]",
|
||||||
|
table.name,
|
||||||
|
table
|
||||||
|
.row_count
|
||||||
|
.map(|count| count.to_string())
|
||||||
|
.unwrap_or_else(|| "unreadable".to_string()),
|
||||||
|
table.columns.join(", ")
|
||||||
|
)
|
||||||
|
})
|
||||||
|
.collect::<Vec<_>>()
|
||||||
|
.join("\n");
|
||||||
|
widgets
|
||||||
|
.tables_label
|
||||||
|
.set_text(&format!("Tables:\n{}", table_summary));
|
||||||
|
}
|
||||||
|
|
||||||
|
fn filtered_files(state: &AppState) -> Vec<FileRecord> {
|
||||||
|
let Some(store) = state.store.as_ref() else {
|
||||||
|
return Vec::new();
|
||||||
|
};
|
||||||
|
let filter = state.filter_text.to_ascii_lowercase();
|
||||||
|
store
|
||||||
|
.files
|
||||||
|
.iter()
|
||||||
|
.filter(|file| {
|
||||||
|
filter.is_empty()
|
||||||
|
|| file.path.to_ascii_lowercase().contains(&filter)
|
||||||
|
|| file
|
||||||
|
.source
|
||||||
|
.as_ref()
|
||||||
|
.is_some_and(|source| source.to_ascii_lowercase().contains(&filter))
|
||||||
|
|| store
|
||||||
|
.chunks
|
||||||
|
.iter()
|
||||||
|
.any(|chunk| chunk.path == file.path && chunk_matches_filter(chunk, &filter))
|
||||||
|
})
|
||||||
|
.cloned()
|
||||||
|
.collect()
|
||||||
|
}
|
||||||
|
|
||||||
|
fn visible_points<'a>(state: &'a AppState) -> Vec<(&'a ChunkRecord, &'a ProjectedPoint)> {
|
||||||
|
let Some(store) = state.store.as_ref() else {
|
||||||
|
return Vec::new();
|
||||||
|
};
|
||||||
|
let filter = state.filter_text.to_ascii_lowercase();
|
||||||
|
store
|
||||||
|
.chunks
|
||||||
|
.iter()
|
||||||
|
.filter(|chunk| {
|
||||||
|
state
|
||||||
|
.selected_file
|
||||||
|
.as_ref()
|
||||||
|
.is_none_or(|selected_file| &chunk.path == selected_file)
|
||||||
|
&& (filter.is_empty() || chunk_matches_filter(chunk, &filter))
|
||||||
|
})
|
||||||
|
.filter_map(|chunk| {
|
||||||
|
store
|
||||||
|
.points
|
||||||
|
.iter()
|
||||||
|
.find(|point| point.chunk_id == chunk.id)
|
||||||
|
.map(|point| (chunk, point))
|
||||||
|
})
|
||||||
|
.collect()
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Clone, Copy)]
|
||||||
|
struct RenderedPoint<'a> {
|
||||||
|
chunk: &'a ChunkRecord,
|
||||||
|
point: &'a ProjectedPoint,
|
||||||
|
canvas_x: f64,
|
||||||
|
canvas_y: f64,
|
||||||
|
depth: f64,
|
||||||
|
radius: f64,
|
||||||
|
}
|
||||||
|
|
||||||
|
fn rendered_points<'a>(state: &'a AppState, width: f64, height: f64) -> Vec<RenderedPoint<'a>> {
|
||||||
|
let mut rendered = visible_points(state)
|
||||||
|
.into_iter()
|
||||||
|
.map(|(chunk, point)| {
|
||||||
|
let (view_x, view_y, depth, scale) = rotated_view(point, state);
|
||||||
|
let (canvas_x, canvas_y) = point_to_canvas(view_x, view_y, width, height);
|
||||||
|
RenderedPoint {
|
||||||
|
chunk,
|
||||||
|
point,
|
||||||
|
canvas_x,
|
||||||
|
canvas_y,
|
||||||
|
depth,
|
||||||
|
radius: 3.5 + (depth + 1.0) * 2.2 * scale,
|
||||||
|
}
|
||||||
|
})
|
||||||
|
.collect::<Vec<_>>();
|
||||||
|
rendered.sort_by(|left, right| {
|
||||||
|
left.depth
|
||||||
|
.partial_cmp(&right.depth)
|
||||||
|
.unwrap_or(std::cmp::Ordering::Equal)
|
||||||
|
});
|
||||||
|
rendered
|
||||||
|
}
|
||||||
|
|
||||||
|
fn draw_plot(context: &Context, width: i32, height: i32, state: &AppState) {
|
||||||
|
context.set_source_rgb(0.97, 0.97, 0.95);
|
||||||
|
let _ = context.paint();
|
||||||
|
|
||||||
|
if state.store.is_none() {
|
||||||
|
context.set_source_rgb(0.20, 0.20, 0.20);
|
||||||
|
context.move_to(30.0, 40.0);
|
||||||
|
let _ = context.show_text("Open a SQLite vector store to draw its chunk map.");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
let points = rendered_points(state, width as f64, height as f64);
|
||||||
|
let margin = 48.0;
|
||||||
|
let plot_width = (width as f64 - margin * 2.0).max(1.0);
|
||||||
|
let plot_height = (height as f64 - margin * 2.0).max(1.0);
|
||||||
|
|
||||||
|
context.set_source_rgb(0.88, 0.88, 0.86);
|
||||||
|
for step in 0..=4 {
|
||||||
|
let ratio = step as f64 / 4.0;
|
||||||
|
let x = margin + plot_width * ratio;
|
||||||
|
let y = margin + plot_height * ratio;
|
||||||
|
context.move_to(x, margin);
|
||||||
|
context.line_to(x, margin + plot_height);
|
||||||
|
context.move_to(margin, y);
|
||||||
|
context.line_to(margin + plot_width, y);
|
||||||
|
}
|
||||||
|
let _ = context.stroke();
|
||||||
|
|
||||||
|
context.set_source_rgb(0.25, 0.25, 0.25);
|
||||||
|
draw_axis_guides(context, width as f64, height as f64, state);
|
||||||
|
|
||||||
|
let selected_id = state.selected_chunk_id.as_deref();
|
||||||
|
let semantic = points.iter().any(|point| point.point.from_embeddings);
|
||||||
|
context.move_to(margin, 24.0);
|
||||||
|
let _ = context.show_text(if semantic {
|
||||||
|
"3D semantic view from embeddings"
|
||||||
|
} else {
|
||||||
|
"3D fallback layout because embeddings were unavailable"
|
||||||
|
});
|
||||||
|
|
||||||
|
for (index, rendered) in points.iter().enumerate() {
|
||||||
|
let radius = if selected_id == Some(rendered.chunk.id.as_str()) {
|
||||||
|
rendered.radius + 3.0
|
||||||
|
} else {
|
||||||
|
rendered.radius
|
||||||
|
};
|
||||||
|
let color = color_for_index(index, points.len().max(1), rendered.depth);
|
||||||
|
context.set_source_rgba(
|
||||||
|
color.0,
|
||||||
|
color.1,
|
||||||
|
color.2,
|
||||||
|
0.55 + ((rendered.depth + 1.0) / 2.0) * 0.4,
|
||||||
|
);
|
||||||
|
context.arc(
|
||||||
|
rendered.canvas_x,
|
||||||
|
rendered.canvas_y,
|
||||||
|
radius,
|
||||||
|
0.0,
|
||||||
|
std::f64::consts::TAU,
|
||||||
|
);
|
||||||
|
let _ = context.fill();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn nearest_chunk_at(
|
||||||
|
state: &AppState,
|
||||||
|
x: f64,
|
||||||
|
y: f64,
|
||||||
|
width: f64,
|
||||||
|
height: f64,
|
||||||
|
max_distance: f64,
|
||||||
|
) -> Option<String> {
|
||||||
|
rendered_points(state, width, height)
|
||||||
|
.into_iter()
|
||||||
|
.map(|rendered| {
|
||||||
|
let dx = rendered.canvas_x - x;
|
||||||
|
let dy = rendered.canvas_y - y;
|
||||||
|
(rendered.chunk.id.clone(), (dx * dx + dy * dy).sqrt())
|
||||||
|
})
|
||||||
|
.filter(|(_, distance)| *distance <= max_distance)
|
||||||
|
.min_by(|left, right| {
|
||||||
|
left.1
|
||||||
|
.partial_cmp(&right.1)
|
||||||
|
.unwrap_or(std::cmp::Ordering::Equal)
|
||||||
|
})
|
||||||
|
.map(|(chunk_id, _)| chunk_id)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn point_to_canvas(view_x: f64, view_y: f64, width: f64, height: f64) -> (f64, f64) {
|
||||||
|
let margin = 48.0;
|
||||||
|
let plot_width = (width - margin * 2.0).max(1.0);
|
||||||
|
let plot_height = (height - margin * 2.0).max(1.0);
|
||||||
|
let x = margin + ((view_x + 1.0) / 2.0) * plot_width;
|
||||||
|
let y = margin + (1.0 - (view_y + 1.0) / 2.0) * plot_height;
|
||||||
|
(x, y)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn rotated_view(point: &ProjectedPoint, state: &AppState) -> (f64, f64, f64, f64) {
|
||||||
|
let yaw = state.yaw_deg.to_radians();
|
||||||
|
let pitch = state.pitch_deg.to_radians();
|
||||||
|
|
||||||
|
let x = point.x as f64;
|
||||||
|
let y = point.y as f64;
|
||||||
|
let z = point.z as f64;
|
||||||
|
|
||||||
|
let yaw_x = x * yaw.cos() + z * yaw.sin();
|
||||||
|
let yaw_z = -x * yaw.sin() + z * yaw.cos();
|
||||||
|
let pitch_y = y * pitch.cos() - yaw_z * pitch.sin();
|
||||||
|
let pitch_z = y * pitch.sin() + yaw_z * pitch.cos();
|
||||||
|
|
||||||
|
let camera_distance = 3.2;
|
||||||
|
let perspective = camera_distance / (camera_distance - pitch_z * 0.9);
|
||||||
|
let view_x = (yaw_x * perspective).clamp(-1.2, 1.2);
|
||||||
|
let view_y = (pitch_y * perspective).clamp(-1.2, 1.2);
|
||||||
|
(view_x, view_y, pitch_z.clamp(-1.0, 1.0), perspective)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn draw_axis_guides(context: &Context, width: f64, height: f64, state: &AppState) {
|
||||||
|
let axes = [
|
||||||
|
("X", 1.0_f32, 0.0_f32, 0.0_f32),
|
||||||
|
("Y", 0.0_f32, 1.0_f32, 0.0_f32),
|
||||||
|
("Z", 0.0_f32, 0.0_f32, 1.0_f32),
|
||||||
|
];
|
||||||
|
let origin = ProjectedPoint {
|
||||||
|
chunk_id: String::new(),
|
||||||
|
x: 0.0,
|
||||||
|
y: 0.0,
|
||||||
|
z: 0.0,
|
||||||
|
from_embeddings: true,
|
||||||
|
};
|
||||||
|
let (origin_x, origin_y, _, _) = rotated_view(&origin, state);
|
||||||
|
let (origin_x, origin_y) = point_to_canvas(origin_x, origin_y, width, height);
|
||||||
|
|
||||||
|
context.set_source_rgba(0.20, 0.20, 0.20, 0.7);
|
||||||
|
for (label, x, y, z) in axes {
|
||||||
|
let axis_point = ProjectedPoint {
|
||||||
|
chunk_id: String::new(),
|
||||||
|
x,
|
||||||
|
y,
|
||||||
|
z,
|
||||||
|
from_embeddings: true,
|
||||||
|
};
|
||||||
|
let (axis_x, axis_y, _, _) = rotated_view(&axis_point, state);
|
||||||
|
let (axis_x, axis_y) = point_to_canvas(axis_x, axis_y, width, height);
|
||||||
|
context.move_to(origin_x, origin_y);
|
||||||
|
context.line_to(axis_x, axis_y);
|
||||||
|
let _ = context.stroke();
|
||||||
|
context.move_to(axis_x + 4.0, axis_y - 4.0);
|
||||||
|
let _ = context.show_text(label);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn selected_chunk<'a>(state: &'a AppState) -> Option<&'a ChunkRecord> {
|
||||||
|
let store = state.store.as_ref()?;
|
||||||
|
let selected_id = state.selected_chunk_id.as_ref()?;
|
||||||
|
store.chunks.iter().find(|chunk| &chunk.id == selected_id)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn selected_file<'a>(state: &'a AppState) -> Option<&'a FileRecord> {
|
||||||
|
let store = state.store.as_ref()?;
|
||||||
|
let selected_path = state.selected_file.as_ref()?;
|
||||||
|
store.files.iter().find(|file| &file.path == selected_path)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn chunk_matches_filter(chunk: &ChunkRecord, filter: &str) -> bool {
|
||||||
|
chunk.path.to_ascii_lowercase().contains(filter)
|
||||||
|
|| chunk.text.to_ascii_lowercase().contains(filter)
|
||||||
|
|| chunk
|
||||||
|
.model
|
||||||
|
.as_ref()
|
||||||
|
.is_some_and(|model| model.to_ascii_lowercase().contains(filter))
|
||||||
|
|| chunk
|
||||||
|
.source
|
||||||
|
.as_ref()
|
||||||
|
.is_some_and(|source| source.to_ascii_lowercase().contains(filter))
|
||||||
|
}
|
||||||
|
|
||||||
|
fn clear_list_box(list_box: &ListBox) {
|
||||||
|
while let Some(child) = list_box.first_child() {
|
||||||
|
list_box.remove(&child);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn preview_text(text: &str, max_chars: usize) -> String {
|
||||||
|
let compact = text.replace("\r\n", "\n");
|
||||||
|
let trimmed = compact.trim();
|
||||||
|
if trimmed.chars().count() <= max_chars {
|
||||||
|
trimmed.to_string()
|
||||||
|
} else {
|
||||||
|
format!("{}...", trimmed.chars().take(max_chars).collect::<String>())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn format_line_range(chunk: &ChunkRecord) -> String {
|
||||||
|
match (chunk.start_line, chunk.end_line) {
|
||||||
|
(Some(start), Some(end)) if start == end => start.to_string(),
|
||||||
|
(Some(start), Some(end)) => format!("{start}-{end}"),
|
||||||
|
_ => "unknown".to_string(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn format_timestamp_ms(value: i64) -> String {
|
||||||
|
DateTime::<Utc>::from_timestamp_millis(value)
|
||||||
|
.map(|time| time.format("%Y-%m-%d %H:%M:%S UTC").to_string())
|
||||||
|
.unwrap_or_else(|| value.to_string())
|
||||||
|
}
|
||||||
|
|
||||||
|
fn join_or_unknown(values: &[String]) -> String {
|
||||||
|
if values.is_empty() {
|
||||||
|
"unknown".to_string()
|
||||||
|
} else {
|
||||||
|
values.join(", ")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn color_for_index(index: usize, total: usize, depth: f64) -> (f64, f64, f64) {
|
||||||
|
let hue = (index as f64 / total as f64) * 0.78;
|
||||||
|
let value = 0.55 + ((depth + 1.0) / 2.0) * 0.30;
|
||||||
|
hsv_to_rgb(hue, 0.70, value)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn hsv_to_rgb(hue: f64, saturation: f64, value: f64) -> (f64, f64, f64) {
|
||||||
|
let section = (hue * 6.0).floor();
|
||||||
|
let fraction = hue * 6.0 - section;
|
||||||
|
let p = value * (1.0 - saturation);
|
||||||
|
let q = value * (1.0 - fraction * saturation);
|
||||||
|
let t = value * (1.0 - (1.0 - fraction) * saturation);
|
||||||
|
match section as i32 % 6 {
|
||||||
|
0 => (value, t, p),
|
||||||
|
1 => (q, value, p),
|
||||||
|
2 => (p, value, t),
|
||||||
|
3 => (p, q, value),
|
||||||
|
4 => (t, p, value),
|
||||||
|
_ => (value, p, q),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn startup_path() -> Option<PathBuf> {
|
||||||
|
let default = current_directory().join("main.sqlite");
|
||||||
|
default.exists().then_some(default)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn current_directory() -> PathBuf {
|
||||||
|
std::env::current_dir().unwrap_or_else(|_| PathBuf::from("."))
|
||||||
|
}
|
||||||
223
src/projection.rs
Normal file
223
src/projection.rs
Normal file
@@ -0,0 +1,223 @@
|
|||||||
|
use crate::store::ChunkRecord;
|
||||||
|
|
||||||
|
#[derive(Clone, Debug)]
|
||||||
|
pub struct ProjectedPoint {
|
||||||
|
pub chunk_id: String,
|
||||||
|
pub x: f32,
|
||||||
|
pub y: f32,
|
||||||
|
pub z: f32,
|
||||||
|
pub from_embeddings: bool,
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn project_chunks(chunks: &[ChunkRecord]) -> Vec<ProjectedPoint> {
|
||||||
|
let dims = dominant_embedding_dimension(chunks);
|
||||||
|
if let Some(dims) = dims {
|
||||||
|
let embedding_points = chunks
|
||||||
|
.iter()
|
||||||
|
.filter_map(|chunk| {
|
||||||
|
chunk.embedding.as_ref().and_then(|embedding| {
|
||||||
|
(embedding.len() == dims).then(|| (chunk.id.clone(), embedding.clone()))
|
||||||
|
})
|
||||||
|
})
|
||||||
|
.collect::<Vec<_>>();
|
||||||
|
|
||||||
|
if embedding_points.len() >= 3 {
|
||||||
|
let embeddings = embedding_points
|
||||||
|
.iter()
|
||||||
|
.map(|(_, embedding)| embedding.clone())
|
||||||
|
.collect::<Vec<_>>();
|
||||||
|
let (component_x, component_y, component_z, mean) = principal_components(&embeddings);
|
||||||
|
let mut scores = embedding_points
|
||||||
|
.iter()
|
||||||
|
.map(|(chunk_id, embedding)| {
|
||||||
|
let centered = embedding
|
||||||
|
.iter()
|
||||||
|
.zip(mean.iter())
|
||||||
|
.map(|(value, avg)| *value - *avg)
|
||||||
|
.collect::<Vec<_>>();
|
||||||
|
let x = dot(¢ered, &component_x);
|
||||||
|
let y = dot(¢ered, &component_y);
|
||||||
|
let z = dot(¢ered, &component_z);
|
||||||
|
(chunk_id.clone(), x, y, z)
|
||||||
|
})
|
||||||
|
.collect::<Vec<_>>();
|
||||||
|
normalize_scores(&mut scores);
|
||||||
|
|
||||||
|
let mut projected = scores
|
||||||
|
.into_iter()
|
||||||
|
.map(|(chunk_id, x, y, z)| ProjectedPoint {
|
||||||
|
chunk_id,
|
||||||
|
x,
|
||||||
|
y,
|
||||||
|
z,
|
||||||
|
from_embeddings: true,
|
||||||
|
})
|
||||||
|
.collect::<Vec<_>>();
|
||||||
|
let existing_ids = projected
|
||||||
|
.iter()
|
||||||
|
.map(|point| point.chunk_id.clone())
|
||||||
|
.collect::<std::collections::BTreeSet<_>>();
|
||||||
|
projected.extend(
|
||||||
|
chunks
|
||||||
|
.iter()
|
||||||
|
.filter(|chunk| !existing_ids.contains(&chunk.id))
|
||||||
|
.enumerate()
|
||||||
|
.map(|(index, chunk)| fallback_point(chunk, index, chunks.len())),
|
||||||
|
);
|
||||||
|
return projected;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
chunks
|
||||||
|
.iter()
|
||||||
|
.enumerate()
|
||||||
|
.map(|(index, chunk)| fallback_point(chunk, index, chunks.len()))
|
||||||
|
.collect()
|
||||||
|
}
|
||||||
|
|
||||||
|
fn fallback_point(chunk: &ChunkRecord, index: usize, total: usize) -> ProjectedPoint {
|
||||||
|
let columns = (total as f32).sqrt().ceil().max(1.0) as usize;
|
||||||
|
let row = index / columns;
|
||||||
|
let column = index % columns;
|
||||||
|
let width = columns.max(1) as f32;
|
||||||
|
let height = ((total + columns - 1) / columns).max(1) as f32;
|
||||||
|
let x = if width <= 1.0 {
|
||||||
|
0.0
|
||||||
|
} else {
|
||||||
|
(column as f32 / (width - 1.0)) * 2.0 - 1.0
|
||||||
|
};
|
||||||
|
let y = if height <= 1.0 {
|
||||||
|
0.0
|
||||||
|
} else {
|
||||||
|
(row as f32 / (height - 1.0)) * 2.0 - 1.0
|
||||||
|
};
|
||||||
|
let z = if total <= 1 {
|
||||||
|
0.0
|
||||||
|
} else {
|
||||||
|
(index as f32 / (total as f32 - 1.0)) * 2.0 - 1.0
|
||||||
|
};
|
||||||
|
ProjectedPoint {
|
||||||
|
chunk_id: chunk.id.clone(),
|
||||||
|
x,
|
||||||
|
y,
|
||||||
|
z,
|
||||||
|
from_embeddings: false,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn dominant_embedding_dimension(chunks: &[ChunkRecord]) -> Option<usize> {
|
||||||
|
let mut dims = std::collections::BTreeMap::<usize, usize>::new();
|
||||||
|
for embedding in chunks.iter().filter_map(|chunk| chunk.embedding.as_ref()) {
|
||||||
|
*dims.entry(embedding.len()).or_default() += 1;
|
||||||
|
}
|
||||||
|
dims.into_iter()
|
||||||
|
.max_by_key(|(_, count)| *count)
|
||||||
|
.map(|(dims, _)| dims)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn principal_components(rows: &[Vec<f32>]) -> (Vec<f32>, Vec<f32>, Vec<f32>, Vec<f32>) {
|
||||||
|
let dims = rows.first().map(|row| row.len()).unwrap_or(0);
|
||||||
|
let mean = mean_vector(rows, dims);
|
||||||
|
let first = power_iteration(rows, &mean, &[]);
|
||||||
|
let second = power_iteration(rows, &mean, &[first.as_slice()]);
|
||||||
|
let third = power_iteration(rows, &mean, &[first.as_slice(), second.as_slice()]);
|
||||||
|
(first, second, third, mean)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn mean_vector(rows: &[Vec<f32>], dims: usize) -> Vec<f32> {
|
||||||
|
let mut mean = vec![0.0; dims];
|
||||||
|
for row in rows {
|
||||||
|
for (index, value) in row.iter().enumerate() {
|
||||||
|
mean[index] += *value;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if !rows.is_empty() {
|
||||||
|
let scale = 1.0 / rows.len() as f32;
|
||||||
|
for value in &mut mean {
|
||||||
|
*value *= scale;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
mean
|
||||||
|
}
|
||||||
|
|
||||||
|
fn power_iteration(rows: &[Vec<f32>], mean: &[f32], orthogonal_to: &[&[f32]]) -> Vec<f32> {
|
||||||
|
let dims = mean.len();
|
||||||
|
let mut vector = vec![0.0; dims];
|
||||||
|
if dims > 0 {
|
||||||
|
vector[0] = 1.0;
|
||||||
|
}
|
||||||
|
|
||||||
|
for _ in 0..24 {
|
||||||
|
let mut next = covariance_mul(rows, mean, &vector);
|
||||||
|
for orthogonal in orthogonal_to {
|
||||||
|
orthogonalize(&mut next, orthogonal);
|
||||||
|
}
|
||||||
|
let norm = l2_norm(&next);
|
||||||
|
if norm <= f32::EPSILON {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
for value in &mut next {
|
||||||
|
*value /= norm;
|
||||||
|
}
|
||||||
|
vector = next;
|
||||||
|
}
|
||||||
|
|
||||||
|
if l2_norm(&vector) <= f32::EPSILON && dims > 1 {
|
||||||
|
vector[1] = 1.0;
|
||||||
|
}
|
||||||
|
vector
|
||||||
|
}
|
||||||
|
|
||||||
|
fn covariance_mul(rows: &[Vec<f32>], mean: &[f32], vector: &[f32]) -> Vec<f32> {
|
||||||
|
let mut accum = vec![0.0; mean.len()];
|
||||||
|
for row in rows {
|
||||||
|
let centered = row
|
||||||
|
.iter()
|
||||||
|
.zip(mean.iter())
|
||||||
|
.map(|(value, avg)| *value - *avg)
|
||||||
|
.collect::<Vec<_>>();
|
||||||
|
let score = dot(¢ered, vector);
|
||||||
|
for (index, value) in centered.iter().enumerate() {
|
||||||
|
accum[index] += *value * score;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
accum
|
||||||
|
}
|
||||||
|
|
||||||
|
fn normalize_scores(scores: &mut [(String, f32, f32, f32)]) {
|
||||||
|
let max_x = scores
|
||||||
|
.iter()
|
||||||
|
.map(|(_, x, _, _)| x.abs())
|
||||||
|
.fold(0.0_f32, f32::max)
|
||||||
|
.max(1.0);
|
||||||
|
let max_y = scores
|
||||||
|
.iter()
|
||||||
|
.map(|(_, _, y, _)| y.abs())
|
||||||
|
.fold(0.0_f32, f32::max)
|
||||||
|
.max(1.0);
|
||||||
|
let max_z = scores
|
||||||
|
.iter()
|
||||||
|
.map(|(_, _, _, z)| z.abs())
|
||||||
|
.fold(0.0_f32, f32::max)
|
||||||
|
.max(1.0);
|
||||||
|
for (_, x, y, z) in scores {
|
||||||
|
*x /= max_x;
|
||||||
|
*y /= max_y;
|
||||||
|
*z /= max_z;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn orthogonalize(vector: &mut [f32], basis: &[f32]) {
|
||||||
|
let projection = dot(vector, basis);
|
||||||
|
for (value, basis_value) in vector.iter_mut().zip(basis.iter()) {
|
||||||
|
*value -= basis_value * projection;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn dot(left: &[f32], right: &[f32]) -> f32 {
|
||||||
|
left.iter().zip(right.iter()).map(|(a, b)| a * b).sum()
|
||||||
|
}
|
||||||
|
|
||||||
|
fn l2_norm(values: &[f32]) -> f32 {
|
||||||
|
dot(values, values).sqrt()
|
||||||
|
}
|
||||||
764
src/store.rs
Normal file
764
src/store.rs
Normal file
@@ -0,0 +1,764 @@
|
|||||||
|
#![allow(dead_code)]
|
||||||
|
|
||||||
|
use std::collections::{BTreeMap, BTreeSet, HashMap};
|
||||||
|
use std::path::{Path, PathBuf};
|
||||||
|
|
||||||
|
use anyhow::{Context, Result, anyhow};
|
||||||
|
use rusqlite::types::ValueRef;
|
||||||
|
use rusqlite::{Connection, Row};
|
||||||
|
use serde::Deserialize;
|
||||||
|
|
||||||
|
use crate::projection::{ProjectedPoint, project_chunks};
|
||||||
|
|
||||||
|
#[derive(Clone, Debug)]
|
||||||
|
pub struct LoadedStore {
|
||||||
|
pub db_path: PathBuf,
|
||||||
|
pub adapter_name: String,
|
||||||
|
pub metrics: OverviewMetrics,
|
||||||
|
pub tables: Vec<TableSummary>,
|
||||||
|
pub files: Vec<FileRecord>,
|
||||||
|
pub chunks: Vec<ChunkRecord>,
|
||||||
|
pub points: Vec<ProjectedPoint>,
|
||||||
|
pub notes: Vec<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Clone, Debug, Default)]
|
||||||
|
pub struct OverviewMetrics {
|
||||||
|
pub total_files: usize,
|
||||||
|
pub total_chunks: usize,
|
||||||
|
pub embedding_rows: usize,
|
||||||
|
pub embedding_dims: Option<usize>,
|
||||||
|
pub vector_backend: Option<String>,
|
||||||
|
pub models: Vec<String>,
|
||||||
|
pub sources: Vec<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Clone, Debug)]
|
||||||
|
pub struct TableSummary {
|
||||||
|
pub name: String,
|
||||||
|
pub columns: Vec<String>,
|
||||||
|
pub create_sql: Option<String>,
|
||||||
|
pub row_count: Option<i64>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Clone, Debug)]
|
||||||
|
pub struct FileRecord {
|
||||||
|
pub path: String,
|
||||||
|
pub source: Option<String>,
|
||||||
|
pub size: Option<i64>,
|
||||||
|
pub mtime_ms: Option<i64>,
|
||||||
|
pub chunk_count: usize,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Clone, Debug)]
|
||||||
|
pub struct ChunkRecord {
|
||||||
|
pub id: String,
|
||||||
|
pub path: String,
|
||||||
|
pub source: Option<String>,
|
||||||
|
pub model: Option<String>,
|
||||||
|
pub start_line: Option<i64>,
|
||||||
|
pub end_line: Option<i64>,
|
||||||
|
pub updated_at_ms: Option<i64>,
|
||||||
|
pub text: String,
|
||||||
|
pub embedding: Option<Vec<f32>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn detect_and_load(db_path: &Path) -> Result<LoadedStore> {
|
||||||
|
let connection = Connection::open(db_path)
|
||||||
|
.with_context(|| format!("failed to open SQLite database at {}", db_path.display()))?;
|
||||||
|
let schema = SchemaSnapshot::read(&connection)?;
|
||||||
|
let adapters: Vec<Box<dyn VectorStoreAdapter>> =
|
||||||
|
vec![Box::new(OpenClawAdapter), Box::new(GenericSqliteAdapter)];
|
||||||
|
|
||||||
|
let adapter = adapters
|
||||||
|
.into_iter()
|
||||||
|
.find(|adapter| adapter.detect(&schema))
|
||||||
|
.ok_or_else(|| anyhow!("no compatible adapter found"))?;
|
||||||
|
|
||||||
|
adapter.load(&connection, db_path, &schema)
|
||||||
|
}
|
||||||
|
|
||||||
|
trait VectorStoreAdapter {
|
||||||
|
fn name(&self) -> &'static str;
|
||||||
|
fn detect(&self, schema: &SchemaSnapshot) -> bool;
|
||||||
|
fn load(
|
||||||
|
&self,
|
||||||
|
connection: &Connection,
|
||||||
|
db_path: &Path,
|
||||||
|
schema: &SchemaSnapshot,
|
||||||
|
) -> Result<LoadedStore>;
|
||||||
|
}
|
||||||
|
|
||||||
|
struct OpenClawAdapter;
|
||||||
|
|
||||||
|
impl VectorStoreAdapter for OpenClawAdapter {
|
||||||
|
fn name(&self) -> &'static str {
|
||||||
|
"OpenClaw Memory"
|
||||||
|
}
|
||||||
|
|
||||||
|
fn detect(&self, schema: &SchemaSnapshot) -> bool {
|
||||||
|
schema.has_table_with_columns(
|
||||||
|
"chunks",
|
||||||
|
&[
|
||||||
|
"id",
|
||||||
|
"path",
|
||||||
|
"source",
|
||||||
|
"start_line",
|
||||||
|
"end_line",
|
||||||
|
"model",
|
||||||
|
"text",
|
||||||
|
"embedding",
|
||||||
|
],
|
||||||
|
) && schema.has_table_with_columns("files", &["path", "source", "size", "mtime"])
|
||||||
|
}
|
||||||
|
|
||||||
|
fn load(
|
||||||
|
&self,
|
||||||
|
connection: &Connection,
|
||||||
|
db_path: &Path,
|
||||||
|
schema: &SchemaSnapshot,
|
||||||
|
) -> Result<LoadedStore> {
|
||||||
|
let meta = load_openclaw_meta(connection)?;
|
||||||
|
let mut files = load_openclaw_files(connection)?;
|
||||||
|
let chunks = load_openclaw_chunks(connection)?;
|
||||||
|
files.sort_by(|left, right| {
|
||||||
|
right
|
||||||
|
.chunk_count
|
||||||
|
.cmp(&left.chunk_count)
|
||||||
|
.then_with(|| left.path.cmp(&right.path))
|
||||||
|
});
|
||||||
|
|
||||||
|
let metrics = build_metrics(
|
||||||
|
&files,
|
||||||
|
&chunks,
|
||||||
|
meta.as_ref().and_then(|meta| meta.vector_dims),
|
||||||
|
schema,
|
||||||
|
);
|
||||||
|
let notes = build_openclaw_notes(schema, meta.as_ref(), &metrics);
|
||||||
|
let tables = schema.to_summaries(connection);
|
||||||
|
let points = project_chunks(&chunks);
|
||||||
|
|
||||||
|
Ok(LoadedStore {
|
||||||
|
db_path: db_path.to_path_buf(),
|
||||||
|
adapter_name: self.name().to_string(),
|
||||||
|
metrics,
|
||||||
|
tables,
|
||||||
|
files,
|
||||||
|
chunks,
|
||||||
|
points,
|
||||||
|
notes,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
struct GenericSqliteAdapter;
|
||||||
|
|
||||||
|
impl VectorStoreAdapter for GenericSqliteAdapter {
|
||||||
|
fn name(&self) -> &'static str {
|
||||||
|
"Generic SQLite Vector Store"
|
||||||
|
}
|
||||||
|
|
||||||
|
fn detect(&self, schema: &SchemaSnapshot) -> bool {
|
||||||
|
choose_content_mapping(schema).is_some()
|
||||||
|
}
|
||||||
|
|
||||||
|
fn load(
|
||||||
|
&self,
|
||||||
|
connection: &Connection,
|
||||||
|
db_path: &Path,
|
||||||
|
schema: &SchemaSnapshot,
|
||||||
|
) -> Result<LoadedStore> {
|
||||||
|
let mapping = choose_content_mapping(schema)
|
||||||
|
.ok_or_else(|| anyhow!("unable to find a chunk/content table"))?;
|
||||||
|
let chunks = load_generic_chunks(connection, &mapping)?;
|
||||||
|
let files = load_generic_files(connection, schema, &mapping, &chunks)?;
|
||||||
|
let metrics = build_metrics(&files, &chunks, None, schema);
|
||||||
|
let mut notes = vec![format!(
|
||||||
|
"Detected chunk-like table `{}` using heuristic column matching.",
|
||||||
|
mapping.table_name
|
||||||
|
)];
|
||||||
|
if let Some(vector_backend) = metrics.vector_backend.as_ref() {
|
||||||
|
notes.push(format!(
|
||||||
|
"Vector backend inferred from schema artifacts: {}.",
|
||||||
|
vector_backend
|
||||||
|
));
|
||||||
|
}
|
||||||
|
let tables = schema.to_summaries(connection);
|
||||||
|
let points = project_chunks(&chunks);
|
||||||
|
|
||||||
|
Ok(LoadedStore {
|
||||||
|
db_path: db_path.to_path_buf(),
|
||||||
|
adapter_name: self.name().to_string(),
|
||||||
|
metrics,
|
||||||
|
tables,
|
||||||
|
files,
|
||||||
|
chunks,
|
||||||
|
points,
|
||||||
|
notes,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Clone, Debug)]
|
||||||
|
struct SchemaSnapshot {
|
||||||
|
tables: Vec<TableSchema>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Clone, Debug)]
|
||||||
|
struct TableSchema {
|
||||||
|
name: String,
|
||||||
|
columns: Vec<String>,
|
||||||
|
create_sql: Option<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl SchemaSnapshot {
|
||||||
|
fn read(connection: &Connection) -> Result<Self> {
|
||||||
|
let mut statement = connection.prepare(
|
||||||
|
"SELECT name, sql FROM sqlite_master WHERE type IN ('table', 'view') ORDER BY name",
|
||||||
|
)?;
|
||||||
|
let raw_tables = statement.query_map([], |row| {
|
||||||
|
Ok((row.get::<_, String>(0)?, row.get::<_, Option<String>>(1)?))
|
||||||
|
})?;
|
||||||
|
|
||||||
|
let mut tables = Vec::new();
|
||||||
|
for table in raw_tables {
|
||||||
|
let (name, create_sql) = table?;
|
||||||
|
let columns = read_columns(connection, &name).unwrap_or_default();
|
||||||
|
tables.push(TableSchema {
|
||||||
|
name,
|
||||||
|
columns,
|
||||||
|
create_sql,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
Ok(Self { tables })
|
||||||
|
}
|
||||||
|
|
||||||
|
fn has_table_with_columns(&self, table_name: &str, required: &[&str]) -> bool {
|
||||||
|
self.table(table_name).is_some_and(|table| {
|
||||||
|
required
|
||||||
|
.iter()
|
||||||
|
.all(|required_name| table.has_column(required_name))
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
fn table(&self, table_name: &str) -> Option<&TableSchema> {
|
||||||
|
self.tables.iter().find(|table| table.name == table_name)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn to_summaries(&self, connection: &Connection) -> Vec<TableSummary> {
|
||||||
|
self.tables
|
||||||
|
.iter()
|
||||||
|
.map(|table| TableSummary {
|
||||||
|
name: table.name.clone(),
|
||||||
|
columns: table.columns.clone(),
|
||||||
|
create_sql: table.create_sql.clone(),
|
||||||
|
row_count: count_rows(connection, &table.name).ok(),
|
||||||
|
})
|
||||||
|
.collect()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl TableSchema {
|
||||||
|
fn has_column(&self, column_name: &str) -> bool {
|
||||||
|
self.columns.iter().any(|column| column == column_name)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Deserialize)]
|
||||||
|
struct OpenClawMeta {
|
||||||
|
#[serde(default)]
|
||||||
|
model: Option<String>,
|
||||||
|
#[serde(default)]
|
||||||
|
provider: Option<String>,
|
||||||
|
#[serde(rename = "vectorDims", default)]
|
||||||
|
vector_dims: Option<usize>,
|
||||||
|
}
|
||||||
|
|
||||||
|
fn load_openclaw_meta(connection: &Connection) -> Result<Option<OpenClawMeta>> {
|
||||||
|
let raw_value = connection
|
||||||
|
.query_row(
|
||||||
|
"SELECT value FROM meta WHERE key = 'memory_index_meta_v1'",
|
||||||
|
[],
|
||||||
|
|row| row.get::<_, String>(0),
|
||||||
|
)
|
||||||
|
.ok();
|
||||||
|
|
||||||
|
raw_value
|
||||||
|
.map(|value| serde_json::from_str::<OpenClawMeta>(&value).context("failed to parse meta"))
|
||||||
|
.transpose()
|
||||||
|
}
|
||||||
|
|
||||||
|
fn load_openclaw_files(connection: &Connection) -> Result<Vec<FileRecord>> {
|
||||||
|
let mut statement = connection.prepare(
|
||||||
|
"SELECT
|
||||||
|
f.path,
|
||||||
|
f.source,
|
||||||
|
f.size,
|
||||||
|
CAST(f.mtime AS INTEGER) AS mtime_ms,
|
||||||
|
COUNT(c.id) AS chunk_count
|
||||||
|
FROM files f
|
||||||
|
LEFT JOIN chunks c ON c.path = f.path
|
||||||
|
GROUP BY f.path, f.source, f.size, f.mtime
|
||||||
|
ORDER BY f.path",
|
||||||
|
)?;
|
||||||
|
let rows = statement.query_map([], |row| {
|
||||||
|
Ok(FileRecord {
|
||||||
|
path: row.get(0)?,
|
||||||
|
source: row.get(1)?,
|
||||||
|
size: row.get(2)?,
|
||||||
|
mtime_ms: row.get(3)?,
|
||||||
|
chunk_count: row.get::<_, i64>(4).unwrap_or_default().max(0) as usize,
|
||||||
|
})
|
||||||
|
})?;
|
||||||
|
rows.collect::<rusqlite::Result<Vec<_>>>()
|
||||||
|
.map_err(Into::into)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn load_openclaw_chunks(connection: &Connection) -> Result<Vec<ChunkRecord>> {
|
||||||
|
let mut statement = connection.prepare(
|
||||||
|
"SELECT
|
||||||
|
id,
|
||||||
|
path,
|
||||||
|
source,
|
||||||
|
model,
|
||||||
|
start_line,
|
||||||
|
end_line,
|
||||||
|
updated_at,
|
||||||
|
text,
|
||||||
|
embedding
|
||||||
|
FROM chunks
|
||||||
|
ORDER BY path, start_line, end_line, id",
|
||||||
|
)?;
|
||||||
|
let rows = statement.query_map([], decode_chunk_row)?;
|
||||||
|
rows.collect::<rusqlite::Result<Vec<_>>>()
|
||||||
|
.map_err(Into::into)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn read_columns(connection: &Connection, table_name: &str) -> Result<Vec<String>> {
|
||||||
|
let pragma = format!("PRAGMA table_info({})", quote_ident(table_name));
|
||||||
|
let mut statement = connection.prepare(&pragma)?;
|
||||||
|
let rows = statement.query_map([], |row| row.get::<_, String>(1))?;
|
||||||
|
rows.collect::<rusqlite::Result<Vec<_>>>()
|
||||||
|
.map_err(Into::into)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn count_rows(connection: &Connection, table_name: &str) -> Result<i64> {
|
||||||
|
let sql = format!("SELECT COUNT(*) FROM {}", quote_ident(table_name));
|
||||||
|
connection
|
||||||
|
.query_row(&sql, [], |row| row.get::<_, i64>(0))
|
||||||
|
.map_err(Into::into)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn build_openclaw_notes(
|
||||||
|
schema: &SchemaSnapshot,
|
||||||
|
meta: Option<&OpenClawMeta>,
|
||||||
|
metrics: &OverviewMetrics,
|
||||||
|
) -> Vec<String> {
|
||||||
|
let mut notes = Vec::new();
|
||||||
|
if let Some(meta) = meta {
|
||||||
|
if let (Some(provider), Some(model)) = (&meta.provider, &meta.model) {
|
||||||
|
notes.push(format!(
|
||||||
|
"Memory index metadata reports provider `{}` with embedding model `{}`.",
|
||||||
|
provider, model
|
||||||
|
));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if schema.table("chunks_fts").is_some() {
|
||||||
|
notes.push(
|
||||||
|
"FTS5 is present, so the store can support lexical search alongside vectors."
|
||||||
|
.to_string(),
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if schema.table("chunks_vec").is_some() {
|
||||||
|
notes.push("`sqlite-vec` artifacts are present, which aligns with OpenClaw's hybrid search layout.".to_string());
|
||||||
|
}
|
||||||
|
notes.push(format!(
|
||||||
|
"Loaded {} files and {} chunks from the memory index.",
|
||||||
|
metrics.total_files, metrics.total_chunks
|
||||||
|
));
|
||||||
|
notes
|
||||||
|
}
|
||||||
|
|
||||||
|
fn build_metrics(
|
||||||
|
files: &[FileRecord],
|
||||||
|
chunks: &[ChunkRecord],
|
||||||
|
meta_dims: Option<usize>,
|
||||||
|
schema: &SchemaSnapshot,
|
||||||
|
) -> OverviewMetrics {
|
||||||
|
let mut models = BTreeSet::new();
|
||||||
|
let mut sources = BTreeSet::new();
|
||||||
|
let mut dims = BTreeMap::<usize, usize>::new();
|
||||||
|
|
||||||
|
for chunk in chunks {
|
||||||
|
if let Some(model) = chunk.model.as_ref() {
|
||||||
|
models.insert(model.clone());
|
||||||
|
}
|
||||||
|
if let Some(source) = chunk.source.as_ref() {
|
||||||
|
sources.insert(source.clone());
|
||||||
|
}
|
||||||
|
if let Some(embedding) = chunk.embedding.as_ref() {
|
||||||
|
*dims.entry(embedding.len()).or_default() += 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
for file in files {
|
||||||
|
if let Some(source) = file.source.as_ref() {
|
||||||
|
sources.insert(source.clone());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
let vector_backend = detect_vector_backend(schema);
|
||||||
|
let embedding_dims = meta_dims.or_else(|| {
|
||||||
|
dims.into_iter()
|
||||||
|
.max_by_key(|(_, count)| *count)
|
||||||
|
.map(|(dims, _)| dims)
|
||||||
|
});
|
||||||
|
|
||||||
|
OverviewMetrics {
|
||||||
|
total_files: files.len(),
|
||||||
|
total_chunks: chunks.len(),
|
||||||
|
embedding_rows: chunks
|
||||||
|
.iter()
|
||||||
|
.filter(|chunk| chunk.embedding.is_some())
|
||||||
|
.count(),
|
||||||
|
embedding_dims,
|
||||||
|
vector_backend,
|
||||||
|
models: models.into_iter().collect(),
|
||||||
|
sources: sources.into_iter().collect(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn detect_vector_backend(schema: &SchemaSnapshot) -> Option<String> {
|
||||||
|
let mut backends = Vec::new();
|
||||||
|
for table in &schema.tables {
|
||||||
|
let sql = table
|
||||||
|
.create_sql
|
||||||
|
.as_deref()
|
||||||
|
.unwrap_or_default()
|
||||||
|
.to_ascii_lowercase();
|
||||||
|
if sql.contains("using vec0") {
|
||||||
|
backends.push("sqlite-vec".to_string());
|
||||||
|
} else if sql.contains("using vss0") {
|
||||||
|
backends.push("sqlite-vss".to_string());
|
||||||
|
} else if table.name.contains("fts") {
|
||||||
|
backends.push("fts5".to_string());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
backends.sort();
|
||||||
|
backends.dedup();
|
||||||
|
(!backends.is_empty()).then(|| backends.join(" + "))
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Clone, Debug)]
|
||||||
|
struct GenericContentMapping {
|
||||||
|
table_name: String,
|
||||||
|
id_column: Option<String>,
|
||||||
|
path_column: Option<String>,
|
||||||
|
text_column: String,
|
||||||
|
source_column: Option<String>,
|
||||||
|
model_column: Option<String>,
|
||||||
|
start_line_column: Option<String>,
|
||||||
|
end_line_column: Option<String>,
|
||||||
|
updated_at_column: Option<String>,
|
||||||
|
embedding_column: Option<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
fn choose_content_mapping(schema: &SchemaSnapshot) -> Option<GenericContentMapping> {
|
||||||
|
let mut best_mapping = None;
|
||||||
|
let mut best_score = i32::MIN;
|
||||||
|
|
||||||
|
for table in &schema.tables {
|
||||||
|
let lower = table
|
||||||
|
.columns
|
||||||
|
.iter()
|
||||||
|
.map(|column| column.to_ascii_lowercase())
|
||||||
|
.collect::<Vec<_>>();
|
||||||
|
let text_column = find_column(
|
||||||
|
&lower,
|
||||||
|
&[
|
||||||
|
"text",
|
||||||
|
"content",
|
||||||
|
"chunk_text",
|
||||||
|
"body",
|
||||||
|
"document",
|
||||||
|
"payload",
|
||||||
|
],
|
||||||
|
);
|
||||||
|
let embedding_column = find_column(
|
||||||
|
&lower,
|
||||||
|
&["embedding", "vector", "embedding_json", "embedding_blob"],
|
||||||
|
);
|
||||||
|
let id_column = find_column(&lower, &["id", "chunk_id", "uuid"]);
|
||||||
|
let path_column = find_column(
|
||||||
|
&lower,
|
||||||
|
&[
|
||||||
|
"path",
|
||||||
|
"file_path",
|
||||||
|
"document_path",
|
||||||
|
"source_path",
|
||||||
|
"uri",
|
||||||
|
"doc_id",
|
||||||
|
"document_id",
|
||||||
|
],
|
||||||
|
);
|
||||||
|
|
||||||
|
if let Some(text_index) = text_column {
|
||||||
|
let mut score = 15;
|
||||||
|
if embedding_column.is_some() {
|
||||||
|
score += 20;
|
||||||
|
}
|
||||||
|
if id_column.is_some() {
|
||||||
|
score += 4;
|
||||||
|
}
|
||||||
|
if path_column.is_some() {
|
||||||
|
score += 4;
|
||||||
|
}
|
||||||
|
if table.name.contains("chunk") || table.name.contains("embedding") {
|
||||||
|
score += 3;
|
||||||
|
}
|
||||||
|
|
||||||
|
if score > best_score {
|
||||||
|
best_score = score;
|
||||||
|
best_mapping = Some(GenericContentMapping {
|
||||||
|
table_name: table.name.clone(),
|
||||||
|
id_column: id_column.map(|index| table.columns[index].clone()),
|
||||||
|
path_column: path_column.map(|index| table.columns[index].clone()),
|
||||||
|
text_column: table.columns[text_index].clone(),
|
||||||
|
source_column: find_column(&lower, &["source", "provider", "namespace"])
|
||||||
|
.map(|index| table.columns[index].clone()),
|
||||||
|
model_column: find_column(&lower, &["model", "embedding_model"])
|
||||||
|
.map(|index| table.columns[index].clone()),
|
||||||
|
start_line_column: find_column(&lower, &["start_line", "line_start"])
|
||||||
|
.map(|index| table.columns[index].clone()),
|
||||||
|
end_line_column: find_column(&lower, &["end_line", "line_end"])
|
||||||
|
.map(|index| table.columns[index].clone()),
|
||||||
|
updated_at_column: find_column(&lower, &["updated_at", "mtime", "created_at"])
|
||||||
|
.map(|index| table.columns[index].clone()),
|
||||||
|
embedding_column: embedding_column.map(|index| table.columns[index].clone()),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
best_mapping
|
||||||
|
}
|
||||||
|
|
||||||
|
fn find_column(columns: &[String], candidates: &[&str]) -> Option<usize> {
|
||||||
|
candidates
|
||||||
|
.iter()
|
||||||
|
.find_map(|candidate| columns.iter().position(|column| column == candidate))
|
||||||
|
}
|
||||||
|
|
||||||
|
fn load_generic_chunks(
|
||||||
|
connection: &Connection,
|
||||||
|
mapping: &GenericContentMapping,
|
||||||
|
) -> Result<Vec<ChunkRecord>> {
|
||||||
|
let sql = format!(
|
||||||
|
"SELECT
|
||||||
|
{id_expr} AS item_id,
|
||||||
|
{path_expr} AS item_path,
|
||||||
|
{source_expr} AS item_source,
|
||||||
|
{model_expr} AS item_model,
|
||||||
|
{start_expr} AS item_start_line,
|
||||||
|
{end_expr} AS item_end_line,
|
||||||
|
{updated_expr} AS item_updated_at,
|
||||||
|
{text_expr} AS item_text,
|
||||||
|
{embedding_expr} AS item_embedding
|
||||||
|
FROM {table_name}",
|
||||||
|
id_expr = mapping
|
||||||
|
.id_column
|
||||||
|
.as_ref()
|
||||||
|
.map(|column| quote_ident(column))
|
||||||
|
.unwrap_or_else(|| "CAST(rowid AS TEXT)".to_string()),
|
||||||
|
path_expr = nullable_ident(mapping.path_column.as_ref()),
|
||||||
|
source_expr = nullable_ident(mapping.source_column.as_ref()),
|
||||||
|
model_expr = nullable_ident(mapping.model_column.as_ref()),
|
||||||
|
start_expr = nullable_ident(mapping.start_line_column.as_ref()),
|
||||||
|
end_expr = nullable_ident(mapping.end_line_column.as_ref()),
|
||||||
|
updated_expr = nullable_ident(mapping.updated_at_column.as_ref()),
|
||||||
|
text_expr = quote_ident(&mapping.text_column),
|
||||||
|
embedding_expr = nullable_ident(mapping.embedding_column.as_ref()),
|
||||||
|
table_name = quote_ident(&mapping.table_name),
|
||||||
|
);
|
||||||
|
|
||||||
|
let mut statement = connection.prepare(&sql)?;
|
||||||
|
let rows = statement.query_map([], |row| {
|
||||||
|
let id = row.get::<_, String>(0)?;
|
||||||
|
let path = row
|
||||||
|
.get::<_, Option<String>>(1)?
|
||||||
|
.unwrap_or_else(|| format!("{}#{}", mapping.table_name, id));
|
||||||
|
let source = row.get(2)?;
|
||||||
|
let model = row.get(3)?;
|
||||||
|
let start_line = row.get(4)?;
|
||||||
|
let end_line = row.get(5)?;
|
||||||
|
let updated_at_ms = row.get(6)?;
|
||||||
|
let text = row.get::<_, String>(7)?;
|
||||||
|
let embedding = decode_embedding_value(row, 8)?;
|
||||||
|
Ok(ChunkRecord {
|
||||||
|
id,
|
||||||
|
path,
|
||||||
|
source,
|
||||||
|
model,
|
||||||
|
start_line,
|
||||||
|
end_line,
|
||||||
|
updated_at_ms,
|
||||||
|
text,
|
||||||
|
embedding,
|
||||||
|
})
|
||||||
|
})?;
|
||||||
|
rows.collect::<rusqlite::Result<Vec<_>>>()
|
||||||
|
.map_err(Into::into)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn load_generic_files(
|
||||||
|
connection: &Connection,
|
||||||
|
schema: &SchemaSnapshot,
|
||||||
|
_mapping: &GenericContentMapping,
|
||||||
|
chunks: &[ChunkRecord],
|
||||||
|
) -> Result<Vec<FileRecord>> {
|
||||||
|
if let Some(file_table) = choose_file_table(schema) {
|
||||||
|
let sql = format!(
|
||||||
|
"SELECT
|
||||||
|
{path_expr} AS file_path,
|
||||||
|
{source_expr} AS file_source,
|
||||||
|
{size_expr} AS file_size,
|
||||||
|
{mtime_expr} AS file_mtime
|
||||||
|
FROM {table_name}",
|
||||||
|
path_expr = quote_ident(&file_table.path_column),
|
||||||
|
source_expr = nullable_ident(file_table.source_column.as_ref()),
|
||||||
|
size_expr = nullable_ident(file_table.size_column.as_ref()),
|
||||||
|
mtime_expr = nullable_ident(file_table.mtime_column.as_ref()),
|
||||||
|
table_name = quote_ident(&file_table.table_name),
|
||||||
|
);
|
||||||
|
let chunk_counts = group_chunk_counts(chunks);
|
||||||
|
let mut statement = connection.prepare(&sql)?;
|
||||||
|
let rows = statement.query_map([], |row| {
|
||||||
|
let path = row.get::<_, String>(0)?;
|
||||||
|
Ok(FileRecord {
|
||||||
|
chunk_count: *chunk_counts.get(&path).unwrap_or(&0),
|
||||||
|
path,
|
||||||
|
source: row.get(1)?,
|
||||||
|
size: row.get(2)?,
|
||||||
|
mtime_ms: row.get(3)?,
|
||||||
|
})
|
||||||
|
})?;
|
||||||
|
let mut files = rows.collect::<rusqlite::Result<Vec<_>>>()?;
|
||||||
|
files.sort_by(|left, right| left.path.cmp(&right.path));
|
||||||
|
return Ok(files);
|
||||||
|
}
|
||||||
|
|
||||||
|
let chunk_counts = group_chunk_counts(chunks);
|
||||||
|
let mut files = chunk_counts
|
||||||
|
.into_iter()
|
||||||
|
.map(|(path, chunk_count)| FileRecord {
|
||||||
|
path,
|
||||||
|
source: None,
|
||||||
|
size: None,
|
||||||
|
mtime_ms: None,
|
||||||
|
chunk_count,
|
||||||
|
})
|
||||||
|
.collect::<Vec<_>>();
|
||||||
|
files.sort_by(|left, right| left.path.cmp(&right.path));
|
||||||
|
Ok(files)
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Clone, Debug)]
|
||||||
|
struct FileTableMapping {
|
||||||
|
table_name: String,
|
||||||
|
path_column: String,
|
||||||
|
source_column: Option<String>,
|
||||||
|
size_column: Option<String>,
|
||||||
|
mtime_column: Option<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
fn choose_file_table(schema: &SchemaSnapshot) -> Option<FileTableMapping> {
|
||||||
|
schema.tables.iter().find_map(|table| {
|
||||||
|
let lower = table
|
||||||
|
.columns
|
||||||
|
.iter()
|
||||||
|
.map(|column| column.to_ascii_lowercase())
|
||||||
|
.collect::<Vec<_>>();
|
||||||
|
let path_column = find_column(
|
||||||
|
&lower,
|
||||||
|
&["path", "file_path", "document_path", "source_path", "uri"],
|
||||||
|
)?;
|
||||||
|
let has_size = find_column(&lower, &["size", "byte_size"]);
|
||||||
|
let has_mtime = find_column(&lower, &["mtime", "updated_at", "modified_at"]);
|
||||||
|
if table.name.contains("file") || has_size.is_some() || has_mtime.is_some() {
|
||||||
|
Some(FileTableMapping {
|
||||||
|
table_name: table.name.clone(),
|
||||||
|
path_column: table.columns[path_column].clone(),
|
||||||
|
source_column: find_column(&lower, &["source", "provider", "namespace"])
|
||||||
|
.map(|index| table.columns[index].clone()),
|
||||||
|
size_column: has_size.map(|index| table.columns[index].clone()),
|
||||||
|
mtime_column: has_mtime.map(|index| table.columns[index].clone()),
|
||||||
|
})
|
||||||
|
} else {
|
||||||
|
None
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
fn group_chunk_counts(chunks: &[ChunkRecord]) -> HashMap<String, usize> {
|
||||||
|
let mut counts = HashMap::new();
|
||||||
|
for chunk in chunks {
|
||||||
|
*counts.entry(chunk.path.clone()).or_default() += 1;
|
||||||
|
}
|
||||||
|
counts
|
||||||
|
}
|
||||||
|
|
||||||
|
fn decode_chunk_row(row: &Row<'_>) -> rusqlite::Result<ChunkRecord> {
|
||||||
|
Ok(ChunkRecord {
|
||||||
|
id: row.get(0)?,
|
||||||
|
path: row.get(1)?,
|
||||||
|
source: row.get(2)?,
|
||||||
|
model: row.get(3)?,
|
||||||
|
start_line: row.get(4)?,
|
||||||
|
end_line: row.get(5)?,
|
||||||
|
updated_at_ms: row.get(6)?,
|
||||||
|
text: row.get(7)?,
|
||||||
|
embedding: decode_embedding_value(row, 8)?,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
fn decode_embedding_value(
|
||||||
|
row: &Row<'_>,
|
||||||
|
column_index: usize,
|
||||||
|
) -> rusqlite::Result<Option<Vec<f32>>> {
|
||||||
|
match row.get_ref(column_index)? {
|
||||||
|
ValueRef::Null => Ok(None),
|
||||||
|
ValueRef::Text(bytes) => Ok(parse_text_embedding(bytes)),
|
||||||
|
ValueRef::Blob(bytes) => Ok(parse_blob_embedding(bytes)),
|
||||||
|
ValueRef::Integer(_) | ValueRef::Real(_) => Ok(None),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn parse_text_embedding(bytes: &[u8]) -> Option<Vec<f32>> {
|
||||||
|
let text = std::str::from_utf8(bytes).ok()?;
|
||||||
|
if text.trim().is_empty() {
|
||||||
|
return None;
|
||||||
|
}
|
||||||
|
serde_json::from_str::<Vec<f32>>(text)
|
||||||
|
.ok()
|
||||||
|
.filter(|values| !values.is_empty())
|
||||||
|
}
|
||||||
|
|
||||||
|
fn parse_blob_embedding(bytes: &[u8]) -> Option<Vec<f32>> {
|
||||||
|
if bytes.len() % 4 != 0 {
|
||||||
|
return None;
|
||||||
|
}
|
||||||
|
let mut values = Vec::with_capacity(bytes.len() / 4);
|
||||||
|
for chunk in bytes.chunks_exact(4) {
|
||||||
|
values.push(f32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]));
|
||||||
|
}
|
||||||
|
(!values.is_empty()).then_some(values)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn quote_ident(identifier: &str) -> String {
|
||||||
|
format!("\"{}\"", identifier.replace('"', "\"\""))
|
||||||
|
}
|
||||||
|
|
||||||
|
fn nullable_ident(identifier: Option<&String>) -> String {
|
||||||
|
identifier
|
||||||
|
.map(|identifier| quote_ident(identifier))
|
||||||
|
.unwrap_or_else(|| "NULL".to_string())
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user