fix(indexer): use upsert_chunks return value for chunk count

Previously total_chunks counted from process_file return (num_chunks) which could differ from actual stored count if upsert silently failed. Now using stored count returned by upsert_chunks. Also fixes cli._index to skip progress yields when building result.
fix(indexer): write sync-result.json in reindex() not just sync()
2026-04-12 02:16:19 -04:00 · 2026-04-12 01:49:43 -04:00 · 2026-04-12 01:03:00 -04:00 · 2026-04-12 00:56:10 -04:00 · 2026-04-12 00:10:38 -04:00 · 2026-04-11 23:58:05 -04:00
13 changed files with 310 additions and 67 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,76 @@
 # AGENTS.md
 ## Stack
 Two independent packages in one repo:
 | Directory | Role | Entry | Build |
 |-----------|------|-------|-------|
 | `src/` | TypeScript OpenClaw plugin | `src/index.ts` | esbuild → `dist/index.js` |
 | `python/` | Python CLI indexer | `obsidian_rag/cli.py` | pip install -e |
 ## Commands
 **TypeScript (OpenClaw plugin):**
 ```bash
 npm run build     # esbuild → dist/index.js
 npm run typecheck # tsc --noEmit
 npm run test      # vitest run
 ```
 **Python (RAG indexer):**
 ```bash
 pip install -e python/          # editable install
 obsidian-rag index|sync|reindex|status   # CLI
 pytest python/                 # tests
 ruff check python/              # lint
 ```
 ## OpenClaw Plugin Install
 Plugin `package.json` MUST have:
 ```json
 "openclaw": {
  "extensions": ["./dist/index.js"],
  "hook": []
 }
 ```
 - `extensions` = array, string path
 - `hook` = singular, not `hooks`
 ## Config
 User config at `~/.obsidian-rag/config.json` or `./obsidian-rag/` dev config.
 Key indexing fields:
 - `indexing.chunk_size` — sliding window chunk size (default 500)
 - `indexing.chunk_overlap` — overlap between chunks (default 100)
 - `indexing.max_section_chars` — max chars per section before hierarchical split (default 4000)
 Key security fields:
 - `security.require_confirmation_for` — list of categories (e.g. `["health", "financial_debt"]`). Empty list disables guard.
 - `security.auto_approve_sensitive` — `true` bypasses sensitive content prompts.
 - `security.local_only` — `true` blocks non-localhost Ollama.
 ## Ollama Context Length
 `python/obsidian_rag/embedder.py` truncates chunks at `MAX_CHUNK_CHARS = 8000` before embedding. If Ollama 500 error returns, increase `max_section_chars` (to reduce section sizes) or reduce `chunk_size` in config.
 ## Hierarchical Chunking
 Structured notes (date-named files) use section-split first, then sliding-window within sections that exceed `max_section_chars`. Small sections stay intact; large sections are broken into sub-chunks with the parent section heading preserved.
 ## Sensitive Content Guard
 Triggered by categories in `require_confirmation_for`. Raises `SensitiveContentError` from `obsidian_rag/indexer.py`.
 To disable: set `require_confirmation_for: []` or `auto_approve_sensitive: true` in config.
 ## Architecture
 ```
 User query → OpenClaw (TypeScript plugin src/index.ts)
           → obsidian_rag_* tools (python/obsidian_rag/)
           → Ollama embeddings (http://localhost:11434)
           → LanceDB vector store
 ```
--- a/docs/troubleshooting/MISSING_OPENCLAW_HOOKS_ERROR.md
+++ b/docs/troubleshooting/MISSING_OPENCLAW_HOOKS_ERROR.md
@@ -0,0 +1,56 @@
 # Troubleshooting: "missing openclaw.hooks" Error
 ## Symptoms
 When installing a plugin using `openclaw plugins install --link <path>`, the following error appears:
 ```
 package.json missing openclaw.hooks; update the plugin package to include openclaw.extensions (for example ["./dist/index.js"]). See https://docs.openclaw.ai/help/troubleshooting#plugin-install-fails-with-missing-openclaw-extensions
 Also not a valid hook pack: Error: package.json missing openclaw.hooks
 ```
 ## Root Cause
 This error message is **a misleading fallback cascade** that occurs when the plugin installation fails for a different reason. The error message suggests the problem is a missing `openclaw.hooks` field, but this is actually a secondary error that appears because the primary plugin installation failed.
 ### How the Error Cascade Works
 1. When `--link` is used with a local plugin path, OpenClaw first attempts to install the plugin via `installPluginFromPath()`
 2. The installation flow calls `installPluginFromDir()` → `installPluginFromSourceDir()` → `detectNativePackageInstallSource()`
 3. If `detectNativePackageInstallSource()` returns `false` (e.g., due to a dangerous code scan failure), it falls through to `installPluginFromPackageDir()`
 4. When that also fails (e.g., due to `child_process` usage being flagged), the code falls back to `tryInstallHookPackFromLocalPath()`
 5. The hook pack installer calls `ensureOpenClawHooks()`, which expects a `hooks` array in the manifest
 6. Since your plugin has no `hooks` field, it throws "missing openclaw.hooks"
 ### Primary vs Secondary Errors
 The **primary error** blocking installation is:
 ```
 Plugin "obsidian-rag" installation blocked: dangerous code patterns detected: Shell command execution detected (child_process)
 ```
 The **secondary error** ("missing openclaw.hooks") is a fallback diagnostic that appears because:
 - The plugin installation path failed
 - OpenClaw tried to interpret the path as a hook pack as a last resort
 - Hook packs require a `hooks` field, which normal plugins don't have
 ## Common Primary Errors
 1. **Dangerous code patterns detected**: Plugins using `child_process`, `eval()`, file system operations, or network requests may be blocked
 2. **Plugin ID mismatch**: The `id` in `openclaw.plugin.json` doesn't match expected values
 3. **Missing `openclaw.extensions`**: The extensions array is missing or malformed
 ## Solution
 The "missing openclaw.hooks" error can be **safely ignored** once the primary installation error is resolved. For the dangerous code blocking issue, use the `--dangerously-force-unsafe-install` flag:
 ```bash
 openclaw plugins install --link --dangerously-force-unsafe-install /path/to/plugin/
 ```
 ## Why This Matters for AI Analysis
 When analyzing this error, focus on the **first error shown**, not the secondary hook-related message. The OpenClaw installer attempts multiple installation strategies in sequence, and the final "missing openclaw.hooks" error is a diagnostic artifact from the fallback path, not the actual problem.
 If the primary error is about dangerous code patterns, the plugin is functioning correctly from a technical standpoint—it simply uses APIs (like `child_process`) that OpenClaw's security scanning flags as potentially unsafe. Whether this is acceptable depends on the plugin's use case and trust level.
--- a/openclaw.plugin.json
+++ b/openclaw.plugin.json
@@ -65,6 +65,11 @@
            "type": "integer",
            "minimum": 0
          },
          "max_section_chars": {
            "type": "integer",
            "minimum": 1,
            "description": "Max chars per section before splitting into sub-chunks. Default 4000."
          },
          "file_patterns": {
            "type": "array",
            "items": {
--- a/package.json
+++ b/package.json
@@ -5,8 +5,7 @@
  "main": "dist/index.js",
  "type": "module",
  "openclaw": {
-    "extensions": ["./dist/index.js"],
+    "extensions": ["./dist/index.js"]
    "hook": []
  },
  "scripts": {
    "build": "esbuild src/index.ts --bundle --platform=node --target=node18 --outfile=dist/index.js --format=esm --external:@lancedb/lancedb --external:@lancedb/lancedb-darwin-arm64 --external:fsevents --external:chokidar",
--- a/python/obsidian_rag/chunker.py
+++ b/python/obsidian_rag/chunker.py
@@ -3,7 +3,6 @@
 from __future__ import annotations
 import re
 import unicodedata
 import hashlib
 from dataclasses import dataclass, field
 from pathlib import Path
@@ -181,9 +180,7 @@ def chunk_file(
    Uses section-split for structured notes (journal entries with date filenames),
    sliding window for everything else.
    """
    import uuid
    vault_path = Path(config.vault_path)
    rel_path = filepath if filepath.is_absolute() else filepath
    source_file = str(rel_path)
    source_directory = rel_path.parts[0] if rel_path.parts else ""
@@ -201,7 +198,6 @@ def chunk_file(
    chunks: list[Chunk] = []
    if is_structured_note(filepath):
        # Section-split for journal/daily notes
        sections = split_by_sections(body, metadata)
        total = len(sections)
@@ -211,13 +207,31 @@ def chunk_file(
            section_tags = extract_tags(section_text)
            combined_tags = list(dict.fromkeys([*tags, *section_tags]))
-            chunk_text = section_text
+            section_heading = f"#{section}" if section else None
            if len(section_text) > config.indexing.max_section_chars:
                sub_chunks = sliding_window_chunks(section_text, chunk_size, overlap)
                sub_total = len(sub_chunks)
                for sub_idx, sub_text in enumerate(sub_chunks):
                    chunk = Chunk(
-                chunk_id=f"{chunk_id_prefix}{_stable_chunk_id(content_hash, idx)}",
+                        chunk_id=f"{chunk_id_prefix}{_stable_chunk_id(content_hash, idx)}_{sub_idx}",
-                text=chunk_text,
+                        text=sub_text,
                        source_file=source_file,
                        source_directory=source_directory,
-                section=f"#{section}" if section else None,
+                        section=section_heading,
                        date=date,
                        tags=combined_tags,
                        chunk_index=sub_idx,
                        total_chunks=sub_total,
                        modified_at=modified_at,
                    )
                    chunks.append(chunk)
            else:
                chunk = Chunk(
                    chunk_id=f"{chunk_id_prefix}{_stable_chunk_id(content_hash, idx)}",
                    text=section_text,
                    source_file=source_file,
                    source_directory=source_directory,
                    section=section_heading,
                    date=date,
                    tags=combined_tags,
                    chunk_index=idx,
--- a/python/obsidian_rag/cli.py
+++ b/python/obsidian_rag/cli.py
@@ -51,7 +51,10 @@ def _index(config) -> int:
        gen = indexer.full_index()
        result: dict = {"indexed_files": 0, "total_chunks": 0, "errors": []}
        for item in gen:
-            result = item  # progress yields are dicts; final dict from return
+            if item.get("type") == "complete":
                result = item
            elif item.get("type") == "progress":
                pass  # skip progress logs in result
        duration_ms = int((time.monotonic() - t0) * 1000)
        print(
            json.dumps(
--- a/python/obsidian_rag/config.py
+++ b/python/obsidian_rag/config.py
@@ -3,7 +3,6 @@
 from __future__ import annotations
 import json
 import os
 from enum import Enum
 from dataclasses import dataclass, field
 from pathlib import Path
@@ -32,6 +31,7 @@ class VectorStoreConfig:
 class IndexingConfig:
    chunk_size: int = 500
    chunk_overlap: int = 100
    max_section_chars: int = 4000
    file_patterns: list[str] = field(default_factory=lambda: ["*.md"])
    deny_dirs: list[str] = field(
        default_factory=lambda: [
--- a/python/obsidian_rag/embedder.py
+++ b/python/obsidian_rag/embedder.py
@@ -12,6 +12,7 @@ if TYPE_CHECKING:
    from obsidian_rag.config import ObsidianRagConfig
 DEFAULT_TIMEOUT = 120.0  # seconds
 MAX_CHUNK_CHARS = 8000  # safe default for most Ollama models
 class EmbeddingError(Exception):
@@ -44,7 +45,7 @@ class OllamaEmbedder:
            return
        parsed = urllib.parse.urlparse(self.base_url)
-        if parsed.hostname not in ['localhost', '127.0.0.1', '::1']:
+        if parsed.hostname not in ["localhost", "127.0.0.1", "::1"]:
            raise SecurityError(
                f"Remote embedding service not allowed when local_only=True: {self.base_url}"
            )
@@ -84,23 +85,31 @@ class OllamaEmbedder:
        # For batch, call /api/embeddings multiple times sequentially
        if len(batch) == 1:
            endpoint = f"{self.base_url}/api/embeddings"
-            payload = {"model": self.model, "prompt": batch[0]}
+            prompt = batch[0][:MAX_CHUNK_CHARS]
            payload = {"model": self.model, "prompt": prompt}
        else:
            # For batch, use /api/embeddings with "input" (multiple calls)
            results = []
            for text in batch:
                truncated = text[:MAX_CHUNK_CHARS]
                try:
                    resp = self._client.post(
                        f"{self.base_url}/api/embeddings",
-                        json={"model": self.model, "prompt": text},
+                        json={"model": self.model, "prompt": truncated},
                        timeout=DEFAULT_TIMEOUT,
                    )
                except httpx.ConnectError as e:
-                    raise OllamaUnavailableError(f"Cannot connect to Ollama at {self.base_url}") from e
+                    raise OllamaUnavailableError(
                        f"Cannot connect to Ollama at {self.base_url}"
                    ) from e
                except httpx.TimeoutException as e:
-                    raise EmbeddingError(f"Embedding request timed out after {DEFAULT_TIMEOUT}s") from e
+                    raise EmbeddingError(
                        f"Embedding request timed out after {DEFAULT_TIMEOUT}s"
                    ) from e
                if resp.status_code != 200:
-                    raise EmbeddingError(f"Ollama returned {resp.status_code}: {resp.text}")
+                    raise EmbeddingError(
                        f"Ollama returned {resp.status_code}: {resp.text}"
                    )
                data = resp.json()
                embedding = data.get("embedding", [])
                if not embedding:
@@ -111,9 +120,13 @@ class OllamaEmbedder:
        try:
            resp = self._client.post(endpoint, json=payload, timeout=DEFAULT_TIMEOUT)
        except httpx.ConnectError as e:
-            raise OllamaUnavailableError(f"Cannot connect to Ollama at {self.base_url}") from e
+            raise OllamaUnavailableError(
                f"Cannot connect to Ollama at {self.base_url}"
            ) from e
        except httpx.TimeoutException as e:
-            raise EmbeddingError(f"Embedding request timed out after {DEFAULT_TIMEOUT}s") from e
+            raise EmbeddingError(
                f"Embedding request timed out after {DEFAULT_TIMEOUT}s"
            ) from e
        if resp.status_code != 200:
            raise EmbeddingError(f"Ollama returned {resp.status_code}: {resp.text}")
--- a/python/obsidian_rag/indexer.py
+++ b/python/obsidian_rag/indexer.py
@@ -4,8 +4,6 @@ from __future__ import annotations
 import json
 import os
 import time
 import uuid
 from datetime import datetime, timezone
 from pathlib import Path
 from typing import TYPE_CHECKING, Any, Generator, Iterator
@@ -16,10 +14,13 @@ if TYPE_CHECKING:
 import obsidian_rag.config as config_mod
 from obsidian_rag.config import _resolve_data_dir
 from obsidian_rag.chunker import chunk_file
-from obsidian_rag.embedder import EmbeddingError, OllamaUnavailableError, SecurityError
+from obsidian_rag.embedder import OllamaUnavailableError
 from obsidian_rag.security import should_index_dir, validate_path
-from obsidian_rag.audit_logger import AuditLogger
+from obsidian_rag.vector_store import (
-from obsidian_rag.vector_store import create_table_if_not_exists, delete_by_source_file, get_db, upsert_chunks
+    create_table_if_not_exists,
    get_db,
    upsert_chunks,
 )
 # ----------------------------------------------------------------------
 # Pipeline
@@ -43,6 +44,7 @@ class Indexer:
    def embedder(self):
        if self._embedder is None:
            from obsidian_rag.embedder import OllamaEmbedder
            self._embedder = OllamaEmbedder(self.config)
        return self._embedder
@@ -50,6 +52,7 @@ class Indexer:
    def audit_logger(self):
        if self._audit_logger is None:
            from obsidian_rag.audit_logger import AuditLogger
            log_dir = _resolve_data_dir() / "audit"
            self._audit_logger = AuditLogger(log_dir / "audit.log")
        return self._audit_logger
@@ -64,9 +67,9 @@ class Indexer:
        for chunk in chunks:
            sensitivity = security.detect_sensitive(
-                chunk['chunk_text'],
+                chunk["chunk_text"],
                self.config.security.sensitive_sections,
-                self.config.memory.patterns
+                self.config.memory.patterns,
            )
            for category in sensitive_categories:
@@ -99,7 +102,11 @@ class Indexer:
        """Index a single file. Returns (num_chunks, enriched_chunks)."""
        from obsidian_rag import security
-        mtime = str(datetime.fromtimestamp(filepath.stat().st_mtime, tz=timezone.utc).isoformat())
+        mtime = str(
            datetime.fromtimestamp(
                filepath.stat().st_mtime, tz=timezone.utc
            ).isoformat()
        )
        content = filepath.read_text(encoding="utf-8")
        # Sanitize
        content = security.sanitize_text(content)
@@ -151,18 +158,19 @@ class Indexer:
                # Log sensitive content access
                for chunk in enriched:
                    from obsidian_rag import security
                    sensitivity = security.detect_sensitive(
-                        chunk['chunk_text'],
+                        chunk["chunk_text"],
                        self.config.security.sensitive_sections,
-                        self.config.memory.patterns
+                        self.config.memory.patterns,
                    )
-                    for category in ['health', 'financial', 'relations']:
+                    for category in ["health", "financial", "relations"]:
                        if sensitivity.get(category, False):
                            self.audit_logger.log_sensitive_access(
-                                str(chunk['source_file']),
+                                str(chunk["source_file"]),
                                category,
-                                'index',
+                                "index",
-                                {'chunk_id': chunk['chunk_id']}
+                                {"chunk_id": chunk["chunk_id"]},
                            )
                # Embed chunks
@@ -176,8 +184,8 @@ class Indexer:
                for e, v in zip(enriched, vectors):
                    e["vector"] = v
                # Store
-                upsert_chunks(table, enriched)
+                stored = upsert_chunks(table, enriched)
-                total_chunks += num_chunks
+                total_chunks += stored
                indexed_files += 1
            except Exception as exc:
                errors.append({"file": str(filepath), "error": str(exc)})
@@ -249,9 +257,16 @@ class Indexer:
        db = get_db(self.config)
        if "obsidian_chunks" in db.list_tables():
            db.drop_table("obsidian_chunks")
        # full_index is a generator — materialize it to get the final dict
        results = list(self.full_index())
-        return results[-1] if results else {"indexed_files": 0, "total_chunks": 0, "errors": []}
+        final = (
            results[-1]
            if results
            else {"indexed_files": 0, "total_chunks": 0, "errors": []}
        )
        self._write_sync_result(
            final["indexed_files"], final["total_chunks"], final["errors"]
        )
        return final
    def _sync_result_path(self) -> Path:
        # Use the same dev-data-dir convention as config.py
@@ -313,18 +328,19 @@ class Indexer:
                # Log sensitive content access
                for chunk in enriched:
                    from obsidian_rag import security
                    sensitivity = security.detect_sensitive(
-                        chunk['chunk_text'],
+                        chunk["chunk_text"],
                        self.config.security.sensitive_sections,
-                        self.config.memory.patterns
+                        self.config.memory.patterns,
                    )
-                    for category in ['health', 'financial', 'relations']:
+                    for category in ["health", "financial", "relations"]:
                        if sensitivity.get(category, False):
                            self.audit_logger.log_sensitive_access(
-                                str(chunk['source_file']),
+                                str(chunk["source_file"]),
                                category,
-                                'index',
+                                "index",
-                                {'chunk_id': chunk['chunk_id']}
+                                {"chunk_id": chunk["chunk_id"]},
                            )
                # Embed chunks
@@ -353,4 +369,3 @@ class Indexer:
        if not data_dir.exists() and not (project_root / "KnowledgeVault").exists():
            data_dir = Path(os.path.expanduser("~/.obsidian-rag"))
        return data_dir / "sync-result.json"
--- a/python/tests/unit/test_chunker.py
+++ b/python/tests/unit/test_chunker.py
@@ -206,6 +206,7 @@ def _mock_config(tmp_path: Path) -> MagicMock:
    cfg.vault_path = str(tmp_path)
    cfg.indexing.chunk_size = 500
    cfg.indexing.chunk_overlap = 100
    cfg.indexing.max_section_chars = 4000
    cfg.indexing.file_patterns = ["*.md"]
    cfg.indexing.deny_dirs = [".obsidian", ".trash", "zzz-Archive", ".git"]
    cfg.indexing.allow_dirs = []
@@ -248,3 +249,41 @@ def test_chunk_file_unstructured(tmp_path: Path):
    assert len(chunks) > 1
    assert all(c.section is None for c in chunks)
    assert chunks[0].chunk_index == 0
 def test_large_section_split_into_sub_chunks(tmp_path: Path):
    """Large section (exceeding max_section_chars) is split via sliding window."""
    vault = tmp_path / "Notes"
    vault.mkdir()
    fpath = vault / "2024-03-15-Podcast.md"
    large_content = "word " * 3000  # ~15000 chars, exceeds MAX_SECTION_CHARS
    fpath.write_text(f"# Episode Notes\n\n{large_content}")
    cfg = _mock_config(tmp_path)
    cfg.indexing.max_section_chars = 4000
    chunks = chunk_file(fpath, fpath.read_text(), "2024-03-15T10:00:00Z", cfg)
    # Large section should be split into multiple sub-chunks
    assert len(chunks) > 1
    # Each sub-chunk should preserve the section heading
    for chunk in chunks:
        assert chunk.section == "#Episode Notes", (
            f"Expected #Episode Notes, got {chunk.section}"
        )
 def test_small_section_kept_intact(tmp_path: Path):
    """Small section (under max_section_chars) remains a single chunk."""
    vault = tmp_path / "Notes"
    vault.mkdir()
    fpath = vault / "2024-03-15-Short.md"
    fpath.write_text("# Notes\n\nShort content here.")
    cfg = _mock_config(tmp_path)
    cfg.indexing.max_section_chars = 4000
    chunks = chunk_file(fpath, fpath.read_text(), "2024-03-15T10:00:00Z", cfg)
    # Small section → single chunk
    assert len(chunks) == 1
    assert chunks[0].section == "#Notes"
    assert chunks[0].text.strip().endswith("Short content here.")
--- a/src/services/health.ts
+++ b/src/services/health.ts
@@ -98,7 +98,8 @@ export async function probeAll(config: ObsidianRagConfig): Promise<ProbeResult>
  if (indexExists) {
    try {
-      const syncPath = resolve(dbPath, "..", "sync-result.json");
+      const dataDir = resolveDataDir();
      const syncPath = resolve(dataDir, "sync-result.json");
      if (existsSync(syncPath)) {
        const data = JSON.parse(readFileSync(syncPath, "utf-8"));
        lastSync = data.timestamp ?? null;
@@ -120,6 +121,17 @@ export async function probeAll(config: ObsidianRagConfig): Promise<ProbeResult>
  };
 }
 function resolveDataDir(): string {
  const cwd = process.cwd();
  const devDataDir = resolve(cwd, "obsidian-rag");
  const devVaultMarker = resolve(cwd, "KnowledgeVault");
  if (existsSync(devDataDir) || existsSync(devVaultMarker)) {
    return devDataDir;
  }
  const home = process.env.HOME ?? process.env.USERPROFILE ?? "";
  return resolve(home, ".obsidian-rag");
 }
 async function probeOllama(baseUrl: string): Promise<boolean> {
  try {
    const res = await fetch(`${baseUrl}/api/tags`, { signal: AbortSignal.timeout(3000) });
--- a/src/services/indexer-bridge.ts
+++ b/src/services/indexer-bridge.ts
@@ -109,7 +109,7 @@ export function readSyncResult(config: ObsidianRagConfig): {
  total_chunks: number;
  errors: Array<{ file: string; error: string }>;
 } | null {
-  const dataDir = resolve(process.cwd(), ".obsidian-rag");
+  const dataDir = _resolveDataDir();
  const path = resolve(dataDir, "sync-result.json");
  if (!existsSync(path)) return null;
  try {
@@ -118,3 +118,14 @@ export function readSyncResult(config: ObsidianRagConfig): {
    return null;
  }
 }
 function _resolveDataDir(): string {
  const cwd = process.cwd();
  const devDataDir = resolve(cwd, "obsidian-rag");
  const devVaultMarker = resolve(cwd, "KnowledgeVault");
  if (existsSync(devDataDir) || existsSync(devVaultMarker)) {
    return devDataDir;
  }
  const home = process.env.HOME ?? process.env.USERPROFILE ?? "";
  return resolve(home, ".obsidian-rag");
 }
--- a/src/utils/config.ts
+++ b/src/utils/config.ts
@@ -88,7 +88,7 @@ function defaults(): ObsidianRagConfig {
 }
 export function loadConfig(configPath?: string): ObsidianRagConfig {
-  const defaultPath = resolve(process.cwd(), ".obsidian-rag", "config.json");
+  const defaultPath = resolve(process.cwd(), "obsidian-rag", "config.json");
  const path = configPath ?? defaultPath;
  try {
    const raw = JSON.parse(readFileSync(path, "utf-8"));
Author	SHA1	Message	Date
Santhosh Janardhanan	21b9704e21	fix(indexer): use upsert_chunks return value for chunk count Previously total_chunks counted from process_file return (num_chunks) which could differ from actual stored count if upsert silently failed. Now using stored count returned by upsert_chunks. Also fixes cli._index to skip progress yields when building result.	2026-04-12 02:16:19 -04:00
Santhosh Janardhanan	4ab504e87c	fix(indexer): write sync-result.json in reindex() not just sync() reindex() was consuming the full_index() generator but never calling _write_sync_result(), leaving sync-result.json stale while CLI output showed correct indexed_files/total_chunks.	2026-04-12 01:49:43 -04:00
Santhosh Janardhanan	9d919dc237	fix(config): use 'obsidian-rag' not '.obsidian-rag' for dev config path	2026-04-12 01:03:00 -04:00
Santhosh Janardhanan	fe428511d1	fix(bridge): dev data dir is 'obsidian-rag' not '.obsidian-rag' Python's _resolve_data_dir() uses 'obsidian-rag' (no dot). TypeScript was using '.obsidian-rag' (with dot) — mismatch caused sync-result.json to never be found by the agent plugin.	2026-04-12 00:56:10 -04:00
Santhosh Janardhanan	a12e27b83a	fix(bridge): resolve data dir same as Python for sync-result.json Both readSyncResult() and probeAll() now mirror Python's _resolve_data_dir() logic: dev detection (cwd/.obsidian-rag or cwd/KnowledgeVault) then home dir fallback. Previously readSyncResult always used cwd/.obsidian-rag (wrong for server deployments) and probeAll resolved sync-result.json relative to db path (wrong for absolute paths like /home/san/.obsidian-rag/).	2026-04-12 00:10:38 -04:00
Santhosh Janardhanan	34f3ce97f7	feat(indexer): hierarchical chunking for large sections - Section-split first for structured notes - Large sections (>max_section_chars) broken via sliding-window - Small sections stay intact with heading preserved - Adds max_section_chars config (default 4000) - 2 new TDD tests for hierarchical chunking	2026-04-11 23:58:05 -04:00
Santhosh Janardhanan	a744c0c566	docs: add AGENTS.md with repo-specific guidance	2026-04-11 23:16:47 -04:00
Santhosh Janardhanan	d946cf34e1	fix(indexer): truncate chunks exceeding Ollama context window	2026-04-11 23:12:13 -04:00
Santhosh Janardhanan	928a027cec	merge: resolve conflict, keeping our extensions array fix	2026-04-11 22:55:18 -04:00
Santhosh Janardhanan	fabdd48877	docs: add troubleshooting guide for misleading openclaw.hooks error	2026-04-11 22:50:04 -04:00