Changelog
All notable changes to this project are documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased
[1.0.0] - 2026-04-02
Changed
- BREAKING: v4 schema. Output schema version bumped from 2/3 to 4. Every file is now a first-class IndexEntry with its own identity hash, timestamps, and metadata. Sidecar files are no longer embedded into parent entries.
- BREAKING: Output suffix convention. In-place output files now use
_idx.json(file-level) and_idxd.json(directory-level) as permanent suffixes, replacing_meta2.json/_meta3.jsonand directory variants for current output generation. - BREAKING: Removed CLI flags.
--meta-mergeand--meta-merge-deletehave been removed. - BREAKING: Removed GUI operations. Meta Merge and Meta Merge Delete operation types were removed from the desktop application. The operation dropdown now contains Index and Rollback.
Added
- Relationship annotations. The indexer now annotates believed
associations between files via a
relationships[]array on sidecar-like entries. Annotations include target ID, relationship type, rule name, rule source, confidence level, and predicate evaluation detail. - Sidecar rule engine. A TOML-based rule engine classifies sidecar relationships with built-in rules and user overrides.
--no-sidecar-detectionCLI flag. Disables relationship classification entirely, producing a pure filesystem inventory.--cleanup-legacy-sidecarsCLI flag. Removes obsolete tool output files when new-format replacements are written.- Community rule pack support. TOML-based rule packs can be installed into the pack directory for shared rule libraries.
- Simplified rollback. All files roll back as uniform file copies. Legacy v2/v3 rollback remains supported through a backward-compat path.
Removed
- Sidecar content ingest/reconstruct/restore pipeline.
- Sidecar reconstruction codepaths for v4 execution.
- Sidecar-rename tracking and coordination logic.
- MetaMergeDelete execution and safety logic.
- Sidecar-origin fields on MetadataEntry and MetadataAttributes that were tied to ingest/reconstruct behavior.
metadata_identifyregex pattern runtime as the primary classifier, replaced by the v4 rule engine.
[0.2.1] - 2026-03-21
Known Issues
- GUI: MetaMergeDelete output field reports false
_directorymeta3.jsonoutput — When running MetaMergeDelete with multi-file output mode and the "Write directory metadata" checkbox unchecked, the Output field incorrectly reports upon completion that an output was written to the default_directorymeta3.jsonlocation. Multi-file sidecars are written as expected, but no directory summary file is produced when directory metadata output is suppressed (this is the correct behavior). The Output field completion summary must be updated to omit the_directorymeta3.jsonpath when directory metadata writing is disabled. - Core: Windows directory shortcut (
.lnk) files incorrectly associated as sidecars — Testing performed subsequent tov0.1.2release discovered that Windows.lnkdirectory shortcut files are misidentified as sidecars of unrelated sibling files in the same directory. The.lnkbinary content is correctly saved and rollback properly restores the file, but the sidecar association logic incorrectly pairs the shortcut with a miscellaneous sibling file that shares no filename similarity or other discernible relationship. The sidecar matching heuristic needs investigation and correction to prevent false.lnkassociations.
Fixed
- Core: Rollback engine unable to discover v3 sidecars —
discover_meta2_files()used a hardcoded*_meta2.jsonglob pattern, finding zero sidecars in directories processed by v0.2.0. Replaced with version-agnosticdiscover_sidecar_files()that discovers both_meta2.jsonand_meta3.jsonsidecars. The old function name is retained as a deprecated alias. - Core: Rollback loader rejecting v3 schema version —
load_meta2()contained a hard gate accepting onlyschema_version == 2, causing all v3 sidecar files to raiseIndexerConfigError. Updated to accept versions 2 and 3. Replaced withload_sidecar()as the primary API; the old function name is retained as a deprecated alias. - Core: In-place sidecar rename using v2 naming convention —
rename_inplace_sidecar()constructed paths using_meta2.jsonsuffixes, so the rename phase looked for a non-existent file while the actual_meta3.jsonsidecar was left orphaned with its original-name base. Updated to use_meta3.jsonnaming.
Added
- Tests: Index-then-rollback round-trip integration test — New
tests/integration/test_roundtrip.pywith three test cases exercising the full index-rename-rollback cycle, mixed v2/v3 sidecar discovery, and v3 inplace sidecar rename verification. These tests detect the class of regression that shipped in v0.2.0. - Build: PyInstaller chardet hook for mypyc-compiled extensions — New
hooks/hook-chardet.pyhook that discovers and bundles chardet v7's mypyc runtime module and data files. ResolvesModuleNotFoundErrorfor the mypyc support module in standalone executables built with PyInstaller.
Changed
- Public API: Rollback function renames —
discover_meta2_files()renamed todiscover_sidecar_files()andload_meta2()renamed toload_sidecar(). The old names are retained as deprecated aliases and remain in__all__.
0.2.0 - 2026-03-20
Known Issues
- GUI: MetaMergeDelete output field reports false
_directorymeta3.jsonoutput — When running MetaMergeDelete with multi-file output mode and the "Write directory metadata" checkbox unchecked, the Output field incorrectly reports upon completion that an output was written to the default_directorymeta3.jsonlocation. Actual file operations are correct; only the GUI completion summary is affected. - Core: Windows directory shortcut (
.lnk) files incorrectly associated as sidecars — Windows.lnkdirectory shortcut files are misidentified as sidecars of unrelated sibling files. The.lnkbinary content is correctly saved and rollback properly restores the file, but the sidecar association is incorrect.
Fixed
- CLI/Build: PyInstaller binary produced no output — The standalone
shruggie-indexer.exebinary silently exited with code 0 for all invocations (no args,--help,-h,--file, etc.) because the PyInstaller spec file pointed atcli/main.py, which defines the Click command group but never calls it. The pip-installed console script andpython -m shruggie_indexerpaths worked correctly because their wrappers explicitly callmain(). Fixed by redirecting the spec entry point fromcli/main.pyto__main__.py(which has the properif __name__ == "__main__"guard), adding a defensiveif __name__ == "__main__"guard tocli/main.py, and expandinghiddenimportsfrom["shruggie_indexer"]to all 28 submodules (lazy imports inside command handlers are invisible to PyInstaller's static analysis). Spec: §13.4. - CLI:
-hshort help flag not recognized — Click only enables--helpby default; the-hshorthand was routed to theindexsubcommand viaDefaultGroupand rejected as an unknown option. Addedcontext_settings={"help_option_names": ["-h", "--help"]}to the Click group. Spec: §8.1.
Changed
- CLI:
index_cmd()complexity refactor — Extracted three helper functions (_resolve_log_file_from_config(),_build_cli_overrides(),_post_index_pipeline()) from the 390-lineindex_cmd()function body to reduce cognitive complexity below Pylance's analysis threshold. No behavioral changes; the same logic executes in the same order. - Sidecar exclusion patterns updated to match v1, v2, and v3 sidecar filenames.
- Serializer key ordering updated to include
encodingfield. - JSON style detection extended to capture indent string alongside compact/pretty classification.
Added
- v3 output schema.
schema_versionis now3. In-place sidecars use_meta3.jsonand_directorymeta3.jsonsuffixes. v2 and v1 sidecar files are still recognized and excluded during traversal. encodingfield onIndexEntry. Optional top-level field capturing BOM type, line-ending convention, detected character encoding, and detection confidence. Enables hash-perfect reversal when file content is stored as decoded text by downstream consumers. Populated for files by default; absent for directories, symlinks, and when--no-detect-encodingis specified.encodingfield onMetadataEntry. Sidecar-only field capturing the same encoding metadata for ingested sidecar files. Enables hash-perfect reversal of sidecar text content.timestamps.created_sourcefield. New optional field onTimestampsObjectindicating whether the creation timestamp was derived fromst_birthtime(true creation time) orst_ctime(inode change time fallback). Resolves the cross-platform ambiguity documented in §15.5.attributes.json_indentfield. New optional field onMetadataAttributescapturing the precise indentation string used in the original JSON sidecar file (e.g., 2-space, 4-space, tab). Enables hash-perfect JSON restoration during rollback.core/encoding.pymodule. BOM detection, line-ending detection, and chardet integration for character encoding identification.- Encoding-aware rollback restoration. The rollback engine now consumes
encodingmetadata to restore BOM, line endings, and source encoding during sidecar restoration. JSON sidecars are restored with the original indent string whenjson_indentis available. Text-format sidecar restoration is now byte-identical when full encoding metadata is present. --no-detect-encodingCLI flag. Disables all encoding detection (BOM, line endings, and chardet), omitting theencodingfield from output.--no-detect-charsetCLI flag. Disables only chardet-based detection; BOM and line-ending detection remain active.- v3 JSON Schema. Canonical schema at
schemas.shruggie.tech/data/shruggie-indexer-v3.schema.json. chardetdependency. Added as a standard runtime dependency for character encoding detection.- Tests: CLI entry-point smoke tests — New
tests/integration/test_cli_entrypoint.pywith 8 tests covering Click in-process invocation (--help,-h,--version, subcommand help),python -mout-of-process invocation, and direct script execution (simulating the PyInstaller path). Prevents regression of the silent-exit bug.
0.1.2 - 2026-03-05
Added
- Rollback: Session-ID validation and content-hash collision detection —
plan_rollback()now performs two pre-planning sanitization passes on loaded entries before planning begins. Content-hash collision detection groups entries by composite(md5, sha256)hash and, when multiple non-duplicate entries share the same hash with differentfile_system.relativevalues, applies tiebreaking rules (majority session → session over no session → first encountered) to keep only one entry per content file. Discarded entries are logged at WARNING. Entries from theduplicates[]array are excluded from collision detection (they share the canonical's hash intentionally). Spec: §6.11. -
Rollback: Legacy
file_system.relativeprefix detection —plan_rollback()automatically detects and strips the legacy relative path prefix from_meta2.jsonfiles produced by indexer versions prior to thefile_system.relativecomputation fix. When all entries share a common first path component matching the source directory name, the component is stripped and an INFO log is emitted. This ensures correct restoration without extra nesting regardless of sidecar provenance. Spec: §6.11. -
Core: Parameter logging at operation start — Every operation entry point (CLI indexing, CLI rollback, GUI indexing, GUI rollback, and
plan_rollback()) now emits a single INFO-level log record summarizing all resolved operational parameters before processing begins. Provides immediate diagnostic visibility into the settings governing the current invocation. Spec: §11.4a. - Core: Origin-directory annotation for recursive rollback —
load_meta2()now annotates every deserializedIndexEntrywith the parent directory of the_meta2.jsonfile that contained it, stored in a module-level_origin_dirsmapping.LocalSourceResolveruses this annotation as a Strategy 3 fallback when the content file is not found insearch_dir, enabling recursive rollback to resolve files in subdirectories. Spec: §6.11. - Tests: Recursive rollback unit tests — 7 new tests in
tests/unit/test_rollback.pycovering origin-dir annotation (recursive, non-recursive, and single-file loading),LocalSourceResolverorigin-dir fallback (isolated and full pipeline),plan_rollback()parameter logging viacaplog, and_set_windows_creation_time()on Windows and non-Windows platforms. Newtests/fixtures/rollback-testbed/recursive-testbed/fixture with root and subdirectory content files and meta2 sidecars.
The existing indexing logic is now the index subcommand, invoked by default when no explicit subcommand is given. All existing invocations (shruggie-indexer /path, shruggie-indexer /path --rename --inplace, shruggie-indexer --help, shruggie-indexer --version) continue to work identically. New explicit form: shruggie-indexer index /path. The DefaultGroup class handles subcommand injection by introspecting group-level options at parse time, requiring no external dependencies. This sets the architectural pattern for future subcommands (rollback, migrate, etc.).
- Core: Provenance-preserving de-duplication — When Rename is active (
--renameflag or GUI Rename checkbox), the indexer now performs session-scoped de-duplication. The first file encountered with a given content hash is designated the canonical copy; all subsequent byte-identical files are absorbed into the canonical entry's newduplicatesarray, preserving their complete identity metadata (original name, timestamps, filesystem location, and all metadata entries). Duplicate files are deleted from disk after the rename phase. Works for both same-directory and cross-directory duplicates. In dry-run mode, duplicates are identified and reported in the output but no files are deleted. - Schema:
duplicatesfield on IndexEntry — New optionalduplicatesfield (array ofIndexEntry) on the root schema object. Contains completeIndexEntryobjects for files that were de-duplicated against this entry during a rename operation. Absent (notnull, not empty array) when no duplicates exist. The field appears aftermetadataand beforesession_id/indexed_atin serialized output. - Core:
dedupmodule — Newsrc/shruggie_indexer/core/dedup.pymodule providingDedupRegistry,DedupResult,DedupStats,DedupAction,scan_tree(),apply_dedup(), andcleanup_duplicate_files(). Designed for standalone importability by downstream projects (specificallyshruggie-catalog). - Schema: De-duplication example — New
docs/schema/examples/deduplicated_meta2.jsonexample file demonstrating theduplicatesfield with a canonical entry and one absorbed duplicate. - Tests: De-duplication unit tests — 26 unit tests in
tests/unit/test_dedup.pycovering registry population, duplicate detection, canonical selection, merge behavior, stats calculation, tree scanning, empty-tree and single-file edge cases,to_dict()serialization, and provenance preservation. - Tests: De-duplication integration tests — 12 integration tests in
tests/integration/test_dedup_rename.pycovering same-directory dedup, cross-directory dedup, dry-run output, cleanup file deletion, provenance round-trip, and no-duplicates baseline. - CLI:
--dir-meta/--no-dir-metaflag — New boolean toggle (--dir-meta/--no-dir-meta) controlling whether_directorymeta2.jsonsidecar files are written during inplace output. Defaults to enabled (preserving existing behavior). When disabled, per-file_meta2.jsonsidecars are still written normally; only the directory-level summary files are suppressed. Also suppresses auto-generated aggregate_directorymeta2.jsonoutput files (explicit--outfilepaths are never suppressed). Corresponds to thewrite_directory_metaconfiguration key. - GUI: "Write directory metadata" checkbox — New checkbox in the Output card on the Operations page controlling directory metadata sidecar suppression. Mirrors the CLI
--no-dir-metaflag. Automatically disabled when the target type is "File" (not applicable). Automatically forces Multi-file output mode when unchecked with a directory target (Single file mode produces only directory metadata, which would be empty). State persists across sessions. - Config:
write_directory_metakey — New boolean configuration key (defaulttrue) supported in user TOML, project TOML, and CLI/API overrides. Controls whether_directorymeta2.jsonfiles are emitted during output. Participates in the standard 4-layer config resolution pipeline. - Tests: Directory metadata suppression — 3 unit tests in
tests/unit/test_serializer.pyand 6 integration tests intests/integration/test_output_modes.pycovering inplace sidecar suppression, aggregate output gating, stdout passthrough, explicit outfile preservation, and single-file target edge cases. - Core: Rollback engine — New
src/shruggie_indexer/core/rollback.pymodule providing the complete rollback engine for reversing rename and de-duplication operations. Reads_meta2.jsonsidecar files and reconstructs the original filesystem layout. Providesload_meta2()(JSON→IndexEntrydeserialization supporting per-file sidecars, aggregate directory sidecars, and directories of sidecars),discover_meta2_files()(recursive/non-recursive sidecar discovery),plan_rollback()(structured or flat restore planning with conflict detection, path traversal safety, duplicate handling, and sidecar restoration),execute_rollback()(4-phase executor: mkdir→restore→duplicate→sidecar with dry-run, cancellation, and progress reporting), andverify_file_hash()(content integrity verification). Uses aSourceResolverprotocol withLocalSourceResolveras the default implementation. Follows the same plan-then-execute architecture ascore/dedup.py. - CLI: Rollback subcommand — New
shruggie-indexer rollbacksubcommand providing CLI access to the rollback engine. Accepts aMETA2_PATHargument (sidecar file, aggregate output, or directory of sidecars) and restores files to their original names and directory structure. Supports--target(optional, defaults to parent of META2_PATH),--source(explicit content file directory),--flat(restore without directory structure),--recursive(search subdirectories for sidecars),--dry-run,--no-verify,--force,--skip-duplicates, and--no-restore-sidecarsflags. Follows the same thin-orchestration pattern as theindexsubcommand. - Exceptions:
RollbackError— NewRollbackError(IndexerRuntimeError)exception for rollback-specific failures. - Tests: Rollback unit tests — 50 unit tests in
tests/unit/test_rollback.pycovering meta2 loading (per-file sidecar, aggregate tree, directory discovery, v1 rejection, malformed JSON, duplicate extraction, sidecar metadata preservation), source resolution (storage name match, original name match, hash verification), rollback planning (renamed/non-renamed files, source-not-found, path traversal rejection, conflict detection with same/different hashes, force overwrite, duplicate skipping, sidecar restoration, directory actions, flat mode, mixed sessions), rollback execution (dry-run, file copy with timestamp restoration, directory creation, duplicate copy, 4 sidecar format decoders, cancellation, error handling, flat mode), and hash verification. - Tests: Rollback fixtures — New
tests/fixtures/rollback-testbed/directory with 7 subdirectories providing renamed files, non-renamed files, aggregate directory sidecars, deduplicated entries with sidecar metadata, mixed-session entries, malformed JSON, and v1-format sidecars for comprehensive rollback testing. - GUI: Rollback operation mode — Added "Rollback" as a fourth option in the operation type dropdown on the Operations page. When selected, the page morphs to present rollback-specific controls: a Source card (meta2 path with file/folder browse buttons, optional source directory, recursive checkbox), an Options card (flat restore, verify hashes, force overwrite, skip duplicates, restore sidecars), and a Target card (output directory with placeholder text showing the default). All indexing-specific cards are hidden during rollback mode. The destructive operation indicator shows green (non-destructive). The START button is disabled when the meta2 path field is empty. The rollback job runs in a background thread with progress reporting and cancellation support, producing a human-readable text summary on completion. All rollback-specific control states persist across application sessions.
- Public API: Rollback exports — Twelve rollback symbols (
RollbackPlan,RollbackAction,RollbackStats,RollbackResult,SourceResolver,LocalSourceResolver,discover_meta2_files,execute_rollback,load_meta2,plan_rollback,verify_file_hash) andRollbackErroradded to__all__in__init__.py. All symbols are importable viafrom shruggie_indexer import .... - Docs: Rollback Guide — New
docs/user-guide/rollback.mdproviding a dedicated rollback guide with plan-then-execute architecture overview, nine scenario walkthroughs (single renamed file, renamed directory, deduplicated files, aggregate output, non-renamed files, mixed sessions, flat mode, default target, vault delivery), sidecar reconstruction, timestamp restoration, hash verification, conflict resolution, and error handling documentation. - Docs: Rollback Python API —
docs/user-guide/python-api.mdupdated with a full Rollback API section coveringload_meta2,discover_meta2_files,plan_rollback,execute_rollback,verify_file_hash, all four data classes, theSourceResolverprotocol,LocalSourceResolver, and a vault integration example. - Spec: §6.11 Rollback Operations — New specification section documenting the rollback engine's inputs, restore modes, target resolution, timestamp restoration, sidecar reconstruction, conflict resolution, mixed-session detection,
SourceResolverprotocol, error handling, and plan-then-execute architecture. Cross-referenced from §4.1, §6.10, §8.12, and §9.1. - Sidecar:
.urltext cascade —.urlsidecar files are now stored as full text content using the standard JSON → text → binary fallback chain, preserving the complete INI-format content rather than extracting only theURL=value. This enables byte-perfect round-trip restoration during rollback. Theattributes.typeremains"link". Spec: §6.7. - Sidecar:
.lnkdual-storage handler — Windows.lnkshortcut files are now stored with dual representation: Base64-encoded binary for byte-perfect rollback, plus optional structuredlink_metadata(target path, working directory, arguments, icon location, description, hotkey) when theLnkParse3library is available. Theattributes.typeis set to"shortcut"(new type, distinct from"link"). Newcore/lnk_parser.pymodule wrapsLnkParse3with graceful fallback when the library is not installed. Spec: §5.10, §6.7. - Sidecar: JSON formatting style detection — JSON sidecar files now have their formatting style detected at ingest and recorded in
attributes.json_style("pretty"or"compact"). The rollback engine uses this field to restore JSON sidecars with the original whitespace convention, preventing compact JSON files from inflating in size during rollback. Spec: §5.10, §6.11. - Schema:
json_styleattribute — New optionaljson_stylefield onMetadataAttributes("pretty"or"compact"). Present only whenattributes.formatis"json". Spec: §5.10. - Schema:
link_metadataattribute — New optionallink_metadatafield onMetadataAttributes(dict of string key-value pairs). Present only whenattributes.typeis"shortcut"andLnkParse3is available. Spec: §5.10. - Dependencies:
LnkParse3optional dependency — Newlnkoptional dependency group (pip install shruggie-indexer[lnk]) providing Windows.lnkshortcut metadata extraction viaLnkParse3>=1.4. Added todevandalldependency groups. - Tests: Sidecar round-trip fidelity — 12 new unit tests across
test_sidecar.pyandtest_rollback.pycovering.urltext cascade,.lnkdual-storage,.lnkfallback without parser, JSON style detection (compact/pretty/non-JSON), and round-trip restoration fidelity for all three sub-features.
Fixed
- Core: Stage 6 consumed-sidecar deletion logging hardened — The
MetaMergeDelete Stage 6
_drain_delete_queue()function in both CLI and GUI now deduplicates queued paths before unlinking, preventing spuriousERROR-level "file not found" messages when the same sidecar appears in the queue multiple times (which occurs when multiple content files in a directory each discover the same sidecar during pattern matching). DiagnosticDEBUG-level logging of the queue size (total and unique entries) is emitted before draining, providing an audit trail for the delete phase. Tests now exercise the real_drain_delete_queue()implementation fromcli/main.pyinstead of a local copy. Two new test cases validate deduplication behaviour andERROR-level logging on genuine failures. Spec: §6.10. - Rollback: Content-hash collision detection scope narrowed — The collision detection key in
_deduplicate_by_content_hash()was widened from(md5, sha256)to(md5, sha256, storage_name). The previous hash-only key incorrectly treated two distinct canonical files with the same content hash but different storage names (e.g.,slippers.gifandslippers.pngas byte-identical files with different extensions) as a cross-session collision, silently discarding one during rollback. This was a data loss bug. The fix ensures entries with differentstorage_namevalues are recognized as distinct files and proceed through the normal restore pipeline without warning. Additionally, the collision warning message now accurately describes the collision scope — "found in multiple sessions" for cross-session collisions, "with conflicting paths in session {id}" for intra-session collisions. Spec: §6.11. - Core: MetaMergeDelete stale metadata cleanup (Stage 7) — MetaMergeDelete now includes a post-processing cleanup phase that removes
_meta.json,_meta2.json,_directorymeta.json, and_directorymeta2.jsonfiles from prior indexer runs that survive the Layer 1 exclusion mechanism. Previously, these files accumulated silently across repeated runs because Layer 1 excluded them from traversal before sidecar discovery could see them, meaning they were never queued for deletion. Stale_meta2.jsonfiles from prior runs caused rollback to produce duplicate or misplaced restores. The cleanup function (cleanup_stale_metadata()incore/entry.py) runs after Stage 6 consumed-sidecar deletion, scans all traversed directories for matching files, protects current-run output sidecars, and deletes stale artifacts with INFO-level logging. Failed deletions log at WARNING and do not abort the operation. Spec: §6.10. - Core:
file_system.relativepath computation — The relative path for entries produced by directory indexing operations now correctly uses the target directory as the index root (producing"."for the root entry), instead of the parent directory (which produced the target's own name as a prefix). This corrects a spec violation (§5.6) that caused rollback to create an extra nesting level in restored directory trees. All path reconstruction call sites in_write_inplace_tree(),_rename_tree(), andcleanup_duplicate_files()updated fromroot_path.parent / relativetoroot_path / relative. Breaking change: Existing_meta2.jsonfiles contain the old prefix-based relative paths. Rollback includes automatic legacy prefix detection to handle both formats (§6.11).
Changed
- Core: Windows ctime restoration uses ctypes — Replaced the
pywin32(win32file.SetFileTime) approach for Windows creation-time restoration during rollback with a zero-dependencyctypes.windll.kernel32implementation usingCreateFileW/SetFileTime/CloseHandle. Eliminates the conditionalpywin32import and the "pywin32 not available" fallback path. FILETIME conversion:int((unix_seconds + 11644473600) * 10_000_000). Spec: §6.11, §15.5. - GUI: Rollback source-dir defaulting —
_background_rollback()now defaultssource_dirto the meta2 parent directory when the user leaves the source field empty, matching the CLI's defaulting logic inrollback_cmd(). Previously, an empty source field passedNoneto the resolver, causing all file resolutions to fail. - Rollback: JSON sidecar restoration is now style-aware —
_decode_sidecar_data()now usesattributes.json_styleto select the serialization format when restoring JSON sidecars."pretty"produces indented output (indent=2),"compact"produces minified output (separators=(",", ":")). Whenjson_styleis absent (backward compatibility), the default changed fromindent=2to compact. This fixes a bug where compact JSON sidecars (e.g., minified.info.jsonfiles) inflated in size during rollback. Spec: §6.11. - Sidecar:
.urlfiles stored as full content —.urlsidecar files are no longer parsed to extract just theURL=value. The complete file content is preserved via the standard text cascade, enabling byte-perfect round-trip restoration. Spec: §6.7. -
Sidecar:
.lnkfiles use"shortcut"type —.lnksidecar files are now classified asattributes.type = "shortcut"instead of"link", reflecting their distinct storage strategy (Base64 binary vs. text). Spec: §5.10, §6.7. -
Core: Rename collision logging elevated to ERROR — The collision branch in
rename_item()now logs atERRORlevel (wasWARNING). After the dedup pass removes duplicates from the entry tree, same-directory collisions should never occur; if they do, it indicates a bug in the dedup pipeline. The message explicitly states this is a defensive guard. -
Docs: Ecosystem direction amended for de-duplication — New
.archive/20260301-003-ecosystem-direction.mdsupersedes the original ecosystem direction document with five amendments: added single-session dedup to indexer responsibilities, added cross-session dedup exclusion to indexer non-responsibilities, amended the indexer's filesystem-mutation invariant to reflect rename + dedup behavior gated behind--rename, added cross-session dedup to catalog responsibilities, and added a new "De-Duplication Scope Boundaries" section documenting the two-scope architecture and sharedcore.dedupmodule reuse. -
GUI: Exit button — Added an "Exit" button to the bottom of the sidebar, positioned above the version label. The button uses a red accent color (
#c0392b/#a93226) to visually distinguish it from navigation buttons. Clicking it triggers the same close/cleanup sequence as the window's close button (WM_DELETE_WINDOW), including the cancellation confirmation dialog when an operation is in progress. - GUI: Log file path display — The Settings page "Output & Logging" card now includes a read-only entry field showing the computed log file path (platform-specific log directory with
<timestamp>.lognaming). The field is always visible but displays greyed-out/muted text when logging to file is disabled or log level is "None". - GUI: "None" log level — The Log Level dropdown on the Settings page now includes a "None" option that suppresses all logging output. When selected, no log messages are routed to the GUI log panel during operations (a static notice is displayed instead), and no log file is written even if "Write log files" is checked. This is a GUI-only feature; the CLI is not affected.
- GUI: Advanced Configuration improvements — Each subsection within the Advanced Configuration area is now independently collapsible via its own disclosure caret (▶/▼), using the same
_LabeledGrouppattern as the Operations page cards. Clicking a subsection header toggles only that subsection; the parent "Advanced Configuration" toggle still controls overall visibility. Six subsections are now displayed: Filesystem Exclusions, Metadata Identification, Metadata Exclusion, ExifTool, Extension Groups (new), and Extension Validation. Each subsection includes a brief description in muted text explaining the configuration group's purpose. All read-only textboxes now display the complete, untruncated default values fromconfig/defaults.py— the BCP 47 language-code alternation and other long patterns are no longer truncated with "...". List-type values are rendered one item per line for improved readability. Subsection expanded/collapsed states persist across sessions independently of the parent section. All subsections default to collapsed on first launch. - Infra: Canonical path resolver module — New
app_paths.pymodule providesget_app_data_dir()andget_log_dir()as the single source of truth for all application data directories on all platforms. Every module that reads or writes application data now imports from this module rather than resolving paths independently. This prevents the namespace and path inconsistencies that arose when separate sprints resolved paths in isolation.
Changed
- GUI: Settings "Custom Config File" label and documentation link — The "Config File" label on the Settings page Configuration card has been renamed to "Custom Config File" for clarity. A clickable hyperlink to the Configuration File Format documentation is now displayed below the entry field, opening the relevant documentation section in the user's default browser.
- GUI: Drag handle repositioned above START button — The drag handle (resizer bar) for the output panel has been moved from below the START button to above it, sitting between the scrollable input area and the fixed progress/action region. This places the resize control at the natural boundary between the configuration surface and the action area. The drag handle is now owned by the
OperationsPageframe rather than the main application area, eliminating the need for explicit pack/unpack management during tab switches. - GUI: Target Type converted from radio buttons to dropdown — The Type selector in the Target section of the Operations page has been changed from a
CTkRadioButtongroup to aCTkOptionMenudropdown, matching the visual style of other selectors on the page (Operation type, Output mode, ID Algorithm). Dropdown values are "Auto", "File", and "Directory" with identical behavior to the previous radio buttons. Session state persistence is fully preserved. - Infra: Application data directory consolidation — All application data (session files, configuration, log files) is now written to a single directory per platform:
%LOCALAPPDATA%\shruggie-tech\shruggie-indexer\(Windows),~/.config/shruggie-tech/shruggie-indexer/(Linux),~/Library/Application Support/shruggie-tech/shruggie-indexer/(macOS). Previously, config/session data used%APPDATA%(Roaming) while logs used%LOCALAPPDATA%with an incorrectShruggieTech(PascalCase) namespace. The "Open Config Folder" button now opens a directory that contains session, config, and log files in one place. Three-tier read fallback preserves access to files at legacy locations (v0.1.1 Roaming, v0.1.0 flat). Legacy files are never deleted. - Infra: Namespace casing normalized — The
ShruggieTech(PascalCase) directory name used by the logging subsystem has been corrected toshruggie-tech(kebab-case), matching all other ecosystem references. On Linux, log files previously wrote to~/.local/share/shruggie-indexer/logs/— they now write to~/.config/shruggie-tech/shruggie-indexer/logs/, consistent with the canonical base. - GUI: Remove forced output toggle on completion — The output/log panel no longer force-switches to the Output tab when an indexing operation completes. If the user is viewing the Log tab during an operation, the Log tab remains active after completion; the Output button briefly highlights to signal that new content is available. Respects user agency per §10.9.7.
- GUI: START button width reduced — The START button maximum width is now ~25% of the parent width (capped at 175px, minimum 120px), reduced from the previous ~45%/350px. This brings the button into better visual proportion with the surrounding controls.
- GUI: Cancel button dimensions matched to START — The Cancel (STOP) button now uses the same height (36px), font size (14pt bold), and dynamic width constraint (25%/175px max) as the START button, ensuring identical pixel dimensions across both states.
- GUI: Output mode constraint expansion — "View only" is no longer removed from the output mode dropdown when constrained. Instead, it always appears in the list; selecting it while a constraint is active triggers an immediate snap-back to the appropriate default (Single file for file targets, Multi-file for directory targets) with an explanatory info-label message. Two constraint conditions now apply: (1) Operation is Meta Merge Delete — destructive operations require a persistent output record. (2) Rename is active — rename requires writing files to disk, which is incompatible with view-only mode. When both conditions apply simultaneously, the Meta Merge Delete message takes priority. The info-label clears when no constraint is active.
- GUI: Collapsible operations cards — The Target, Options, and Output cards on the Operations page are now collapsible. Each card header displays a disclosure caret (▶ collapsed / ▼ expanded) and the entire header row is clickable (
cursor="hand2"). Collapsed/expanded state for each card persists across application sessions via the session file. All three cards default to expanded on first launch. - GUI: Drag handle grip indicator — The drag handle between the configuration area and the output/log panel now displays a centered three-dot grip indicator (• • •) in a muted color, visually communicating that it is interactive. Handle height increased from 6px to 8px and background color slightly differentiated from the default frame color for improved visibility as a secondary region boundary cue.
- GUI: Region boundary separators — Added uniform 1px separator lines (
gray50) at the boundary edges of anchored (non-scrollable) regions on both the Operations and Settings pages. Separators appear at the bottom of the Operations page title, the top of the Operations page bottom anchored region (progress/action area), and the bottom of the Settings page title. Provides clear visual delineation between fixed header regions and scrollable content areas at all window sizes. - GUI: About page header separator — Added a 1px
gray50bottom border below the About page title, matching the separator convention already applied to the Operations and Settings page titles. All three page titles now have consistent header-region separators. - GUI: "Shared Settings" status text styling — Changed the "Shared Settings (not yet available)" label in the Advanced Configuration section from italic gray (
gray60) to non-italic red (#cc0000), making the unavailable status visually prominent as a warning indicator rather than a subtle note. - GUI: Advanced Configuration textbox auto-sizing — Increased read-only textbox heights in the Advanced Configuration subsections to accommodate their full default content without scrolling. Heights are calculated as
min(lines × 20 + 10, 300)pixels. Affected textboxes: Excluded names (100→230px), Exclude extensions (60→130px), Base arguments (100→290px), Metadata Identification (250→300px), Exclude patterns (60→70px), Extension Groups (200→300px). Single-line textboxes and already-large textboxes are unchanged. - GUI: Settings page reorganization — Consolidated the six top-level Settings sections (Indexing Defaults, Output Preferences, Logging, Interface, Configuration, Advanced Configurations) into four card-based sections (Indexing, Output & Logging, Interface, Configuration) plus the existing Advanced Configuration collapsible section. Each section is now presented as a visually bounded card container (
_LabeledGroup) with consistent styling matching the Operations page cards. The "Output Preferences" and "Logging" sections are merged into "Output & Logging". Controls within the merged section are reordered: JSON Indentation, Write log files, Log Level, Log file path. The log level control is now aCTkOptionMenudropdown (replacing radio buttons) with options: None, Normal, Verbose, Debug. The "Reset to Defaults" and "Open Config Folder" buttons are separated from the scrollable card area and anchored to a fixed (non-scrollable) region at the bottom of the Settings page, separated by a visible divider. - GUI: File logging respects Settings controls — The persistent file handler now respects both the "Write log files" checkbox and the "None" log level setting. When either condition suppresses logging, the file handler is detached and closed. Re-enabling logging creates a new file handler. Session state migration handles the transition from lowercase to capitalized log level values.
- Spec SS10.9.2: Region boundary clarity — Updated "Region boundary clarity" paragraph in the Layout stability standard to include the About page title separator, completing coverage across all three pages. Documented the header-region bottom border convention.
- Spec §10.4: Advanced Configuration styling — Added amendment documenting the red non-italic styling for "Shared Settings (not yet available)" status text and the auto-sizing behavior for read-only textbox heights.
- Spec §3.1.1: Archive naming convention — Codified the
.archive/file naming convention (<YYYYmmdd>-<ZZZ>-<title>.<ext>) in the technical specification as a new §3.1.1 sub-section. Documents the date-scoped increment reset rule, field definitions, batch label convention, and examples. The.archive/table entry in §3.1 now cross-references the new sub-section. - Docs: GUI guide audit corrections — Audited
docs/user-guide/gui.mdagainst the running application after all Batch 1 GUI changes. Removed false claim that Rename forces Multi-file output mode (Rename constrains View only but does not force Multi-file). Added documentation for tab-preservation behavior on operation completion (active Output/Log tab is preserved; Output button briefly highlights green when new content arrives while Log tab is active). Updated session persistence section to explicitly list card collapsed/expanded states and Advanced Configuration section states. Fixed log level tooltip inapp.pyto accurately describe the mapping: Normal → warnings and errors (was incorrectly labeled "INFO-level").
Fixed
- Core: Recursive rollback failing for subdirectory files —
LocalSourceResolver.resolve()only searchedsearch_dir(the tree root), but in recursive mode content files reside in subdirectories alongside their_meta2.jsonsidecars. Added Strategy 3 (origin-directory fallback) toLocalSourceResolverand origin-dir annotations inload_meta2(). Root files and subdirectory files are now resolved correctly during recursive rollback. - Core: Timestamp restoration failing on Windows —
_restore_timestamps()attempted to importpywin32(not a project dependency) and silently skipped ctime restoration when it was unavailable, producing a misleading "pywin32 not available" DEBUG message. Replaced withctypes.windll.kernel32calls that are unconditionally available on all Windows Python installations. -
GUI: Rollback source-dir not defaulted — When the user left the source directory field empty in the GUI rollback panel,
source_dirwas passed asNoneto the resolver, causing all file resolutions to fail with "Source file not found". The CLI had defaulting logic (source_dir = meta2.parent) that the GUI lacked. Added matching default logic to_background_rollback(). -
GUI: Log file writing non-functional — The persistent file handler was created during
__init__()before session restore, then immediately detached by_sync_file_logging()during session restore when the persistedlog_to_filevalue wasFalse. This left 0-byte orphan log files on disk and no active file logging for subsequent operations. Fixed by moving_setup_file_logging()to after_restore_session()so the handler is only created when the user's persisted preference (and default) authorizes it. Changed the default value oflog_to_file_varfromFalsetoTrue, aligning with the "always on" design intent documented in §11.1. Root cause confirmed via print-based tracing of the initialization, session restore, and sync code paths. - GUI: Log file buffering (0-byte on disk) — The
FileHandler's underlying stream could buffer write data, causing log files to appear as 0 bytes in file managers until a flush was triggered by a later event (e.g., navigating to the Settings page). Fixed by reconfiguring the handler's stream withline_buffering=Trueafter creation and adding an explicitflush()after the initial startup log records. Log files are now non-empty on disk immediately after application startup. - Spec §10.4: Incorrect Windows path — The "Open Config Folder" description in §10.4 referenced
%APPDATA%instead of%LOCALAPPDATA%. Corrected to reference the canonical application data directory defined in §3.3a. - GUI: Output panel Save/Copy button state staleness — The Save and Copy buttons on the output panel did not reliably reflect the active view's content state during log streaming. Buttons remained greyed out on the Log tab until the user manually toggled to Output and back. Root causes: (1)
set_json()directly set button state based on JSON content, overriding the active-view check when the user was viewing the Log tab. (2) No periodic convergence mechanism existed to correct stale button state. Fixed by adding_poll_button_state()(1-second interval polling of_update_button_state()), and replacing direct button manipulation inset_json()andset_status_message()with delegation to_update_button_state(), which always evaluates the active view's content.
0.1.1 - 2026-02-25
Added
- Schema:
session_idfield on IndexEntry — New optionalsession_idfield (UUID4 string) identifies the indexing invocation that produced each entry. All entries within a single CLI, GUI, or API invocation share the same value. Enables downstream consumers (primarilyshruggie-catalog) to perform run correlation, staleness detection, provenance linking, and batch integrity verification. The field is optional and absent from therequiredarray;schema_versionremains2. Whenindex_path()is called without an explicitsession_id, one is auto-generated. Ref: spec §18.1.14. - Schema:
indexed_atfield on IndexEntry — New optionalindexed_atfield (TimestampPair) records the UTC moment each IndexEntry was constructed by the indexer. Distinct from the file's own filesystem timestamps — records the indexer's observation time. Each entry receives its own uniqueindexed_atvalue. Supports ordering, TTL/freshness policies, and debugging for downstream consumers. - Tests: Catalog readiness conformance tests — Added
TestSessionIdAndIndexedAttest class (7 cases) totests/conformance/test_v2_schema.pyvalidating schema conformance for entries with/withoutsession_idandindexed_at, UUID4 pattern enforcement, real-file entry population, and session_id threading through directory children. - Tests: Entry builder unit tests — Added
TestSessionIdThreading(3 cases) andTestIndexedAtTimestamp(4 cases) totests/unit/test_entry.pyverifyingsession_idpropagation throughbuild_file_entry()andbuild_directory_entry(),indexed_atpopulation, and defaultNonevalues for directly constructed entries. - GUI: Advanced configuration scaffold — The Settings tab now includes a collapsed "Advanced Configuration" section below existing settings. Expanding the section reveals five labeled groups — Filesystem Exclusion, ExifTool, Extension Validation, Sidecar Identification, and Metadata Exclusion — each displaying compiled default values in read-only monospace textboxes. Includes a cosmetic "Shared Settings" / "Indexer-Specific Settings" separator preparing for future cross-tool configuration. Each group has a disabled "Reset to Defaults" button placeholder. Full editing, data binding, and persistence are deferred to a post-v0.1.1 release. The section's expanded/collapsed state is persisted across sessions.
Changed
- Infra: Shared configuration namespace — Configuration and session storage paths now use a shared ShruggieTech ecosystem directory (
<base>/shruggie-tech/shruggie-indexer/) instead of the tool-specific<base>/shruggie-indexer/path used in v0.1.0. Both the TOML config loader and GUISessionManagercheck the new path first, fall back to the legacy path with an INFO-level migration recommendation, and always write to the new path. Old files are never deleted. Prepares the directory structure for future ecosystem tools (shruggie-catalog,shruggie-vault,shruggie-sync). - Archive: Normalized naming convention — All files in
.archive/now follow a<YYYYmmdd>-<ZZZ>-<title>.<ext>naming convention. Renamedshruggie-indexer-spec.pdf→20260224-001-shruggie-indexer-spec.pdf,shruggie-indexer-spec.html→20260224-001-shruggie-indexer-spec.html, andshruggie-indexer-plan.md→20260219-002-shruggie-indexer-plan.md. Updated all spec references (§1.5, §3.1) to the new filenames. - Docs: Spec download link — Documentation site Quick Links now includes both a Markdown (GitHub) link and a raw PDF download link for the technical specification.
- Core:
index_path()accepts optionalsession_idkeyword argument — The public API functionindex_path()now accepts an optionalsession_id: str | Noneparameter. When omitted, a UUID4 is auto-generated per call. CLI and GUI entry points generate and pass their own session IDs. - Core:
build_file_entry()andbuild_directory_entry()acceptsession_id— Both builder functions now accept asession_idkeyword argument and thread it into all recursive child entries. Both functions generate anindexed_attimestamp at construction time. - Schema: v2 JSON Schema updated — Added
session_id(with UUID4 pattern validation) andindexed_at(TimestampPair$ref) property definitions to the root IndexEntry schema. Neither field is added to therequiredarray.additionalPropertiesremainsfalse. - Docs: Schema reference updated —
docs/schema/index.mdupdated to documentsession_idandindexed_atin the top-level fields table with a note on their optional status. - Docs: Changelog auto-sync —
docs/changelog.mdis now auto-copied from the repository rootCHANGELOG.mdduring CI build, eliminating manual synchronization drift. The docs CI workflow (docs.yml) copies the root changelog beforemkdocs buildand triggers onCHANGELOG.mdchanges. A header comment indocs/changelog.mdwarns against direct edits. - Docs: Python API reference updated — Updated
docs/user-guide/python-api.mdto reflect the newsession_idparameter onindex_path(),build_file_entry(), andbuild_directory_entry(), and the newsession_idandindexed_atfields onIndexEntry. - Docs: GUI guide updated — Added Advanced Configuration section documentation to
docs/user-guide/gui.mddescribing the collapsed settings scaffold, its five configuration groups, and the deferred editing note. - Tests:
sidecar-testbedfixture — Createdtests/fixtures/sidecar-testbed/directory tree with 19 files across 3 subdirectories, exercising all sidecar exclusion, merge, delete, and rename scenarios. Includes content files, sidecar variants (json_metadata, description, screenshot, hash, yaml), prior-run indexer output artifacts, and false-positive candidates (standalone_notes.txt). - Core:
rename_inplace_sidecar()function — New public function incore/rename.pythat renames an in-place_meta2.jsonsidecar file from{original_name}_meta2.jsonto{storage_name}_meta2.jsonafter a content file is renamed. Prevents orphaned sidecars and incorrect sidecar naming when rename is active. Exported viashruggie_indexer.__init__. - Core: Rename phase diagnostic logging — Added comprehensive
DEBUG-level logging to_rename_tree()in both GUI and CLI. Each rename candidate now logs its type and storage_name. Directory entries log item count on descent and skip reason when empty. - Core:
build_directory_entry()Layer 2 sidecar filtering — Moved sidecar-pattern exclusion (Layer 2) fromlist_children()intobuild_directory_entry(). The full file list fromlist_children()is preserved assiblingsfor sidecar discovery, while a filteredentry_fileslist (excluding recognized sidecar patterns) is used for child entry construction. This architectural change ensures sidecar discovery can find companion files while preventing them from being indexed as standalone items. - Core:
list_children()now applies Layer 1 filtering —list_children()intraversal.pynow appliesmetadata_exclude_patterns(compiled regexes) against all filenames after the scandir loop, before sorting. This is a new filtering step that was previously absent from the traversal module. - Core:
shutdown_exiftool()API — New public function (shruggie_indexer.shutdown_exiftool()) that explicitly terminates the persistent ExifTool batch-mode process. Registered viaatexitfor automatic cleanup and called explicitly by the GUI's_on_close()handler. - Spec: v0.1.1 synchronization — Updated technical specification to reflect all v0.1.1 implementation changes. Rewrote §3.3 (config file locations with
shruggie-techecosystem namespace and legacy fallback). Updated §3.6 (docs tree withgetting-started/,user-guide/, top-levelchangelog.md,assets/). Added §3.7.3 (changelog synchronization mechanism). Updated §5.3 and §5.4 (session_idandindexed_atfield documentation). Updated §6.8 (entry construction signatures and session/timestamp steps). Updated §7.6 (config path table). Updated §9.2 (public API signatures and behavioral contract). Updated §10.1 (session file paths with migration behavior). Updated §10.4 (Open Config Folder paths, added Advanced Configuration scaffold subsection). Updated §11.4 (session lifecycle for JSON output). Marked §18.1.14 as IMPLEMENTED. Updated §18.2.2 (session_idstruck through as implemented). - Docs: GUI screenshot infrastructure — Created
docs/assets/images/gui/directory for storing annotated GUI screenshots. Added a "To Do" section to the GUI documentation page (docs/user-guide/gui.md) listing the required screenshots. - Config: User-customizable ExifTool key exclusions — Added
exiftool.exclude_keys(replace) andexiftool.exclude_keys_append(extend) configuration keys for controlling which metadata keys are filtered from ExifTool output. The compiled default set is unchanged; users can now extend or replace it via TOML configuration files or API overrides without modifying source code. Addedexiftool_exclude_keysfield toIndexerConfig,DEFAULT_EXIFTOOL_EXCLUDE_KEYSto compiled defaults, and TOML merge logic for both replace and append modes. Includes eight new unit tests covering replace mode, append mode, TOML loading, config loader round-tripping, and end-to-end extraction with custom sets. - Tests: Non-zero exit metadata recovery — Added
TestNonZeroExitMetadataRecoverytest class with five cases covering batch recovery, subprocess recovery, empty-stdout fallback, helper reset on unrecoverable errors, andErrorkey exclusion. - CLI:
--log-fileflag — New option to write log output to a persistent file.--log-file(no argument) writes to the default platform-specific app data directory;--log-file <path>writes to a custom location. Log files are namedYYYY-MM-DD_HHMMSS.logand include timestamps, session ID, and logger name. - TOML:
[logging]configuration section — Addedlogging.file_enabledandlogging.file_pathkeys to enable persistent log file output via configuration files. - GUI: "Write log files" settings toggle — New checkbox in the Settings page Logging section. When enabled, each operation writes a timestamped log file to the app data directory. Toggle state is persisted across sessions.
- GUI: 3-mode output system — Replaced the dry-run checkbox, in-place checkbox, output mode radio buttons (
View only/Save to file/Both), and editable output file entry with a singleCTkOptionMenudropdown offering three modes: Single file (one aggregate JSON), Multi-file (per-item in-place sidecars), and View only (display in output panel, nothing written to disk). The output path is now auto-computed and displayed in a read-only entry. Multi-file mode is only available for directory targets; View only is unavailable for Meta Merge Delete. Constraint enforcement and fallback logic are handled by_reconcile_controls(). - GUI: ExifTool process cleanup on exit — The
_on_close()handler now explicitly callsshutdown_exiftool()beforeself.destroy(), ensuring the persistent ExifTool batch-mode process is terminated cleanly on window close. Anatexitfallback is also registered for non-GUI exit paths. - GUI: Output panel tab-aware completion — Operation completion no longer force-switches the output panel to the Output tab. If the user is viewing the Log tab, the Output button flashes green for 3 seconds to signal new content availability; if already on the Output tab, content refreshes in place.
- GUI: Stage 6 post-processing in background job —
_background_job()now executes the full CLI-equivalent Stage 6 pipeline: rename loop (rename_item()), in-place sidecar writes (write_inplace()), and sidecar delete queue drain — matching the backend behavior of the CLI entry point. - Spec SS10.3: Output system documentation — Updated §10.3 output controls table, visual grouping description, context-sensitive options, output file auto-suggest section, and ASCII wireframe to reflect the 3-mode dropdown model replacing radio buttons and editable path fields.
- Spec SS10.6: Post-job display and tab behavior — Updated post-job display behavior table to use
Single file/Multi-file/View onlymodes. Added documentation forset_json()tab-aware behavior (no forced switch, green flash indicator). - Spec SS10.1: Session JSON — Removed
inplaceandoutput_filefrom the session format example;output_modenow usessingle/multi/viewkeys withsingleas default. - Spec SS10.5: Application close handler — Added documentation for the
_on_close()lifecycle (running-job confirmation dialog,shutdown_exiftool()call, session save,destroy()). - Spec SS10.9.5: Output handling clarity — Updated "Output path selector" references to "Output mode dropdown" and "computed path display."
- Spec SS10–SS18: Technical specification tone overhaul (Phases D–E) — Completed the full tone rewrite across Sections 10 through 18. All remaining "The port", "Deviation from original", and "Improvement over original" phrasing is replaced with authoritative present-tense voice and "Historical note (DEV-XX):" callouts. The entire specification now reads as a standalone tool reference with zero porting-era language.
- Spec Phase F: Cross-reference audit and docs-spec sync — Validated all 634 SS X.Y cross-references and 186 section headings. Fixed two broken anchor links (
#108-about-tab→#1082-about-tab). Reconciled four behavioral contradictions between the documentation site and the spec: corrected in-place sidecar naming in §8.3, fixed exception module location in §9.4, addedexceptions.pyto §3.2 layout tree and package files table. - Spec SS6.6: ExifTool error recovery — Rewrote the error handling section to document metadata recovery from non-zero exits. ExifTool exit code 1 (e.g., "Unknown file type") now triggers JSON inspection and metadata preservation rather than discarding all output. Updated the error handling table, added
ExifToolExecuteErrorrecovery documentation, documented process reset policy (only on true process failure), andExifTool:ErrorINFO-level logging. - Spec SS17.5: Batch mode error isolation — Rewrote the error isolation paragraph to distinguish recoverable per-file errors (non-zero exit with valid JSON) from true process failures (crash, broken pipe). Only true failures trigger persistent process reset.
- Spec SS6.6+SS7.4: Configurable key exclusion — Updated the key exclusion paragraph in §6.6 from "not currently configurable" to document the
exiftool.exclude_keys(replace) andexiftool.exclude_keys_append(append) configuration keys. Added key exclusion configuration subsection to §7.4, addedexclude_keysentries to the §7.7 collection field table, and added commented examples to the §7.6 TOML example. - Spec SS11.1+SS11.5+SS8.7: Log file support — Documented optional persistent log file output across all three delivery surfaces: CLI
--log-fileflag, TOML[logging]section, and GUI settings toggle. Added platform directory table, file naming convention, and FileHandler lifecycle documentation. - Spec SS10.3: Control dependency matrix — Updated the in-place sidecar checkbox states (available for all operations, forced ON for Meta Merge Delete). Added centralized state reconciliation subsection documenting
_reconcile_controls(). - Spec SS10.5: Progress area and log forwarding — Added notes on fixed-height progress area per §10.9.7 and progress-to-log forwarding.
- Spec Batch 7 amendments (A–J) — Applied 10 amendments to the technical specification reflecting Batch 6 and Batch 7 behavioral changes. (A) §6.1: Documented two-layer sidecar exclusion mechanism in
list_children()with Layer 1 (metadata_exclude_patterns) and Layer 2 (metadata_identifyunion). (B) §6.2: Correctedbuild_sidecar_path()documentation from<item_path>/_directorymeta2.jsonto<item_path>/<dirname>_directorymeta2.json. (C) §6.9: Documented{dirname}_directorymeta2.jsonin-place naming convention and aggregate vs. in-place distinction. (D) §6.9: Documented in-place sidecar rename coordination (write-then-rename). (E) §6.10: Updated collision detection fromRenameErrorto skip-and-warn withWARNING-level logging. (F) §6.10: Documented sidecar rename during file rename phase. (G) §6.10: Documented MetaMergeDelete execution loop withINFO/ERRORlogging and pipeline ordering. (H) §7.5: Clarifiedmetadata_exclude_patternsend-anchored regex matching. (I) §8.4: Documenteddelete_queueorchestrator wiring. (J) §14.2: Added test specifications fortest_sidecar_exclusion.py,test_meta_merge_delete.py,test_integration_mmd_pipeline.pywith pytest markers. Updated spec date from 2026-02-24 to 2026-02-25. - Spec SS10.6: Save/Copy button enablement — Updated enablement rules to per-view semantics (Output tab vs Log tab).
- Spec SS1–SS9: Technical specification tone overhaul (Phases A–C) — Rewrote Sections 1 through 9 of the technical specification to shift the document's voice from a porting diary to an authoritative standalone tool specification. All "Deviation from original" and "Improvement over original" callouts are relabeled to "Historical note" blockquotes. References to "the port" are replaced with "the tool" or "shruggie-indexer"; references to "the original" are demoted into Historical note callouts or rephrased as "the legacy implementation." Body text now leads with what the tool does, not how it differs from the PowerShell predecessor. DEV-XX codes and SS X.Y cross-references are preserved throughout.
- ExifTool:
_filter_keys()accepts configurable exclusion set —_filter_keys()now takes anexclude_keysparameter instead of referencing the module-level constant. The exclusion set is resolved fromIndexerConfig.exiftool_exclude_keysand threaded through all backend call paths (_extract_batch,_extract_subprocess,_parse_json_output,_recover_metadata_from_error). The module-levelEXIFTOOL_EXCLUDED_KEYSconstant is retained for backward compatibility and reference. - GUI: Centralized control reconciliation — Replaced
_update_controls()with_reconcile_controls(), a single method implementing the full dependency matrix for all Operations page controls. Manages recursive toggle state across five target scenarios, in-place sidecar forcing for Meta Merge Delete, SHA-512 settings sync, output placeholder text, and destructive indicator updates. All control-change callbacks now route through this method. - Expanded
EXIFTOOL_EXCLUDED_KEYSfrom 8 to 24 entries. AddedSourceFile, redundant timestamp/size keys already captured in IndexEntry fields, OS-specific filesystem attributes (FileAttributes,FileDeviceNumber,FileInodeNumber, etc.), and ExifTool operational keys (Now,ProcessingTime). - GUI: Consolidated tab layout — Replaced four separate operation tabs (Index, Meta Merge, Meta Merge Delete, Rename) with a single Operations page using an operation-type selector dropdown. Sidebar now contains three tabs: Operations, Settings, and About.
- GUI: Destructive-operation indicator — Added a real-time visual indicator (green/red dot) that reflects whether the selected operation and dry-run state combination is destructive.
- GUI: Labeled control groups — Reorganized controls into bordered, labeled groups (Operation, Target, Options, Output) with contextual descriptions. Controls show or hide dynamically based on the selected operation type.
- GUI: Always-visible controls with enable/disable — All input controls (Target, Options, Output) are now always visible regardless of the selected operation type. Controls that do not apply to the current operation are disabled with a brief explanatory label instead of being hidden. This makes the full option space discoverable at all times.
- GUI: Rename as feature toggle — Rename is no longer a standalone operation type. It is now a "Rename files" checkbox in the Options group that can be combined with any of the three operation types (Index, Meta Merge, Meta Merge Delete). The operation type dropdown has been reduced from four entries to three.
- GUI: Target/Type validation — The application now detects file-vs-directory conflicts between the target path and the selected type. A red inline error message appears below the type selector, and the START button is disabled until the conflict is resolved. The Recursive checkbox is automatically disabled when the target is a single file.
- GUI: SHA-512 settings sync — When "Compute SHA-512 by default" is enabled in Settings, the SHA-512 checkbox on the Operations page is forced on and cannot be unchecked, with a "(Enabled in Settings)" explanation label. Changing the setting syncs immediately.
- GUI: Green START button — The action button now always displays "▶ START" (green) regardless of operation type, centered with a max width of 50% of the window. During execution it changes to "■ Cancel" (red). The label no longer varies by operation.
- GUI: Removed static scrollbar — The Operations page input area now uses a plain frame instead of a scrollable frame, eliminating a persistent scrollbar that appeared even when content fit without scrolling.
- GUI: Dual browse buttons — Target input now shows separate "File…" and "Folder…" browse buttons when the target type is "auto", and a single context-appropriate button otherwise.
- GUI: Persistent output file entry — Output file path field is always visible with auto-suggested paths that update as target and operation change, while preserving manual user edits.
- GUI: Auto-clear output on run — Output panel is automatically cleared when starting a new operation.
- GUI: Output panel clear button — Added a "Clear" toolbar button alongside the existing Save and Copy buttons.
- GUI: Save-mode completion message — When output mode is "save", the completion panel now shows a status message instead of attempting to render an empty JSON result.
- GUI: About tab — New About tab displaying project description, version, Python and ExifTool info, documentation and website links, and attribution.
- GUI: Sidebar version label — Application version displayed at the bottom of the sidebar in a muted font.
- GUI: Library log capture — Core library log messages are now captured via a queue handler attached to the
shruggie_indexerlogger and streamed to both the progress and output panels. Verbosity level is controlled from Settings. - GUI: Copy button feedback — Copy button briefly changes to "Copied ✓" with a green highlight for 1.5 seconds after clicking.
- GUI: Resizable output panel — Output panel includes a drag handle for vertical resizing (100–600 px range). Panel height is persisted across sessions.
- GUI: Tooltips — Descriptive hover tooltips added to all interactive controls, with a global enable/disable toggle in Settings.
- Documentation: GUI usage guide — Added a dedicated desktop application guide (
docs/user-guide/gui.md) covering launch, interface navigation, all operation types, output panel usage, keyboard shortcuts, session persistence, and troubleshooting. Includes Windows SmartScreen unblocking instructions. Placed prominently in the MkDocs navigation under User Guide. - Spec SS3.1: Repository layout sync — Updated the top-level layout tree diagram and table to reflect the current repository state. Added entries for
.archive/,CHANGELOG.md, PyInstaller spec files, generated spec renderings (.html/.pdf), and VS Code workspace file. Correcteddocs/subdirectory names (user/→user-guide/, addedgetting-started/). Broadened.github/entry to covercopilot-instructions.md. - Spec SS1.5: Archived implementation plan — Updated the reference documents table to point to the archived location (
.archive/shruggie-indexer-plan.md) and note that all sprints are complete. - Spec SS3.7: Documentation site nav — Updated the
navYAML example to match the currentmkdocs.ymlstructure: added Getting Started section, Desktop Application under User Guide, corrected doc paths, and promoted Changelog to top-level nav item. - Spec SS10: GUI Application — Comprehensive spec update to reflect the consolidated GUI architecture implemented in Sections 1–5. Rewrote SS10 introduction, SS10.1 session persistence (v2 format), SS10.2 window layout (3-tab sidebar, version label), SS10.3 target selection (consolidated operations page, dual browse, auto-suggest output, context-sensitive controls), SS10.4 configuration panel (removed embedded About, added Interface/tooltips section), SS10.5 action button and job exclusivity (single Operations page model), SS10.6 output display (Clear button, copy feedback, resizable panel, auto-clear, post-job display modes), SS10.7 keyboard shortcuts (Ctrl+1–3 for pages, Ctrl+Shift+C for copy). Added new SS10.8 supplemental components subsection covering destructive indicator, About tab, tooltips, labeled group frames, and debug logging. Updated to reflect rename as feature toggle (not standalone operation), always-visible controls with enable/disable paradigm, green START button, target/type validation, and SHA-512 settings sync.
- Spec SS10.9: GUI Design Standards — Added new subsection to the technical specification codifying GUI design governance. Adopts Jakob Nielsen's 10 Usability Heuristics by reference as the baseline evaluation framework. Defines seven project-specific UI standards: layout stability, state-driven control visibility, control interdependency transparency, output handling clarity, destructive operation safeguards, progress/feedback area allocation, and log/output panel behavior. Includes a directive requiring all GUI implementation work (including AI agent sessions) to comply with these standards.
- Deprecated
shruggie-indexer-plan.md— Moved the completed implementation plan from the repository root to.archive/. Added.archive/,shruggie-indexer-spec.html, andshruggie-indexer-spec.pdfto.gitignore.
Fixed
- GUI/CLI: Root directory in-place sidecar duplicating aggregate output —
_write_inplace_tree()wrote an in-place sidecar for the root target directory inside the target (data/data_directorymeta2.json), duplicating the aggregate output file written alongside the target (data_directorymeta2.json). Both contained the same full entry tree with the same filename, differing only in location. Fixed by skipping the root directory entry during in-place sidecar traversal. Child directory sidecars are unaffected. - Core: In-place directory sidecar naming —
build_sidecar_path()incore/paths.pyconstructed directory sidecar paths as<dir>/_directorymeta2.json(bare name with no identifying prefix). Every directory in the tree received an identically named file, making them indistinguishable in file managers and inconsistent with both the aggregate output naming ({dirname}_directorymeta2.json) and file sidecar naming ({filename}_meta2.json). Fixed to produce<dir>/{dirname}_directorymeta2.json. Themetadata_exclude_patternsregex continues to match the corrected filenames because the pattern is end-anchored. - Core: Sidecar files indexed as standalone items —
list_children()intraversal.pywas not applyingmetadata_exclude_patternsduring item enumeration. Indexer output artifacts (_meta.json,_meta2.json,_directorymeta2.json) were being treated as regular files — fully indexed, hashed, EXIF-checked, renamed, and given their own sidecar output (creating absurd filenames likefile_meta.json_meta2.json). Fixed by adding Layer 1 filtering: files matchingmetadata_exclude_patternsare now unconditionally excluded during traversal. - Core: Sidecar companion files indexed as standalone items when MetaMerge active — When MetaMerge was enabled, files matching
metadata_identifysidecar patterns (e.g.,.info.json,.description,_screen.jpg,.md5,.yaml) appeared as both standalone index entries AND sidecar metadata sources. Fixed by adding Layer 2 filtering inbuild_directory_entry(): recognized sidecar files are excluded from the entry-building iteration while remaining in the fullsiblingslist for sidecar discovery. This ensures sidecars are consumed exclusively through the merge system without breaking sidecar discovery's sibling enumeration. - GUI/CLI: MetaMergeDelete log levels incorrect —
_drain_delete_queue()in both GUI and CLI usedlogger.debugfor successful deletions andlogger.warningfor failures. Per spec, successful deletions must be logged atINFO(Sidecar deleted: {path}) and failures atERROR(Sidecar delete FAILED: {path}: {exception}). Fixed in both entry points. - GUI/CLI: In-place sidecar files named using pre-rename filename — When both rename and in-place output were active,
_meta2.jsonsidecar files were written using the original filename (e.g.,photo.jpg_meta2.json) instead of the post-rename storage name (e.g.,yABC123.jpg_meta2.json). This created orphaned sidecars with no on-disk association to the renamed content file. Fixed by wiringrename_inplace_sidecar()into the rename phase of both GUI and CLI, renaming the sidecar file alongside its content file. - GUI/CLI: Pipeline ordering for rename + in-place output — In-place sidecar writes were happening independently of the rename phase. Swapped ordering so in-place writes occur before rename, and the rename phase handles both the content file and its sidecar atomically. This preserves partial-result survival for non-rename cases while ensuring correct sidecar naming when rename is active.
- Core: Rename collision logging — When multiple files share an identical content hash and the rename target already exists on disk, the collision was previously either raised as a
RenameError(caught silently by callers) or skipped with no log output.rename_item()now logs aWARNING-level message (Rename SKIPPED (collision): {original_name} → {storage_name} (target already exists)) and returns the original path without raising. Callers in both GUI and CLI now check the return value to skip in-place sidecar rename for collision-skipped files, preserving the original filename as the sidecar base. - GUI: Session debug logging referenced non-existent attribute —
_save_session()and_restore_session()referencedSessionManager._pathwhich does not exist after the namespace migration. Fixed to use_write_pathand_read_pathrespectively. - GUI: Rename operation silently failing — The GUI's
_background_job()was missing the Stage 6 post-processing that the CLI performs: after indexing, each completed entry must be passed throughrename_item()to execute the rename. The GUI skipped this step entirely, so rename operations appeared to succeed but never moved any files. Fixed by adding a rename loop matching the CLI's Stage 6 implementation. - GUI: Meta Merge Delete not deleting sidecar files — Two bugs prevented sidecar deletion in the GUI: (1) no
delete_queuelist was created or passed toindex_path(), so sidecar paths were never collected; (2) no drain call was made toos.unlink()the queued paths after indexing. Fixed by creating the queue, threading it throughindex_path(), and draining it at the end of the background job. - GUI: In-place sidecar writes not firing — When Multi-file (in-place) output mode was selected, the background job never called
write_inplace()to write per-item sidecar JSON files alongside originals. Fixed by adding_write_inplace_tree()to the post-processing sequence. - ExifTool: WARNING regression for unknown file types —
_recover_metadata_from_error()failed to recover valid metadata fromExifToolExecuteErrorbecause pyexiftool ≥ 0.5 providesstdoutasbytes, notstr. Theisinstance(stdout, str)guard always failed, causing every non-zero exit to fall through to a WARNING log. Fixed by adding abytes→strdecode step before JSON parsing. - GUI: Output panel forced tab switch —
OutputPanel.set_json()unconditionally switched the active view to the Output tab on every operation completion, interrupting users reviewing the Log tab. Fixed: if the Output tab is already active, content is refreshed in place; if the Log tab is active, the Output button flashes green for 3 seconds to signal new content without switching. - GUI: Progress bar layout stability — Replaced the swappable progress/output panel arrangement with a fixed-height progress region embedded directly within the Operations page. The region uses
pack_propagate(False)to maintain a constant 120 px allocation, toggling between idle (START button) and running (progress bar + cancel) sub-frames without reflowing surrounding controls. - ExifTool: Metadata recovery on non-zero exit — ExifTool invocations that exit with a non-zero status (e.g. unsupported file types producing partial JSON on stdout) now attempt to recover valid metadata from the output before falling back. Previously, any non-zero exit discarded all data. Added
_recover_metadata_from_error()and_log_exiftool_error_field()helpers; "Unknown file type" warnings are now logged at INFO instead of WARNING. Added"Error"toEXIFTOOL_EXCLUDED_KEYS. - GUI: Log capture pipeline — Core library log messages and progress event messages now both appear in the output panel's log view with timestamps. Previously, most diagnostic output was silently dropped or consumed by the progress panel without forwarding to the log stream.
- GUI: Log entry timestamps and formatting — Log entries in the GUI log panel now use the
HH:MM:SS LEVEL messageformat with color coding: red for ERROR/CRITICAL, amber for WARNING, muted gray for DEBUG, and default text color for INFO. - GUI: Log auto-scroll behavior — The log panel now auto-scrolls to the bottom when new content arrives, pauses auto-scroll when the user scrolls upward, and resumes when scrolled back to the bottom.
- GUI: Log panel Save and Copy buttons — Save and Copy buttons are now enabled whenever the active view (Output or Log) contains content. Previously they were permanently disabled in the log view. Save in log view opens a save-as dialog for
.logfiles. - ExifTool key filtering now correctly handles group-prefixed keys (e.g.
System:FileName) by matching on the base key name after the last:separator. Previously, the-G3:1flag caused all keys to carry group prefixes, which bypassed the exact-match exclusion check and leaked sensitive filesystem details into output.
0.1.0 - 2026-02-20
Added
- Initial
shruggie-indexerrelease with CLI, GUI, and Python API delivery surfaces. - Deterministic v2 schema output with hash-based IDs, timestamp capture, and storage names.
- Filesystem traversal for files/directories with recursive mode and platform-aware behavior.
- Metadata features: sidecar discovery/parse/merge flows and optional ExifTool extraction.
- Output routing modes for stdout, combined outfile, and in-place sidecar JSON writes.
- Rename flow with dry-run support and configurable identity algorithm (
md5orsha256). - Config loading with layered defaults and TOML overrides.
- Cross-platform build-and-release CI that publishes standalone executables for Windows, Linux, and macOS.