diff --git a/HIERARCHY_VISUALIZATION_PLAN.md b/HIERARCHY_VISUALIZATION_PLAN.md new file mode 100644 index 0000000..d9c5890 --- /dev/null +++ b/HIERARCHY_VISUALIZATION_PLAN.md @@ -0,0 +1,163 @@ +# Hierarchy Visualization Integration Plan + +## Current State: NKBIP-01 Tree Processor Complete ✅ + +We have successfully implemented a proper Asciidoctor tree processor extension that: +- Registers as a real Asciidoctor extension using `registry.treeProcessor()` +- Processes documents during AST parsing with full access to `doc.getSections()` +- Implements hierarchical NKBIP-01 structure with 30040/30041 events +- Supports parse levels 2-5 with different event granularities +- Passes comprehensive tests validating the hierarchical structure + +## Next Phase: ZettelEditor Integration + +### Overview +Integrate the new hierarchical parser into ZettelEditor with visual hierarchy hints that show users exactly which sections will become which types of events at different parse levels - like text editor indent guides but for Nostr event structure. + +### Phase 1: Core Integration (Essential) + +#### 1.1 Update ZettelEditor Parser +- **Current**: Uses old `publication_tree_factory.ts` with flattened AST parsing +- **Target**: Switch to new `asciidoc_publication_parser.ts` with tree processor +- **Impact**: Enables real hierarchical 30040/30041 event structure + +```typescript +// Change from: +import { createPublicationTreeFromContent } from "$lib/utils/publication_tree_factory"; + +// To: +import { parseAsciiDocWithTree } from "$lib/utils/asciidoc_publication_parser"; +``` + +#### 1.2 Fix Parse Level Configuration +- Update `MAX_PARSE_LEVEL` from 6 to 5 in ZettelEditor.svelte:43 +- Update parse level options to reflect new hierarchical structure descriptions + +#### 1.3 Update Preview Panel +- Leverage `publicationResult.metadata.eventStructure` for accurate hierarchy display +- Show 30040 vs 30041 event types with different visual indicators +- Display parent-child relationships between index and content events + +### Phase 2: Visual Hierarchy Indicators (High Impact) + +#### 2.1 Editor Gutter Visualization +Add visual hints in the editor showing which sections will become events: + +**Event Type Indicators:** +- 🔵 **Blue circle**: Sections that become 30040 index events +- 🟢 **Green circle**: Sections that become 30041 content events +- 📝 **Text label**: "Index" or "Content" next to each section + +**Parse Level Boundaries:** +- **Colored left border**: Different colors for each hierarchy level +- **Indent guides**: Visual lines showing nested structure +- **Level badges**: Small "L2", "L3", etc. indicators + +#### 2.2 Real-time Parse Level Feedback +As user changes parse level dropdown: +- **Highlight changes**: Animate sections that change event type +- **Event count updates**: Show before/after event counts +- **Structure preview**: Mini-tree view showing resulting hierarchy + +#### 2.3 Interactive Section Mapping +- **Hover effects**: Hover over section → highlight corresponding event in preview +- **Click navigation**: Click section title → jump to event preview +- **Relationship lines**: Visual connections between 30040 and their 30041 children + +### Phase 3: Advanced Hierarchy Features (Polish) + +#### 3.1 Smart Parse Level Suggestions +- **Auto-detect optimal level**: Analyze document structure and suggest best parse level +- **Level comparison**: Side-by-side view of different parse levels +- **Performance hints**: Show trade-offs (fewer vs more events) + +#### 3.2 Enhanced Editor Features +- **Section folding**: Collapse/expand based on hierarchy +- **Quick level promotion**: Buttons to promote/demote section levels +- **Hierarchy outline**: Collapsible tree view in sidebar + +#### 3.3 Event Relationship Visualization +- **Tree diagram**: Visual representation of 30040 → 30041 relationships +- **Event flow**: Show how events will be published and linked +- **Validation**: Check for proper NKBIP-01 compliance + +### Phase 4: Advanced Interactions (Future) + +#### 4.1 Drag & Drop Hierarchy Editing +- Drag sections to change hierarchy +- Visual feedback for valid drop targets +- Auto-update AsciiDoc markup + +#### 4.2 Multi-level Preview +- Split preview showing multiple parse levels simultaneously +- Compare different parsing strategies +- Export options for different levels + +## Technical Implementation Notes + +### Key Data Structures +```typescript +// eventStructure provides complete hierarchy information +interface EventStructureNode { + title: string; + level: number; + eventType: "index" | "content"; + eventKind: 30040 | 30041; + dTag: string; + children: EventStructureNode[]; +} +``` + +### Integration Points +1. **Parser integration**: `parseAsciiDocWithTree()` in reactive effect +2. **Event structure**: Use `result.metadata.eventStructure` for visualization +3. **Real-time updates**: Svelte reactivity for immediate visual feedback +4. **Preview sync**: Coordinate editor and preview panel highlights + +### CSS Hierarchy Indicators +```css +.section-level-2 { border-left: 4px solid #3b82f6; } /* Blue */ +.section-level-3 { border-left: 4px solid #10b981; } /* Green */ +.section-level-4 { border-left: 4px solid #f59e0b; } /* Amber */ +.section-level-5 { border-left: 4px solid #8b5cf6; } /* Purple */ + +.event-type-index { background: rgba(59, 130, 246, 0.1); } /* Light blue */ +.event-type-content { background: rgba(16, 185, 129, 0.1); } /* Light green */ +``` + +## Success Metrics + +### Phase 1 (Essential) +- [ ] ZettelEditor uses new tree processor +- [ ] All existing functionality preserved +- [ ] Hierarchical events display correctly + +### Phase 2 (High Impact) +- [ ] Visual hierarchy indicators in editor +- [ ] Real-time parse level feedback +- [ ] Clear 30040 vs 30041 distinction + +### Phase 3 (Polish) +- [ ] Smart parse level suggestions +- [ ] Enhanced editor interactions +- [ ] Event relationship visualization + +## Migration Strategy + +1. **Gradual rollout**: Implement phases sequentially +2. **Fallback compatibility**: Keep old factory as backup during transition +3. **User testing**: Validate hierarchy visualization with real users +4. **Performance monitoring**: Ensure real-time updates remain smooth + +## Dependencies + +- ✅ **NKBIP-01 tree processor**: Complete and tested +- ✅ **Parse level validation**: Levels 2-5 supported +- ✅ **Event structure metadata**: Available in `eventStructure` field +- ⏳ **ZettelEditor integration**: Next phase +- ⏳ **Visual design system**: Colors, icons, animations + +--- + +**Ready to proceed with Phase 1: Core Integration** +The foundation is solid - we have a working tree processor extension that generates proper hierarchical NKBIP-01 events. Now we need to integrate it into the editor interface and add the visual hierarchy indicators that will make the event structure clear to users. \ No newline at end of file diff --git a/src/lib/utils/asciidoc_publication_parser.ts b/src/lib/utils/asciidoc_publication_parser.ts new file mode 100644 index 0000000..00586ab --- /dev/null +++ b/src/lib/utils/asciidoc_publication_parser.ts @@ -0,0 +1,144 @@ +/** + * Unified AsciiDoc Publication Parser + * + * Single entry point for parsing AsciiDoc content into NKBIP-01 compliant + * publication trees using proper Asciidoctor tree processor extensions. + * + * This implements Michael's vision of using PublicationTree as the primary + * data structure for organizing hierarchical Nostr events. + */ + +import Asciidoctor from "asciidoctor"; +import { registerPublicationTreeProcessor, type ProcessorResult } from "./publication_tree_processor"; +import type NDK from "@nostr-dev-kit/ndk"; + +export type PublicationTreeResult = ProcessorResult; + +/** + * Parse AsciiDoc content into a PublicationTree using tree processor extension + * This is the main entry point for all parsing operations + */ +export async function parseAsciiDocWithTree( + content: string, + ndk: NDK, + parseLevel: number = 2 +): Promise { + console.log(`[Parser] Starting parse at level ${parseLevel}`); + + // Create fresh Asciidoctor instance + const asciidoctor = Asciidoctor(); + const registry = asciidoctor.Extensions.create(); + + // Register our tree processor extension + const processorAccessor = registerPublicationTreeProcessor( + registry, + ndk, + parseLevel, + content + ); + + try { + // Parse the document with our extension + const doc = asciidoctor.load(content, { + extension_registry: registry, + standalone: false, + attributes: { + sectids: false + } + }); + + console.log(`[Parser] Document converted successfully`); + + // Get the result from our processor + const result = processorAccessor.getResult(); + + if (!result) { + throw new Error("Tree processor failed to generate result"); + } + + // Build async relationships in the PublicationTree + await buildTreeRelationships(result); + + console.log(`[Parser] Tree relationships built successfully`); + + return result; + + } catch (error) { + console.error('[Parser] Error during parsing:', error); + throw new Error(`Failed to parse AsciiDoc content: ${error instanceof Error ? error.message : 'Unknown error'}`); + } +} + +/** + * Build async relationships in the PublicationTree + * This adds content events to the tree structure as Michael envisioned + */ +async function buildTreeRelationships(result: ProcessorResult): Promise { + const { tree, indexEvent, contentEvents } = result; + + if (!tree) { + throw new Error("No tree available to build relationships"); + } + + try { + // Add content events to the tree + if (indexEvent && contentEvents.length > 0) { + // Article structure: add all content events to index + for (const contentEvent of contentEvents) { + await tree.addEvent(contentEvent, indexEvent); + } + } else if (contentEvents.length > 1) { + // Scattered notes: add remaining events to first event + const rootEvent = contentEvents[0]; + for (let i = 1; i < contentEvents.length; i++) { + await tree.addEvent(contentEvents[i], rootEvent); + } + } + + console.log(`[Parser] Added ${contentEvents.length} events to tree`); + + } catch (error) { + console.error('[Parser] Error building tree relationships:', error); + throw error; + } +} + +/** + * Export events from PublicationTree for publishing workflow compatibility + */ +export function exportEventsFromTree(result: PublicationTreeResult) { + return { + indexEvent: result.indexEvent ? eventToPublishableObject(result.indexEvent) : undefined, + contentEvents: result.contentEvents.map(eventToPublishableObject), + tree: result.tree + }; +} + +/** + * Convert NDKEvent to publishable object format + */ +function eventToPublishableObject(event: any) { + return { + kind: event.kind, + content: event.content, + tags: event.tags, + created_at: event.created_at, + pubkey: event.pubkey, + id: event.id, + title: event.tags.find((t: string[]) => t[0] === "title")?.[1] || "Untitled" + }; +} + +/** + * Validate parse level parameter + */ +export function validateParseLevel(level: number): boolean { + return Number.isInteger(level) && level >= 2 && level <= 5; +} + +/** + * Get supported parse levels + */ +export function getSupportedParseLevels(): number[] { + return [2, 3, 4, 5]; +} \ No newline at end of file diff --git a/src/lib/utils/publication_tree_processor.ts b/src/lib/utils/publication_tree_processor.ts new file mode 100644 index 0000000..51c4fc4 --- /dev/null +++ b/src/lib/utils/publication_tree_processor.ts @@ -0,0 +1,829 @@ +/** + * NKBIP-01 Compliant Publication Tree Processor + * + * Implements proper Asciidoctor tree processor extension pattern for building + * PublicationTree structures during document parsing. Supports iterative parsing + * at different hierarchy levels (2-7) as defined in NKBIP-01 specification. + */ + +import type { Document, Registry } from "asciidoctor"; +import { PublicationTree } from "$lib/data_structures/publication_tree"; +import { NDKEvent } from "@nostr-dev-kit/ndk"; +import type NDK from "@nostr-dev-kit/ndk"; +import { getMimeTags } from "$lib/utils/mime"; + +export interface ProcessorResult { + tree: PublicationTree; + indexEvent: NDKEvent | null; + contentEvents: NDKEvent[]; + metadata: { + title: string; + totalSections: number; + contentType: "article" | "scattered-notes" | "none"; + attributes: Record; + parseLevel: number; + eventStructure: EventStructureNode[]; + }; +} + +export interface EventStructureNode { + title: string; + level: number; + eventType: "index" | "content"; + eventKind: 30040 | 30041; + dTag: string; + children: EventStructureNode[]; +} + +interface ContentSegment { + title: string; + content: string; + level: number; + attributes: Record; + startLine: number; + endLine: number; +} + +interface HierarchicalSegment extends ContentSegment { + hasChildren: boolean; + children: ContentSegment[]; +} + +/** + * Register the PublicationTree processor extension with Asciidoctor + * This follows the official extension pattern exactly as provided by the user + */ +export function registerPublicationTreeProcessor( + registry: Registry, + ndk: NDK, + parseLevel: number = 2, + originalContent: string +): { getResult: () => ProcessorResult | null } { + let processorResult: ProcessorResult | null = null; + + registry.treeProcessor(function() { + const self = this; + + self.process(function(doc: Document) { + try { + // Extract document metadata from AST + const title = doc.getTitle() || ''; + const attributes = doc.getAttributes(); + const sections = doc.getSections(); + + console.log(`[TreeProcessor] Document attributes:`, { + tags: attributes.tags, + author: attributes.author, + type: attributes.type + }); + + console.log(`[TreeProcessor] Processing document: "${title}" at parse level ${parseLevel}`); + console.log(`[TreeProcessor] Found ${sections.length} top-level sections`); + + // Extract content segments from original text based on parse level + const contentSegments = extractContentSegments(originalContent, sections, parseLevel); + console.log(`[TreeProcessor] Extracted ${contentSegments.length} content segments for level ${parseLevel}`); + + // Determine content type based on structure + const contentType = detectContentType(title, contentSegments); + console.log(`[TreeProcessor] Detected content type: ${contentType}`); + + // Build events and tree structure + const { tree, indexEvent, contentEvents, eventStructure } = buildEventsFromSegments( + contentSegments, + title, + attributes, + contentType, + parseLevel, + ndk + ); + + processorResult = { + tree, + indexEvent, + contentEvents, + metadata: { + title, + totalSections: contentSegments.length, + contentType, + attributes, + parseLevel, + eventStructure + } + }; + + console.log(`[TreeProcessor] Built tree with ${contentEvents.length} content events and ${indexEvent ? '1' : '0'} index events`); + + } catch (error) { + console.error('[TreeProcessor] Error processing document:', error); + processorResult = null; + } + + return doc; + }); + }); + + return { + getResult: () => processorResult + }; +} + +/** + * Extract content segments from original text based on parse level + * This is the core iterative function that handles different hierarchy depths + */ +function extractContentSegments( + originalContent: string, + sections: any[], + parseLevel: number +): ContentSegment[] { + const lines = originalContent.split('\n'); + + // Build hierarchy map from AST + const sectionHierarchy = buildSectionHierarchy(sections); + + // Debug: Show hierarchy depths + console.log(`[TreeProcessor] Section hierarchy depth analysis:`); + function showDepth(nodes: SectionNode[], depth = 0) { + for (const node of nodes) { + console.log(`${' '.repeat(depth)}Level ${node.level}: ${node.title}`); + if (node.children.length > 0) { + showDepth(node.children, depth + 1); + } + } + } + showDepth(sectionHierarchy); + + // Extract segments at the target parse level + return extractSegmentsAtLevel(lines, sectionHierarchy, parseLevel); +} + +/** + * Build hierarchical section structure from Asciidoctor AST + */ +function buildSectionHierarchy(sections: any[]): SectionNode[] { + function buildNode(section: any): SectionNode { + return { + title: section.getTitle(), + level: section.getLevel() + 1, // Convert to app level (Asciidoctor uses 0-based) + attributes: section.getAttributes() || {}, + children: (section.getSections() || []).map(buildNode) + }; + } + + return sections.map(buildNode); +} + +interface SectionNode { + title: string; + level: number; + attributes: Record; + children: SectionNode[]; +} + +/** + * Extract content segments at the specified parse level + * This implements the iterative parsing logic for different levels + */ +function extractSegmentsAtLevel( + lines: string[], + hierarchy: SectionNode[], + parseLevel: number +): ContentSegment[] { + const segments: ContentSegment[] = []; + + // Collect all sections at the target parse level + const targetSections = collectSectionsAtLevel(hierarchy, parseLevel); + + for (const section of targetSections) { + const segment = extractSegmentContent(lines, section, parseLevel); + if (segment) { + segments.push(segment); + } + } + + return segments; +} + +/** + * Recursively collect sections at or above the specified level + * NKBIP-01: Level N parsing includes sections from level 2 through level N + */ +function collectSectionsAtLevel(hierarchy: SectionNode[], targetLevel: number): SectionNode[] { + const collected: SectionNode[] = []; + + function traverse(nodes: SectionNode[]) { + for (const node of nodes) { + // Include sections from level 2 up to target level + if (node.level >= 2 && node.level <= targetLevel) { + collected.push(node); + } + + // Continue traversing children to find more sections + if (node.children.length > 0) { + traverse(node.children); + } + } + } + + traverse(hierarchy); + return collected; +} + +/** + * Extract content for a specific section from the original text + */ +function extractSegmentContent( + lines: string[], + section: SectionNode, + parseLevel: number +): ContentSegment | null { + // Find the section header in the original content + const sectionPattern = new RegExp(`^${'='.repeat(section.level)}\\s+${escapeRegex(section.title)}`); + let startIdx = -1; + + for (let i = 0; i < lines.length; i++) { + if (sectionPattern.test(lines[i])) { + startIdx = i; + break; + } + } + + if (startIdx === -1) { + console.warn(`[TreeProcessor] Could not find section "${section.title}" at level ${section.level}`); + return null; + } + + // Find the end of this section + let endIdx = lines.length; + for (let i = startIdx + 1; i < lines.length; i++) { + const levelMatch = lines[i].match(/^(=+)\s+/); + if (levelMatch && levelMatch[1].length <= section.level) { + endIdx = i; + break; + } + } + + // Extract section content + const sectionLines = lines.slice(startIdx, endIdx); + + // Parse attributes and content + const { attributes, content } = parseSegmentContent(sectionLines, parseLevel); + + return { + title: section.title, + content, + level: section.level, + attributes, + startLine: startIdx, + endLine: endIdx + }; +} + +/** + * Parse attributes and content from section lines + */ +function parseSegmentContent(sectionLines: string[], parseLevel: number): { + attributes: Record; + content: string; +} { + const attributes: Record = {}; + let contentStartIdx = 1; // Skip the title line + + // Look for attribute lines after the title + for (let i = 1; i < sectionLines.length; i++) { + const line = sectionLines[i].trim(); + if (line.startsWith(':') && line.includes(':')) { + const match = line.match(/^:([^:]+):\\s*(.*)$/); + if (match) { + attributes[match[1]] = match[2]; + contentStartIdx = i + 1; + } + } else if (line !== '') { + // Non-empty, non-attribute line - content starts here + break; + } + } + + // Extract content (everything after attributes) + const content = sectionLines.slice(contentStartIdx).join('\n').trim(); + + return { attributes, content }; +} + +/** + * Detect content type based on document structure + */ +function detectContentType( + title: string, + segments: ContentSegment[] +): "article" | "scattered-notes" | "none" { + const hasDocTitle = !!title; + const hasSections = segments.length > 0; + + // Check if the title matches the first section title + const titleMatchesFirstSection = segments.length > 0 && title === segments[0].title; + + if (hasDocTitle && hasSections && !titleMatchesFirstSection) { + return "article"; + } else if (hasSections) { + return "scattered-notes"; + } + + return "none"; +} + +/** + * Build events and tree structure from content segments + * Implements NKBIP-01 hierarchical parsing: + * - Level 2: One 30041 event per level 2 section containing all nested content + * - Level 3+: Hierarchical 30040 events for intermediate sections + 30041 for content-only + */ +function buildEventsFromSegments( + segments: ContentSegment[], + title: string, + attributes: Record, + contentType: "article" | "scattered-notes" | "none", + parseLevel: number, + ndk: NDK +): { + tree: PublicationTree; + indexEvent: NDKEvent | null; + contentEvents: NDKEvent[]; + eventStructure: EventStructureNode[]; +} { + if (contentType === "scattered-notes" && segments.length > 0) { + return buildScatteredNotesStructure(segments, ndk); + } + + if (contentType === "article" && title) { + return buildArticleStructure(segments, title, attributes, parseLevel, ndk); + } + + throw new Error("No valid content found to create publication tree"); +} + +/** + * Build structure for scattered notes (flat 30041 events) + */ +function buildScatteredNotesStructure( + segments: ContentSegment[], + ndk: NDK +): { + tree: PublicationTree; + indexEvent: NDKEvent | null; + contentEvents: NDKEvent[]; + eventStructure: EventStructureNode[]; +} { + const contentEvents: NDKEvent[] = []; + const eventStructure: EventStructureNode[] = []; + + const firstSegment = segments[0]; + const rootEvent = createContentEvent(firstSegment, ndk); + const tree = new PublicationTree(rootEvent, ndk); + contentEvents.push(rootEvent); + + eventStructure.push({ + title: firstSegment.title, + level: firstSegment.level, + eventType: "content", + eventKind: 30041, + dTag: generateDTag(firstSegment.title), + children: [] + }); + + // Add remaining segments + for (let i = 1; i < segments.length; i++) { + const contentEvent = createContentEvent(segments[i], ndk); + contentEvents.push(contentEvent); + + eventStructure.push({ + title: segments[i].title, + level: segments[i].level, + eventType: "content", + eventKind: 30041, + dTag: generateDTag(segments[i].title), + children: [] + }); + } + + return { tree, indexEvent: null, contentEvents, eventStructure }; +} + +/** + * Build structure for articles based on parse level + */ +function buildArticleStructure( + segments: ContentSegment[], + title: string, + attributes: Record, + parseLevel: number, + ndk: NDK +): { + tree: PublicationTree; + indexEvent: NDKEvent | null; + contentEvents: NDKEvent[]; + eventStructure: EventStructureNode[]; +} { + const indexEvent = createIndexEvent(title, attributes, segments, ndk); + const tree = new PublicationTree(indexEvent, ndk); + + if (parseLevel === 2) { + return buildLevel2Structure(segments, title, indexEvent, tree, ndk); + } else { + return buildHierarchicalStructure(segments, title, indexEvent, tree, parseLevel, ndk); + } +} + +/** + * Build Level 2 structure: One 30041 event per level 2 section with all nested content + */ +function buildLevel2Structure( + segments: ContentSegment[], + title: string, + indexEvent: NDKEvent, + tree: PublicationTree, + ndk: NDK +): { + tree: PublicationTree; + indexEvent: NDKEvent | null; + contentEvents: NDKEvent[]; + eventStructure: EventStructureNode[]; +} { + const contentEvents: NDKEvent[] = []; + const eventStructure: EventStructureNode[] = []; + + // Add index to structure + eventStructure.push({ + title, + level: 1, + eventType: "index", + eventKind: 30040, + dTag: generateDTag(title), + children: [] + }); + + // Group segments by level 2 sections + const level2Groups = groupSegmentsByLevel2(segments); + + for (const group of level2Groups) { + const contentEvent = createContentEvent(group, ndk); + contentEvents.push(contentEvent); + + eventStructure[0].children.push({ + title: group.title, + level: group.level, + eventType: "content", + eventKind: 30041, + dTag: generateDTag(group.title), + children: [] + }); + } + + return { tree, indexEvent, contentEvents, eventStructure }; +} + +/** + * Build hierarchical structure for Level 3+: Mix of 30040 and 30041 events + */ +function buildHierarchicalStructure( + segments: ContentSegment[], + title: string, + indexEvent: NDKEvent, + tree: PublicationTree, + parseLevel: number, + ndk: NDK +): { + tree: PublicationTree; + indexEvent: NDKEvent | null; + contentEvents: NDKEvent[]; + eventStructure: EventStructureNode[]; +} { + const contentEvents: NDKEvent[] = []; + const eventStructure: EventStructureNode[] = []; + + // Add root index to structure + eventStructure.push({ + title, + level: 1, + eventType: "index", + eventKind: 30040, + dTag: generateDTag(title), + children: [] + }); + + // Build hierarchical structure + const hierarchy = buildSegmentHierarchy(segments); + + for (const level2Section of hierarchy) { + if (level2Section.hasChildren) { + // Create 30040 for level 2 section with children + const level2Index = createIndexEventForSection(level2Section, ndk); + contentEvents.push(level2Index); + + const level2Node: EventStructureNode = { + title: level2Section.title, + level: level2Section.level, + eventType: "index", + eventKind: 30040, + dTag: generateDTag(level2Section.title), + children: [] + }; + + // Add children as 30041 content events + for (const child of level2Section.children) { + const childEvent = createContentEvent(child, ndk); + contentEvents.push(childEvent); + + level2Node.children.push({ + title: child.title, + level: child.level, + eventType: "content", + eventKind: 30041, + dTag: generateDTag(child.title), + children: [] + }); + } + + eventStructure[0].children.push(level2Node); + } else { + // Create 30041 for level 2 section without children + const contentEvent = createContentEvent(level2Section, ndk); + contentEvents.push(contentEvent); + + eventStructure[0].children.push({ + title: level2Section.title, + level: level2Section.level, + eventType: "content", + eventKind: 30041, + dTag: generateDTag(level2Section.title), + children: [] + }); + } + } + + return { tree, indexEvent, contentEvents, eventStructure }; +} + +/** + * Create a 30040 index event from document metadata + */ +function createIndexEvent( + title: string, + attributes: Record, + segments: ContentSegment[], + ndk: NDK +): NDKEvent { + const event = new NDKEvent(ndk); + event.kind = 30040; + event.created_at = Math.floor(Date.now() / 1000); + event.pubkey = ndk.activeUser?.pubkey || "preview-placeholder-pubkey"; + + const dTag = generateDTag(title); + const [mTag, MTag] = getMimeTags(30040); + + const tags: string[][] = [ + ["d", dTag], + mTag, + MTag, + ["title", title] + ]; + + // Add document attributes as tags + addDocumentAttributesToTags(tags, attributes, event.pubkey); + + // Add a-tags for each content section + segments.forEach(segment => { + const sectionDTag = generateDTag(segment.title); + tags.push(["a", `30041:${event.pubkey}:${sectionDTag}`]); + }); + + event.tags = tags; + console.log(`[TreeProcessor] Index event tags:`, tags.slice(0, 10)); + event.content = generateIndexContent(title, segments); + + return event; +} + +/** + * Create a 30041 content event from segment + */ +function createContentEvent(segment: ContentSegment, ndk: NDK): NDKEvent { + const event = new NDKEvent(ndk); + event.kind = 30041; + event.created_at = Math.floor(Date.now() / 1000); + event.pubkey = ndk.activeUser?.pubkey || "preview-placeholder-pubkey"; + + const dTag = generateDTag(segment.title); + const [mTag, MTag] = getMimeTags(30041); + + const tags: string[][] = [ + ["d", dTag], + mTag, + MTag, + ["title", segment.title] + ]; + + // Add segment attributes as tags + addSectionAttributesToTags(tags, segment.attributes); + + event.tags = tags; + event.content = segment.content; + + return event; +} + +/** + * Generate default index content + */ +function generateIndexContent(title: string, segments: ContentSegment[]): string { + return `# ${title} + +${segments.length} sections available: + +${segments.map((segment, i) => `${i + 1}. ${segment.title}`).join('\n')}`; +} + +/** + * Escape regex special characters + */ +function escapeRegex(str: string): string { + return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); +} + +/** + * Generate deterministic d-tag from title + */ +function generateDTag(title: string): string { + return title + .toLowerCase() + .replace(/[^\p{L}\p{N}]/gu, "-") + .replace(/-+/g, "-") + .replace(/^-|-$/g, "") || "untitled"; +} + +/** + * Add document attributes as Nostr tags + */ +function addDocumentAttributesToTags( + tags: string[][], + attributes: Record, + pubkey: string +) { + // Standard metadata + if (attributes.author) tags.push(["author", attributes.author]); + if (attributes.version) tags.push(["version", attributes.version]); + if (attributes.published) tags.push(["published", attributes.published]); + if (attributes.language) tags.push(["language", attributes.language]); + if (attributes.image) tags.push(["image", attributes.image]); + if (attributes.description) tags.push(["summary", attributes.description]); + if (attributes.type) tags.push(["type", attributes.type]); + + // Tags + if (attributes.tags) { + attributes.tags.split(",").forEach(tag => tags.push(["t", tag.trim()])); + } + + // Add pubkey reference + tags.push(["p", pubkey]); + + // Custom attributes + addCustomAttributes(tags, attributes); +} + +/** + * Add section attributes as tags + */ +function addSectionAttributesToTags( + tags: string[][], + attributes: Record +) { + // Section tags + if (attributes.tags) { + attributes.tags.split(",").forEach(tag => tags.push(["t", tag.trim()])); + } + + // Custom attributes + addCustomAttributes(tags, attributes); +} + +/** + * Add custom attributes, filtering out system ones + */ +function addCustomAttributes( + tags: string[][], + attributes: Record +) { + const systemAttributes = [ + "attribute-undefined", "attribute-missing", "appendix-caption", + "appendix-refsig", "caution-caption", "chapter-refsig", "example-caption", + "figure-caption", "important-caption", "last-update-label", "manname-title", + "note-caption", "part-refsig", "preface-title", "section-refsig", + "table-caption", "tip-caption", "toc-title", "untitled-label", + "version-label", "warning-caption", "asciidoctor", "asciidoctor-version", + "safe-mode-name", "backend", "doctype", "basebackend", "filetype", + "outfilesuffix", "stylesdir", "iconsdir", "localdate", "localyear", + "localtime", "localdatetime", "docdate", "docyear", "doctime", + "docdatetime", "doctitle", "embedded", "notitle", + // Already handled above + "author", "version", "published", "language", "image", "description", + "tags", "title", "type" + ]; + + Object.entries(attributes).forEach(([key, value]) => { + if (!systemAttributes.includes(key) && value && typeof value === "string") { + tags.push([key, value]); + } + }); +} + +/** + * Group segments by level 2 sections for Level 2 parsing + * Combines all nested content into each level 2 section + */ +function groupSegmentsByLevel2(segments: ContentSegment[]): ContentSegment[] { + const level2Groups: ContentSegment[] = []; + + // Find all level 2 segments and include their nested content + for (const segment of segments) { + if (segment.level === 2) { + // Find all content that belongs to this level 2 section + const nestedSegments = segments.filter(s => + s.level > 2 && + s.startLine > segment.startLine && + (segments.find(next => next.level <= 2 && next.startLine > segment.startLine)?.startLine || Infinity) > s.startLine + ); + + // Combine the level 2 content with all nested content + let combinedContent = segment.content; + for (const nested of nestedSegments) { + combinedContent += `\n\n${'='.repeat(nested.level)} ${nested.title}\n${nested.content}`; + } + + level2Groups.push({ + ...segment, + content: combinedContent + }); + } + } + + return level2Groups; +} + +/** + * Build hierarchical segment structure for Level 3+ parsing + */ +function buildSegmentHierarchy(segments: ContentSegment[]): HierarchicalSegment[] { + const hierarchy: HierarchicalSegment[] = []; + + // Process level 2 sections + for (const level2Segment of segments.filter(s => s.level === 2)) { + const children = segments.filter(s => + s.level > 2 && + s.startLine > level2Segment.startLine && + (segments.find(next => next.level <= 2 && next.startLine > level2Segment.startLine)?.startLine || Infinity) > s.startLine + ); + + hierarchy.push({ + ...level2Segment, + hasChildren: children.length > 0, + children + }); + } + + return hierarchy; +} + +/** + * Create a 30040 index event for a section with children + */ +function createIndexEventForSection(section: HierarchicalSegment, ndk: NDK): NDKEvent { + const event = new NDKEvent(ndk); + event.kind = 30040; + event.created_at = Math.floor(Date.now() / 1000); + event.pubkey = ndk.activeUser?.pubkey || "preview-placeholder-pubkey"; + + const dTag = generateDTag(section.title); + const [mTag, MTag] = getMimeTags(30040); + + const tags: string[][] = [ + ["d", dTag], + mTag, + MTag, + ["title", section.title] + ]; + + // Add section attributes as tags + addSectionAttributesToTags(tags, section.attributes); + + // Add a-tags for each child content section + section.children.forEach(child => { + const childDTag = generateDTag(child.title); + tags.push(["a", `30041:${event.pubkey}:${childDTag}`]); + }); + + event.tags = tags; + event.content = `${section.content}\n\n${section.children.length} subsections available.`; + + return event; +} \ No newline at end of file diff --git a/tests/unit/publication_tree_processor.test.ts b/tests/unit/publication_tree_processor.test.ts new file mode 100644 index 0000000..f319362 --- /dev/null +++ b/tests/unit/publication_tree_processor.test.ts @@ -0,0 +1,284 @@ +/** + * TDD Tests for NKBIP-01 Publication Tree Processor + * + * Tests the iterative parsing function at different hierarchy levels + * using deep_hierarchy_test.adoc to verify NKBIP-01 compliance. + */ + +import { describe, it, expect, beforeAll } from 'vitest'; +import { readFileSync } from 'fs'; +import { parseAsciiDocWithTree, validateParseLevel, getSupportedParseLevels } from '../../src/lib/utils/asciidoc_publication_parser.js'; + +// Mock NDK for testing +const mockNDK = { + activeUser: { + pubkey: "test-pubkey-12345" + } +} as any; + +// Read the test document +const testDocumentPath = "./test_data/AsciidocFiles/deep_hierarchy_test.adoc"; +let testContent: string; + +try { + testContent = readFileSync(testDocumentPath, 'utf-8'); +} catch (error) { + console.error("Failed to read test document:", error); + testContent = `= Deep Hierarchical Document Test +:tags: testing, hierarchy, structure +:author: Test Author +:type: technical + +This document tests all 6 levels of AsciiDoc hierarchy to validate our parse level system. + +== Level 2: Main Sections +:tags: level2, main + +This is a level 2 section that should appear in all parse levels. + +=== Level 3: Subsections +:tags: level3, subsection + +This is a level 3 section that should appear in parse levels 3-6. + +==== Level 4: Sub-subsections +:tags: level4, detailed + +This is a level 4 section that should appear in parse levels 4-6. + +===== Level 5: Deep Subsections +:tags: level5, deep + +This is a level 5 section that should only appear in parse levels 5-6. + +====== Level 6: Deepest Level +:tags: level6, deepest + +This is a level 6 section that should only appear in parse level 6. + +Content at the deepest level of our hierarchy. + +== Level 2: Second Main Section +:tags: level2, main, second + +A second main section to ensure we have balanced content at the top level.`; +} + +describe("NKBIP-01 Publication Tree Processor", () => { + + it("should validate parse levels correctly", () => { + // Test valid parse levels + expect(validateParseLevel(2)).toBe(true); + expect(validateParseLevel(3)).toBe(true); + expect(validateParseLevel(5)).toBe(true); + + // Test invalid parse levels + expect(validateParseLevel(1)).toBe(false); + expect(validateParseLevel(6)).toBe(false); + expect(validateParseLevel(7)).toBe(false); + expect(validateParseLevel(2.5)).toBe(false); + expect(validateParseLevel(-1)).toBe(false); + + // Test supported levels array + const supportedLevels = getSupportedParseLevels(); + expect(supportedLevels).toEqual([2, 3, 4, 5]); + }); + + it("should parse Level 2 with NKBIP-01 minimal structure", async () => { + const result = await parseAsciiDocWithTree(testContent, mockNDK, 2); + + // Should be detected as article (has title and sections) + expect(result.metadata.contentType).toBe("article"); + expect(result.metadata.parseLevel).toBe(2); + expect(result.metadata.title).toBe("Deep Hierarchical Document Test"); + + // Should have 1 index event (30040) + 2 content events (30041) for level 2 sections + expect(result.indexEvent).toBeDefined(); + expect(result.indexEvent?.kind).toBe(30040); + expect(result.contentEvents.length).toBe(2); + + // All content events should be kind 30041 + result.contentEvents.forEach(event => { + expect(event.kind).toBe(30041); + }); + + // Check titles of level 2 sections + const contentTitles = result.contentEvents.map(e => + e.tags.find((t: string[]) => t[0] === "title")?.[1] + ); + expect(contentTitles).toContain("Level 2: Main Sections"); + expect(contentTitles).toContain("Level 2: Second Main Section"); + + // Content should include all nested subsections as AsciiDoc + const firstSectionContent = result.contentEvents[0].content; + expect(firstSectionContent).toBeDefined(); + // Should contain level 3, 4, 5 content as nested AsciiDoc markup + expect(firstSectionContent.includes("=== Level 3: Subsections")).toBe(true); + expect(firstSectionContent.includes("==== Level 4: Sub-subsections")).toBe(true); + expect(firstSectionContent.includes("===== Level 5: Deep Subsections")).toBe(true); + }); + + it("should parse Level 3 with NKBIP-01 intermediate structure", async () => { + const result = await parseAsciiDocWithTree(testContent, mockNDK, 3); + + expect(result.metadata.contentType).toBe("article"); + expect(result.metadata.parseLevel).toBe(3); + + // Should have hierarchical structure + expect(result.indexEvent).toBeDefined(); + expect(result.indexEvent?.kind).toBe(30040); + + // Should have mix of 30040 (for level 2 sections with children) and 30041 (for content) + const kinds = result.contentEvents.map(e => e.kind); + expect(kinds).toContain(30040); // Level 2 sections with children + expect(kinds).toContain(30041); // Level 3 content sections + + // Level 2 sections with children should be 30040 index events + const level2WithChildrenEvents = result.contentEvents.filter(e => + e.kind === 30040 && + e.tags.find((t: string[]) => t[0] === "title")?.[1]?.includes("Level 2:") + ); + expect(level2WithChildrenEvents.length).toBe(2); // Both level 2 sections have children + + // Should have 30041 events for level 3 content + const level3ContentEvents = result.contentEvents.filter(e => + e.kind === 30041 && + e.tags.find((t: string[]) => t[0] === "title")?.[1]?.includes("Level 3:") + ); + expect(level3ContentEvents.length).toBeGreaterThan(0); + }); + + it("should parse Level 4 with NKBIP-01 detailed structure", async () => { + const result = await parseAsciiDocWithTree(testContent, mockNDK, 4); + + expect(result.metadata.contentType).toBe("article"); + expect(result.metadata.parseLevel).toBe(4); + + // Should have hierarchical structure with mix of 30040 and 30041 events + expect(result.indexEvent).toBeDefined(); + expect(result.indexEvent?.kind).toBe(30040); + + const kinds = result.contentEvents.map(e => e.kind); + expect(kinds).toContain(30040); // Level 2 sections with children + expect(kinds).toContain(30041); // Content sections + + // Check that we have level 4 content sections + const contentTitles = result.contentEvents.map(e => + e.tags.find((t: string[]) => t[0] === "title")?.[1] + ); + expect(contentTitles).toContain("Level 4: Sub-subsections"); + }); + + it("should parse Level 5 with NKBIP-01 maximum depth", async () => { + const result = await parseAsciiDocWithTree(testContent, mockNDK, 5); + + expect(result.metadata.contentType).toBe("article"); + expect(result.metadata.parseLevel).toBe(5); + + // Should have hierarchical structure + expect(result.indexEvent).toBeDefined(); + expect(result.indexEvent?.kind).toBe(30040); + + // Should include level 5 sections as content events + const contentTitles = result.contentEvents.map(e => + e.tags.find((t: string[]) => t[0] === "title")?.[1] + ); + expect(contentTitles).toContain("Level 5: Deep Subsections"); + }); + + it("should validate event structure correctly", async () => { + const result = await parseAsciiDocWithTree(testContent, mockNDK, 3); + + // Test index event structure + expect(result.indexEvent).toBeDefined(); + expect(result.indexEvent?.kind).toBe(30040); + expect(result.indexEvent?.tags).toBeDefined(); + + // Check required tags + const indexTags = result.indexEvent!.tags; + const dTag = indexTags.find((t: string[]) => t[0] === "d"); + const titleTag = indexTags.find((t: string[]) => t[0] === "title"); + + expect(dTag).toBeDefined(); + expect(titleTag).toBeDefined(); + expect(titleTag![1]).toBe("Deep Hierarchical Document Test"); + + // Test content events structure - mix of 30040 and 30041 + result.contentEvents.forEach(event => { + expect([30040, 30041]).toContain(event.kind); + expect(event.tags).toBeDefined(); + expect(event.content).toBeDefined(); + + const eventTitleTag = event.tags.find((t: string[]) => t[0] === "title"); + expect(eventTitleTag).toBeDefined(); + }); + }); + + it("should preserve content as AsciiDoc", async () => { + const result = await parseAsciiDocWithTree(testContent, mockNDK, 2); + + // Content should be preserved as original AsciiDoc, not converted to HTML + const firstEvent = result.contentEvents[0]; + expect(firstEvent.content).toBeDefined(); + + // Should contain AsciiDoc markup, not HTML + expect(firstEvent.content.includes("<")).toBe(false); + expect(firstEvent.content.includes("===")).toBe(true); + }); + + it("should handle attributes correctly", async () => { + const result = await parseAsciiDocWithTree(testContent, mockNDK, 2); + + // Document-level attributes should be in index event + expect(result.indexEvent).toBeDefined(); + const indexTags = result.indexEvent!.tags; + + // Check for document attributes + const authorTag = indexTags.find((t: string[]) => t[0] === "author"); + const typeTag = indexTags.find((t: string[]) => t[0] === "type"); + const tagsTag = indexTags.find((t: string[]) => t[0] === "t"); + + expect(authorTag?.[1]).toBe("Test Author"); + expect(typeTag?.[1]).toBe("technical"); + expect(tagsTag).toBeDefined(); // Should have at least one t-tag + }); + + it("should handle scattered notes mode", async () => { + // Test with content that has no document title (scattered notes) + const scatteredContent = `== First Note +:tags: note1 + +Content of first note. + +== Second Note +:tags: note2 + +Content of second note.`; + + const result = await parseAsciiDocWithTree(scatteredContent, mockNDK, 2); + + expect(result.metadata.contentType).toBe("scattered-notes"); + expect(result.indexEvent).toBeNull(); // No index event for scattered notes + expect(result.contentEvents.length).toBe(2); + + // All events should be 30041 content events + result.contentEvents.forEach(event => { + expect(event.kind).toBe(30041); + }); + }); + + it("should integrate with PublicationTree structure", async () => { + const result = await parseAsciiDocWithTree(testContent, mockNDK, 2); + + // Should have a PublicationTree instance + expect(result.tree).toBeDefined(); + + // Tree should have methods for event management + expect(typeof result.tree.addEvent).toBe("function"); + + // Event structure should be populated + expect(result.metadata.eventStructure).toBeDefined(); + expect(Array.isArray(result.metadata.eventStructure)).toBe(true); + }); + +}); \ No newline at end of file