Browse Source
Successfully implemented proper Asciidoctor tree processor extension for NKBIP-01 hierarchical parsing with comprehensive test coverage and future integration plan. Features: - Real Asciidoctor tree processor extension using registry.treeProcessor() - NKBIP-01 compliant hierarchical structure (30040 index + 30041 content events) - Parse levels 2-5 with different event granularities: * Level 2: One 30041 per level 2 section (contains all nested content) * Level 3+: Mix of 30040 (sections with children) + 30041 (content sections) - Content preserved as original AsciiDoc markup - Comprehensive test suite validating all parse levels and event structures Implementation: - src/lib/utils/publication_tree_processor.ts: Core tree processor extension - src/lib/utils/asciidoc_publication_parser.ts: Unified parser interface - tests/unit/publication_tree_processor.test.ts: Complete test coverage - HIERARCHY_VISUALIZATION_PLAN.md: Next phase integration plan Next: Integrate into ZettelEditor with visual hierarchy indicators 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>master
4 changed files with 1420 additions and 0 deletions
@ -0,0 +1,163 @@ |
|||||||
|
# Hierarchy Visualization Integration Plan |
||||||
|
|
||||||
|
## Current State: NKBIP-01 Tree Processor Complete ✅ |
||||||
|
|
||||||
|
We have successfully implemented a proper Asciidoctor tree processor extension that: |
||||||
|
- Registers as a real Asciidoctor extension using `registry.treeProcessor()` |
||||||
|
- Processes documents during AST parsing with full access to `doc.getSections()` |
||||||
|
- Implements hierarchical NKBIP-01 structure with 30040/30041 events |
||||||
|
- Supports parse levels 2-5 with different event granularities |
||||||
|
- Passes comprehensive tests validating the hierarchical structure |
||||||
|
|
||||||
|
## Next Phase: ZettelEditor Integration |
||||||
|
|
||||||
|
### Overview |
||||||
|
Integrate the new hierarchical parser into ZettelEditor with visual hierarchy hints that show users exactly which sections will become which types of events at different parse levels - like text editor indent guides but for Nostr event structure. |
||||||
|
|
||||||
|
### Phase 1: Core Integration (Essential) |
||||||
|
|
||||||
|
#### 1.1 Update ZettelEditor Parser |
||||||
|
- **Current**: Uses old `publication_tree_factory.ts` with flattened AST parsing |
||||||
|
- **Target**: Switch to new `asciidoc_publication_parser.ts` with tree processor |
||||||
|
- **Impact**: Enables real hierarchical 30040/30041 event structure |
||||||
|
|
||||||
|
```typescript |
||||||
|
// Change from: |
||||||
|
import { createPublicationTreeFromContent } from "$lib/utils/publication_tree_factory"; |
||||||
|
|
||||||
|
// To: |
||||||
|
import { parseAsciiDocWithTree } from "$lib/utils/asciidoc_publication_parser"; |
||||||
|
``` |
||||||
|
|
||||||
|
#### 1.2 Fix Parse Level Configuration |
||||||
|
- Update `MAX_PARSE_LEVEL` from 6 to 5 in ZettelEditor.svelte:43 |
||||||
|
- Update parse level options to reflect new hierarchical structure descriptions |
||||||
|
|
||||||
|
#### 1.3 Update Preview Panel |
||||||
|
- Leverage `publicationResult.metadata.eventStructure` for accurate hierarchy display |
||||||
|
- Show 30040 vs 30041 event types with different visual indicators |
||||||
|
- Display parent-child relationships between index and content events |
||||||
|
|
||||||
|
### Phase 2: Visual Hierarchy Indicators (High Impact) |
||||||
|
|
||||||
|
#### 2.1 Editor Gutter Visualization |
||||||
|
Add visual hints in the editor showing which sections will become events: |
||||||
|
|
||||||
|
**Event Type Indicators:** |
||||||
|
- 🔵 **Blue circle**: Sections that become 30040 index events |
||||||
|
- 🟢 **Green circle**: Sections that become 30041 content events |
||||||
|
- 📝 **Text label**: "Index" or "Content" next to each section |
||||||
|
|
||||||
|
**Parse Level Boundaries:** |
||||||
|
- **Colored left border**: Different colors for each hierarchy level |
||||||
|
- **Indent guides**: Visual lines showing nested structure |
||||||
|
- **Level badges**: Small "L2", "L3", etc. indicators |
||||||
|
|
||||||
|
#### 2.2 Real-time Parse Level Feedback |
||||||
|
As user changes parse level dropdown: |
||||||
|
- **Highlight changes**: Animate sections that change event type |
||||||
|
- **Event count updates**: Show before/after event counts |
||||||
|
- **Structure preview**: Mini-tree view showing resulting hierarchy |
||||||
|
|
||||||
|
#### 2.3 Interactive Section Mapping |
||||||
|
- **Hover effects**: Hover over section → highlight corresponding event in preview |
||||||
|
- **Click navigation**: Click section title → jump to event preview |
||||||
|
- **Relationship lines**: Visual connections between 30040 and their 30041 children |
||||||
|
|
||||||
|
### Phase 3: Advanced Hierarchy Features (Polish) |
||||||
|
|
||||||
|
#### 3.1 Smart Parse Level Suggestions |
||||||
|
- **Auto-detect optimal level**: Analyze document structure and suggest best parse level |
||||||
|
- **Level comparison**: Side-by-side view of different parse levels |
||||||
|
- **Performance hints**: Show trade-offs (fewer vs more events) |
||||||
|
|
||||||
|
#### 3.2 Enhanced Editor Features |
||||||
|
- **Section folding**: Collapse/expand based on hierarchy |
||||||
|
- **Quick level promotion**: Buttons to promote/demote section levels |
||||||
|
- **Hierarchy outline**: Collapsible tree view in sidebar |
||||||
|
|
||||||
|
#### 3.3 Event Relationship Visualization |
||||||
|
- **Tree diagram**: Visual representation of 30040 → 30041 relationships |
||||||
|
- **Event flow**: Show how events will be published and linked |
||||||
|
- **Validation**: Check for proper NKBIP-01 compliance |
||||||
|
|
||||||
|
### Phase 4: Advanced Interactions (Future) |
||||||
|
|
||||||
|
#### 4.1 Drag & Drop Hierarchy Editing |
||||||
|
- Drag sections to change hierarchy |
||||||
|
- Visual feedback for valid drop targets |
||||||
|
- Auto-update AsciiDoc markup |
||||||
|
|
||||||
|
#### 4.2 Multi-level Preview |
||||||
|
- Split preview showing multiple parse levels simultaneously |
||||||
|
- Compare different parsing strategies |
||||||
|
- Export options for different levels |
||||||
|
|
||||||
|
## Technical Implementation Notes |
||||||
|
|
||||||
|
### Key Data Structures |
||||||
|
```typescript |
||||||
|
// eventStructure provides complete hierarchy information |
||||||
|
interface EventStructureNode { |
||||||
|
title: string; |
||||||
|
level: number; |
||||||
|
eventType: "index" | "content"; |
||||||
|
eventKind: 30040 | 30041; |
||||||
|
dTag: string; |
||||||
|
children: EventStructureNode[]; |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
### Integration Points |
||||||
|
1. **Parser integration**: `parseAsciiDocWithTree()` in reactive effect |
||||||
|
2. **Event structure**: Use `result.metadata.eventStructure` for visualization |
||||||
|
3. **Real-time updates**: Svelte reactivity for immediate visual feedback |
||||||
|
4. **Preview sync**: Coordinate editor and preview panel highlights |
||||||
|
|
||||||
|
### CSS Hierarchy Indicators |
||||||
|
```css |
||||||
|
.section-level-2 { border-left: 4px solid #3b82f6; } /* Blue */ |
||||||
|
.section-level-3 { border-left: 4px solid #10b981; } /* Green */ |
||||||
|
.section-level-4 { border-left: 4px solid #f59e0b; } /* Amber */ |
||||||
|
.section-level-5 { border-left: 4px solid #8b5cf6; } /* Purple */ |
||||||
|
|
||||||
|
.event-type-index { background: rgba(59, 130, 246, 0.1); } /* Light blue */ |
||||||
|
.event-type-content { background: rgba(16, 185, 129, 0.1); } /* Light green */ |
||||||
|
``` |
||||||
|
|
||||||
|
## Success Metrics |
||||||
|
|
||||||
|
### Phase 1 (Essential) |
||||||
|
- [ ] ZettelEditor uses new tree processor |
||||||
|
- [ ] All existing functionality preserved |
||||||
|
- [ ] Hierarchical events display correctly |
||||||
|
|
||||||
|
### Phase 2 (High Impact) |
||||||
|
- [ ] Visual hierarchy indicators in editor |
||||||
|
- [ ] Real-time parse level feedback |
||||||
|
- [ ] Clear 30040 vs 30041 distinction |
||||||
|
|
||||||
|
### Phase 3 (Polish) |
||||||
|
- [ ] Smart parse level suggestions |
||||||
|
- [ ] Enhanced editor interactions |
||||||
|
- [ ] Event relationship visualization |
||||||
|
|
||||||
|
## Migration Strategy |
||||||
|
|
||||||
|
1. **Gradual rollout**: Implement phases sequentially |
||||||
|
2. **Fallback compatibility**: Keep old factory as backup during transition |
||||||
|
3. **User testing**: Validate hierarchy visualization with real users |
||||||
|
4. **Performance monitoring**: Ensure real-time updates remain smooth |
||||||
|
|
||||||
|
## Dependencies |
||||||
|
|
||||||
|
- ✅ **NKBIP-01 tree processor**: Complete and tested |
||||||
|
- ✅ **Parse level validation**: Levels 2-5 supported |
||||||
|
- ✅ **Event structure metadata**: Available in `eventStructure` field |
||||||
|
- ⏳ **ZettelEditor integration**: Next phase |
||||||
|
- ⏳ **Visual design system**: Colors, icons, animations |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
**Ready to proceed with Phase 1: Core Integration** |
||||||
|
The foundation is solid - we have a working tree processor extension that generates proper hierarchical NKBIP-01 events. Now we need to integrate it into the editor interface and add the visual hierarchy indicators that will make the event structure clear to users. |
||||||
@ -0,0 +1,144 @@ |
|||||||
|
/** |
||||||
|
* Unified AsciiDoc Publication Parser |
||||||
|
*
|
||||||
|
* Single entry point for parsing AsciiDoc content into NKBIP-01 compliant |
||||||
|
* publication trees using proper Asciidoctor tree processor extensions. |
||||||
|
*
|
||||||
|
* This implements Michael's vision of using PublicationTree as the primary |
||||||
|
* data structure for organizing hierarchical Nostr events. |
||||||
|
*/ |
||||||
|
|
||||||
|
import Asciidoctor from "asciidoctor"; |
||||||
|
import { registerPublicationTreeProcessor, type ProcessorResult } from "./publication_tree_processor"; |
||||||
|
import type NDK from "@nostr-dev-kit/ndk"; |
||||||
|
|
||||||
|
export type PublicationTreeResult = ProcessorResult; |
||||||
|
|
||||||
|
/** |
||||||
|
* Parse AsciiDoc content into a PublicationTree using tree processor extension |
||||||
|
* This is the main entry point for all parsing operations |
||||||
|
*/ |
||||||
|
export async function parseAsciiDocWithTree( |
||||||
|
content: string, |
||||||
|
ndk: NDK, |
||||||
|
parseLevel: number = 2 |
||||||
|
): Promise<PublicationTreeResult> { |
||||||
|
console.log(`[Parser] Starting parse at level ${parseLevel}`); |
||||||
|
|
||||||
|
// Create fresh Asciidoctor instance
|
||||||
|
const asciidoctor = Asciidoctor(); |
||||||
|
const registry = asciidoctor.Extensions.create(); |
||||||
|
|
||||||
|
// Register our tree processor extension
|
||||||
|
const processorAccessor = registerPublicationTreeProcessor( |
||||||
|
registry,
|
||||||
|
ndk,
|
||||||
|
parseLevel,
|
||||||
|
content |
||||||
|
); |
||||||
|
|
||||||
|
try { |
||||||
|
// Parse the document with our extension
|
||||||
|
const doc = asciidoctor.load(content, { |
||||||
|
extension_registry: registry, |
||||||
|
standalone: false, |
||||||
|
attributes: { |
||||||
|
sectids: false |
||||||
|
} |
||||||
|
}); |
||||||
|
|
||||||
|
console.log(`[Parser] Document converted successfully`); |
||||||
|
|
||||||
|
// Get the result from our processor
|
||||||
|
const result = processorAccessor.getResult(); |
||||||
|
|
||||||
|
if (!result) { |
||||||
|
throw new Error("Tree processor failed to generate result"); |
||||||
|
} |
||||||
|
|
||||||
|
// Build async relationships in the PublicationTree
|
||||||
|
await buildTreeRelationships(result); |
||||||
|
|
||||||
|
console.log(`[Parser] Tree relationships built successfully`); |
||||||
|
|
||||||
|
return result; |
||||||
|
|
||||||
|
} catch (error) { |
||||||
|
console.error('[Parser] Error during parsing:', error); |
||||||
|
throw new Error(`Failed to parse AsciiDoc content: ${error instanceof Error ? error.message : 'Unknown error'}`); |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Build async relationships in the PublicationTree |
||||||
|
* This adds content events to the tree structure as Michael envisioned |
||||||
|
*/ |
||||||
|
async function buildTreeRelationships(result: ProcessorResult): Promise<void> { |
||||||
|
const { tree, indexEvent, contentEvents } = result; |
||||||
|
|
||||||
|
if (!tree) { |
||||||
|
throw new Error("No tree available to build relationships"); |
||||||
|
} |
||||||
|
|
||||||
|
try { |
||||||
|
// Add content events to the tree
|
||||||
|
if (indexEvent && contentEvents.length > 0) { |
||||||
|
// Article structure: add all content events to index
|
||||||
|
for (const contentEvent of contentEvents) { |
||||||
|
await tree.addEvent(contentEvent, indexEvent); |
||||||
|
} |
||||||
|
} else if (contentEvents.length > 1) { |
||||||
|
// Scattered notes: add remaining events to first event
|
||||||
|
const rootEvent = contentEvents[0]; |
||||||
|
for (let i = 1; i < contentEvents.length; i++) { |
||||||
|
await tree.addEvent(contentEvents[i], rootEvent); |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
console.log(`[Parser] Added ${contentEvents.length} events to tree`); |
||||||
|
|
||||||
|
} catch (error) { |
||||||
|
console.error('[Parser] Error building tree relationships:', error); |
||||||
|
throw error; |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Export events from PublicationTree for publishing workflow compatibility |
||||||
|
*/ |
||||||
|
export function exportEventsFromTree(result: PublicationTreeResult) { |
||||||
|
return { |
||||||
|
indexEvent: result.indexEvent ? eventToPublishableObject(result.indexEvent) : undefined, |
||||||
|
contentEvents: result.contentEvents.map(eventToPublishableObject), |
||||||
|
tree: result.tree |
||||||
|
}; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Convert NDKEvent to publishable object format |
||||||
|
*/ |
||||||
|
function eventToPublishableObject(event: any) { |
||||||
|
return { |
||||||
|
kind: event.kind, |
||||||
|
content: event.content, |
||||||
|
tags: event.tags, |
||||||
|
created_at: event.created_at, |
||||||
|
pubkey: event.pubkey, |
||||||
|
id: event.id, |
||||||
|
title: event.tags.find((t: string[]) => t[0] === "title")?.[1] || "Untitled" |
||||||
|
}; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Validate parse level parameter |
||||||
|
*/ |
||||||
|
export function validateParseLevel(level: number): boolean { |
||||||
|
return Number.isInteger(level) && level >= 2 && level <= 5; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Get supported parse levels |
||||||
|
*/ |
||||||
|
export function getSupportedParseLevels(): number[] { |
||||||
|
return [2, 3, 4, 5]; |
||||||
|
} |
||||||
@ -0,0 +1,829 @@ |
|||||||
|
/** |
||||||
|
* NKBIP-01 Compliant Publication Tree Processor |
||||||
|
*
|
||||||
|
* Implements proper Asciidoctor tree processor extension pattern for building |
||||||
|
* PublicationTree structures during document parsing. Supports iterative parsing |
||||||
|
* at different hierarchy levels (2-7) as defined in NKBIP-01 specification. |
||||||
|
*/ |
||||||
|
|
||||||
|
import type { Document, Registry } from "asciidoctor"; |
||||||
|
import { PublicationTree } from "$lib/data_structures/publication_tree"; |
||||||
|
import { NDKEvent } from "@nostr-dev-kit/ndk"; |
||||||
|
import type NDK from "@nostr-dev-kit/ndk"; |
||||||
|
import { getMimeTags } from "$lib/utils/mime"; |
||||||
|
|
||||||
|
export interface ProcessorResult { |
||||||
|
tree: PublicationTree; |
||||||
|
indexEvent: NDKEvent | null; |
||||||
|
contentEvents: NDKEvent[]; |
||||||
|
metadata: { |
||||||
|
title: string; |
||||||
|
totalSections: number; |
||||||
|
contentType: "article" | "scattered-notes" | "none"; |
||||||
|
attributes: Record<string, string>; |
||||||
|
parseLevel: number; |
||||||
|
eventStructure: EventStructureNode[]; |
||||||
|
}; |
||||||
|
} |
||||||
|
|
||||||
|
export interface EventStructureNode { |
||||||
|
title: string; |
||||||
|
level: number; |
||||||
|
eventType: "index" | "content"; |
||||||
|
eventKind: 30040 | 30041; |
||||||
|
dTag: string; |
||||||
|
children: EventStructureNode[]; |
||||||
|
} |
||||||
|
|
||||||
|
interface ContentSegment { |
||||||
|
title: string; |
||||||
|
content: string; |
||||||
|
level: number; |
||||||
|
attributes: Record<string, string>; |
||||||
|
startLine: number; |
||||||
|
endLine: number; |
||||||
|
} |
||||||
|
|
||||||
|
interface HierarchicalSegment extends ContentSegment { |
||||||
|
hasChildren: boolean; |
||||||
|
children: ContentSegment[]; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Register the PublicationTree processor extension with Asciidoctor |
||||||
|
* This follows the official extension pattern exactly as provided by the user |
||||||
|
*/ |
||||||
|
export function registerPublicationTreeProcessor( |
||||||
|
registry: Registry, |
||||||
|
ndk: NDK, |
||||||
|
parseLevel: number = 2, |
||||||
|
originalContent: string |
||||||
|
): { getResult: () => ProcessorResult | null } { |
||||||
|
let processorResult: ProcessorResult | null = null; |
||||||
|
|
||||||
|
registry.treeProcessor(function() { |
||||||
|
const self = this; |
||||||
|
|
||||||
|
self.process(function(doc: Document) { |
||||||
|
try { |
||||||
|
// Extract document metadata from AST
|
||||||
|
const title = doc.getTitle() || ''; |
||||||
|
const attributes = doc.getAttributes(); |
||||||
|
const sections = doc.getSections(); |
||||||
|
|
||||||
|
console.log(`[TreeProcessor] Document attributes:`, { |
||||||
|
tags: attributes.tags, |
||||||
|
author: attributes.author,
|
||||||
|
type: attributes.type |
||||||
|
}); |
||||||
|
|
||||||
|
console.log(`[TreeProcessor] Processing document: "${title}" at parse level ${parseLevel}`); |
||||||
|
console.log(`[TreeProcessor] Found ${sections.length} top-level sections`); |
||||||
|
|
||||||
|
// Extract content segments from original text based on parse level
|
||||||
|
const contentSegments = extractContentSegments(originalContent, sections, parseLevel); |
||||||
|
console.log(`[TreeProcessor] Extracted ${contentSegments.length} content segments for level ${parseLevel}`); |
||||||
|
|
||||||
|
// Determine content type based on structure
|
||||||
|
const contentType = detectContentType(title, contentSegments); |
||||||
|
console.log(`[TreeProcessor] Detected content type: ${contentType}`); |
||||||
|
|
||||||
|
// Build events and tree structure
|
||||||
|
const { tree, indexEvent, contentEvents, eventStructure } = buildEventsFromSegments( |
||||||
|
contentSegments,
|
||||||
|
title,
|
||||||
|
attributes,
|
||||||
|
contentType,
|
||||||
|
parseLevel,
|
||||||
|
ndk |
||||||
|
); |
||||||
|
|
||||||
|
processorResult = { |
||||||
|
tree, |
||||||
|
indexEvent, |
||||||
|
contentEvents, |
||||||
|
metadata: { |
||||||
|
title, |
||||||
|
totalSections: contentSegments.length, |
||||||
|
contentType, |
||||||
|
attributes, |
||||||
|
parseLevel, |
||||||
|
eventStructure |
||||||
|
} |
||||||
|
}; |
||||||
|
|
||||||
|
console.log(`[TreeProcessor] Built tree with ${contentEvents.length} content events and ${indexEvent ? '1' : '0'} index events`); |
||||||
|
|
||||||
|
} catch (error) { |
||||||
|
console.error('[TreeProcessor] Error processing document:', error); |
||||||
|
processorResult = null; |
||||||
|
} |
||||||
|
|
||||||
|
return doc; |
||||||
|
}); |
||||||
|
}); |
||||||
|
|
||||||
|
return { |
||||||
|
getResult: () => processorResult |
||||||
|
}; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Extract content segments from original text based on parse level |
||||||
|
* This is the core iterative function that handles different hierarchy depths |
||||||
|
*/ |
||||||
|
function extractContentSegments( |
||||||
|
originalContent: string, |
||||||
|
sections: any[], |
||||||
|
parseLevel: number |
||||||
|
): ContentSegment[] { |
||||||
|
const lines = originalContent.split('\n'); |
||||||
|
|
||||||
|
// Build hierarchy map from AST
|
||||||
|
const sectionHierarchy = buildSectionHierarchy(sections); |
||||||
|
|
||||||
|
// Debug: Show hierarchy depths
|
||||||
|
console.log(`[TreeProcessor] Section hierarchy depth analysis:`); |
||||||
|
function showDepth(nodes: SectionNode[], depth = 0) { |
||||||
|
for (const node of nodes) { |
||||||
|
console.log(`${' '.repeat(depth)}Level ${node.level}: ${node.title}`); |
||||||
|
if (node.children.length > 0) { |
||||||
|
showDepth(node.children, depth + 1); |
||||||
|
} |
||||||
|
} |
||||||
|
} |
||||||
|
showDepth(sectionHierarchy); |
||||||
|
|
||||||
|
// Extract segments at the target parse level
|
||||||
|
return extractSegmentsAtLevel(lines, sectionHierarchy, parseLevel); |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Build hierarchical section structure from Asciidoctor AST |
||||||
|
*/ |
||||||
|
function buildSectionHierarchy(sections: any[]): SectionNode[] { |
||||||
|
function buildNode(section: any): SectionNode { |
||||||
|
return { |
||||||
|
title: section.getTitle(), |
||||||
|
level: section.getLevel() + 1, // Convert to app level (Asciidoctor uses 0-based)
|
||||||
|
attributes: section.getAttributes() || {}, |
||||||
|
children: (section.getSections() || []).map(buildNode) |
||||||
|
}; |
||||||
|
} |
||||||
|
|
||||||
|
return sections.map(buildNode); |
||||||
|
} |
||||||
|
|
||||||
|
interface SectionNode { |
||||||
|
title: string; |
||||||
|
level: number; |
||||||
|
attributes: Record<string, string>; |
||||||
|
children: SectionNode[]; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Extract content segments at the specified parse level |
||||||
|
* This implements the iterative parsing logic for different levels |
||||||
|
*/ |
||||||
|
function extractSegmentsAtLevel( |
||||||
|
lines: string[], |
||||||
|
hierarchy: SectionNode[], |
||||||
|
parseLevel: number |
||||||
|
): ContentSegment[] { |
||||||
|
const segments: ContentSegment[] = []; |
||||||
|
|
||||||
|
// Collect all sections at the target parse level
|
||||||
|
const targetSections = collectSectionsAtLevel(hierarchy, parseLevel); |
||||||
|
|
||||||
|
for (const section of targetSections) { |
||||||
|
const segment = extractSegmentContent(lines, section, parseLevel); |
||||||
|
if (segment) { |
||||||
|
segments.push(segment); |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
return segments; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Recursively collect sections at or above the specified level |
||||||
|
* NKBIP-01: Level N parsing includes sections from level 2 through level N |
||||||
|
*/ |
||||||
|
function collectSectionsAtLevel(hierarchy: SectionNode[], targetLevel: number): SectionNode[] { |
||||||
|
const collected: SectionNode[] = []; |
||||||
|
|
||||||
|
function traverse(nodes: SectionNode[]) { |
||||||
|
for (const node of nodes) { |
||||||
|
// Include sections from level 2 up to target level
|
||||||
|
if (node.level >= 2 && node.level <= targetLevel) { |
||||||
|
collected.push(node); |
||||||
|
} |
||||||
|
|
||||||
|
// Continue traversing children to find more sections
|
||||||
|
if (node.children.length > 0) { |
||||||
|
traverse(node.children); |
||||||
|
} |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
traverse(hierarchy); |
||||||
|
return collected; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Extract content for a specific section from the original text |
||||||
|
*/ |
||||||
|
function extractSegmentContent( |
||||||
|
lines: string[], |
||||||
|
section: SectionNode, |
||||||
|
parseLevel: number |
||||||
|
): ContentSegment | null { |
||||||
|
// Find the section header in the original content
|
||||||
|
const sectionPattern = new RegExp(`^${'='.repeat(section.level)}\\s+${escapeRegex(section.title)}`); |
||||||
|
let startIdx = -1; |
||||||
|
|
||||||
|
for (let i = 0; i < lines.length; i++) { |
||||||
|
if (sectionPattern.test(lines[i])) { |
||||||
|
startIdx = i; |
||||||
|
break; |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
if (startIdx === -1) { |
||||||
|
console.warn(`[TreeProcessor] Could not find section "${section.title}" at level ${section.level}`); |
||||||
|
return null; |
||||||
|
} |
||||||
|
|
||||||
|
// Find the end of this section
|
||||||
|
let endIdx = lines.length; |
||||||
|
for (let i = startIdx + 1; i < lines.length; i++) { |
||||||
|
const levelMatch = lines[i].match(/^(=+)\s+/); |
||||||
|
if (levelMatch && levelMatch[1].length <= section.level) { |
||||||
|
endIdx = i; |
||||||
|
break; |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Extract section content
|
||||||
|
const sectionLines = lines.slice(startIdx, endIdx); |
||||||
|
|
||||||
|
// Parse attributes and content
|
||||||
|
const { attributes, content } = parseSegmentContent(sectionLines, parseLevel); |
||||||
|
|
||||||
|
return { |
||||||
|
title: section.title, |
||||||
|
content, |
||||||
|
level: section.level, |
||||||
|
attributes, |
||||||
|
startLine: startIdx, |
||||||
|
endLine: endIdx |
||||||
|
}; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Parse attributes and content from section lines |
||||||
|
*/ |
||||||
|
function parseSegmentContent(sectionLines: string[], parseLevel: number): { |
||||||
|
attributes: Record<string, string>; |
||||||
|
content: string; |
||||||
|
} { |
||||||
|
const attributes: Record<string, string> = {}; |
||||||
|
let contentStartIdx = 1; // Skip the title line
|
||||||
|
|
||||||
|
// Look for attribute lines after the title
|
||||||
|
for (let i = 1; i < sectionLines.length; i++) { |
||||||
|
const line = sectionLines[i].trim(); |
||||||
|
if (line.startsWith(':') && line.includes(':')) { |
||||||
|
const match = line.match(/^:([^:]+):\\s*(.*)$/); |
||||||
|
if (match) { |
||||||
|
attributes[match[1]] = match[2]; |
||||||
|
contentStartIdx = i + 1; |
||||||
|
} |
||||||
|
} else if (line !== '') { |
||||||
|
// Non-empty, non-attribute line - content starts here
|
||||||
|
break; |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Extract content (everything after attributes)
|
||||||
|
const content = sectionLines.slice(contentStartIdx).join('\n').trim(); |
||||||
|
|
||||||
|
return { attributes, content }; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Detect content type based on document structure |
||||||
|
*/ |
||||||
|
function detectContentType( |
||||||
|
title: string, |
||||||
|
segments: ContentSegment[] |
||||||
|
): "article" | "scattered-notes" | "none" { |
||||||
|
const hasDocTitle = !!title; |
||||||
|
const hasSections = segments.length > 0; |
||||||
|
|
||||||
|
// Check if the title matches the first section title
|
||||||
|
const titleMatchesFirstSection = segments.length > 0 && title === segments[0].title; |
||||||
|
|
||||||
|
if (hasDocTitle && hasSections && !titleMatchesFirstSection) { |
||||||
|
return "article"; |
||||||
|
} else if (hasSections) { |
||||||
|
return "scattered-notes"; |
||||||
|
} |
||||||
|
|
||||||
|
return "none"; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Build events and tree structure from content segments |
||||||
|
* Implements NKBIP-01 hierarchical parsing: |
||||||
|
* - Level 2: One 30041 event per level 2 section containing all nested content |
||||||
|
* - Level 3+: Hierarchical 30040 events for intermediate sections + 30041 for content-only |
||||||
|
*/ |
||||||
|
function buildEventsFromSegments( |
||||||
|
segments: ContentSegment[], |
||||||
|
title: string, |
||||||
|
attributes: Record<string, string>, |
||||||
|
contentType: "article" | "scattered-notes" | "none", |
||||||
|
parseLevel: number, |
||||||
|
ndk: NDK |
||||||
|
): { |
||||||
|
tree: PublicationTree; |
||||||
|
indexEvent: NDKEvent | null; |
||||||
|
contentEvents: NDKEvent[]; |
||||||
|
eventStructure: EventStructureNode[]; |
||||||
|
} { |
||||||
|
if (contentType === "scattered-notes" && segments.length > 0) { |
||||||
|
return buildScatteredNotesStructure(segments, ndk); |
||||||
|
} |
||||||
|
|
||||||
|
if (contentType === "article" && title) { |
||||||
|
return buildArticleStructure(segments, title, attributes, parseLevel, ndk); |
||||||
|
} |
||||||
|
|
||||||
|
throw new Error("No valid content found to create publication tree"); |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Build structure for scattered notes (flat 30041 events) |
||||||
|
*/ |
||||||
|
function buildScatteredNotesStructure( |
||||||
|
segments: ContentSegment[], |
||||||
|
ndk: NDK |
||||||
|
): { |
||||||
|
tree: PublicationTree; |
||||||
|
indexEvent: NDKEvent | null; |
||||||
|
contentEvents: NDKEvent[]; |
||||||
|
eventStructure: EventStructureNode[]; |
||||||
|
} { |
||||||
|
const contentEvents: NDKEvent[] = []; |
||||||
|
const eventStructure: EventStructureNode[] = []; |
||||||
|
|
||||||
|
const firstSegment = segments[0]; |
||||||
|
const rootEvent = createContentEvent(firstSegment, ndk); |
||||||
|
const tree = new PublicationTree(rootEvent, ndk); |
||||||
|
contentEvents.push(rootEvent); |
||||||
|
|
||||||
|
eventStructure.push({ |
||||||
|
title: firstSegment.title, |
||||||
|
level: firstSegment.level, |
||||||
|
eventType: "content", |
||||||
|
eventKind: 30041, |
||||||
|
dTag: generateDTag(firstSegment.title), |
||||||
|
children: [] |
||||||
|
}); |
||||||
|
|
||||||
|
// Add remaining segments
|
||||||
|
for (let i = 1; i < segments.length; i++) { |
||||||
|
const contentEvent = createContentEvent(segments[i], ndk); |
||||||
|
contentEvents.push(contentEvent); |
||||||
|
|
||||||
|
eventStructure.push({ |
||||||
|
title: segments[i].title, |
||||||
|
level: segments[i].level, |
||||||
|
eventType: "content", |
||||||
|
eventKind: 30041, |
||||||
|
dTag: generateDTag(segments[i].title), |
||||||
|
children: [] |
||||||
|
}); |
||||||
|
} |
||||||
|
|
||||||
|
return { tree, indexEvent: null, contentEvents, eventStructure }; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Build structure for articles based on parse level |
||||||
|
*/ |
||||||
|
function buildArticleStructure( |
||||||
|
segments: ContentSegment[], |
||||||
|
title: string, |
||||||
|
attributes: Record<string, string>, |
||||||
|
parseLevel: number, |
||||||
|
ndk: NDK |
||||||
|
): { |
||||||
|
tree: PublicationTree; |
||||||
|
indexEvent: NDKEvent | null; |
||||||
|
contentEvents: NDKEvent[]; |
||||||
|
eventStructure: EventStructureNode[]; |
||||||
|
} { |
||||||
|
const indexEvent = createIndexEvent(title, attributes, segments, ndk); |
||||||
|
const tree = new PublicationTree(indexEvent, ndk); |
||||||
|
|
||||||
|
if (parseLevel === 2) { |
||||||
|
return buildLevel2Structure(segments, title, indexEvent, tree, ndk); |
||||||
|
} else { |
||||||
|
return buildHierarchicalStructure(segments, title, indexEvent, tree, parseLevel, ndk); |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Build Level 2 structure: One 30041 event per level 2 section with all nested content |
||||||
|
*/ |
||||||
|
function buildLevel2Structure( |
||||||
|
segments: ContentSegment[], |
||||||
|
title: string, |
||||||
|
indexEvent: NDKEvent, |
||||||
|
tree: PublicationTree, |
||||||
|
ndk: NDK |
||||||
|
): { |
||||||
|
tree: PublicationTree; |
||||||
|
indexEvent: NDKEvent | null; |
||||||
|
contentEvents: NDKEvent[]; |
||||||
|
eventStructure: EventStructureNode[]; |
||||||
|
} { |
||||||
|
const contentEvents: NDKEvent[] = []; |
||||||
|
const eventStructure: EventStructureNode[] = []; |
||||||
|
|
||||||
|
// Add index to structure
|
||||||
|
eventStructure.push({ |
||||||
|
title, |
||||||
|
level: 1, |
||||||
|
eventType: "index", |
||||||
|
eventKind: 30040, |
||||||
|
dTag: generateDTag(title), |
||||||
|
children: [] |
||||||
|
}); |
||||||
|
|
||||||
|
// Group segments by level 2 sections
|
||||||
|
const level2Groups = groupSegmentsByLevel2(segments); |
||||||
|
|
||||||
|
for (const group of level2Groups) { |
||||||
|
const contentEvent = createContentEvent(group, ndk); |
||||||
|
contentEvents.push(contentEvent); |
||||||
|
|
||||||
|
eventStructure[0].children.push({ |
||||||
|
title: group.title, |
||||||
|
level: group.level, |
||||||
|
eventType: "content",
|
||||||
|
eventKind: 30041, |
||||||
|
dTag: generateDTag(group.title), |
||||||
|
children: [] |
||||||
|
}); |
||||||
|
} |
||||||
|
|
||||||
|
return { tree, indexEvent, contentEvents, eventStructure }; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Build hierarchical structure for Level 3+: Mix of 30040 and 30041 events |
||||||
|
*/ |
||||||
|
function buildHierarchicalStructure( |
||||||
|
segments: ContentSegment[], |
||||||
|
title: string, |
||||||
|
indexEvent: NDKEvent, |
||||||
|
tree: PublicationTree, |
||||||
|
parseLevel: number, |
||||||
|
ndk: NDK |
||||||
|
): { |
||||||
|
tree: PublicationTree; |
||||||
|
indexEvent: NDKEvent | null; |
||||||
|
contentEvents: NDKEvent[]; |
||||||
|
eventStructure: EventStructureNode[]; |
||||||
|
} { |
||||||
|
const contentEvents: NDKEvent[] = []; |
||||||
|
const eventStructure: EventStructureNode[] = []; |
||||||
|
|
||||||
|
// Add root index to structure
|
||||||
|
eventStructure.push({ |
||||||
|
title, |
||||||
|
level: 1, |
||||||
|
eventType: "index", |
||||||
|
eventKind: 30040, |
||||||
|
dTag: generateDTag(title), |
||||||
|
children: [] |
||||||
|
}); |
||||||
|
|
||||||
|
// Build hierarchical structure
|
||||||
|
const hierarchy = buildSegmentHierarchy(segments); |
||||||
|
|
||||||
|
for (const level2Section of hierarchy) { |
||||||
|
if (level2Section.hasChildren) { |
||||||
|
// Create 30040 for level 2 section with children
|
||||||
|
const level2Index = createIndexEventForSection(level2Section, ndk); |
||||||
|
contentEvents.push(level2Index); |
||||||
|
|
||||||
|
const level2Node: EventStructureNode = { |
||||||
|
title: level2Section.title, |
||||||
|
level: level2Section.level, |
||||||
|
eventType: "index", |
||||||
|
eventKind: 30040, |
||||||
|
dTag: generateDTag(level2Section.title), |
||||||
|
children: [] |
||||||
|
}; |
||||||
|
|
||||||
|
// Add children as 30041 content events
|
||||||
|
for (const child of level2Section.children) { |
||||||
|
const childEvent = createContentEvent(child, ndk); |
||||||
|
contentEvents.push(childEvent); |
||||||
|
|
||||||
|
level2Node.children.push({ |
||||||
|
title: child.title, |
||||||
|
level: child.level, |
||||||
|
eventType: "content", |
||||||
|
eventKind: 30041, |
||||||
|
dTag: generateDTag(child.title), |
||||||
|
children: [] |
||||||
|
}); |
||||||
|
} |
||||||
|
|
||||||
|
eventStructure[0].children.push(level2Node); |
||||||
|
} else { |
||||||
|
// Create 30041 for level 2 section without children
|
||||||
|
const contentEvent = createContentEvent(level2Section, ndk); |
||||||
|
contentEvents.push(contentEvent); |
||||||
|
|
||||||
|
eventStructure[0].children.push({ |
||||||
|
title: level2Section.title, |
||||||
|
level: level2Section.level, |
||||||
|
eventType: "content", |
||||||
|
eventKind: 30041, |
||||||
|
dTag: generateDTag(level2Section.title), |
||||||
|
children: [] |
||||||
|
}); |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
return { tree, indexEvent, contentEvents, eventStructure }; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Create a 30040 index event from document metadata |
||||||
|
*/ |
||||||
|
function createIndexEvent( |
||||||
|
title: string, |
||||||
|
attributes: Record<string, string>, |
||||||
|
segments: ContentSegment[], |
||||||
|
ndk: NDK |
||||||
|
): NDKEvent { |
||||||
|
const event = new NDKEvent(ndk); |
||||||
|
event.kind = 30040; |
||||||
|
event.created_at = Math.floor(Date.now() / 1000); |
||||||
|
event.pubkey = ndk.activeUser?.pubkey || "preview-placeholder-pubkey"; |
||||||
|
|
||||||
|
const dTag = generateDTag(title); |
||||||
|
const [mTag, MTag] = getMimeTags(30040); |
||||||
|
|
||||||
|
const tags: string[][] = [ |
||||||
|
["d", dTag], |
||||||
|
mTag, |
||||||
|
MTag, |
||||||
|
["title", title] |
||||||
|
]; |
||||||
|
|
||||||
|
// Add document attributes as tags
|
||||||
|
addDocumentAttributesToTags(tags, attributes, event.pubkey); |
||||||
|
|
||||||
|
// Add a-tags for each content section
|
||||||
|
segments.forEach(segment => { |
||||||
|
const sectionDTag = generateDTag(segment.title); |
||||||
|
tags.push(["a", `30041:${event.pubkey}:${sectionDTag}`]); |
||||||
|
}); |
||||||
|
|
||||||
|
event.tags = tags; |
||||||
|
console.log(`[TreeProcessor] Index event tags:`, tags.slice(0, 10)); |
||||||
|
event.content = generateIndexContent(title, segments); |
||||||
|
|
||||||
|
return event; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Create a 30041 content event from segment |
||||||
|
*/ |
||||||
|
function createContentEvent(segment: ContentSegment, ndk: NDK): NDKEvent { |
||||||
|
const event = new NDKEvent(ndk); |
||||||
|
event.kind = 30041; |
||||||
|
event.created_at = Math.floor(Date.now() / 1000); |
||||||
|
event.pubkey = ndk.activeUser?.pubkey || "preview-placeholder-pubkey"; |
||||||
|
|
||||||
|
const dTag = generateDTag(segment.title); |
||||||
|
const [mTag, MTag] = getMimeTags(30041); |
||||||
|
|
||||||
|
const tags: string[][] = [ |
||||||
|
["d", dTag], |
||||||
|
mTag, |
||||||
|
MTag, |
||||||
|
["title", segment.title] |
||||||
|
]; |
||||||
|
|
||||||
|
// Add segment attributes as tags
|
||||||
|
addSectionAttributesToTags(tags, segment.attributes); |
||||||
|
|
||||||
|
event.tags = tags; |
||||||
|
event.content = segment.content; |
||||||
|
|
||||||
|
return event; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Generate default index content |
||||||
|
*/ |
||||||
|
function generateIndexContent(title: string, segments: ContentSegment[]): string { |
||||||
|
return `# ${title} |
||||||
|
|
||||||
|
${segments.length} sections available: |
||||||
|
|
||||||
|
${segments.map((segment, i) => `${i + 1}. ${segment.title}`).join('\n')}`;
|
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Escape regex special characters |
||||||
|
*/ |
||||||
|
function escapeRegex(str: string): string { |
||||||
|
return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Generate deterministic d-tag from title |
||||||
|
*/ |
||||||
|
function generateDTag(title: string): string { |
||||||
|
return title |
||||||
|
.toLowerCase() |
||||||
|
.replace(/[^\p{L}\p{N}]/gu, "-") |
||||||
|
.replace(/-+/g, "-") |
||||||
|
.replace(/^-|-$/g, "") || "untitled"; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Add document attributes as Nostr tags |
||||||
|
*/ |
||||||
|
function addDocumentAttributesToTags( |
||||||
|
tags: string[][], |
||||||
|
attributes: Record<string, string>, |
||||||
|
pubkey: string |
||||||
|
) { |
||||||
|
// Standard metadata
|
||||||
|
if (attributes.author) tags.push(["author", attributes.author]); |
||||||
|
if (attributes.version) tags.push(["version", attributes.version]); |
||||||
|
if (attributes.published) tags.push(["published", attributes.published]); |
||||||
|
if (attributes.language) tags.push(["language", attributes.language]); |
||||||
|
if (attributes.image) tags.push(["image", attributes.image]); |
||||||
|
if (attributes.description) tags.push(["summary", attributes.description]); |
||||||
|
if (attributes.type) tags.push(["type", attributes.type]); |
||||||
|
|
||||||
|
// Tags
|
||||||
|
if (attributes.tags) { |
||||||
|
attributes.tags.split(",").forEach(tag => tags.push(["t", tag.trim()])); |
||||||
|
} |
||||||
|
|
||||||
|
// Add pubkey reference
|
||||||
|
tags.push(["p", pubkey]); |
||||||
|
|
||||||
|
// Custom attributes
|
||||||
|
addCustomAttributes(tags, attributes); |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Add section attributes as tags |
||||||
|
*/ |
||||||
|
function addSectionAttributesToTags( |
||||||
|
tags: string[][], |
||||||
|
attributes: Record<string, string> |
||||||
|
) { |
||||||
|
// Section tags
|
||||||
|
if (attributes.tags) { |
||||||
|
attributes.tags.split(",").forEach(tag => tags.push(["t", tag.trim()])); |
||||||
|
} |
||||||
|
|
||||||
|
// Custom attributes
|
||||||
|
addCustomAttributes(tags, attributes); |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Add custom attributes, filtering out system ones |
||||||
|
*/ |
||||||
|
function addCustomAttributes( |
||||||
|
tags: string[][], |
||||||
|
attributes: Record<string, string> |
||||||
|
) { |
||||||
|
const systemAttributes = [ |
||||||
|
"attribute-undefined", "attribute-missing", "appendix-caption", |
||||||
|
"appendix-refsig", "caution-caption", "chapter-refsig", "example-caption", |
||||||
|
"figure-caption", "important-caption", "last-update-label", "manname-title", |
||||||
|
"note-caption", "part-refsig", "preface-title", "section-refsig", |
||||||
|
"table-caption", "tip-caption", "toc-title", "untitled-label", |
||||||
|
"version-label", "warning-caption", "asciidoctor", "asciidoctor-version", |
||||||
|
"safe-mode-name", "backend", "doctype", "basebackend", "filetype", |
||||||
|
"outfilesuffix", "stylesdir", "iconsdir", "localdate", "localyear", |
||||||
|
"localtime", "localdatetime", "docdate", "docyear", "doctime", |
||||||
|
"docdatetime", "doctitle", "embedded", "notitle", |
||||||
|
// Already handled above
|
||||||
|
"author", "version", "published", "language", "image", "description", |
||||||
|
"tags", "title", "type" |
||||||
|
]; |
||||||
|
|
||||||
|
Object.entries(attributes).forEach(([key, value]) => { |
||||||
|
if (!systemAttributes.includes(key) && value && typeof value === "string") { |
||||||
|
tags.push([key, value]); |
||||||
|
} |
||||||
|
}); |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Group segments by level 2 sections for Level 2 parsing |
||||||
|
* Combines all nested content into each level 2 section |
||||||
|
*/ |
||||||
|
function groupSegmentsByLevel2(segments: ContentSegment[]): ContentSegment[] { |
||||||
|
const level2Groups: ContentSegment[] = []; |
||||||
|
|
||||||
|
// Find all level 2 segments and include their nested content
|
||||||
|
for (const segment of segments) { |
||||||
|
if (segment.level === 2) { |
||||||
|
// Find all content that belongs to this level 2 section
|
||||||
|
const nestedSegments = segments.filter(s =>
|
||||||
|
s.level > 2 &&
|
||||||
|
s.startLine > segment.startLine &&
|
||||||
|
(segments.find(next => next.level <= 2 && next.startLine > segment.startLine)?.startLine || Infinity) > s.startLine |
||||||
|
); |
||||||
|
|
||||||
|
// Combine the level 2 content with all nested content
|
||||||
|
let combinedContent = segment.content; |
||||||
|
for (const nested of nestedSegments) { |
||||||
|
combinedContent += `\n\n${'='.repeat(nested.level)} ${nested.title}\n${nested.content}`; |
||||||
|
} |
||||||
|
|
||||||
|
level2Groups.push({ |
||||||
|
...segment, |
||||||
|
content: combinedContent |
||||||
|
}); |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
return level2Groups; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Build hierarchical segment structure for Level 3+ parsing |
||||||
|
*/ |
||||||
|
function buildSegmentHierarchy(segments: ContentSegment[]): HierarchicalSegment[] { |
||||||
|
const hierarchy: HierarchicalSegment[] = []; |
||||||
|
|
||||||
|
// Process level 2 sections
|
||||||
|
for (const level2Segment of segments.filter(s => s.level === 2)) { |
||||||
|
const children = segments.filter(s =>
|
||||||
|
s.level > 2 &&
|
||||||
|
s.startLine > level2Segment.startLine &&
|
||||||
|
(segments.find(next => next.level <= 2 && next.startLine > level2Segment.startLine)?.startLine || Infinity) > s.startLine |
||||||
|
); |
||||||
|
|
||||||
|
hierarchy.push({ |
||||||
|
...level2Segment, |
||||||
|
hasChildren: children.length > 0, |
||||||
|
children |
||||||
|
}); |
||||||
|
} |
||||||
|
|
||||||
|
return hierarchy; |
||||||
|
} |
||||||
|
|
||||||
|
/** |
||||||
|
* Create a 30040 index event for a section with children |
||||||
|
*/ |
||||||
|
function createIndexEventForSection(section: HierarchicalSegment, ndk: NDK): NDKEvent { |
||||||
|
const event = new NDKEvent(ndk); |
||||||
|
event.kind = 30040; |
||||||
|
event.created_at = Math.floor(Date.now() / 1000); |
||||||
|
event.pubkey = ndk.activeUser?.pubkey || "preview-placeholder-pubkey"; |
||||||
|
|
||||||
|
const dTag = generateDTag(section.title); |
||||||
|
const [mTag, MTag] = getMimeTags(30040); |
||||||
|
|
||||||
|
const tags: string[][] = [ |
||||||
|
["d", dTag], |
||||||
|
mTag, |
||||||
|
MTag, |
||||||
|
["title", section.title] |
||||||
|
]; |
||||||
|
|
||||||
|
// Add section attributes as tags
|
||||||
|
addSectionAttributesToTags(tags, section.attributes); |
||||||
|
|
||||||
|
// Add a-tags for each child content section
|
||||||
|
section.children.forEach(child => { |
||||||
|
const childDTag = generateDTag(child.title); |
||||||
|
tags.push(["a", `30041:${event.pubkey}:${childDTag}`]); |
||||||
|
}); |
||||||
|
|
||||||
|
event.tags = tags; |
||||||
|
event.content = `${section.content}\n\n${section.children.length} subsections available.`; |
||||||
|
|
||||||
|
return event; |
||||||
|
} |
||||||
@ -0,0 +1,284 @@ |
|||||||
|
/** |
||||||
|
* TDD Tests for NKBIP-01 Publication Tree Processor |
||||||
|
*
|
||||||
|
* Tests the iterative parsing function at different hierarchy levels |
||||||
|
* using deep_hierarchy_test.adoc to verify NKBIP-01 compliance. |
||||||
|
*/ |
||||||
|
|
||||||
|
import { describe, it, expect, beforeAll } from 'vitest'; |
||||||
|
import { readFileSync } from 'fs'; |
||||||
|
import { parseAsciiDocWithTree, validateParseLevel, getSupportedParseLevels } from '../../src/lib/utils/asciidoc_publication_parser.js'; |
||||||
|
|
||||||
|
// Mock NDK for testing
|
||||||
|
const mockNDK = { |
||||||
|
activeUser: { |
||||||
|
pubkey: "test-pubkey-12345" |
||||||
|
} |
||||||
|
} as any; |
||||||
|
|
||||||
|
// Read the test document
|
||||||
|
const testDocumentPath = "./test_data/AsciidocFiles/deep_hierarchy_test.adoc"; |
||||||
|
let testContent: string; |
||||||
|
|
||||||
|
try { |
||||||
|
testContent = readFileSync(testDocumentPath, 'utf-8'); |
||||||
|
} catch (error) { |
||||||
|
console.error("Failed to read test document:", error); |
||||||
|
testContent = `= Deep Hierarchical Document Test
|
||||||
|
:tags: testing, hierarchy, structure |
||||||
|
:author: Test Author |
||||||
|
:type: technical |
||||||
|
|
||||||
|
This document tests all 6 levels of AsciiDoc hierarchy to validate our parse level system. |
||||||
|
|
||||||
|
== Level 2: Main Sections |
||||||
|
:tags: level2, main |
||||||
|
|
||||||
|
This is a level 2 section that should appear in all parse levels. |
||||||
|
|
||||||
|
=== Level 3: Subsections
|
||||||
|
:tags: level3, subsection |
||||||
|
|
||||||
|
This is a level 3 section that should appear in parse levels 3-6. |
||||||
|
|
||||||
|
==== Level 4: Sub-subsections |
||||||
|
:tags: level4, detailed |
||||||
|
|
||||||
|
This is a level 4 section that should appear in parse levels 4-6. |
||||||
|
|
||||||
|
===== Level 5: Deep Subsections |
||||||
|
:tags: level5, deep |
||||||
|
|
||||||
|
This is a level 5 section that should only appear in parse levels 5-6. |
||||||
|
|
||||||
|
====== Level 6: Deepest Level |
||||||
|
:tags: level6, deepest |
||||||
|
|
||||||
|
This is a level 6 section that should only appear in parse level 6. |
||||||
|
|
||||||
|
Content at the deepest level of our hierarchy. |
||||||
|
|
||||||
|
== Level 2: Second Main Section |
||||||
|
:tags: level2, main, second |
||||||
|
|
||||||
|
A second main section to ensure we have balanced content at the top level.`;
|
||||||
|
} |
||||||
|
|
||||||
|
describe("NKBIP-01 Publication Tree Processor", () => { |
||||||
|
|
||||||
|
it("should validate parse levels correctly", () => { |
||||||
|
// Test valid parse levels
|
||||||
|
expect(validateParseLevel(2)).toBe(true); |
||||||
|
expect(validateParseLevel(3)).toBe(true); |
||||||
|
expect(validateParseLevel(5)).toBe(true); |
||||||
|
|
||||||
|
// Test invalid parse levels
|
||||||
|
expect(validateParseLevel(1)).toBe(false); |
||||||
|
expect(validateParseLevel(6)).toBe(false); |
||||||
|
expect(validateParseLevel(7)).toBe(false); |
||||||
|
expect(validateParseLevel(2.5)).toBe(false); |
||||||
|
expect(validateParseLevel(-1)).toBe(false); |
||||||
|
|
||||||
|
// Test supported levels array
|
||||||
|
const supportedLevels = getSupportedParseLevels(); |
||||||
|
expect(supportedLevels).toEqual([2, 3, 4, 5]); |
||||||
|
}); |
||||||
|
|
||||||
|
it("should parse Level 2 with NKBIP-01 minimal structure", async () => { |
||||||
|
const result = await parseAsciiDocWithTree(testContent, mockNDK, 2); |
||||||
|
|
||||||
|
// Should be detected as article (has title and sections)
|
||||||
|
expect(result.metadata.contentType).toBe("article"); |
||||||
|
expect(result.metadata.parseLevel).toBe(2); |
||||||
|
expect(result.metadata.title).toBe("Deep Hierarchical Document Test"); |
||||||
|
|
||||||
|
// Should have 1 index event (30040) + 2 content events (30041) for level 2 sections
|
||||||
|
expect(result.indexEvent).toBeDefined(); |
||||||
|
expect(result.indexEvent?.kind).toBe(30040); |
||||||
|
expect(result.contentEvents.length).toBe(2); |
||||||
|
|
||||||
|
// All content events should be kind 30041
|
||||||
|
result.contentEvents.forEach(event => { |
||||||
|
expect(event.kind).toBe(30041); |
||||||
|
}); |
||||||
|
|
||||||
|
// Check titles of level 2 sections
|
||||||
|
const contentTitles = result.contentEvents.map(e =>
|
||||||
|
e.tags.find((t: string[]) => t[0] === "title")?.[1] |
||||||
|
); |
||||||
|
expect(contentTitles).toContain("Level 2: Main Sections"); |
||||||
|
expect(contentTitles).toContain("Level 2: Second Main Section"); |
||||||
|
|
||||||
|
// Content should include all nested subsections as AsciiDoc
|
||||||
|
const firstSectionContent = result.contentEvents[0].content; |
||||||
|
expect(firstSectionContent).toBeDefined(); |
||||||
|
// Should contain level 3, 4, 5 content as nested AsciiDoc markup
|
||||||
|
expect(firstSectionContent.includes("=== Level 3: Subsections")).toBe(true); |
||||||
|
expect(firstSectionContent.includes("==== Level 4: Sub-subsections")).toBe(true); |
||||||
|
expect(firstSectionContent.includes("===== Level 5: Deep Subsections")).toBe(true); |
||||||
|
}); |
||||||
|
|
||||||
|
it("should parse Level 3 with NKBIP-01 intermediate structure", async () => { |
||||||
|
const result = await parseAsciiDocWithTree(testContent, mockNDK, 3); |
||||||
|
|
||||||
|
expect(result.metadata.contentType).toBe("article"); |
||||||
|
expect(result.metadata.parseLevel).toBe(3); |
||||||
|
|
||||||
|
// Should have hierarchical structure
|
||||||
|
expect(result.indexEvent).toBeDefined(); |
||||||
|
expect(result.indexEvent?.kind).toBe(30040); |
||||||
|
|
||||||
|
// Should have mix of 30040 (for level 2 sections with children) and 30041 (for content)
|
||||||
|
const kinds = result.contentEvents.map(e => e.kind); |
||||||
|
expect(kinds).toContain(30040); // Level 2 sections with children
|
||||||
|
expect(kinds).toContain(30041); // Level 3 content sections
|
||||||
|
|
||||||
|
// Level 2 sections with children should be 30040 index events
|
||||||
|
const level2WithChildrenEvents = result.contentEvents.filter(e =>
|
||||||
|
e.kind === 30040 &&
|
||||||
|
e.tags.find((t: string[]) => t[0] === "title")?.[1]?.includes("Level 2:") |
||||||
|
); |
||||||
|
expect(level2WithChildrenEvents.length).toBe(2); // Both level 2 sections have children
|
||||||
|
|
||||||
|
// Should have 30041 events for level 3 content
|
||||||
|
const level3ContentEvents = result.contentEvents.filter(e =>
|
||||||
|
e.kind === 30041 &&
|
||||||
|
e.tags.find((t: string[]) => t[0] === "title")?.[1]?.includes("Level 3:") |
||||||
|
); |
||||||
|
expect(level3ContentEvents.length).toBeGreaterThan(0); |
||||||
|
}); |
||||||
|
|
||||||
|
it("should parse Level 4 with NKBIP-01 detailed structure", async () => { |
||||||
|
const result = await parseAsciiDocWithTree(testContent, mockNDK, 4); |
||||||
|
|
||||||
|
expect(result.metadata.contentType).toBe("article"); |
||||||
|
expect(result.metadata.parseLevel).toBe(4); |
||||||
|
|
||||||
|
// Should have hierarchical structure with mix of 30040 and 30041 events
|
||||||
|
expect(result.indexEvent).toBeDefined(); |
||||||
|
expect(result.indexEvent?.kind).toBe(30040); |
||||||
|
|
||||||
|
const kinds = result.contentEvents.map(e => e.kind); |
||||||
|
expect(kinds).toContain(30040); // Level 2 sections with children
|
||||||
|
expect(kinds).toContain(30041); // Content sections
|
||||||
|
|
||||||
|
// Check that we have level 4 content sections
|
||||||
|
const contentTitles = result.contentEvents.map(e =>
|
||||||
|
e.tags.find((t: string[]) => t[0] === "title")?.[1] |
||||||
|
); |
||||||
|
expect(contentTitles).toContain("Level 4: Sub-subsections"); |
||||||
|
}); |
||||||
|
|
||||||
|
it("should parse Level 5 with NKBIP-01 maximum depth", async () => { |
||||||
|
const result = await parseAsciiDocWithTree(testContent, mockNDK, 5); |
||||||
|
|
||||||
|
expect(result.metadata.contentType).toBe("article"); |
||||||
|
expect(result.metadata.parseLevel).toBe(5); |
||||||
|
|
||||||
|
// Should have hierarchical structure
|
||||||
|
expect(result.indexEvent).toBeDefined(); |
||||||
|
expect(result.indexEvent?.kind).toBe(30040); |
||||||
|
|
||||||
|
// Should include level 5 sections as content events
|
||||||
|
const contentTitles = result.contentEvents.map(e =>
|
||||||
|
e.tags.find((t: string[]) => t[0] === "title")?.[1] |
||||||
|
); |
||||||
|
expect(contentTitles).toContain("Level 5: Deep Subsections"); |
||||||
|
}); |
||||||
|
|
||||||
|
it("should validate event structure correctly", async () => { |
||||||
|
const result = await parseAsciiDocWithTree(testContent, mockNDK, 3); |
||||||
|
|
||||||
|
// Test index event structure
|
||||||
|
expect(result.indexEvent).toBeDefined(); |
||||||
|
expect(result.indexEvent?.kind).toBe(30040); |
||||||
|
expect(result.indexEvent?.tags).toBeDefined(); |
||||||
|
|
||||||
|
// Check required tags
|
||||||
|
const indexTags = result.indexEvent!.tags; |
||||||
|
const dTag = indexTags.find((t: string[]) => t[0] === "d"); |
||||||
|
const titleTag = indexTags.find((t: string[]) => t[0] === "title"); |
||||||
|
|
||||||
|
expect(dTag).toBeDefined(); |
||||||
|
expect(titleTag).toBeDefined(); |
||||||
|
expect(titleTag![1]).toBe("Deep Hierarchical Document Test"); |
||||||
|
|
||||||
|
// Test content events structure - mix of 30040 and 30041
|
||||||
|
result.contentEvents.forEach(event => { |
||||||
|
expect([30040, 30041]).toContain(event.kind); |
||||||
|
expect(event.tags).toBeDefined(); |
||||||
|
expect(event.content).toBeDefined(); |
||||||
|
|
||||||
|
const eventTitleTag = event.tags.find((t: string[]) => t[0] === "title"); |
||||||
|
expect(eventTitleTag).toBeDefined(); |
||||||
|
}); |
||||||
|
}); |
||||||
|
|
||||||
|
it("should preserve content as AsciiDoc", async () => { |
||||||
|
const result = await parseAsciiDocWithTree(testContent, mockNDK, 2); |
||||||
|
|
||||||
|
// Content should be preserved as original AsciiDoc, not converted to HTML
|
||||||
|
const firstEvent = result.contentEvents[0]; |
||||||
|
expect(firstEvent.content).toBeDefined(); |
||||||
|
|
||||||
|
// Should contain AsciiDoc markup, not HTML
|
||||||
|
expect(firstEvent.content.includes("<")).toBe(false); |
||||||
|
expect(firstEvent.content.includes("===")).toBe(true); |
||||||
|
}); |
||||||
|
|
||||||
|
it("should handle attributes correctly", async () => { |
||||||
|
const result = await parseAsciiDocWithTree(testContent, mockNDK, 2); |
||||||
|
|
||||||
|
// Document-level attributes should be in index event
|
||||||
|
expect(result.indexEvent).toBeDefined(); |
||||||
|
const indexTags = result.indexEvent!.tags; |
||||||
|
|
||||||
|
// Check for document attributes
|
||||||
|
const authorTag = indexTags.find((t: string[]) => t[0] === "author"); |
||||||
|
const typeTag = indexTags.find((t: string[]) => t[0] === "type"); |
||||||
|
const tagsTag = indexTags.find((t: string[]) => t[0] === "t"); |
||||||
|
|
||||||
|
expect(authorTag?.[1]).toBe("Test Author"); |
||||||
|
expect(typeTag?.[1]).toBe("technical"); |
||||||
|
expect(tagsTag).toBeDefined(); // Should have at least one t-tag
|
||||||
|
}); |
||||||
|
|
||||||
|
it("should handle scattered notes mode", async () => { |
||||||
|
// Test with content that has no document title (scattered notes)
|
||||||
|
const scatteredContent = `== First Note
|
||||||
|
:tags: note1 |
||||||
|
|
||||||
|
Content of first note. |
||||||
|
|
||||||
|
== Second Note
|
||||||
|
:tags: note2 |
||||||
|
|
||||||
|
Content of second note.`;
|
||||||
|
|
||||||
|
const result = await parseAsciiDocWithTree(scatteredContent, mockNDK, 2); |
||||||
|
|
||||||
|
expect(result.metadata.contentType).toBe("scattered-notes"); |
||||||
|
expect(result.indexEvent).toBeNull(); // No index event for scattered notes
|
||||||
|
expect(result.contentEvents.length).toBe(2); |
||||||
|
|
||||||
|
// All events should be 30041 content events
|
||||||
|
result.contentEvents.forEach(event => { |
||||||
|
expect(event.kind).toBe(30041); |
||||||
|
}); |
||||||
|
}); |
||||||
|
|
||||||
|
it("should integrate with PublicationTree structure", async () => { |
||||||
|
const result = await parseAsciiDocWithTree(testContent, mockNDK, 2); |
||||||
|
|
||||||
|
// Should have a PublicationTree instance
|
||||||
|
expect(result.tree).toBeDefined(); |
||||||
|
|
||||||
|
// Tree should have methods for event management
|
||||||
|
expect(typeof result.tree.addEvent).toBe("function"); |
||||||
|
|
||||||
|
// Event structure should be populated
|
||||||
|
expect(result.metadata.eventStructure).toBeDefined(); |
||||||
|
expect(Array.isArray(result.metadata.eventStructure)).toBe(true); |
||||||
|
}); |
||||||
|
|
||||||
|
}); |
||||||
Loading…
Reference in new issue