Browse Source

feat: Implement NKBIP-01 hierarchical tree processor with Asciidoctor extension

Successfully implemented proper Asciidoctor tree processor extension for NKBIP-01
hierarchical parsing with comprehensive test coverage and future integration plan.

Features:
- Real Asciidoctor tree processor extension using registry.treeProcessor()
- NKBIP-01 compliant hierarchical structure (30040 index + 30041 content events)
- Parse levels 2-5 with different event granularities:
  * Level 2: One 30041 per level 2 section (contains all nested content)
  * Level 3+: Mix of 30040 (sections with children) + 30041 (content sections)
- Content preserved as original AsciiDoc markup
- Comprehensive test suite validating all parse levels and event structures

Implementation:
- src/lib/utils/publication_tree_processor.ts: Core tree processor extension
- src/lib/utils/asciidoc_publication_parser.ts: Unified parser interface
- tests/unit/publication_tree_processor.test.ts: Complete test coverage
- HIERARCHY_VISUALIZATION_PLAN.md: Next phase integration plan

Next: Integrate into ZettelEditor with visual hierarchy indicators

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
master
limina1 7 months ago
parent
commit
6e4cccf660
  1. 163
      HIERARCHY_VISUALIZATION_PLAN.md
  2. 144
      src/lib/utils/asciidoc_publication_parser.ts
  3. 829
      src/lib/utils/publication_tree_processor.ts
  4. 284
      tests/unit/publication_tree_processor.test.ts

163
HIERARCHY_VISUALIZATION_PLAN.md

@ -0,0 +1,163 @@ @@ -0,0 +1,163 @@
# Hierarchy Visualization Integration Plan
## Current State: NKBIP-01 Tree Processor Complete ✅
We have successfully implemented a proper Asciidoctor tree processor extension that:
- Registers as a real Asciidoctor extension using `registry.treeProcessor()`
- Processes documents during AST parsing with full access to `doc.getSections()`
- Implements hierarchical NKBIP-01 structure with 30040/30041 events
- Supports parse levels 2-5 with different event granularities
- Passes comprehensive tests validating the hierarchical structure
## Next Phase: ZettelEditor Integration
### Overview
Integrate the new hierarchical parser into ZettelEditor with visual hierarchy hints that show users exactly which sections will become which types of events at different parse levels - like text editor indent guides but for Nostr event structure.
### Phase 1: Core Integration (Essential)
#### 1.1 Update ZettelEditor Parser
- **Current**: Uses old `publication_tree_factory.ts` with flattened AST parsing
- **Target**: Switch to new `asciidoc_publication_parser.ts` with tree processor
- **Impact**: Enables real hierarchical 30040/30041 event structure
```typescript
// Change from:
import { createPublicationTreeFromContent } from "$lib/utils/publication_tree_factory";
// To:
import { parseAsciiDocWithTree } from "$lib/utils/asciidoc_publication_parser";
```
#### 1.2 Fix Parse Level Configuration
- Update `MAX_PARSE_LEVEL` from 6 to 5 in ZettelEditor.svelte:43
- Update parse level options to reflect new hierarchical structure descriptions
#### 1.3 Update Preview Panel
- Leverage `publicationResult.metadata.eventStructure` for accurate hierarchy display
- Show 30040 vs 30041 event types with different visual indicators
- Display parent-child relationships between index and content events
### Phase 2: Visual Hierarchy Indicators (High Impact)
#### 2.1 Editor Gutter Visualization
Add visual hints in the editor showing which sections will become events:
**Event Type Indicators:**
- 🔵 **Blue circle**: Sections that become 30040 index events
- 🟢 **Green circle**: Sections that become 30041 content events
- 📝 **Text label**: "Index" or "Content" next to each section
**Parse Level Boundaries:**
- **Colored left border**: Different colors for each hierarchy level
- **Indent guides**: Visual lines showing nested structure
- **Level badges**: Small "L2", "L3", etc. indicators
#### 2.2 Real-time Parse Level Feedback
As user changes parse level dropdown:
- **Highlight changes**: Animate sections that change event type
- **Event count updates**: Show before/after event counts
- **Structure preview**: Mini-tree view showing resulting hierarchy
#### 2.3 Interactive Section Mapping
- **Hover effects**: Hover over section → highlight corresponding event in preview
- **Click navigation**: Click section title → jump to event preview
- **Relationship lines**: Visual connections between 30040 and their 30041 children
### Phase 3: Advanced Hierarchy Features (Polish)
#### 3.1 Smart Parse Level Suggestions
- **Auto-detect optimal level**: Analyze document structure and suggest best parse level
- **Level comparison**: Side-by-side view of different parse levels
- **Performance hints**: Show trade-offs (fewer vs more events)
#### 3.2 Enhanced Editor Features
- **Section folding**: Collapse/expand based on hierarchy
- **Quick level promotion**: Buttons to promote/demote section levels
- **Hierarchy outline**: Collapsible tree view in sidebar
#### 3.3 Event Relationship Visualization
- **Tree diagram**: Visual representation of 30040 → 30041 relationships
- **Event flow**: Show how events will be published and linked
- **Validation**: Check for proper NKBIP-01 compliance
### Phase 4: Advanced Interactions (Future)
#### 4.1 Drag & Drop Hierarchy Editing
- Drag sections to change hierarchy
- Visual feedback for valid drop targets
- Auto-update AsciiDoc markup
#### 4.2 Multi-level Preview
- Split preview showing multiple parse levels simultaneously
- Compare different parsing strategies
- Export options for different levels
## Technical Implementation Notes
### Key Data Structures
```typescript
// eventStructure provides complete hierarchy information
interface EventStructureNode {
title: string;
level: number;
eventType: "index" | "content";
eventKind: 30040 | 30041;
dTag: string;
children: EventStructureNode[];
}
```
### Integration Points
1. **Parser integration**: `parseAsciiDocWithTree()` in reactive effect
2. **Event structure**: Use `result.metadata.eventStructure` for visualization
3. **Real-time updates**: Svelte reactivity for immediate visual feedback
4. **Preview sync**: Coordinate editor and preview panel highlights
### CSS Hierarchy Indicators
```css
.section-level-2 { border-left: 4px solid #3b82f6; } /* Blue */
.section-level-3 { border-left: 4px solid #10b981; } /* Green */
.section-level-4 { border-left: 4px solid #f59e0b; } /* Amber */
.section-level-5 { border-left: 4px solid #8b5cf6; } /* Purple */
.event-type-index { background: rgba(59, 130, 246, 0.1); } /* Light blue */
.event-type-content { background: rgba(16, 185, 129, 0.1); } /* Light green */
```
## Success Metrics
### Phase 1 (Essential)
- [ ] ZettelEditor uses new tree processor
- [ ] All existing functionality preserved
- [ ] Hierarchical events display correctly
### Phase 2 (High Impact)
- [ ] Visual hierarchy indicators in editor
- [ ] Real-time parse level feedback
- [ ] Clear 30040 vs 30041 distinction
### Phase 3 (Polish)
- [ ] Smart parse level suggestions
- [ ] Enhanced editor interactions
- [ ] Event relationship visualization
## Migration Strategy
1. **Gradual rollout**: Implement phases sequentially
2. **Fallback compatibility**: Keep old factory as backup during transition
3. **User testing**: Validate hierarchy visualization with real users
4. **Performance monitoring**: Ensure real-time updates remain smooth
## Dependencies
- ✅ **NKBIP-01 tree processor**: Complete and tested
- ✅ **Parse level validation**: Levels 2-5 supported
- ✅ **Event structure metadata**: Available in `eventStructure` field
- ⏳ **ZettelEditor integration**: Next phase
- ⏳ **Visual design system**: Colors, icons, animations
---
**Ready to proceed with Phase 1: Core Integration**
The foundation is solid - we have a working tree processor extension that generates proper hierarchical NKBIP-01 events. Now we need to integrate it into the editor interface and add the visual hierarchy indicators that will make the event structure clear to users.

144
src/lib/utils/asciidoc_publication_parser.ts

@ -0,0 +1,144 @@ @@ -0,0 +1,144 @@
/**
* Unified AsciiDoc Publication Parser
*
* Single entry point for parsing AsciiDoc content into NKBIP-01 compliant
* publication trees using proper Asciidoctor tree processor extensions.
*
* This implements Michael's vision of using PublicationTree as the primary
* data structure for organizing hierarchical Nostr events.
*/
import Asciidoctor from "asciidoctor";
import { registerPublicationTreeProcessor, type ProcessorResult } from "./publication_tree_processor";
import type NDK from "@nostr-dev-kit/ndk";
export type PublicationTreeResult = ProcessorResult;
/**
* Parse AsciiDoc content into a PublicationTree using tree processor extension
* This is the main entry point for all parsing operations
*/
export async function parseAsciiDocWithTree(
content: string,
ndk: NDK,
parseLevel: number = 2
): Promise<PublicationTreeResult> {
console.log(`[Parser] Starting parse at level ${parseLevel}`);
// Create fresh Asciidoctor instance
const asciidoctor = Asciidoctor();
const registry = asciidoctor.Extensions.create();
// Register our tree processor extension
const processorAccessor = registerPublicationTreeProcessor(
registry,
ndk,
parseLevel,
content
);
try {
// Parse the document with our extension
const doc = asciidoctor.load(content, {
extension_registry: registry,
standalone: false,
attributes: {
sectids: false
}
});
console.log(`[Parser] Document converted successfully`);
// Get the result from our processor
const result = processorAccessor.getResult();
if (!result) {
throw new Error("Tree processor failed to generate result");
}
// Build async relationships in the PublicationTree
await buildTreeRelationships(result);
console.log(`[Parser] Tree relationships built successfully`);
return result;
} catch (error) {
console.error('[Parser] Error during parsing:', error);
throw new Error(`Failed to parse AsciiDoc content: ${error instanceof Error ? error.message : 'Unknown error'}`);
}
}
/**
* Build async relationships in the PublicationTree
* This adds content events to the tree structure as Michael envisioned
*/
async function buildTreeRelationships(result: ProcessorResult): Promise<void> {
const { tree, indexEvent, contentEvents } = result;
if (!tree) {
throw new Error("No tree available to build relationships");
}
try {
// Add content events to the tree
if (indexEvent && contentEvents.length > 0) {
// Article structure: add all content events to index
for (const contentEvent of contentEvents) {
await tree.addEvent(contentEvent, indexEvent);
}
} else if (contentEvents.length > 1) {
// Scattered notes: add remaining events to first event
const rootEvent = contentEvents[0];
for (let i = 1; i < contentEvents.length; i++) {
await tree.addEvent(contentEvents[i], rootEvent);
}
}
console.log(`[Parser] Added ${contentEvents.length} events to tree`);
} catch (error) {
console.error('[Parser] Error building tree relationships:', error);
throw error;
}
}
/**
* Export events from PublicationTree for publishing workflow compatibility
*/
export function exportEventsFromTree(result: PublicationTreeResult) {
return {
indexEvent: result.indexEvent ? eventToPublishableObject(result.indexEvent) : undefined,
contentEvents: result.contentEvents.map(eventToPublishableObject),
tree: result.tree
};
}
/**
* Convert NDKEvent to publishable object format
*/
function eventToPublishableObject(event: any) {
return {
kind: event.kind,
content: event.content,
tags: event.tags,
created_at: event.created_at,
pubkey: event.pubkey,
id: event.id,
title: event.tags.find((t: string[]) => t[0] === "title")?.[1] || "Untitled"
};
}
/**
* Validate parse level parameter
*/
export function validateParseLevel(level: number): boolean {
return Number.isInteger(level) && level >= 2 && level <= 5;
}
/**
* Get supported parse levels
*/
export function getSupportedParseLevels(): number[] {
return [2, 3, 4, 5];
}

829
src/lib/utils/publication_tree_processor.ts

@ -0,0 +1,829 @@ @@ -0,0 +1,829 @@
/**
* NKBIP-01 Compliant Publication Tree Processor
*
* Implements proper Asciidoctor tree processor extension pattern for building
* PublicationTree structures during document parsing. Supports iterative parsing
* at different hierarchy levels (2-7) as defined in NKBIP-01 specification.
*/
import type { Document, Registry } from "asciidoctor";
import { PublicationTree } from "$lib/data_structures/publication_tree";
import { NDKEvent } from "@nostr-dev-kit/ndk";
import type NDK from "@nostr-dev-kit/ndk";
import { getMimeTags } from "$lib/utils/mime";
export interface ProcessorResult {
tree: PublicationTree;
indexEvent: NDKEvent | null;
contentEvents: NDKEvent[];
metadata: {
title: string;
totalSections: number;
contentType: "article" | "scattered-notes" | "none";
attributes: Record<string, string>;
parseLevel: number;
eventStructure: EventStructureNode[];
};
}
export interface EventStructureNode {
title: string;
level: number;
eventType: "index" | "content";
eventKind: 30040 | 30041;
dTag: string;
children: EventStructureNode[];
}
interface ContentSegment {
title: string;
content: string;
level: number;
attributes: Record<string, string>;
startLine: number;
endLine: number;
}
interface HierarchicalSegment extends ContentSegment {
hasChildren: boolean;
children: ContentSegment[];
}
/**
* Register the PublicationTree processor extension with Asciidoctor
* This follows the official extension pattern exactly as provided by the user
*/
export function registerPublicationTreeProcessor(
registry: Registry,
ndk: NDK,
parseLevel: number = 2,
originalContent: string
): { getResult: () => ProcessorResult | null } {
let processorResult: ProcessorResult | null = null;
registry.treeProcessor(function() {
const self = this;
self.process(function(doc: Document) {
try {
// Extract document metadata from AST
const title = doc.getTitle() || '';
const attributes = doc.getAttributes();
const sections = doc.getSections();
console.log(`[TreeProcessor] Document attributes:`, {
tags: attributes.tags,
author: attributes.author,
type: attributes.type
});
console.log(`[TreeProcessor] Processing document: "${title}" at parse level ${parseLevel}`);
console.log(`[TreeProcessor] Found ${sections.length} top-level sections`);
// Extract content segments from original text based on parse level
const contentSegments = extractContentSegments(originalContent, sections, parseLevel);
console.log(`[TreeProcessor] Extracted ${contentSegments.length} content segments for level ${parseLevel}`);
// Determine content type based on structure
const contentType = detectContentType(title, contentSegments);
console.log(`[TreeProcessor] Detected content type: ${contentType}`);
// Build events and tree structure
const { tree, indexEvent, contentEvents, eventStructure } = buildEventsFromSegments(
contentSegments,
title,
attributes,
contentType,
parseLevel,
ndk
);
processorResult = {
tree,
indexEvent,
contentEvents,
metadata: {
title,
totalSections: contentSegments.length,
contentType,
attributes,
parseLevel,
eventStructure
}
};
console.log(`[TreeProcessor] Built tree with ${contentEvents.length} content events and ${indexEvent ? '1' : '0'} index events`);
} catch (error) {
console.error('[TreeProcessor] Error processing document:', error);
processorResult = null;
}
return doc;
});
});
return {
getResult: () => processorResult
};
}
/**
* Extract content segments from original text based on parse level
* This is the core iterative function that handles different hierarchy depths
*/
function extractContentSegments(
originalContent: string,
sections: any[],
parseLevel: number
): ContentSegment[] {
const lines = originalContent.split('\n');
// Build hierarchy map from AST
const sectionHierarchy = buildSectionHierarchy(sections);
// Debug: Show hierarchy depths
console.log(`[TreeProcessor] Section hierarchy depth analysis:`);
function showDepth(nodes: SectionNode[], depth = 0) {
for (const node of nodes) {
console.log(`${' '.repeat(depth)}Level ${node.level}: ${node.title}`);
if (node.children.length > 0) {
showDepth(node.children, depth + 1);
}
}
}
showDepth(sectionHierarchy);
// Extract segments at the target parse level
return extractSegmentsAtLevel(lines, sectionHierarchy, parseLevel);
}
/**
* Build hierarchical section structure from Asciidoctor AST
*/
function buildSectionHierarchy(sections: any[]): SectionNode[] {
function buildNode(section: any): SectionNode {
return {
title: section.getTitle(),
level: section.getLevel() + 1, // Convert to app level (Asciidoctor uses 0-based)
attributes: section.getAttributes() || {},
children: (section.getSections() || []).map(buildNode)
};
}
return sections.map(buildNode);
}
interface SectionNode {
title: string;
level: number;
attributes: Record<string, string>;
children: SectionNode[];
}
/**
* Extract content segments at the specified parse level
* This implements the iterative parsing logic for different levels
*/
function extractSegmentsAtLevel(
lines: string[],
hierarchy: SectionNode[],
parseLevel: number
): ContentSegment[] {
const segments: ContentSegment[] = [];
// Collect all sections at the target parse level
const targetSections = collectSectionsAtLevel(hierarchy, parseLevel);
for (const section of targetSections) {
const segment = extractSegmentContent(lines, section, parseLevel);
if (segment) {
segments.push(segment);
}
}
return segments;
}
/**
* Recursively collect sections at or above the specified level
* NKBIP-01: Level N parsing includes sections from level 2 through level N
*/
function collectSectionsAtLevel(hierarchy: SectionNode[], targetLevel: number): SectionNode[] {
const collected: SectionNode[] = [];
function traverse(nodes: SectionNode[]) {
for (const node of nodes) {
// Include sections from level 2 up to target level
if (node.level >= 2 && node.level <= targetLevel) {
collected.push(node);
}
// Continue traversing children to find more sections
if (node.children.length > 0) {
traverse(node.children);
}
}
}
traverse(hierarchy);
return collected;
}
/**
* Extract content for a specific section from the original text
*/
function extractSegmentContent(
lines: string[],
section: SectionNode,
parseLevel: number
): ContentSegment | null {
// Find the section header in the original content
const sectionPattern = new RegExp(`^${'='.repeat(section.level)}\\s+${escapeRegex(section.title)}`);
let startIdx = -1;
for (let i = 0; i < lines.length; i++) {
if (sectionPattern.test(lines[i])) {
startIdx = i;
break;
}
}
if (startIdx === -1) {
console.warn(`[TreeProcessor] Could not find section "${section.title}" at level ${section.level}`);
return null;
}
// Find the end of this section
let endIdx = lines.length;
for (let i = startIdx + 1; i < lines.length; i++) {
const levelMatch = lines[i].match(/^(=+)\s+/);
if (levelMatch && levelMatch[1].length <= section.level) {
endIdx = i;
break;
}
}
// Extract section content
const sectionLines = lines.slice(startIdx, endIdx);
// Parse attributes and content
const { attributes, content } = parseSegmentContent(sectionLines, parseLevel);
return {
title: section.title,
content,
level: section.level,
attributes,
startLine: startIdx,
endLine: endIdx
};
}
/**
* Parse attributes and content from section lines
*/
function parseSegmentContent(sectionLines: string[], parseLevel: number): {
attributes: Record<string, string>;
content: string;
} {
const attributes: Record<string, string> = {};
let contentStartIdx = 1; // Skip the title line
// Look for attribute lines after the title
for (let i = 1; i < sectionLines.length; i++) {
const line = sectionLines[i].trim();
if (line.startsWith(':') && line.includes(':')) {
const match = line.match(/^:([^:]+):\\s*(.*)$/);
if (match) {
attributes[match[1]] = match[2];
contentStartIdx = i + 1;
}
} else if (line !== '') {
// Non-empty, non-attribute line - content starts here
break;
}
}
// Extract content (everything after attributes)
const content = sectionLines.slice(contentStartIdx).join('\n').trim();
return { attributes, content };
}
/**
* Detect content type based on document structure
*/
function detectContentType(
title: string,
segments: ContentSegment[]
): "article" | "scattered-notes" | "none" {
const hasDocTitle = !!title;
const hasSections = segments.length > 0;
// Check if the title matches the first section title
const titleMatchesFirstSection = segments.length > 0 && title === segments[0].title;
if (hasDocTitle && hasSections && !titleMatchesFirstSection) {
return "article";
} else if (hasSections) {
return "scattered-notes";
}
return "none";
}
/**
* Build events and tree structure from content segments
* Implements NKBIP-01 hierarchical parsing:
* - Level 2: One 30041 event per level 2 section containing all nested content
* - Level 3+: Hierarchical 30040 events for intermediate sections + 30041 for content-only
*/
function buildEventsFromSegments(
segments: ContentSegment[],
title: string,
attributes: Record<string, string>,
contentType: "article" | "scattered-notes" | "none",
parseLevel: number,
ndk: NDK
): {
tree: PublicationTree;
indexEvent: NDKEvent | null;
contentEvents: NDKEvent[];
eventStructure: EventStructureNode[];
} {
if (contentType === "scattered-notes" && segments.length > 0) {
return buildScatteredNotesStructure(segments, ndk);
}
if (contentType === "article" && title) {
return buildArticleStructure(segments, title, attributes, parseLevel, ndk);
}
throw new Error("No valid content found to create publication tree");
}
/**
* Build structure for scattered notes (flat 30041 events)
*/
function buildScatteredNotesStructure(
segments: ContentSegment[],
ndk: NDK
): {
tree: PublicationTree;
indexEvent: NDKEvent | null;
contentEvents: NDKEvent[];
eventStructure: EventStructureNode[];
} {
const contentEvents: NDKEvent[] = [];
const eventStructure: EventStructureNode[] = [];
const firstSegment = segments[0];
const rootEvent = createContentEvent(firstSegment, ndk);
const tree = new PublicationTree(rootEvent, ndk);
contentEvents.push(rootEvent);
eventStructure.push({
title: firstSegment.title,
level: firstSegment.level,
eventType: "content",
eventKind: 30041,
dTag: generateDTag(firstSegment.title),
children: []
});
// Add remaining segments
for (let i = 1; i < segments.length; i++) {
const contentEvent = createContentEvent(segments[i], ndk);
contentEvents.push(contentEvent);
eventStructure.push({
title: segments[i].title,
level: segments[i].level,
eventType: "content",
eventKind: 30041,
dTag: generateDTag(segments[i].title),
children: []
});
}
return { tree, indexEvent: null, contentEvents, eventStructure };
}
/**
* Build structure for articles based on parse level
*/
function buildArticleStructure(
segments: ContentSegment[],
title: string,
attributes: Record<string, string>,
parseLevel: number,
ndk: NDK
): {
tree: PublicationTree;
indexEvent: NDKEvent | null;
contentEvents: NDKEvent[];
eventStructure: EventStructureNode[];
} {
const indexEvent = createIndexEvent(title, attributes, segments, ndk);
const tree = new PublicationTree(indexEvent, ndk);
if (parseLevel === 2) {
return buildLevel2Structure(segments, title, indexEvent, tree, ndk);
} else {
return buildHierarchicalStructure(segments, title, indexEvent, tree, parseLevel, ndk);
}
}
/**
* Build Level 2 structure: One 30041 event per level 2 section with all nested content
*/
function buildLevel2Structure(
segments: ContentSegment[],
title: string,
indexEvent: NDKEvent,
tree: PublicationTree,
ndk: NDK
): {
tree: PublicationTree;
indexEvent: NDKEvent | null;
contentEvents: NDKEvent[];
eventStructure: EventStructureNode[];
} {
const contentEvents: NDKEvent[] = [];
const eventStructure: EventStructureNode[] = [];
// Add index to structure
eventStructure.push({
title,
level: 1,
eventType: "index",
eventKind: 30040,
dTag: generateDTag(title),
children: []
});
// Group segments by level 2 sections
const level2Groups = groupSegmentsByLevel2(segments);
for (const group of level2Groups) {
const contentEvent = createContentEvent(group, ndk);
contentEvents.push(contentEvent);
eventStructure[0].children.push({
title: group.title,
level: group.level,
eventType: "content",
eventKind: 30041,
dTag: generateDTag(group.title),
children: []
});
}
return { tree, indexEvent, contentEvents, eventStructure };
}
/**
* Build hierarchical structure for Level 3+: Mix of 30040 and 30041 events
*/
function buildHierarchicalStructure(
segments: ContentSegment[],
title: string,
indexEvent: NDKEvent,
tree: PublicationTree,
parseLevel: number,
ndk: NDK
): {
tree: PublicationTree;
indexEvent: NDKEvent | null;
contentEvents: NDKEvent[];
eventStructure: EventStructureNode[];
} {
const contentEvents: NDKEvent[] = [];
const eventStructure: EventStructureNode[] = [];
// Add root index to structure
eventStructure.push({
title,
level: 1,
eventType: "index",
eventKind: 30040,
dTag: generateDTag(title),
children: []
});
// Build hierarchical structure
const hierarchy = buildSegmentHierarchy(segments);
for (const level2Section of hierarchy) {
if (level2Section.hasChildren) {
// Create 30040 for level 2 section with children
const level2Index = createIndexEventForSection(level2Section, ndk);
contentEvents.push(level2Index);
const level2Node: EventStructureNode = {
title: level2Section.title,
level: level2Section.level,
eventType: "index",
eventKind: 30040,
dTag: generateDTag(level2Section.title),
children: []
};
// Add children as 30041 content events
for (const child of level2Section.children) {
const childEvent = createContentEvent(child, ndk);
contentEvents.push(childEvent);
level2Node.children.push({
title: child.title,
level: child.level,
eventType: "content",
eventKind: 30041,
dTag: generateDTag(child.title),
children: []
});
}
eventStructure[0].children.push(level2Node);
} else {
// Create 30041 for level 2 section without children
const contentEvent = createContentEvent(level2Section, ndk);
contentEvents.push(contentEvent);
eventStructure[0].children.push({
title: level2Section.title,
level: level2Section.level,
eventType: "content",
eventKind: 30041,
dTag: generateDTag(level2Section.title),
children: []
});
}
}
return { tree, indexEvent, contentEvents, eventStructure };
}
/**
* Create a 30040 index event from document metadata
*/
function createIndexEvent(
title: string,
attributes: Record<string, string>,
segments: ContentSegment[],
ndk: NDK
): NDKEvent {
const event = new NDKEvent(ndk);
event.kind = 30040;
event.created_at = Math.floor(Date.now() / 1000);
event.pubkey = ndk.activeUser?.pubkey || "preview-placeholder-pubkey";
const dTag = generateDTag(title);
const [mTag, MTag] = getMimeTags(30040);
const tags: string[][] = [
["d", dTag],
mTag,
MTag,
["title", title]
];
// Add document attributes as tags
addDocumentAttributesToTags(tags, attributes, event.pubkey);
// Add a-tags for each content section
segments.forEach(segment => {
const sectionDTag = generateDTag(segment.title);
tags.push(["a", `30041:${event.pubkey}:${sectionDTag}`]);
});
event.tags = tags;
console.log(`[TreeProcessor] Index event tags:`, tags.slice(0, 10));
event.content = generateIndexContent(title, segments);
return event;
}
/**
* Create a 30041 content event from segment
*/
function createContentEvent(segment: ContentSegment, ndk: NDK): NDKEvent {
const event = new NDKEvent(ndk);
event.kind = 30041;
event.created_at = Math.floor(Date.now() / 1000);
event.pubkey = ndk.activeUser?.pubkey || "preview-placeholder-pubkey";
const dTag = generateDTag(segment.title);
const [mTag, MTag] = getMimeTags(30041);
const tags: string[][] = [
["d", dTag],
mTag,
MTag,
["title", segment.title]
];
// Add segment attributes as tags
addSectionAttributesToTags(tags, segment.attributes);
event.tags = tags;
event.content = segment.content;
return event;
}
/**
* Generate default index content
*/
function generateIndexContent(title: string, segments: ContentSegment[]): string {
return `# ${title}
${segments.length} sections available:
${segments.map((segment, i) => `${i + 1}. ${segment.title}`).join('\n')}`;
}
/**
* Escape regex special characters
*/
function escapeRegex(str: string): string {
return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}
/**
* Generate deterministic d-tag from title
*/
function generateDTag(title: string): string {
return title
.toLowerCase()
.replace(/[^\p{L}\p{N}]/gu, "-")
.replace(/-+/g, "-")
.replace(/^-|-$/g, "") || "untitled";
}
/**
* Add document attributes as Nostr tags
*/
function addDocumentAttributesToTags(
tags: string[][],
attributes: Record<string, string>,
pubkey: string
) {
// Standard metadata
if (attributes.author) tags.push(["author", attributes.author]);
if (attributes.version) tags.push(["version", attributes.version]);
if (attributes.published) tags.push(["published", attributes.published]);
if (attributes.language) tags.push(["language", attributes.language]);
if (attributes.image) tags.push(["image", attributes.image]);
if (attributes.description) tags.push(["summary", attributes.description]);
if (attributes.type) tags.push(["type", attributes.type]);
// Tags
if (attributes.tags) {
attributes.tags.split(",").forEach(tag => tags.push(["t", tag.trim()]));
}
// Add pubkey reference
tags.push(["p", pubkey]);
// Custom attributes
addCustomAttributes(tags, attributes);
}
/**
* Add section attributes as tags
*/
function addSectionAttributesToTags(
tags: string[][],
attributes: Record<string, string>
) {
// Section tags
if (attributes.tags) {
attributes.tags.split(",").forEach(tag => tags.push(["t", tag.trim()]));
}
// Custom attributes
addCustomAttributes(tags, attributes);
}
/**
* Add custom attributes, filtering out system ones
*/
function addCustomAttributes(
tags: string[][],
attributes: Record<string, string>
) {
const systemAttributes = [
"attribute-undefined", "attribute-missing", "appendix-caption",
"appendix-refsig", "caution-caption", "chapter-refsig", "example-caption",
"figure-caption", "important-caption", "last-update-label", "manname-title",
"note-caption", "part-refsig", "preface-title", "section-refsig",
"table-caption", "tip-caption", "toc-title", "untitled-label",
"version-label", "warning-caption", "asciidoctor", "asciidoctor-version",
"safe-mode-name", "backend", "doctype", "basebackend", "filetype",
"outfilesuffix", "stylesdir", "iconsdir", "localdate", "localyear",
"localtime", "localdatetime", "docdate", "docyear", "doctime",
"docdatetime", "doctitle", "embedded", "notitle",
// Already handled above
"author", "version", "published", "language", "image", "description",
"tags", "title", "type"
];
Object.entries(attributes).forEach(([key, value]) => {
if (!systemAttributes.includes(key) && value && typeof value === "string") {
tags.push([key, value]);
}
});
}
/**
* Group segments by level 2 sections for Level 2 parsing
* Combines all nested content into each level 2 section
*/
function groupSegmentsByLevel2(segments: ContentSegment[]): ContentSegment[] {
const level2Groups: ContentSegment[] = [];
// Find all level 2 segments and include their nested content
for (const segment of segments) {
if (segment.level === 2) {
// Find all content that belongs to this level 2 section
const nestedSegments = segments.filter(s =>
s.level > 2 &&
s.startLine > segment.startLine &&
(segments.find(next => next.level <= 2 && next.startLine > segment.startLine)?.startLine || Infinity) > s.startLine
);
// Combine the level 2 content with all nested content
let combinedContent = segment.content;
for (const nested of nestedSegments) {
combinedContent += `\n\n${'='.repeat(nested.level)} ${nested.title}\n${nested.content}`;
}
level2Groups.push({
...segment,
content: combinedContent
});
}
}
return level2Groups;
}
/**
* Build hierarchical segment structure for Level 3+ parsing
*/
function buildSegmentHierarchy(segments: ContentSegment[]): HierarchicalSegment[] {
const hierarchy: HierarchicalSegment[] = [];
// Process level 2 sections
for (const level2Segment of segments.filter(s => s.level === 2)) {
const children = segments.filter(s =>
s.level > 2 &&
s.startLine > level2Segment.startLine &&
(segments.find(next => next.level <= 2 && next.startLine > level2Segment.startLine)?.startLine || Infinity) > s.startLine
);
hierarchy.push({
...level2Segment,
hasChildren: children.length > 0,
children
});
}
return hierarchy;
}
/**
* Create a 30040 index event for a section with children
*/
function createIndexEventForSection(section: HierarchicalSegment, ndk: NDK): NDKEvent {
const event = new NDKEvent(ndk);
event.kind = 30040;
event.created_at = Math.floor(Date.now() / 1000);
event.pubkey = ndk.activeUser?.pubkey || "preview-placeholder-pubkey";
const dTag = generateDTag(section.title);
const [mTag, MTag] = getMimeTags(30040);
const tags: string[][] = [
["d", dTag],
mTag,
MTag,
["title", section.title]
];
// Add section attributes as tags
addSectionAttributesToTags(tags, section.attributes);
// Add a-tags for each child content section
section.children.forEach(child => {
const childDTag = generateDTag(child.title);
tags.push(["a", `30041:${event.pubkey}:${childDTag}`]);
});
event.tags = tags;
event.content = `${section.content}\n\n${section.children.length} subsections available.`;
return event;
}

284
tests/unit/publication_tree_processor.test.ts

@ -0,0 +1,284 @@ @@ -0,0 +1,284 @@
/**
* TDD Tests for NKBIP-01 Publication Tree Processor
*
* Tests the iterative parsing function at different hierarchy levels
* using deep_hierarchy_test.adoc to verify NKBIP-01 compliance.
*/
import { describe, it, expect, beforeAll } from 'vitest';
import { readFileSync } from 'fs';
import { parseAsciiDocWithTree, validateParseLevel, getSupportedParseLevels } from '../../src/lib/utils/asciidoc_publication_parser.js';
// Mock NDK for testing
const mockNDK = {
activeUser: {
pubkey: "test-pubkey-12345"
}
} as any;
// Read the test document
const testDocumentPath = "./test_data/AsciidocFiles/deep_hierarchy_test.adoc";
let testContent: string;
try {
testContent = readFileSync(testDocumentPath, 'utf-8');
} catch (error) {
console.error("Failed to read test document:", error);
testContent = `= Deep Hierarchical Document Test
:tags: testing, hierarchy, structure
:author: Test Author
:type: technical
This document tests all 6 levels of AsciiDoc hierarchy to validate our parse level system.
== Level 2: Main Sections
:tags: level2, main
This is a level 2 section that should appear in all parse levels.
=== Level 3: Subsections
:tags: level3, subsection
This is a level 3 section that should appear in parse levels 3-6.
==== Level 4: Sub-subsections
:tags: level4, detailed
This is a level 4 section that should appear in parse levels 4-6.
===== Level 5: Deep Subsections
:tags: level5, deep
This is a level 5 section that should only appear in parse levels 5-6.
====== Level 6: Deepest Level
:tags: level6, deepest
This is a level 6 section that should only appear in parse level 6.
Content at the deepest level of our hierarchy.
== Level 2: Second Main Section
:tags: level2, main, second
A second main section to ensure we have balanced content at the top level.`;
}
describe("NKBIP-01 Publication Tree Processor", () => {
it("should validate parse levels correctly", () => {
// Test valid parse levels
expect(validateParseLevel(2)).toBe(true);
expect(validateParseLevel(3)).toBe(true);
expect(validateParseLevel(5)).toBe(true);
// Test invalid parse levels
expect(validateParseLevel(1)).toBe(false);
expect(validateParseLevel(6)).toBe(false);
expect(validateParseLevel(7)).toBe(false);
expect(validateParseLevel(2.5)).toBe(false);
expect(validateParseLevel(-1)).toBe(false);
// Test supported levels array
const supportedLevels = getSupportedParseLevels();
expect(supportedLevels).toEqual([2, 3, 4, 5]);
});
it("should parse Level 2 with NKBIP-01 minimal structure", async () => {
const result = await parseAsciiDocWithTree(testContent, mockNDK, 2);
// Should be detected as article (has title and sections)
expect(result.metadata.contentType).toBe("article");
expect(result.metadata.parseLevel).toBe(2);
expect(result.metadata.title).toBe("Deep Hierarchical Document Test");
// Should have 1 index event (30040) + 2 content events (30041) for level 2 sections
expect(result.indexEvent).toBeDefined();
expect(result.indexEvent?.kind).toBe(30040);
expect(result.contentEvents.length).toBe(2);
// All content events should be kind 30041
result.contentEvents.forEach(event => {
expect(event.kind).toBe(30041);
});
// Check titles of level 2 sections
const contentTitles = result.contentEvents.map(e =>
e.tags.find((t: string[]) => t[0] === "title")?.[1]
);
expect(contentTitles).toContain("Level 2: Main Sections");
expect(contentTitles).toContain("Level 2: Second Main Section");
// Content should include all nested subsections as AsciiDoc
const firstSectionContent = result.contentEvents[0].content;
expect(firstSectionContent).toBeDefined();
// Should contain level 3, 4, 5 content as nested AsciiDoc markup
expect(firstSectionContent.includes("=== Level 3: Subsections")).toBe(true);
expect(firstSectionContent.includes("==== Level 4: Sub-subsections")).toBe(true);
expect(firstSectionContent.includes("===== Level 5: Deep Subsections")).toBe(true);
});
it("should parse Level 3 with NKBIP-01 intermediate structure", async () => {
const result = await parseAsciiDocWithTree(testContent, mockNDK, 3);
expect(result.metadata.contentType).toBe("article");
expect(result.metadata.parseLevel).toBe(3);
// Should have hierarchical structure
expect(result.indexEvent).toBeDefined();
expect(result.indexEvent?.kind).toBe(30040);
// Should have mix of 30040 (for level 2 sections with children) and 30041 (for content)
const kinds = result.contentEvents.map(e => e.kind);
expect(kinds).toContain(30040); // Level 2 sections with children
expect(kinds).toContain(30041); // Level 3 content sections
// Level 2 sections with children should be 30040 index events
const level2WithChildrenEvents = result.contentEvents.filter(e =>
e.kind === 30040 &&
e.tags.find((t: string[]) => t[0] === "title")?.[1]?.includes("Level 2:")
);
expect(level2WithChildrenEvents.length).toBe(2); // Both level 2 sections have children
// Should have 30041 events for level 3 content
const level3ContentEvents = result.contentEvents.filter(e =>
e.kind === 30041 &&
e.tags.find((t: string[]) => t[0] === "title")?.[1]?.includes("Level 3:")
);
expect(level3ContentEvents.length).toBeGreaterThan(0);
});
it("should parse Level 4 with NKBIP-01 detailed structure", async () => {
const result = await parseAsciiDocWithTree(testContent, mockNDK, 4);
expect(result.metadata.contentType).toBe("article");
expect(result.metadata.parseLevel).toBe(4);
// Should have hierarchical structure with mix of 30040 and 30041 events
expect(result.indexEvent).toBeDefined();
expect(result.indexEvent?.kind).toBe(30040);
const kinds = result.contentEvents.map(e => e.kind);
expect(kinds).toContain(30040); // Level 2 sections with children
expect(kinds).toContain(30041); // Content sections
// Check that we have level 4 content sections
const contentTitles = result.contentEvents.map(e =>
e.tags.find((t: string[]) => t[0] === "title")?.[1]
);
expect(contentTitles).toContain("Level 4: Sub-subsections");
});
it("should parse Level 5 with NKBIP-01 maximum depth", async () => {
const result = await parseAsciiDocWithTree(testContent, mockNDK, 5);
expect(result.metadata.contentType).toBe("article");
expect(result.metadata.parseLevel).toBe(5);
// Should have hierarchical structure
expect(result.indexEvent).toBeDefined();
expect(result.indexEvent?.kind).toBe(30040);
// Should include level 5 sections as content events
const contentTitles = result.contentEvents.map(e =>
e.tags.find((t: string[]) => t[0] === "title")?.[1]
);
expect(contentTitles).toContain("Level 5: Deep Subsections");
});
it("should validate event structure correctly", async () => {
const result = await parseAsciiDocWithTree(testContent, mockNDK, 3);
// Test index event structure
expect(result.indexEvent).toBeDefined();
expect(result.indexEvent?.kind).toBe(30040);
expect(result.indexEvent?.tags).toBeDefined();
// Check required tags
const indexTags = result.indexEvent!.tags;
const dTag = indexTags.find((t: string[]) => t[0] === "d");
const titleTag = indexTags.find((t: string[]) => t[0] === "title");
expect(dTag).toBeDefined();
expect(titleTag).toBeDefined();
expect(titleTag![1]).toBe("Deep Hierarchical Document Test");
// Test content events structure - mix of 30040 and 30041
result.contentEvents.forEach(event => {
expect([30040, 30041]).toContain(event.kind);
expect(event.tags).toBeDefined();
expect(event.content).toBeDefined();
const eventTitleTag = event.tags.find((t: string[]) => t[0] === "title");
expect(eventTitleTag).toBeDefined();
});
});
it("should preserve content as AsciiDoc", async () => {
const result = await parseAsciiDocWithTree(testContent, mockNDK, 2);
// Content should be preserved as original AsciiDoc, not converted to HTML
const firstEvent = result.contentEvents[0];
expect(firstEvent.content).toBeDefined();
// Should contain AsciiDoc markup, not HTML
expect(firstEvent.content.includes("<")).toBe(false);
expect(firstEvent.content.includes("===")).toBe(true);
});
it("should handle attributes correctly", async () => {
const result = await parseAsciiDocWithTree(testContent, mockNDK, 2);
// Document-level attributes should be in index event
expect(result.indexEvent).toBeDefined();
const indexTags = result.indexEvent!.tags;
// Check for document attributes
const authorTag = indexTags.find((t: string[]) => t[0] === "author");
const typeTag = indexTags.find((t: string[]) => t[0] === "type");
const tagsTag = indexTags.find((t: string[]) => t[0] === "t");
expect(authorTag?.[1]).toBe("Test Author");
expect(typeTag?.[1]).toBe("technical");
expect(tagsTag).toBeDefined(); // Should have at least one t-tag
});
it("should handle scattered notes mode", async () => {
// Test with content that has no document title (scattered notes)
const scatteredContent = `== First Note
:tags: note1
Content of first note.
== Second Note
:tags: note2
Content of second note.`;
const result = await parseAsciiDocWithTree(scatteredContent, mockNDK, 2);
expect(result.metadata.contentType).toBe("scattered-notes");
expect(result.indexEvent).toBeNull(); // No index event for scattered notes
expect(result.contentEvents.length).toBe(2);
// All events should be 30041 content events
result.contentEvents.forEach(event => {
expect(event.kind).toBe(30041);
});
});
it("should integrate with PublicationTree structure", async () => {
const result = await parseAsciiDocWithTree(testContent, mockNDK, 2);
// Should have a PublicationTree instance
expect(result.tree).toBeDefined();
// Tree should have methods for event management
expect(typeof result.tree.addEvent).toBe("function");
// Event structure should be populated
expect(result.metadata.eventStructure).toBeDefined();
expect(Array.isArray(result.metadata.eventStructure)).toBe(true);
});
});
Loading…
Cancel
Save