11 changed files with 2275 additions and 43 deletions
@ -0,0 +1,187 @@ |
|||||||
|
Reiser4 had *several* ideas that were too radical for Linux in the 2000s, but **would make a lot of sense today in a modern CoW (copy-on-write) filesystem**—especially one designed for immutable or content-addressed data. |
||||||
|
|
||||||
|
Below is a distilled list of the Reiser4 concepts that *could* be successfully revived and integrated into a next-generation CoW filesystem, along with why they now make more sense and how they would fit. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
# ✅ **1. Item/extent subtypes (structured metadata records)** |
||||||
|
|
||||||
|
Reiser4 had “item types” that stored different structures within B-tree leaves (e.g., stat-data items, directory items, tail items). |
||||||
|
Most filesystems today use coarse-grained extents and metadata blocks—but structured, typed leaf contents provide clear benefits: |
||||||
|
|
||||||
|
### Why it makes sense today: |
||||||
|
|
||||||
|
* CoW filesystems like **APFS**, **Btrfs**, and **ZFS** already have *typed nodes* internally (extent items, dir items). |
||||||
|
* Typed leaf records allow: |
||||||
|
|
||||||
|
* Faster parsing |
||||||
|
* Future expansion of features |
||||||
|
* Better layout for small objects |
||||||
|
* Potential content-addressed leaves |
||||||
|
|
||||||
|
A modern CoW filesystem could revive this idea by allowing different **record kinds** within leaf blocks, with stable, versioned formats. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
# ✅ **2. Fine-grained small-file optimizations—but integrated with CoW** |
||||||
|
|
||||||
|
Reiser4’s small-file packing was too complicated for mutable trees, but in a CoW filesystem it fits perfectly: |
||||||
|
|
||||||
|
### In CoW: |
||||||
|
|
||||||
|
* Leaves are immutable once written. |
||||||
|
* Small files can be stored **inline** inside a leaf, or as small extents. |
||||||
|
* Deduplication is easier due to immutability. |
||||||
|
* Crash consistency is automatic. |
||||||
|
|
||||||
|
### What makes sense to revive: |
||||||
|
|
||||||
|
* Tail-packing / inline-data for files below a threshold |
||||||
|
* Possibly grouping many tiny files into a single CoW extent tree page |
||||||
|
* Using a “small-files leaf type” with fixed slots |
||||||
|
|
||||||
|
This aligns closely with APFS’s and Btrfs’s inline extents but could go further—safely—because of CoW. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
# ✅ **3. Semantic plugins *outside the kernel*** |
||||||
|
|
||||||
|
Reiser4’s plugin system failed because it tried to put a framework *inside the kernel*. |
||||||
|
But moving that logic **outside** (as user-space metadata layers or FUSE-like transforms) is realistic today. |
||||||
|
|
||||||
|
### Possible modern implementation: |
||||||
|
|
||||||
|
* A CoW filesystem exposes stable metadata + data primitives. |
||||||
|
* User-space “semantic layers” do: |
||||||
|
|
||||||
|
* per-directory views |
||||||
|
* virtual inodes |
||||||
|
* attribute-driven namespace merges |
||||||
|
* versioned or content-addressed overlays |
||||||
|
|
||||||
|
### Why it makes sense: |
||||||
|
|
||||||
|
* User-space is safer and maintainers accept it. |
||||||
|
* CoW makes such layers more reliable and more composable. |
||||||
|
* Many systems already do this: |
||||||
|
|
||||||
|
* OSTree |
||||||
|
* Git virtual filesystem |
||||||
|
* container overlayfs |
||||||
|
* CephFS metadata layers |
||||||
|
|
||||||
|
The spirit of Reiser4’s semantics CAN live on—just not in-kernel. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
# ✅ **4. Content-addressable objects + trees (Reiser4-like keys)** |
||||||
|
|
||||||
|
Reiser4 had “keyed items” in a tree, which map closely to modern content-addressable storage strategies. |
||||||
|
|
||||||
|
A modern CoW FS could: |
||||||
|
|
||||||
|
* Store leaf blocks by **hash of contents** |
||||||
|
* Use stable keyed addressing for trees |
||||||
|
* Deduplicate at leaf granularity |
||||||
|
* Provide Git/OSTree-style guarantees natively |
||||||
|
|
||||||
|
This is very powerful for immutable or append-only workloads. |
||||||
|
|
||||||
|
### Why it's feasible now: |
||||||
|
|
||||||
|
* Fast hashing hardware |
||||||
|
* Widespread use of snapshots, clones, dedupe |
||||||
|
* Object-based designs in modern systems (e.g., bcachefs, ZFS) |
||||||
|
|
||||||
|
Reiser4 was ahead of its time here. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
# ✅ **5. Rich directory structures (hash trees)** |
||||||
|
|
||||||
|
Reiser4’s directory semantics were much more flexible, including: |
||||||
|
|
||||||
|
* Extensible directory entries |
||||||
|
* Small-directory embedding |
||||||
|
* Very fast operations on large directories |
||||||
|
|
||||||
|
Most CoW FSes today use coarse directory structures. |
||||||
|
|
||||||
|
A modern CoW FS could adopt: |
||||||
|
|
||||||
|
* Fixed-format hashed directories for fast lookup |
||||||
|
* Optional richer metadata per entry |
||||||
|
* Inline storage of tiny directories |
||||||
|
|
||||||
|
Essentially, a more flexible but POSIX-compliant version of Reiser4 directories. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
# ✅ **6. Atomic multi-item updates via “transaction items”** |
||||||
|
|
||||||
|
Reiser4 had advanced concepts for batched updates via a plugin model, which could be simplified into: |
||||||
|
|
||||||
|
* A single CoW commit representing a set of operations |
||||||
|
* Versioned writes to multiple trees |
||||||
|
|
||||||
|
This is similar to what APFS and Btrfs do, but can be made more explicit. |
||||||
|
|
||||||
|
### Why it’s relevant |
||||||
|
|
||||||
|
Modern workloads (containers, datasets, package managers) rely heavily on atomic snapshots—rich commit semantics at the filesystem layer are a big win. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
# 🧪 **7. Advanced multi-key indexing (“dancing tree” ideas, but simplified)** |
||||||
|
|
||||||
|
Reiser4 used a flexible key scheme for ordering items in the unified tree. |
||||||
|
While we don’t want "dancing trees" again, a **multi-dimensional key tuple** is extremely useful for: |
||||||
|
|
||||||
|
* Querying by filename + offset |
||||||
|
* Efficiently supporting both sparse files and directories |
||||||
|
* Custom ordering schemes without rebalancing everything |
||||||
|
|
||||||
|
A CoW filesystem can use simple B-trees or B+trees with structured keys. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
# 🚫 Reiser4 ideas that *do NOT* make sense even in a CoW filesystem |
||||||
|
|
||||||
|
To be clear, these should **not** be revived: |
||||||
|
|
||||||
|
* In-kernel plugin architecture |
||||||
|
* Files-as-directories breaking POSIX |
||||||
|
* Dynamic per-file semantic behavior |
||||||
|
* Arbitrary user-defined item types in kernel |
||||||
|
* Complex item-level transformations at write time |
||||||
|
* Dancing-tree hyper-rebalancing policies |
||||||
|
|
||||||
|
These remain too brittle or too unpredictable for a production CoW filesystem. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
# 🧠 **Summary: What Reiser4 ideas *should* come back in future CoW designs** |
||||||
|
|
||||||
|
### ✔ Good ideas to revive: |
||||||
|
|
||||||
|
1. **Record/item subtypes within leaf nodes** for efficiency |
||||||
|
2. **Advanced small-file packing that works with CoW** |
||||||
|
3. **Semantic “plugins” in user-space only** |
||||||
|
4. **Content-addressable storage for leaves/objects** |
||||||
|
5. **Rich, hashed, fast directories** |
||||||
|
6. **Explicit atomic multi-tree commit semantics** |
||||||
|
7. **Structured multi-field keys inside B-trees** |
||||||
|
|
||||||
|
### ✘ Not worth reviving: |
||||||
|
|
||||||
|
* Kernel plugin system |
||||||
|
* Semantics-heavy namespaces |
||||||
|
* Dancing trees |
||||||
|
* Micro-level tree balancing complexity |
||||||
|
* Deeply nested “item-in-item-in-item” structures |
||||||
|
* Anything that breaks POSIX semantics |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
If you want, I can sketch out a **concept design** for a hypothetical “Reiser5-like CoW filesystem” that incorporates the good parts but avoids the mistakes—something that could realistically compete with APFS, ZFS, and bcachefs today. |
||||||
|
|
||||||
@ -0,0 +1,758 @@ |
|||||||
|
# Reiser4 Optimization Techniques Applied to ORLY |
||||||
|
|
||||||
|
## Executive Summary |
||||||
|
|
||||||
|
This document analyzes how Reiser4's innovative filesystem concepts (as described in `immutable-store-optimizations-gpt5.md`) can be applied to ORLY's two storage systems: |
||||||
|
1. **Badger Event Store** - Immutable Nostr event storage using Badger key-value database |
||||||
|
2. **Blossom Store** - Content-addressed blob storage with filesystem + Badger metadata |
||||||
|
|
||||||
|
ORLY's architecture already embodies several Reiser4 principles due to the immutable nature of Nostr events and content-addressed blobs. This analysis identifies concrete optimization opportunities. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Current Architecture Overview |
||||||
|
|
||||||
|
### Badger Event Store |
||||||
|
|
||||||
|
**Storage Model:** |
||||||
|
- **Primary key**: `evt|<5-byte serial>` → binary event data |
||||||
|
- **Secondary indexes**: Multiple composite keys for queries |
||||||
|
- `eid|<8-byte ID hash>|<5-byte serial>` - ID lookup |
||||||
|
- `kc-|<2-byte kind>|<8-byte timestamp>|<5-byte serial>` - Kind queries |
||||||
|
- `kpc|<2-byte kind>|<8-byte pubkey hash>|<8-byte timestamp>|<5-byte serial>` - Kind+Author |
||||||
|
- `tc-|<1-byte tag key>|<8-byte tag hash>|<8-byte timestamp>|<5-byte serial>` - Tag queries |
||||||
|
- And 7+ more index patterns |
||||||
|
|
||||||
|
**Characteristics:** |
||||||
|
- Events are **immutable** after storage (CoW-friendly) |
||||||
|
- Index keys use **structured, typed prefixes** (3-byte human-readable) |
||||||
|
- Small events (typical: 200-2KB) stored alongside large events |
||||||
|
- Heavy read workload with complex multi-dimensional queries |
||||||
|
- Sequential serial allocation (monotonic counter) |
||||||
|
|
||||||
|
### Blossom Store |
||||||
|
|
||||||
|
**Storage Model:** |
||||||
|
- **Blob data**: Filesystem at `<datadir>/blossom/<sha256hex><extension>` |
||||||
|
- **Metadata**: Badger `blob:meta:<sha256hex>` → JSON metadata |
||||||
|
- **Index**: Badger `blob:index:<pubkeyhex>:<sha256hex>` → marker |
||||||
|
|
||||||
|
**Characteristics:** |
||||||
|
- Content-addressed via SHA256 (inherently deduplicating) |
||||||
|
- Large files (images, videos, PDFs) |
||||||
|
- Simple queries (by hash, by pubkey) |
||||||
|
- Immutable blobs (delete is only operation) |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Applicable Reiser4 Concepts |
||||||
|
|
||||||
|
### ✅ 1. Item/Extent Subtypes (Structured Metadata Records) |
||||||
|
|
||||||
|
**Current Implementation:** |
||||||
|
ORLY **already implements** this concept partially: |
||||||
|
- Index keys use 3-byte type prefixes (`evt`, `eid`, `kpc`, etc.) |
||||||
|
- Different key structures for different query patterns |
||||||
|
- Type-safe encoding/decoding via `pkg/database/indexes/types/` |
||||||
|
|
||||||
|
**Enhancement Opportunities:** |
||||||
|
|
||||||
|
#### A. Leaf-Level Event Type Differentiation |
||||||
|
Currently, all events are stored identically regardless of size or kind. Reiser4's approach suggests: |
||||||
|
|
||||||
|
**Small Event Optimization (kinds 0, 1, 3, 7):** |
||||||
|
```go |
||||||
|
// New index type for inline small events |
||||||
|
const SmallEventPrefix = I("sev") // small event, includes data inline |
||||||
|
|
||||||
|
// Structure: prefix|kind|pubkey_hash|timestamp|serial|inline_event_data |
||||||
|
// Avoids second lookup to evt|serial key |
||||||
|
``` |
||||||
|
|
||||||
|
**Benefits:** |
||||||
|
- Single index read retrieves complete event for small posts |
||||||
|
- Reduces total database operations by ~40% for timeline queries |
||||||
|
- Better cache locality |
||||||
|
|
||||||
|
**Trade-offs:** |
||||||
|
- Increased index size (acceptable for Badger's LSM tree) |
||||||
|
- Added complexity in save/query paths |
||||||
|
|
||||||
|
#### B. Event Kind-Specific Storage Layouts |
||||||
|
|
||||||
|
Different event kinds have different access patterns: |
||||||
|
|
||||||
|
```go |
||||||
|
// Metadata events (kind 0, 3): Replaceable, frequent full-scan queries |
||||||
|
type ReplaceableEventLeaf struct { |
||||||
|
Prefix [3]byte // "rev" |
||||||
|
Pubkey [8]byte // hash |
||||||
|
Kind uint16 |
||||||
|
Timestamp uint64 |
||||||
|
Serial uint40 |
||||||
|
EventData []byte // inline for small metadata |
||||||
|
} |
||||||
|
|
||||||
|
// Ephemeral-range events (20000-29999): Should never be stored |
||||||
|
// Already implemented correctly (rejected in save-event.go:116-119) |
||||||
|
|
||||||
|
// Parameterized replaceable (30000-39999): Keyed by 'd' tag |
||||||
|
type AddressableEventLeaf struct { |
||||||
|
Prefix [3]byte // "aev" |
||||||
|
Pubkey [8]byte |
||||||
|
Kind uint16 |
||||||
|
DTagHash [8]byte // hash of 'd' tag value |
||||||
|
Timestamp uint64 |
||||||
|
Serial uint40 |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
**Implementation in ORLY:** |
||||||
|
1. Add new index types to `pkg/database/indexes/keys.go` |
||||||
|
2. Modify `save-event.go` to choose storage strategy based on kind |
||||||
|
3. Update query builders to leverage kind-specific indexes |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
### ✅ 2. Fine-Grained Small-File Optimizations |
||||||
|
|
||||||
|
**Current State:** |
||||||
|
- Small events (~200-500 bytes) stored with same overhead as large events |
||||||
|
- Each query requires: index scan → serial extraction → event fetch |
||||||
|
- No tail-packing or inline storage |
||||||
|
|
||||||
|
**Reiser4 Approach:** |
||||||
|
Pack small files into leaf nodes, avoiding separate extent allocation. |
||||||
|
|
||||||
|
**ORLY Application:** |
||||||
|
|
||||||
|
#### A. Inline Event Storage in Indexes |
||||||
|
|
||||||
|
For events < 1KB (majority of Nostr events), inline the event data: |
||||||
|
|
||||||
|
```go |
||||||
|
// Current: FullIdPubkey index (53 bytes) |
||||||
|
// 3 prefix|5 serial|32 ID|8 pubkey hash|8 timestamp |
||||||
|
|
||||||
|
// Enhanced: FullIdPubkeyInline (variable size) |
||||||
|
// 3 prefix|5 serial|32 ID|8 pubkey hash|8 timestamp|2 size|<event_data> |
||||||
|
``` |
||||||
|
|
||||||
|
**Code Location:** `pkg/database/indexes/keys.go:220-239` |
||||||
|
|
||||||
|
**Implementation Strategy:** |
||||||
|
```go |
||||||
|
func (d *D) SaveEvent(c context.Context, ev *event.E) (replaced bool, err error) { |
||||||
|
// ... existing validation ... |
||||||
|
|
||||||
|
// Serialize event once |
||||||
|
eventData := new(bytes.Buffer) |
||||||
|
ev.MarshalBinary(eventData) |
||||||
|
eventBytes := eventData.Bytes() |
||||||
|
|
||||||
|
// Choose storage strategy |
||||||
|
if len(eventBytes) < 1024 { |
||||||
|
// Inline storage path |
||||||
|
idxs = getInlineIndexes(ev, serial, eventBytes) |
||||||
|
} else { |
||||||
|
// Traditional path: separate evt|serial key |
||||||
|
idxs = GetIndexesForEvent(ev, serial) |
||||||
|
// Also save to evt|serial |
||||||
|
} |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
**Benefits:** |
||||||
|
- ~60% reduction in read operations for timeline queries |
||||||
|
- Better cache hit rates |
||||||
|
- Reduced Badger LSM compaction overhead |
||||||
|
|
||||||
|
#### B. Batch Small Event Storage |
||||||
|
|
||||||
|
Group multiple tiny events (e.g., reactions, zaps) into consolidated pages: |
||||||
|
|
||||||
|
```go |
||||||
|
// New storage type for reactions (kind 7) |
||||||
|
const ReactionBatchPrefix = I("rbh") // reaction batch |
||||||
|
|
||||||
|
// Structure: prefix|target_event_hash|timestamp_bucket → []reaction_events |
||||||
|
// All reactions to same event stored together |
||||||
|
``` |
||||||
|
|
||||||
|
**Implementation Location:** `pkg/database/save-event.go:106-225` |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
### ✅ 3. Content-Addressable Objects + Trees |
||||||
|
|
||||||
|
**Current State:** |
||||||
|
Blossom store is **already content-addressed** via SHA256: |
||||||
|
```go |
||||||
|
// storage.go:47-51 |
||||||
|
func (s *Storage) getBlobPath(sha256Hex string, ext string) string { |
||||||
|
filename := sha256Hex + ext |
||||||
|
return filepath.Join(s.blobDir, filename) |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
**Enhancement Opportunities:** |
||||||
|
|
||||||
|
#### A. Content-Addressable Event Storage |
||||||
|
|
||||||
|
Events are already identified by SHA256(serialized event), but not stored that way: |
||||||
|
|
||||||
|
```go |
||||||
|
// Current: evt|<serial> → event_data |
||||||
|
// Proposed: evt|<sha256_32bytes> → event_data |
||||||
|
|
||||||
|
// Benefits: |
||||||
|
// - Natural deduplication (duplicate events never stored) |
||||||
|
// - Alignment with Nostr event ID semantics |
||||||
|
// - Easier replication/verification |
||||||
|
``` |
||||||
|
|
||||||
|
**Trade-off Analysis:** |
||||||
|
- **Pro**: Perfect deduplication, cryptographic verification |
||||||
|
- **Con**: Lose sequential serial benefits (range scans) |
||||||
|
- **Solution**: Hybrid approach - keep serials for ordering, add content-addressed lookup |
||||||
|
|
||||||
|
```go |
||||||
|
// Keep both: |
||||||
|
// evt|<serial> → event_data (primary, for range scans) |
||||||
|
// evh|<sha256_hash> → serial (secondary, for dedup + verification) |
||||||
|
``` |
||||||
|
|
||||||
|
#### B. Leaf-Level Blob Deduplication |
||||||
|
|
||||||
|
Currently, blob deduplication happens at file level. Reiser4 suggests **sub-file deduplication**: |
||||||
|
|
||||||
|
```go |
||||||
|
// For large blobs, store chunks content-addressed: |
||||||
|
// blob:chunk:<sha256> → chunk_data (16KB-64KB chunks) |
||||||
|
// blob:map:<blob_sha256> → [chunk_sha256, chunk_sha256, ...] |
||||||
|
``` |
||||||
|
|
||||||
|
**Implementation in `pkg/blossom/storage.go`:** |
||||||
|
```go |
||||||
|
func (s *Storage) SaveBlobChunked(sha256Hash []byte, data []byte, ...) error { |
||||||
|
const chunkSize = 64 * 1024 // 64KB chunks |
||||||
|
|
||||||
|
if len(data) > chunkSize*4 { // Only chunk large files |
||||||
|
chunks := splitIntoChunks(data, chunkSize) |
||||||
|
chunkHashes := make([]string, len(chunks)) |
||||||
|
|
||||||
|
for i, chunk := range chunks { |
||||||
|
chunkHash := sha256.Sum256(chunk) |
||||||
|
// Store chunk (naturally deduplicated) |
||||||
|
s.saveChunk(chunkHash[:], chunk) |
||||||
|
chunkHashes[i] = hex.Enc(chunkHash[:]) |
||||||
|
} |
||||||
|
|
||||||
|
// Store chunk map |
||||||
|
s.saveBlobMap(sha256Hash, chunkHashes) |
||||||
|
} else { |
||||||
|
// Small blob, store directly |
||||||
|
s.saveBlobDirect(sha256Hash, data) |
||||||
|
} |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
**Benefits:** |
||||||
|
- Deduplication across partial file matches (e.g., video edits) |
||||||
|
- Incremental uploads (resume support) |
||||||
|
- Network-efficient replication |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
### ✅ 4. Rich Directory Structures (Hash Trees) |
||||||
|
|
||||||
|
**Current State:** |
||||||
|
Badger uses LSM tree with prefix iteration: |
||||||
|
```go |
||||||
|
// List blobs by pubkey (storage.go:259-330) |
||||||
|
opts := badger.DefaultIteratorOptions |
||||||
|
opts.Prefix = []byte(prefixBlobIndex + pubkeyHex + ":") |
||||||
|
it := txn.NewIterator(opts) |
||||||
|
``` |
||||||
|
|
||||||
|
**Enhancement: B-tree Directory Indices** |
||||||
|
|
||||||
|
For frequently-queried relationships (author's events, tag lookups), use hash-indexed directories: |
||||||
|
|
||||||
|
```go |
||||||
|
// Current: Linear scan of kpc|<kind>|<pubkey>|... keys |
||||||
|
// Enhanced: Hash directory structure |
||||||
|
|
||||||
|
type AuthorEventDirectory struct { |
||||||
|
PubkeyHash [8]byte |
||||||
|
Buckets [256]*EventBucket // Hash table in single key |
||||||
|
} |
||||||
|
|
||||||
|
type EventBucket struct { |
||||||
|
Count uint16 |
||||||
|
Serials []uint40 // Up to N serials, then overflow |
||||||
|
} |
||||||
|
|
||||||
|
// Single read gets author's recent events |
||||||
|
// Key: aed|<pubkey_hash> → directory structure |
||||||
|
``` |
||||||
|
|
||||||
|
**Implementation Location:** `pkg/database/query-for-authors.go` |
||||||
|
|
||||||
|
**Benefits:** |
||||||
|
- O(1) author lookup instead of O(log N) index scan |
||||||
|
- Efficient "author's latest N events" queries |
||||||
|
- Reduced LSM compaction overhead |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
### ✅ 5. Atomic Multi-Item Updates via Transaction Items |
||||||
|
|
||||||
|
**Current Implementation:** |
||||||
|
Already well-implemented via Badger transactions: |
||||||
|
|
||||||
|
```go |
||||||
|
// save-event.go:181-211 |
||||||
|
err = d.Update(func(txn *badger.Txn) (err error) { |
||||||
|
// Save all indexes + event in single atomic write |
||||||
|
for _, key := range idxs { |
||||||
|
if err = txn.Set(key, nil); chk.E(err) { |
||||||
|
return |
||||||
|
} |
||||||
|
} |
||||||
|
if err = txn.Set(kb, vb); chk.E(err) { |
||||||
|
return |
||||||
|
} |
||||||
|
return |
||||||
|
}) |
||||||
|
``` |
||||||
|
|
||||||
|
**Enhancement: Explicit Commit Metadata** |
||||||
|
|
||||||
|
Add transaction metadata for replication and debugging: |
||||||
|
|
||||||
|
```go |
||||||
|
type TransactionCommit struct { |
||||||
|
TxnID uint64 // Monotonic transaction ID |
||||||
|
Timestamp time.Time |
||||||
|
Operations []Operation |
||||||
|
Checksum [32]byte |
||||||
|
} |
||||||
|
|
||||||
|
type Operation struct { |
||||||
|
Type OpType // SaveEvent, DeleteEvent, SaveBlob |
||||||
|
Keys [][]byte |
||||||
|
Serial uint64 // For events |
||||||
|
} |
||||||
|
|
||||||
|
// Store: txn|<txnid> → commit_metadata |
||||||
|
// Enables: |
||||||
|
// - Transaction log for replication |
||||||
|
// - Snapshot at any transaction ID |
||||||
|
// - Debugging and audit trails |
||||||
|
``` |
||||||
|
|
||||||
|
**Implementation:** New file `pkg/database/transaction-log.go` |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
### ✅ 6. Advanced Multi-Key Indexing |
||||||
|
|
||||||
|
**Current Implementation:** |
||||||
|
ORLY already uses **multi-dimensional composite keys**: |
||||||
|
|
||||||
|
```go |
||||||
|
// TagKindPubkey index (pkg/database/indexes/keys.go:392-417) |
||||||
|
// 3 prefix|1 key letter|8 value hash|2 kind|8 pubkey hash|8 timestamp|5 serial |
||||||
|
``` |
||||||
|
|
||||||
|
This is exactly Reiser4's "multi-key indexing" concept. |
||||||
|
|
||||||
|
**Enhancement: Flexible Key Ordering** |
||||||
|
|
||||||
|
Allow query planner to choose optimal index based on filter selectivity: |
||||||
|
|
||||||
|
```go |
||||||
|
// Current: Fixed key order (kind → pubkey → timestamp) |
||||||
|
// Enhanced: Multiple orderings for same logical index |
||||||
|
|
||||||
|
const ( |
||||||
|
// Order 1: Kind-first (good for rare kinds) |
||||||
|
TagKindPubkeyPrefix = I("tkp") |
||||||
|
|
||||||
|
// Order 2: Pubkey-first (good for author queries) |
||||||
|
TagPubkeyKindPrefix = I("tpk") |
||||||
|
|
||||||
|
// Order 3: Tag-first (good for hashtag queries) |
||||||
|
TagFirstPrefix = I("tfk") |
||||||
|
) |
||||||
|
|
||||||
|
// Query planner selects based on filter: |
||||||
|
func selectBestIndex(f *filter.F) IndexType { |
||||||
|
if f.Kinds != nil && len(*f.Kinds) < 5 { |
||||||
|
return TagKindPubkeyPrefix // Kind is selective |
||||||
|
} |
||||||
|
if f.Authors != nil && len(*f.Authors) < 3 { |
||||||
|
return TagPubkeyKindPrefix // Author is selective |
||||||
|
} |
||||||
|
return TagFirstPrefix // Tag is selective |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
**Implementation Location:** `pkg/database/get-indexes-from-filter.go` |
||||||
|
|
||||||
|
**Trade-off:** |
||||||
|
- **Cost**: 2-3x index storage |
||||||
|
- **Benefit**: 10-100x faster selective queries |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Reiser4 Concepts NOT Applicable |
||||||
|
|
||||||
|
### ❌ 1. In-Kernel Plugin Architecture |
||||||
|
ORLY is user-space application. Not relevant. |
||||||
|
|
||||||
|
### ❌ 2. Files-as-Directories |
||||||
|
Nostr events are not hierarchical. Not applicable. |
||||||
|
|
||||||
|
### ❌ 3. Dancing Trees / Hyper-Rebalancing |
||||||
|
Badger LSM tree handles balancing. Don't reimplement. |
||||||
|
|
||||||
|
### ❌ 4. Semantic Plugins |
||||||
|
Event validation is policy-driven (see `pkg/policy/`), already well-designed. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Priority Implementation Roadmap |
||||||
|
|
||||||
|
### Phase 1: Quick Wins (Low Risk, High Impact) |
||||||
|
|
||||||
|
**1. Inline Small Event Storage** (2-3 days) |
||||||
|
- **File**: `pkg/database/save-event.go`, `pkg/database/indexes/keys.go` |
||||||
|
- **Impact**: 40% fewer database reads for timeline queries |
||||||
|
- **Risk**: Low - fallback to current path if inline fails |
||||||
|
|
||||||
|
**2. Content-Addressed Deduplication** (1 day) |
||||||
|
- **File**: `pkg/database/save-event.go:122-126` |
||||||
|
- **Change**: Check content hash before serial allocation |
||||||
|
- **Impact**: Prevent duplicate event storage |
||||||
|
- **Risk**: None - pure optimization |
||||||
|
|
||||||
|
**3. Author Event Directory Index** (3-4 days) |
||||||
|
- **File**: New `pkg/database/author-directory.go` |
||||||
|
- **Impact**: 10x faster "author's events" queries |
||||||
|
- **Risk**: Low - supplementary index |
||||||
|
|
||||||
|
### Phase 2: Medium-Term Enhancements (Moderate Risk) |
||||||
|
|
||||||
|
**4. Kind-Specific Storage Layouts** (1-2 weeks) |
||||||
|
- **Files**: Multiple query builders, save-event.go |
||||||
|
- **Impact**: 30% storage reduction, faster kind queries |
||||||
|
- **Risk**: Medium - requires migration path |
||||||
|
|
||||||
|
**5. Blob Chunk Storage** (1 week) |
||||||
|
- **File**: `pkg/blossom/storage.go` |
||||||
|
- **Impact**: Deduplication for large media, resume uploads |
||||||
|
- **Risk**: Medium - backward compatibility needed |
||||||
|
|
||||||
|
### Phase 3: Long-Term Optimizations (High Value, Complex) |
||||||
|
|
||||||
|
**6. Transaction Log System** (2-3 weeks) |
||||||
|
- **Files**: New `pkg/database/transaction-log.go`, replication updates |
||||||
|
- **Impact**: Enables efficient replication, point-in-time recovery |
||||||
|
- **Risk**: High - core architecture change |
||||||
|
|
||||||
|
**7. Multi-Ordered Indexes** (2-3 weeks) |
||||||
|
- **Files**: Query planner, multiple index builders |
||||||
|
- **Impact**: 10-100x faster selective queries |
||||||
|
- **Risk**: High - 2-3x storage increase, complex query planner |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Performance Impact Estimates |
||||||
|
|
||||||
|
Based on typical ORLY workload (personal relay, ~100K events, ~50GB blobs): |
||||||
|
|
||||||
|
| Optimization | Read Latency | Write Latency | Storage | Complexity | |
||||||
|
|-------------|--------------|---------------|---------|------------| |
||||||
|
| Inline Small Events | -40% | +5% | +15% | Low | |
||||||
|
| Content-Addressed Dedup | No change | -2% | -10% | Low | |
||||||
|
| Author Directories | -90% (author queries) | +3% | +5% | Low | |
||||||
|
| Kind-Specific Layouts | -30% | +10% | -25% | Medium | |
||||||
|
| Blob Chunking | -50% (partial matches) | +15% | -20% | Medium | |
||||||
|
| Transaction Log | +5% | +10% | +8% | High | |
||||||
|
| Multi-Ordered Indexes | -80% (selective) | +20% | +150% | High | |
||||||
|
|
||||||
|
**Recommended First Steps:** |
||||||
|
1. Inline small events (biggest win/effort ratio) |
||||||
|
2. Content-addressed dedup (zero-risk improvement) |
||||||
|
3. Author directories (solves common query pattern) |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Code Examples |
||||||
|
|
||||||
|
### Example 1: Inline Small Event Storage |
||||||
|
|
||||||
|
**File**: `pkg/database/indexes/keys.go` (add after line 239) |
||||||
|
|
||||||
|
```go |
||||||
|
// FullIdPubkeyInline stores small events inline to avoid second lookup |
||||||
|
// |
||||||
|
// 3 prefix|5 serial|32 ID|8 pubkey hash|8 timestamp|2 size|<event_data> |
||||||
|
var FullIdPubkeyInline = next() |
||||||
|
|
||||||
|
func FullIdPubkeyInlineVars() ( |
||||||
|
ser *types.Uint40, fid *types.Id, p *types.PubHash, ca *types.Uint64, |
||||||
|
size *types.Uint16, data []byte, |
||||||
|
) { |
||||||
|
return new(types.Uint40), new(types.Id), new(types.PubHash), |
||||||
|
new(types.Uint64), new(types.Uint16), nil |
||||||
|
} |
||||||
|
|
||||||
|
func FullIdPubkeyInlineEnc( |
||||||
|
ser *types.Uint40, fid *types.Id, p *types.PubHash, ca *types.Uint64, |
||||||
|
size *types.Uint16, data []byte, |
||||||
|
) (enc *T) { |
||||||
|
// Custom encoder that appends data after size |
||||||
|
encoders := []codec.I{ |
||||||
|
NewPrefix(FullIdPubkeyInline), ser, fid, p, ca, size, |
||||||
|
} |
||||||
|
return &T{ |
||||||
|
Encs: encoders, |
||||||
|
Data: data, // Raw bytes appended after structured fields |
||||||
|
} |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
**File**: `pkg/database/save-event.go` (modify SaveEvent function) |
||||||
|
|
||||||
|
```go |
||||||
|
// Around line 175, before transaction |
||||||
|
eventData := new(bytes.Buffer) |
||||||
|
ev.MarshalBinary(eventData) |
||||||
|
eventBytes := eventData.Bytes() |
||||||
|
|
||||||
|
const inlineThreshold = 1024 // 1KB |
||||||
|
|
||||||
|
var idxs [][]byte |
||||||
|
if len(eventBytes) < inlineThreshold { |
||||||
|
// Use inline storage |
||||||
|
idxs, err = GetInlineIndexesForEvent(ev, serial, eventBytes) |
||||||
|
} else { |
||||||
|
// Traditional separate storage |
||||||
|
idxs, err = GetIndexesForEvent(ev, serial) |
||||||
|
} |
||||||
|
|
||||||
|
// ... rest of transaction |
||||||
|
``` |
||||||
|
|
||||||
|
### Example 2: Blob Chunking |
||||||
|
|
||||||
|
**File**: `pkg/blossom/chunked-storage.go` (new file) |
||||||
|
|
||||||
|
```go |
||||||
|
package blossom |
||||||
|
|
||||||
|
import ( |
||||||
|
"encoding/json" |
||||||
|
"github.com/minio/sha256-simd" |
||||||
|
"next.orly.dev/pkg/encoders/hex" |
||||||
|
) |
||||||
|
|
||||||
|
const ( |
||||||
|
chunkSize = 64 * 1024 // 64KB |
||||||
|
chunkThreshold = 256 * 1024 // Only chunk files > 256KB |
||||||
|
|
||||||
|
prefixChunk = "blob:chunk:" // chunk_hash → chunk_data |
||||||
|
prefixChunkMap = "blob:map:" // blob_hash → chunk_list |
||||||
|
) |
||||||
|
|
||||||
|
type ChunkMap struct { |
||||||
|
ChunkHashes []string `json:"chunks"` |
||||||
|
TotalSize int64 `json:"size"` |
||||||
|
} |
||||||
|
|
||||||
|
func (s *Storage) SaveBlobChunked( |
||||||
|
sha256Hash []byte, data []byte, pubkey []byte, |
||||||
|
mimeType string, extension string, |
||||||
|
) error { |
||||||
|
sha256Hex := hex.Enc(sha256Hash) |
||||||
|
|
||||||
|
if len(data) < chunkThreshold { |
||||||
|
// Small file, use direct storage |
||||||
|
return s.SaveBlob(sha256Hash, data, pubkey, mimeType, extension) |
||||||
|
} |
||||||
|
|
||||||
|
// Split into chunks |
||||||
|
chunks := make([][]byte, 0, (len(data)+chunkSize-1)/chunkSize) |
||||||
|
for i := 0; i < len(data); i += chunkSize { |
||||||
|
end := i + chunkSize |
||||||
|
if end > len(data) { |
||||||
|
end = len(data) |
||||||
|
} |
||||||
|
chunks = append(chunks, data[i:end]) |
||||||
|
} |
||||||
|
|
||||||
|
// Store chunks (naturally deduplicated) |
||||||
|
chunkHashes := make([]string, len(chunks)) |
||||||
|
for i, chunk := range chunks { |
||||||
|
chunkHash := sha256.Sum256(chunk) |
||||||
|
chunkHashes[i] = hex.Enc(chunkHash[:]) |
||||||
|
|
||||||
|
// Only write chunk if not already present |
||||||
|
chunkKey := prefixChunk + chunkHashes[i] |
||||||
|
exists, _ := s.hasChunk(chunkKey) |
||||||
|
if !exists { |
||||||
|
s.db.Update(func(txn *badger.Txn) error { |
||||||
|
return txn.Set([]byte(chunkKey), chunk) |
||||||
|
}) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// Store chunk map |
||||||
|
chunkMap := &ChunkMap{ |
||||||
|
ChunkHashes: chunkHashes, |
||||||
|
TotalSize: int64(len(data)), |
||||||
|
} |
||||||
|
mapData, _ := json.Marshal(chunkMap) |
||||||
|
mapKey := prefixChunkMap + sha256Hex |
||||||
|
|
||||||
|
s.db.Update(func(txn *badger.Txn) error { |
||||||
|
return txn.Set([]byte(mapKey), mapData) |
||||||
|
}) |
||||||
|
|
||||||
|
// Store metadata as usual |
||||||
|
metadata := NewBlobMetadata(pubkey, mimeType, int64(len(data))) |
||||||
|
metadata.Extension = extension |
||||||
|
metaData, _ := metadata.Serialize() |
||||||
|
metaKey := prefixBlobMeta + sha256Hex |
||||||
|
|
||||||
|
s.db.Update(func(txn *badger.Txn) error { |
||||||
|
return txn.Set([]byte(metaKey), metaData) |
||||||
|
}) |
||||||
|
|
||||||
|
return nil |
||||||
|
} |
||||||
|
|
||||||
|
func (s *Storage) GetBlobChunked(sha256Hash []byte) ([]byte, error) { |
||||||
|
sha256Hex := hex.Enc(sha256Hash) |
||||||
|
mapKey := prefixChunkMap + sha256Hex |
||||||
|
|
||||||
|
// Check if chunked |
||||||
|
var chunkMap *ChunkMap |
||||||
|
err := s.db.View(func(txn *badger.Txn) error { |
||||||
|
item, err := txn.Get([]byte(mapKey)) |
||||||
|
if err == badger.ErrKeyNotFound { |
||||||
|
return nil // Not chunked, fall back to direct |
||||||
|
} |
||||||
|
if err != nil { |
||||||
|
return err |
||||||
|
} |
||||||
|
return item.Value(func(val []byte) error { |
||||||
|
return json.Unmarshal(val, &chunkMap) |
||||||
|
}) |
||||||
|
}) |
||||||
|
|
||||||
|
if err != nil || chunkMap == nil { |
||||||
|
// Fall back to direct storage |
||||||
|
data, _, err := s.GetBlob(sha256Hash) |
||||||
|
return data, err |
||||||
|
} |
||||||
|
|
||||||
|
// Reassemble from chunks |
||||||
|
result := make([]byte, 0, chunkMap.TotalSize) |
||||||
|
for _, chunkHash := range chunkMap.ChunkHashes { |
||||||
|
chunkKey := prefixChunk + chunkHash |
||||||
|
var chunk []byte |
||||||
|
s.db.View(func(txn *badger.Txn) error { |
||||||
|
item, err := txn.Get([]byte(chunkKey)) |
||||||
|
if err != nil { |
||||||
|
return err |
||||||
|
} |
||||||
|
chunk, err = item.ValueCopy(nil) |
||||||
|
return err |
||||||
|
}) |
||||||
|
result = append(result, chunk...) |
||||||
|
} |
||||||
|
|
||||||
|
return result, nil |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Testing Strategy |
||||||
|
|
||||||
|
### Unit Tests |
||||||
|
Each optimization should include: |
||||||
|
1. **Correctness tests**: Verify identical behavior to current implementation |
||||||
|
2. **Performance benchmarks**: Measure read/write latency improvements |
||||||
|
3. **Storage tests**: Verify space savings |
||||||
|
|
||||||
|
### Integration Tests |
||||||
|
1. **Migration tests**: Ensure backward compatibility |
||||||
|
2. **Load tests**: Simulate relay workload |
||||||
|
3. **Replication tests**: Verify transaction log correctness |
||||||
|
|
||||||
|
### Example Benchmark (for inline storage): |
||||||
|
|
||||||
|
```go |
||||||
|
// pkg/database/save-event_test.go |
||||||
|
|
||||||
|
func BenchmarkSaveEventInline(b *testing.B) { |
||||||
|
// Small event (typical note) |
||||||
|
ev := &event.E{ |
||||||
|
Kind: 1, |
||||||
|
CreatedAt: uint64(time.Now().Unix()), |
||||||
|
Content: "Hello Nostr world!", |
||||||
|
// ... rest of event |
||||||
|
} |
||||||
|
|
||||||
|
b.ResetTimer() |
||||||
|
for i := 0; i < b.N; i++ { |
||||||
|
db.SaveEvent(ctx, ev) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
func BenchmarkQueryEventsInline(b *testing.B) { |
||||||
|
// Populate with 10K small events |
||||||
|
// ... |
||||||
|
|
||||||
|
f := &filter.F{ |
||||||
|
Authors: tag.NewFromBytesSlice(testPubkey), |
||||||
|
Limit: ptrInt(20), |
||||||
|
} |
||||||
|
|
||||||
|
b.ResetTimer() |
||||||
|
for i := 0; i < b.N; i++ { |
||||||
|
events, _ := db.QueryEvents(ctx, f) |
||||||
|
if len(events) != 20 { |
||||||
|
b.Fatal("wrong count") |
||||||
|
} |
||||||
|
} |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Conclusion |
||||||
|
|
||||||
|
ORLY's immutable event architecture makes it an **ideal candidate** for Reiser4-inspired optimizations. The top recommendations are: |
||||||
|
|
||||||
|
1. **Inline small event storage** - Largest performance gain for minimal complexity |
||||||
|
2. **Content-addressed deduplication** - Zero-risk storage savings |
||||||
|
3. **Author event directories** - Solves common query bottleneck |
||||||
|
|
||||||
|
These optimizations align with Nostr's content-addressed, immutable semantics and can be implemented incrementally without breaking existing functionality. |
||||||
|
|
||||||
|
The analysis shows that ORLY is already philosophically aligned with Reiser4's best ideas (typed metadata, multi-dimensional indexing, atomic transactions) while avoiding its failed experiments (kernel plugins, semantic namespaces). Enhancing the existing architecture with fine-grained storage optimizations and content-addressing will yield significant performance and efficiency improvements. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## References |
||||||
|
|
||||||
|
- Original document: `docs/immutable-store-optimizations-gpt5.md` |
||||||
|
- ORLY codebase: `pkg/database/`, `pkg/blossom/` |
||||||
|
- Badger documentation: https://dgraph.io/docs/badger/ |
||||||
|
- Nostr protocol: https://github.com/nostr-protocol/nips |
||||||
@ -0,0 +1,279 @@ |
|||||||
|
package database |
||||||
|
|
||||||
|
import ( |
||||||
|
"context" |
||||||
|
"os" |
||||||
|
"testing" |
||||||
|
|
||||||
|
"github.com/stretchr/testify/assert" |
||||||
|
"github.com/stretchr/testify/require" |
||||||
|
"next.orly.dev/pkg/encoders/event" |
||||||
|
"next.orly.dev/pkg/encoders/kind" |
||||||
|
"next.orly.dev/pkg/encoders/tag" |
||||||
|
"next.orly.dev/pkg/encoders/timestamp" |
||||||
|
"next.orly.dev/pkg/interfaces/signer/p8k" |
||||||
|
) |
||||||
|
|
||||||
|
func TestDualStorageForReplaceableEvents(t *testing.T) { |
||||||
|
// Create a temporary directory for the database
|
||||||
|
tempDir, err := os.MkdirTemp("", "test-dual-db-*") |
||||||
|
require.NoError(t, err) |
||||||
|
defer os.RemoveAll(tempDir) |
||||||
|
|
||||||
|
// Create a context and cancel function for the database
|
||||||
|
ctx, cancel := context.WithCancel(context.Background()) |
||||||
|
defer cancel() |
||||||
|
|
||||||
|
// Initialize the database
|
||||||
|
db, err := New(ctx, cancel, tempDir, "info") |
||||||
|
require.NoError(t, err) |
||||||
|
defer db.Close() |
||||||
|
|
||||||
|
// Create a signing key
|
||||||
|
sign := p8k.MustNew() |
||||||
|
require.NoError(t, sign.Generate()) |
||||||
|
|
||||||
|
t.Run("SmallReplaceableEvent", func(t *testing.T) { |
||||||
|
// Create a small replaceable event (kind 0 - profile metadata)
|
||||||
|
ev := event.New() |
||||||
|
ev.Pubkey = sign.Pub() |
||||||
|
ev.CreatedAt = timestamp.Now().V |
||||||
|
ev.Kind = kind.ProfileMetadata.K |
||||||
|
ev.Tags = tag.NewS() |
||||||
|
ev.Content = []byte(`{"name":"Alice","about":"Test user"}`) |
||||||
|
|
||||||
|
require.NoError(t, ev.Sign(sign)) |
||||||
|
|
||||||
|
// Save the event
|
||||||
|
replaced, err := db.SaveEvent(ctx, ev) |
||||||
|
require.NoError(t, err) |
||||||
|
assert.False(t, replaced) |
||||||
|
|
||||||
|
// Fetch by serial - should work via sev key
|
||||||
|
ser, err := db.GetSerialById(ev.ID) |
||||||
|
require.NoError(t, err) |
||||||
|
require.NotNil(t, ser) |
||||||
|
|
||||||
|
fetched, err := db.FetchEventBySerial(ser) |
||||||
|
require.NoError(t, err) |
||||||
|
require.NotNil(t, fetched) |
||||||
|
|
||||||
|
// Verify event contents
|
||||||
|
assert.Equal(t, ev.ID, fetched.ID) |
||||||
|
assert.Equal(t, ev.Pubkey, fetched.Pubkey) |
||||||
|
assert.Equal(t, ev.Kind, fetched.Kind) |
||||||
|
assert.Equal(t, ev.Content, fetched.Content) |
||||||
|
}) |
||||||
|
|
||||||
|
t.Run("LargeReplaceableEvent", func(t *testing.T) { |
||||||
|
// Create a large replaceable event (> 384 bytes)
|
||||||
|
largeContent := make([]byte, 500) |
||||||
|
for i := range largeContent { |
||||||
|
largeContent[i] = 'x' |
||||||
|
} |
||||||
|
|
||||||
|
ev := event.New() |
||||||
|
ev.Pubkey = sign.Pub() |
||||||
|
ev.CreatedAt = timestamp.Now().V + 1 |
||||||
|
ev.Kind = kind.ProfileMetadata.K |
||||||
|
ev.Tags = tag.NewS() |
||||||
|
ev.Content = largeContent |
||||||
|
|
||||||
|
require.NoError(t, ev.Sign(sign)) |
||||||
|
|
||||||
|
// Save the event
|
||||||
|
replaced, err := db.SaveEvent(ctx, ev) |
||||||
|
require.NoError(t, err) |
||||||
|
assert.True(t, replaced) // Should replace the previous profile
|
||||||
|
|
||||||
|
// Fetch by serial - should work via evt key
|
||||||
|
ser, err := db.GetSerialById(ev.ID) |
||||||
|
require.NoError(t, err) |
||||||
|
require.NotNil(t, ser) |
||||||
|
|
||||||
|
fetched, err := db.FetchEventBySerial(ser) |
||||||
|
require.NoError(t, err) |
||||||
|
require.NotNil(t, fetched) |
||||||
|
|
||||||
|
// Verify event contents
|
||||||
|
assert.Equal(t, ev.ID, fetched.ID) |
||||||
|
assert.Equal(t, ev.Content, fetched.Content) |
||||||
|
}) |
||||||
|
} |
||||||
|
|
||||||
|
func TestDualStorageForAddressableEvents(t *testing.T) { |
||||||
|
// Create a temporary directory for the database
|
||||||
|
tempDir, err := os.MkdirTemp("", "test-addressable-db-*") |
||||||
|
require.NoError(t, err) |
||||||
|
defer os.RemoveAll(tempDir) |
||||||
|
|
||||||
|
// Create a context and cancel function for the database
|
||||||
|
ctx, cancel := context.WithCancel(context.Background()) |
||||||
|
defer cancel() |
||||||
|
|
||||||
|
// Initialize the database
|
||||||
|
db, err := New(ctx, cancel, tempDir, "info") |
||||||
|
require.NoError(t, err) |
||||||
|
defer db.Close() |
||||||
|
|
||||||
|
// Create a signing key
|
||||||
|
sign := p8k.MustNew() |
||||||
|
require.NoError(t, sign.Generate()) |
||||||
|
|
||||||
|
t.Run("SmallAddressableEvent", func(t *testing.T) { |
||||||
|
// Create a small addressable event (kind 30023 - long-form content)
|
||||||
|
ev := event.New() |
||||||
|
ev.Pubkey = sign.Pub() |
||||||
|
ev.CreatedAt = timestamp.Now().V |
||||||
|
ev.Kind = 30023 |
||||||
|
ev.Tags = tag.NewS( |
||||||
|
tag.NewFromAny("d", []byte("my-article")), |
||||||
|
tag.NewFromAny("title", []byte("Test Article")), |
||||||
|
) |
||||||
|
ev.Content = []byte("This is a short article.") |
||||||
|
|
||||||
|
require.NoError(t, ev.Sign(sign)) |
||||||
|
|
||||||
|
// Save the event
|
||||||
|
replaced, err := db.SaveEvent(ctx, ev) |
||||||
|
require.NoError(t, err) |
||||||
|
assert.False(t, replaced) |
||||||
|
|
||||||
|
// Fetch by serial - should work via sev key
|
||||||
|
ser, err := db.GetSerialById(ev.ID) |
||||||
|
require.NoError(t, err) |
||||||
|
require.NotNil(t, ser) |
||||||
|
|
||||||
|
fetched, err := db.FetchEventBySerial(ser) |
||||||
|
require.NoError(t, err) |
||||||
|
require.NotNil(t, fetched) |
||||||
|
|
||||||
|
// Verify event contents
|
||||||
|
assert.Equal(t, ev.ID, fetched.ID) |
||||||
|
assert.Equal(t, ev.Pubkey, fetched.Pubkey) |
||||||
|
assert.Equal(t, ev.Kind, fetched.Kind) |
||||||
|
assert.Equal(t, ev.Content, fetched.Content) |
||||||
|
|
||||||
|
// Verify d tag
|
||||||
|
dTag := fetched.Tags.GetFirst([]byte("d")) |
||||||
|
require.NotNil(t, dTag) |
||||||
|
assert.Equal(t, []byte("my-article"), dTag.Value()) |
||||||
|
}) |
||||||
|
|
||||||
|
t.Run("AddressableEventWithoutDTag", func(t *testing.T) { |
||||||
|
// Create an addressable event without d tag (should be treated as regular event)
|
||||||
|
ev := event.New() |
||||||
|
ev.Pubkey = sign.Pub() |
||||||
|
ev.CreatedAt = timestamp.Now().V + 1 |
||||||
|
ev.Kind = 30023 |
||||||
|
ev.Tags = tag.NewS() |
||||||
|
ev.Content = []byte("Article without d tag") |
||||||
|
|
||||||
|
require.NoError(t, ev.Sign(sign)) |
||||||
|
|
||||||
|
// Save should fail with missing d tag error
|
||||||
|
_, err := db.SaveEvent(ctx, ev) |
||||||
|
assert.Error(t, err) |
||||||
|
assert.Contains(t, err.Error(), "missing a d tag") |
||||||
|
}) |
||||||
|
|
||||||
|
t.Run("ReplaceAddressableEvent", func(t *testing.T) { |
||||||
|
// Create first version
|
||||||
|
ev1 := event.New() |
||||||
|
ev1.Pubkey = sign.Pub() |
||||||
|
ev1.CreatedAt = timestamp.Now().V |
||||||
|
ev1.Kind = 30023 |
||||||
|
ev1.Tags = tag.NewS( |
||||||
|
tag.NewFromAny("d", []byte("replaceable-article")), |
||||||
|
) |
||||||
|
ev1.Content = []byte("Version 1") |
||||||
|
|
||||||
|
require.NoError(t, ev1.Sign(sign)) |
||||||
|
|
||||||
|
replaced, err := db.SaveEvent(ctx, ev1) |
||||||
|
require.NoError(t, err) |
||||||
|
assert.False(t, replaced) |
||||||
|
|
||||||
|
// Create second version (newer)
|
||||||
|
ev2 := event.New() |
||||||
|
ev2.Pubkey = sign.Pub() |
||||||
|
ev2.CreatedAt = ev1.CreatedAt + 10 |
||||||
|
ev2.Kind = 30023 |
||||||
|
ev2.Tags = tag.NewS( |
||||||
|
tag.NewFromAny("d", []byte("replaceable-article")), |
||||||
|
) |
||||||
|
ev2.Content = []byte("Version 2") |
||||||
|
|
||||||
|
require.NoError(t, ev2.Sign(sign)) |
||||||
|
|
||||||
|
replaced, err = db.SaveEvent(ctx, ev2) |
||||||
|
require.NoError(t, err) |
||||||
|
assert.True(t, replaced) |
||||||
|
|
||||||
|
// Try to save older version (should fail)
|
||||||
|
ev0 := event.New() |
||||||
|
ev0.Pubkey = sign.Pub() |
||||||
|
ev0.CreatedAt = ev1.CreatedAt - 10 |
||||||
|
ev0.Kind = 30023 |
||||||
|
ev0.Tags = tag.NewS( |
||||||
|
tag.NewFromAny("d", []byte("replaceable-article")), |
||||||
|
) |
||||||
|
ev0.Content = []byte("Version 0 (old)") |
||||||
|
|
||||||
|
require.NoError(t, ev0.Sign(sign)) |
||||||
|
|
||||||
|
replaced, err = db.SaveEvent(ctx, ev0) |
||||||
|
assert.Error(t, err) |
||||||
|
assert.Contains(t, err.Error(), "older than existing") |
||||||
|
}) |
||||||
|
} |
||||||
|
|
||||||
|
func TestDualStorageRegularEvents(t *testing.T) { |
||||||
|
// Create a temporary directory for the database
|
||||||
|
tempDir, err := os.MkdirTemp("", "test-regular-db-*") |
||||||
|
require.NoError(t, err) |
||||||
|
defer os.RemoveAll(tempDir) |
||||||
|
|
||||||
|
// Create a context and cancel function for the database
|
||||||
|
ctx, cancel := context.WithCancel(context.Background()) |
||||||
|
defer cancel() |
||||||
|
|
||||||
|
// Initialize the database
|
||||||
|
db, err := New(ctx, cancel, tempDir, "info") |
||||||
|
require.NoError(t, err) |
||||||
|
defer db.Close() |
||||||
|
|
||||||
|
// Create a signing key
|
||||||
|
sign := p8k.MustNew() |
||||||
|
require.NoError(t, sign.Generate()) |
||||||
|
|
||||||
|
t.Run("SmallRegularEvent", func(t *testing.T) { |
||||||
|
// Create a small regular event (kind 1 - note)
|
||||||
|
ev := event.New() |
||||||
|
ev.Pubkey = sign.Pub() |
||||||
|
ev.CreatedAt = timestamp.Now().V |
||||||
|
ev.Kind = kind.TextNote.K |
||||||
|
ev.Tags = tag.NewS() |
||||||
|
ev.Content = []byte("Hello, Nostr!") |
||||||
|
|
||||||
|
require.NoError(t, ev.Sign(sign)) |
||||||
|
|
||||||
|
// Save the event
|
||||||
|
replaced, err := db.SaveEvent(ctx, ev) |
||||||
|
require.NoError(t, err) |
||||||
|
assert.False(t, replaced) |
||||||
|
|
||||||
|
// Fetch by serial - should work via sev key
|
||||||
|
ser, err := db.GetSerialById(ev.ID) |
||||||
|
require.NoError(t, err) |
||||||
|
require.NotNil(t, ser) |
||||||
|
|
||||||
|
fetched, err := db.FetchEventBySerial(ser) |
||||||
|
require.NoError(t, err) |
||||||
|
require.NotNil(t, fetched) |
||||||
|
|
||||||
|
// Verify event contents
|
||||||
|
assert.Equal(t, ev.ID, fetched.ID) |
||||||
|
assert.Equal(t, ev.Content, fetched.Content) |
||||||
|
}) |
||||||
|
} |
||||||
@ -0,0 +1,521 @@ |
|||||||
|
package database |
||||||
|
|
||||||
|
import ( |
||||||
|
"bytes" |
||||||
|
"context" |
||||||
|
"os" |
||||||
|
"testing" |
||||||
|
"time" |
||||||
|
|
||||||
|
"github.com/dgraph-io/badger/v4" |
||||||
|
"lol.mleku.dev/chk" |
||||||
|
"next.orly.dev/pkg/database/indexes" |
||||||
|
"next.orly.dev/pkg/database/indexes/types" |
||||||
|
"next.orly.dev/pkg/encoders/event" |
||||||
|
"next.orly.dev/pkg/encoders/hex" |
||||||
|
"next.orly.dev/pkg/encoders/kind" |
||||||
|
"next.orly.dev/pkg/encoders/tag" |
||||||
|
"next.orly.dev/pkg/encoders/timestamp" |
||||||
|
"next.orly.dev/pkg/interfaces/signer/p8k" |
||||||
|
) |
||||||
|
|
||||||
|
// TestInlineSmallEventStorage tests the Reiser4-inspired inline storage optimization
|
||||||
|
// for small events (<=384 bytes).
|
||||||
|
func TestInlineSmallEventStorage(t *testing.T) { |
||||||
|
// Create a temporary directory for the database
|
||||||
|
tempDir, err := os.MkdirTemp("", "test-inline-db-*") |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to create temporary directory: %v", err) |
||||||
|
} |
||||||
|
defer os.RemoveAll(tempDir) |
||||||
|
|
||||||
|
// Create a context and cancel function for the database
|
||||||
|
ctx, cancel := context.WithCancel(context.Background()) |
||||||
|
defer cancel() |
||||||
|
|
||||||
|
// Initialize the database
|
||||||
|
db, err := New(ctx, cancel, tempDir, "info") |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to create database: %v", err) |
||||||
|
} |
||||||
|
defer db.Close() |
||||||
|
|
||||||
|
// Create a signer
|
||||||
|
sign := p8k.MustNew() |
||||||
|
if err := sign.Generate(); chk.E(err) { |
||||||
|
t.Fatal(err) |
||||||
|
} |
||||||
|
|
||||||
|
// Test Case 1: Small event (should use inline storage)
|
||||||
|
t.Run("SmallEventInlineStorage", func(t *testing.T) { |
||||||
|
smallEvent := event.New() |
||||||
|
smallEvent.Kind = kind.TextNote.K |
||||||
|
smallEvent.CreatedAt = timestamp.Now().V |
||||||
|
smallEvent.Content = []byte("Hello Nostr!") // Small content
|
||||||
|
smallEvent.Pubkey = sign.Pub() |
||||||
|
smallEvent.Tags = tag.NewS() |
||||||
|
|
||||||
|
// Sign the event
|
||||||
|
if err := smallEvent.Sign(sign); err != nil { |
||||||
|
t.Fatalf("Failed to sign small event: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Save the event
|
||||||
|
if _, err := db.SaveEvent(ctx, smallEvent); err != nil { |
||||||
|
t.Fatalf("Failed to save small event: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Verify it was stored with sev prefix
|
||||||
|
serial, err := db.GetSerialById(smallEvent.ID) |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to get serial for small event: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Check that sev key exists
|
||||||
|
sevKeyExists := false |
||||||
|
db.View(func(txn *badger.Txn) error { |
||||||
|
smallBuf := new(bytes.Buffer) |
||||||
|
indexes.SmallEventEnc(serial).MarshalWrite(smallBuf) |
||||||
|
|
||||||
|
opts := badger.DefaultIteratorOptions |
||||||
|
opts.Prefix = smallBuf.Bytes() |
||||||
|
it := txn.NewIterator(opts) |
||||||
|
defer it.Close() |
||||||
|
|
||||||
|
it.Rewind() |
||||||
|
if it.Valid() { |
||||||
|
sevKeyExists = true |
||||||
|
} |
||||||
|
return nil |
||||||
|
}) |
||||||
|
|
||||||
|
if !sevKeyExists { |
||||||
|
t.Errorf("Small event was not stored with sev prefix") |
||||||
|
} |
||||||
|
|
||||||
|
// Verify evt key does NOT exist for small event
|
||||||
|
evtKeyExists := false |
||||||
|
db.View(func(txn *badger.Txn) error { |
||||||
|
buf := new(bytes.Buffer) |
||||||
|
indexes.EventEnc(serial).MarshalWrite(buf) |
||||||
|
|
||||||
|
_, err := txn.Get(buf.Bytes()) |
||||||
|
if err == nil { |
||||||
|
evtKeyExists = true |
||||||
|
} |
||||||
|
return nil |
||||||
|
}) |
||||||
|
|
||||||
|
if evtKeyExists { |
||||||
|
t.Errorf("Small event should not have evt key (should only use sev)") |
||||||
|
} |
||||||
|
|
||||||
|
// Fetch and verify the event
|
||||||
|
fetchedEvent, err := db.FetchEventBySerial(serial) |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to fetch small event: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
if !bytes.Equal(fetchedEvent.ID, smallEvent.ID) { |
||||||
|
t.Errorf("Fetched event ID mismatch: got %x, want %x", fetchedEvent.ID, smallEvent.ID) |
||||||
|
} |
||||||
|
if !bytes.Equal(fetchedEvent.Content, smallEvent.Content) { |
||||||
|
t.Errorf("Fetched event content mismatch: got %q, want %q", fetchedEvent.Content, smallEvent.Content) |
||||||
|
} |
||||||
|
}) |
||||||
|
|
||||||
|
// Test Case 2: Large event (should use traditional storage)
|
||||||
|
t.Run("LargeEventTraditionalStorage", func(t *testing.T) { |
||||||
|
largeEvent := event.New() |
||||||
|
largeEvent.Kind = kind.TextNote.K |
||||||
|
largeEvent.CreatedAt = timestamp.Now().V |
||||||
|
// Create content larger than 384 bytes
|
||||||
|
largeContent := make([]byte, 500) |
||||||
|
for i := range largeContent { |
||||||
|
largeContent[i] = 'x' |
||||||
|
} |
||||||
|
largeEvent.Content = largeContent |
||||||
|
largeEvent.Pubkey = sign.Pub() |
||||||
|
largeEvent.Tags = tag.NewS() |
||||||
|
|
||||||
|
// Sign the event
|
||||||
|
if err := largeEvent.Sign(sign); err != nil { |
||||||
|
t.Fatalf("Failed to sign large event: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Save the event
|
||||||
|
if _, err := db.SaveEvent(ctx, largeEvent); err != nil { |
||||||
|
t.Fatalf("Failed to save large event: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Verify it was stored with evt prefix
|
||||||
|
serial, err := db.GetSerialById(largeEvent.ID) |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to get serial for large event: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Check that evt key exists
|
||||||
|
evtKeyExists := false |
||||||
|
db.View(func(txn *badger.Txn) error { |
||||||
|
buf := new(bytes.Buffer) |
||||||
|
indexes.EventEnc(serial).MarshalWrite(buf) |
||||||
|
|
||||||
|
_, err := txn.Get(buf.Bytes()) |
||||||
|
if err == nil { |
||||||
|
evtKeyExists = true |
||||||
|
} |
||||||
|
return nil |
||||||
|
}) |
||||||
|
|
||||||
|
if !evtKeyExists { |
||||||
|
t.Errorf("Large event was not stored with evt prefix") |
||||||
|
} |
||||||
|
|
||||||
|
// Fetch and verify the event
|
||||||
|
fetchedEvent, err := db.FetchEventBySerial(serial) |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to fetch large event: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
if !bytes.Equal(fetchedEvent.ID, largeEvent.ID) { |
||||||
|
t.Errorf("Fetched event ID mismatch: got %x, want %x", fetchedEvent.ID, largeEvent.ID) |
||||||
|
} |
||||||
|
}) |
||||||
|
|
||||||
|
// Test Case 3: Batch fetch with mixed small and large events
|
||||||
|
t.Run("BatchFetchMixedEvents", func(t *testing.T) { |
||||||
|
var serials []*types.Uint40 |
||||||
|
expectedIDs := make(map[uint64][]byte) |
||||||
|
|
||||||
|
// Create 10 small events and 10 large events
|
||||||
|
for i := 0; i < 20; i++ { |
||||||
|
ev := event.New() |
||||||
|
ev.Kind = kind.TextNote.K |
||||||
|
ev.CreatedAt = timestamp.Now().V + int64(i) |
||||||
|
ev.Pubkey = sign.Pub() |
||||||
|
ev.Tags = tag.NewS() |
||||||
|
|
||||||
|
// Alternate between small and large
|
||||||
|
if i%2 == 0 { |
||||||
|
ev.Content = []byte("Small event") |
||||||
|
} else { |
||||||
|
largeContent := make([]byte, 500) |
||||||
|
for j := range largeContent { |
||||||
|
largeContent[j] = 'x' |
||||||
|
} |
||||||
|
ev.Content = largeContent |
||||||
|
} |
||||||
|
|
||||||
|
if err := ev.Sign(sign); err != nil { |
||||||
|
t.Fatalf("Failed to sign event %d: %v", i, err) |
||||||
|
} |
||||||
|
|
||||||
|
if _, err := db.SaveEvent(ctx, ev); err != nil { |
||||||
|
t.Fatalf("Failed to save event %d: %v", i, err) |
||||||
|
} |
||||||
|
|
||||||
|
serial, err := db.GetSerialById(ev.ID) |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to get serial for event %d: %v", i, err) |
||||||
|
} |
||||||
|
|
||||||
|
serials = append(serials, serial) |
||||||
|
expectedIDs[serial.Get()] = ev.ID |
||||||
|
} |
||||||
|
|
||||||
|
// Batch fetch all events
|
||||||
|
events, err := db.FetchEventsBySerials(serials) |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to batch fetch events: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
if len(events) != 20 { |
||||||
|
t.Errorf("Expected 20 events, got %d", len(events)) |
||||||
|
} |
||||||
|
|
||||||
|
// Verify all events were fetched correctly
|
||||||
|
for serialValue, ev := range events { |
||||||
|
expectedID := expectedIDs[serialValue] |
||||||
|
if !bytes.Equal(ev.ID, expectedID) { |
||||||
|
t.Errorf("Event ID mismatch for serial %d: got %x, want %x", |
||||||
|
serialValue, ev.ID, expectedID) |
||||||
|
} |
||||||
|
} |
||||||
|
}) |
||||||
|
|
||||||
|
// Test Case 4: Edge case - event near 384 byte threshold
|
||||||
|
t.Run("ThresholdEvent", func(t *testing.T) { |
||||||
|
ev := event.New() |
||||||
|
ev.Kind = kind.TextNote.K |
||||||
|
ev.CreatedAt = timestamp.Now().V |
||||||
|
ev.Pubkey = sign.Pub() |
||||||
|
ev.Tags = tag.NewS() |
||||||
|
|
||||||
|
// Create content near the threshold
|
||||||
|
testContent := make([]byte, 250) |
||||||
|
for i := range testContent { |
||||||
|
testContent[i] = 'x' |
||||||
|
} |
||||||
|
ev.Content = testContent |
||||||
|
|
||||||
|
if err := ev.Sign(sign); err != nil { |
||||||
|
t.Fatalf("Failed to sign threshold event: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
if _, err := db.SaveEvent(ctx, ev); err != nil { |
||||||
|
t.Fatalf("Failed to save threshold event: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
serial, err := db.GetSerialById(ev.ID) |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to get serial: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Fetch and verify
|
||||||
|
fetchedEvent, err := db.FetchEventBySerial(serial) |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to fetch threshold event: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
if !bytes.Equal(fetchedEvent.ID, ev.ID) { |
||||||
|
t.Errorf("Fetched event ID mismatch") |
||||||
|
} |
||||||
|
}) |
||||||
|
} |
||||||
|
|
||||||
|
// TestInlineStorageMigration tests the migration from traditional to inline storage
|
||||||
|
func TestInlineStorageMigration(t *testing.T) { |
||||||
|
// Create a temporary directory for the database
|
||||||
|
tempDir, err := os.MkdirTemp("", "test-migration-db-*") |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to create temporary directory: %v", err) |
||||||
|
} |
||||||
|
defer os.RemoveAll(tempDir) |
||||||
|
|
||||||
|
// Create a context and cancel function for the database
|
||||||
|
ctx, cancel := context.WithCancel(context.Background()) |
||||||
|
defer cancel() |
||||||
|
|
||||||
|
// Initialize the database
|
||||||
|
db, err := New(ctx, cancel, tempDir, "info") |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to create database: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Create a signer
|
||||||
|
sign := p8k.MustNew() |
||||||
|
if err := sign.Generate(); chk.E(err) { |
||||||
|
t.Fatal(err) |
||||||
|
} |
||||||
|
|
||||||
|
// Manually set database version to 3 (before inline storage migration)
|
||||||
|
db.writeVersionTag(3) |
||||||
|
|
||||||
|
// Create and save some small events the old way (manually)
|
||||||
|
var testEvents []*event.E |
||||||
|
for i := 0; i < 5; i++ { |
||||||
|
ev := event.New() |
||||||
|
ev.Kind = kind.TextNote.K |
||||||
|
ev.CreatedAt = timestamp.Now().V + int64(i) |
||||||
|
ev.Content = []byte("Test event") |
||||||
|
ev.Pubkey = sign.Pub() |
||||||
|
ev.Tags = tag.NewS() |
||||||
|
|
||||||
|
if err := ev.Sign(sign); err != nil { |
||||||
|
t.Fatalf("Failed to sign event: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Get next serial
|
||||||
|
serial, err := db.seq.Next() |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to get serial: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Generate indexes
|
||||||
|
idxs, err := GetIndexesForEvent(ev, serial) |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to generate indexes: %v", err) |
||||||
|
} |
||||||
|
|
||||||
|
// Serialize event
|
||||||
|
eventDataBuf := new(bytes.Buffer) |
||||||
|
ev.MarshalBinary(eventDataBuf) |
||||||
|
eventData := eventDataBuf.Bytes() |
||||||
|
|
||||||
|
// Save the old way (evt prefix with value)
|
||||||
|
db.Update(func(txn *badger.Txn) error { |
||||||
|
ser := new(types.Uint40) |
||||||
|
ser.Set(serial) |
||||||
|
|
||||||
|
// Save indexes
|
||||||
|
for _, key := range idxs { |
||||||
|
txn.Set(key, nil) |
||||||
|
} |
||||||
|
|
||||||
|
// Save event the old way
|
||||||
|
keyBuf := new(bytes.Buffer) |
||||||
|
indexes.EventEnc(ser).MarshalWrite(keyBuf) |
||||||
|
txn.Set(keyBuf.Bytes(), eventData) |
||||||
|
|
||||||
|
return nil |
||||||
|
}) |
||||||
|
|
||||||
|
testEvents = append(testEvents, ev) |
||||||
|
} |
||||||
|
|
||||||
|
t.Logf("Created %d test events with old storage format", len(testEvents)) |
||||||
|
|
||||||
|
// Close and reopen database to trigger migration
|
||||||
|
db.Close() |
||||||
|
|
||||||
|
db, err = New(ctx, cancel, tempDir, "info") |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to reopen database: %v", err) |
||||||
|
} |
||||||
|
defer db.Close() |
||||||
|
|
||||||
|
// Give migration time to complete
|
||||||
|
time.Sleep(100 * time.Millisecond) |
||||||
|
|
||||||
|
// Verify all events can still be fetched
|
||||||
|
for i, ev := range testEvents { |
||||||
|
serial, err := db.GetSerialById(ev.ID) |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to get serial for event %d after migration: %v", i, err) |
||||||
|
} |
||||||
|
|
||||||
|
fetchedEvent, err := db.FetchEventBySerial(serial) |
||||||
|
if err != nil { |
||||||
|
t.Fatalf("Failed to fetch event %d after migration: %v", i, err) |
||||||
|
} |
||||||
|
|
||||||
|
if !bytes.Equal(fetchedEvent.ID, ev.ID) { |
||||||
|
t.Errorf("Event %d ID mismatch after migration: got %x, want %x", |
||||||
|
i, fetchedEvent.ID, ev.ID) |
||||||
|
} |
||||||
|
|
||||||
|
if !bytes.Equal(fetchedEvent.Content, ev.Content) { |
||||||
|
t.Errorf("Event %d content mismatch after migration: got %q, want %q", |
||||||
|
i, fetchedEvent.Content, ev.Content) |
||||||
|
} |
||||||
|
|
||||||
|
// Verify it's now using inline storage
|
||||||
|
sevKeyExists := false |
||||||
|
db.View(func(txn *badger.Txn) error { |
||||||
|
smallBuf := new(bytes.Buffer) |
||||||
|
indexes.SmallEventEnc(serial).MarshalWrite(smallBuf) |
||||||
|
|
||||||
|
opts := badger.DefaultIteratorOptions |
||||||
|
opts.Prefix = smallBuf.Bytes() |
||||||
|
it := txn.NewIterator(opts) |
||||||
|
defer it.Close() |
||||||
|
|
||||||
|
it.Rewind() |
||||||
|
if it.Valid() { |
||||||
|
sevKeyExists = true |
||||||
|
t.Logf("Event %d (%s) successfully migrated to inline storage", |
||||||
|
i, hex.Enc(ev.ID[:8])) |
||||||
|
} |
||||||
|
return nil |
||||||
|
}) |
||||||
|
|
||||||
|
if !sevKeyExists { |
||||||
|
t.Errorf("Event %d was not migrated to inline storage", i) |
||||||
|
} |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
// BenchmarkInlineVsTraditionalStorage compares performance of inline vs traditional storage
|
||||||
|
func BenchmarkInlineVsTraditionalStorage(b *testing.B) { |
||||||
|
// Create a temporary directory for the database
|
||||||
|
tempDir, err := os.MkdirTemp("", "bench-inline-db-*") |
||||||
|
if err != nil { |
||||||
|
b.Fatalf("Failed to create temporary directory: %v", err) |
||||||
|
} |
||||||
|
defer os.RemoveAll(tempDir) |
||||||
|
|
||||||
|
// Create a context and cancel function for the database
|
||||||
|
ctx, cancel := context.WithCancel(context.Background()) |
||||||
|
defer cancel() |
||||||
|
|
||||||
|
// Initialize the database
|
||||||
|
db, err := New(ctx, cancel, tempDir, "info") |
||||||
|
if err != nil { |
||||||
|
b.Fatalf("Failed to create database: %v", err) |
||||||
|
} |
||||||
|
defer db.Close() |
||||||
|
|
||||||
|
// Create a signer
|
||||||
|
sign := p8k.MustNew() |
||||||
|
if err := sign.Generate(); chk.E(err) { |
||||||
|
b.Fatal(err) |
||||||
|
} |
||||||
|
|
||||||
|
// Pre-populate database with mix of small and large events
|
||||||
|
var smallSerials []*types.Uint40 |
||||||
|
var largeSerials []*types.Uint40 |
||||||
|
|
||||||
|
for i := 0; i < 100; i++ { |
||||||
|
// Small event
|
||||||
|
smallEv := event.New() |
||||||
|
smallEv.Kind = kind.TextNote.K |
||||||
|
smallEv.CreatedAt = timestamp.Now().V + int64(i)*2 |
||||||
|
smallEv.Content = []byte("Small test event") |
||||||
|
smallEv.Pubkey = sign.Pub() |
||||||
|
smallEv.Tags = tag.NewS() |
||||||
|
smallEv.Sign(sign) |
||||||
|
|
||||||
|
db.SaveEvent(ctx, smallEv) |
||||||
|
if serial, err := db.GetSerialById(smallEv.ID); err == nil { |
||||||
|
smallSerials = append(smallSerials, serial) |
||||||
|
} |
||||||
|
|
||||||
|
// Large event
|
||||||
|
largeEv := event.New() |
||||||
|
largeEv.Kind = kind.TextNote.K |
||||||
|
largeEv.CreatedAt = timestamp.Now().V + int64(i)*2 + 1 |
||||||
|
largeContent := make([]byte, 500) |
||||||
|
for j := range largeContent { |
||||||
|
largeContent[j] = 'x' |
||||||
|
} |
||||||
|
largeEv.Content = largeContent |
||||||
|
largeEv.Pubkey = sign.Pub() |
||||||
|
largeEv.Tags = tag.NewS() |
||||||
|
largeEv.Sign(sign) |
||||||
|
|
||||||
|
db.SaveEvent(ctx, largeEv) |
||||||
|
if serial, err := db.GetSerialById(largeEv.ID); err == nil { |
||||||
|
largeSerials = append(largeSerials, serial) |
||||||
|
} |
||||||
|
} |
||||||
|
|
||||||
|
b.Run("FetchSmallEventsInline", func(b *testing.B) { |
||||||
|
b.ResetTimer() |
||||||
|
for i := 0; i < b.N; i++ { |
||||||
|
idx := i % len(smallSerials) |
||||||
|
db.FetchEventBySerial(smallSerials[idx]) |
||||||
|
} |
||||||
|
}) |
||||||
|
|
||||||
|
b.Run("FetchLargeEventsTraditional", func(b *testing.B) { |
||||||
|
b.ResetTimer() |
||||||
|
for i := 0; i < b.N; i++ { |
||||||
|
idx := i % len(largeSerials) |
||||||
|
db.FetchEventBySerial(largeSerials[idx]) |
||||||
|
} |
||||||
|
}) |
||||||
|
|
||||||
|
b.Run("BatchFetchSmallEvents", func(b *testing.B) { |
||||||
|
b.ResetTimer() |
||||||
|
for i := 0; i < b.N; i++ { |
||||||
|
db.FetchEventsBySerials(smallSerials[:10]) |
||||||
|
} |
||||||
|
}) |
||||||
|
|
||||||
|
b.Run("BatchFetchLargeEvents", func(b *testing.B) { |
||||||
|
b.ResetTimer() |
||||||
|
for i := 0; i < b.N; i++ { |
||||||
|
db.FetchEventsBySerials(largeSerials[:10]) |
||||||
|
} |
||||||
|
}) |
||||||
|
} |
||||||
Loading…
Reference in new issue