Browse Source
NIP-77 Protocol Fixes: - Fix role reversal in HandleNegOpen: relay now correctly requires initial message from client and calls Reconcile() instead of Start() - Fix premature event transmission: haveIDs events now only sent when reconciliation completes (complete=true) - Add proper error handling for missing initial message in NEG-OPEN Sync Client Mode (gRPC IPC): - Add --server flag to 'orly sync' for connecting to running service - Enable one-shot sync without direct database access - Support remote sync operations via gRPC - Add filterToProto helper for gRPC communication Documentation: - Add NIP77_ANALYSIS_AND_FIX.md with detailed protocol analysis - Add SYNC_CLIENT_MODE.md with usage examples and architecture diagrams Bug Fix: - Fix nil pointer panic in handle-req.go when subscriptions map is uninitializedmain
6 changed files with 818 additions and 64 deletions
@ -0,0 +1,263 @@
@@ -0,0 +1,263 @@
|
||||
# NIP-77 Implementation Analysis and Fix |
||||
|
||||
## Executive Summary |
||||
|
||||
This document analyzes the NIP-77 negentropy implementation in ORLY and documents fixes made to ensure compatibility with strfry and other NIP-77 compliant implementations. |
||||
|
||||
## Key Issues Fixed |
||||
|
||||
### Issue 1: Role Reversal in NEG-OPEN Handling (CRITICAL) |
||||
|
||||
**Location:** `pkg/sync/negentropy/server/service.go:HandleNegOpen()` |
||||
|
||||
**Problem:** The relay was incorrectly calling `neg.Start()` when no initial message was provided, which violates the NIP-77 protocol specification. |
||||
|
||||
**NIP-77 Specification:** |
||||
- The **client (initiator)** creates a Negentropy object, adds all events, seals it, and calls `initiate()` to create the initial message |
||||
- The **relay (responder)** creates a Negentropy object, adds all events, seals it, and calls `reconcile()` with the client's initial message |
||||
|
||||
**Incorrect Code:** |
||||
```go |
||||
if len(req.InitialMessage) > 0 { |
||||
respMsg, complete, err = neg.Reconcile(req.InitialMessage) |
||||
} else { |
||||
// WRONG: Relay should never call Start() - only the initiator does this |
||||
respMsg, err = neg.Start() |
||||
} |
||||
``` |
||||
|
||||
**Fixed Code:** |
||||
```go |
||||
// The client (initiator) MUST provide an initial message in NEG-OPEN. |
||||
if len(req.InitialMessage) == 0 { |
||||
return &negentropyv1.NegOpenResponse{ |
||||
Error: "blocked: NEG-OPEN requires initialMessage from initiator", |
||||
}, nil |
||||
} |
||||
|
||||
// Process the client's initial message |
||||
respMsg, complete, err := neg.Reconcile(req.InitialMessage) |
||||
``` |
||||
|
||||
### Issue 2: Premature Event Transmission |
||||
|
||||
**Location:** `app/handle-negentropy.go:HandleNegOpen()` and `HandleNegMsg()` |
||||
|
||||
**Problem:** Events for `haveIDs` were being sent in every round of reconciliation, instead of only when reconciliation was complete. |
||||
|
||||
**Explanation:** |
||||
- The negentropy protocol involves multiple rounds of `NEG-MSG` exchanges |
||||
- In intermediate rounds, `haveIDs` and `needIDs` are partial results |
||||
- The final sets are only available when `reconcile()` returns `complete=true` |
||||
- Sending events prematurely could result in: |
||||
- Incomplete event sets being sent |
||||
- Wasted bandwidth on events that aren't actually needed |
||||
|
||||
**Fixed Code:** |
||||
```go |
||||
// Only send events when reconciliation is complete |
||||
if complete { |
||||
log.D.F("NEG-OPEN: reconciliation complete for %s, sending %d events", |
||||
subscriptionID, len(haveIDs)) |
||||
if len(haveIDs) > 0 { |
||||
if err := l.sendEventsForIDs(subscriptionID, haveIDs); err != nil { |
||||
log.E.F("failed to send events for NEG-OPEN: %v", err) |
||||
} |
||||
} |
||||
} |
||||
``` |
||||
|
||||
## How strfry sync Command Works |
||||
|
||||
### Architecture |
||||
|
||||
The strfry `sync` command is a **one-shot CLI tool**, not a persistent daemon: |
||||
|
||||
```bash |
||||
# CLI invocation connects, syncs, disconnects |
||||
./strfry sync wss://relay.example.com --filter '{"kinds":[0,1]}' --dir both |
||||
``` |
||||
|
||||
**Key characteristics:** |
||||
1. Runs as a separate process, not part of the relay daemon |
||||
2. Opens a WebSocket connection, performs sync, closes connection |
||||
3. Uses `--dir` flag to control sync direction: |
||||
- `down`: Download events from remote relay (pull) |
||||
- `up`: Upload events to remote relay (push) |
||||
- `both`: Bidirectional sync (default) |
||||
- `none`: Only perform reconciliation, no event transfer |
||||
|
||||
### Protocol Flow (strfry sync as initiator) |
||||
|
||||
``` |
||||
┌─────────────┐ ┌─────────────┐ |
||||
│ strfry │ (initiator) │ relay │ (responder) |
||||
│ (sync) │ │ (orly) │ |
||||
└──────┬──────┘ └──────┬──────┘ |
||||
│ │ |
||||
│ 1. Build local storage │ |
||||
│ 2. neg.Start() → initialMsg │ |
||||
│ │ |
||||
├────── NEG-OPEN ───────────────►│ |
||||
│ subID, filter, initialMsg │ |
||||
│ │ |
||||
│ │ 3. Build local storage |
||||
│ │ 4. neg.Reconcile(initialMsg) |
||||
│ │ → response, haveIDs, needIDs |
||||
│ │ |
||||
│◄───── NEG-MSG ─────────────────┤ |
||||
│ response (hex) │ |
||||
│ │ |
||||
│ 5. neg.Reconcile(response) │ |
||||
│ → nextMsg, haveIDs, needIDs │ |
||||
│ │ |
||||
│ (repeat NEG-MSG until complete) │ |
||||
│ │ |
||||
├────── NEG-CLOSE ──────────────►│ |
||||
│ │ |
||||
│ 6. Exchange events: │ |
||||
│ - Send haveIDs as EVENT │ |
||||
│ - Receive needIDs as EVENT │ |
||||
│ - Send OK responses │ |
||||
│ │ |
||||
└────── disconnect ──────────────►│ |
||||
``` |
||||
|
||||
### Event Exchange After Reconciliation |
||||
|
||||
After the negentropy reconciliation completes: |
||||
|
||||
**If `--dir up` or `--dir both`:** |
||||
``` |
||||
strfry → EVENT → orly (for each ID in strfry's haveIDs) |
||||
orly → OK → strfry |
||||
``` |
||||
|
||||
**If `--dir down` or `--dir both`:** |
||||
``` |
||||
strfry → REQ (ids=needIDs) → orly |
||||
orly → EVENT → strfry (for each requested ID) |
||||
orly → EOSE → strfry |
||||
strfry → CLOSE → orly |
||||
``` |
||||
|
||||
## Protocol Compatibility |
||||
|
||||
### What ORLY Now Does Correctly |
||||
|
||||
1. **As Responder (receiving NEG-OPEN):** |
||||
- ✅ Builds local storage from filter |
||||
- ✅ Creates Negentropy instance (responder mode) |
||||
- ✅ Calls `Reconcile()` with client's initial message |
||||
- ✅ Returns NEG-MSG response |
||||
- ✅ Accumulates haveIDs/needIDs across rounds |
||||
- ✅ Sends events for haveIDs only when complete |
||||
|
||||
2. **As Initiator (when syncing to peers):** |
||||
- ✅ Calls `Start()` to generate initial message |
||||
- ✅ Sends NEG-OPEN with initial message |
||||
- ✅ Processes NEG-MSG responses |
||||
- ✅ Handles received EVENT messages |
||||
|
||||
### Message Formats |
||||
|
||||
**NEG-OPEN (client → relay):** |
||||
```json |
||||
["NEG-OPEN", "sub123", {"kinds":[0,1]}, "a1b2c3..."] |
||||
``` |
||||
|
||||
**NEG-MSG (bidirectional):** |
||||
```json |
||||
["NEG-MSG", "sub123", "d4e5f6..."] |
||||
``` |
||||
|
||||
**NEG-CLOSE (client → relay):** |
||||
```json |
||||
["NEG-CLOSE", "sub123"] |
||||
``` |
||||
|
||||
**NEG-ERR (relay → client):** |
||||
```json |
||||
["NEG-ERR", "sub123", "blocked: too many events"] |
||||
``` |
||||
|
||||
## Testing with strfry |
||||
|
||||
### ORLY as Sync Target (strfry pulls from ORLY) |
||||
|
||||
```bash |
||||
# On machine with strfry |
||||
./strfry sync wss://your-orly-relay.com --filter '{"kinds": [0, 1]}' --dir down |
||||
``` |
||||
|
||||
Expected flow: |
||||
1. strfry builds local storage |
||||
2. strfry sends NEG-OPEN with initial message |
||||
3. ORLY responds with NEG-MSG |
||||
4. Multiple NEG-MSG rounds until complete |
||||
5. strfry sends NEG-CLOSE |
||||
6. strfry sends REQ for events it needs |
||||
7. ORLY responds with EVENT messages |
||||
8. Connection closes |
||||
|
||||
### ORLY as Sync Source (strfry pushes to ORLY) |
||||
|
||||
```bash |
||||
# On machine with strfry |
||||
./strfry sync wss://your-orly-relay.com --filter '{"kinds": [0, 1]}' --dir up |
||||
``` |
||||
|
||||
Expected flow: |
||||
1. strfry builds local storage |
||||
2. strfry sends NEG-OPEN with initial message |
||||
3. ORLY responds with NEG-MSG |
||||
4. Multiple NEG-MSG rounds until complete |
||||
5. strfry sends NEG-CLOSE |
||||
6. strfry sends EVENT messages for events ORLY needs |
||||
7. ORLY responds with OK messages |
||||
8. Connection closes |
||||
|
||||
### Bidirectional Sync |
||||
|
||||
```bash |
||||
./strfry sync wss://your-orly-relay.com --filter '{"kinds": [0, 1]}' --dir both |
||||
``` |
||||
|
||||
This combines both flows - events flow in both directions based on what each side is missing. |
||||
|
||||
## Files Modified |
||||
|
||||
1. **`pkg/sync/negentropy/server/service.go`** |
||||
- Fixed role reversal in HandleNegOpen |
||||
- Removed incorrect `neg.Start()` call for responder role |
||||
- Added proper error handling for missing initial message |
||||
|
||||
2. **`app/handle-negentropy.go`** |
||||
- Fixed premature event transmission |
||||
- Events for haveIDs now only sent when `complete=true` |
||||
- Improved logging for better debugging |
||||
|
||||
## Verification |
||||
|
||||
To verify the fix works correctly: |
||||
|
||||
```bash |
||||
# 1. Start ORLY relay with negentropy enabled |
||||
export ORLY_NEGENTROPY_ENABLED=true |
||||
./orly |
||||
|
||||
# 2. From another machine with strfry, test sync |
||||
./strfry sync wss://your-orly-relay.com --filter '{"kinds": [1]}' --dir both |
||||
|
||||
# 3. Check ORLY logs for: |
||||
# - "NEG-OPEN: built storage with N events" |
||||
# - "NEG-OPEN: reconcile complete=true/false" |
||||
# - "NEG-OPEN: reconciliation complete for X, sending Y events" |
||||
``` |
||||
|
||||
## References |
||||
|
||||
- [NIP-77 Specification](https://github.com/nostr-protocol/nips/blob/master/77.md) |
||||
- [strfry sync command](https://github.com/hoytech/strfry/blob/master/src/apps/mesh/cmd_sync.cpp) |
||||
- [strfry negentropy protocol docs](https://github.com/hoytech/strfry/blob/master/docs/negentropy.md) |
||||
- [Negentropy library](https://github.com/hoytech/negentropy) |
||||
@ -0,0 +1,247 @@
@@ -0,0 +1,247 @@
|
||||
# ORLY Sync Client Mode |
||||
|
||||
## Overview |
||||
|
||||
ORLY now supports two modes for the `orly sync` command, similar to how strfry's sync command works but with the added flexibility of gRPC-based IPC: |
||||
|
||||
1. **Direct Mode** (like strfry): Opens the database directly |
||||
2. **Client Mode** (unique to ORLY): Connects to a running `orly-sync-negentropy` service via gRPC |
||||
|
||||
## Why Client Mode? |
||||
|
||||
### The Problem with Direct Database Access |
||||
|
||||
Both strfry and ORLY's direct mode require: |
||||
- Filesystem access to the database files |
||||
- The database to support multiple readers (or reader + writer) |
||||
- Running on the same machine as the database |
||||
|
||||
### The gRPC IPC Solution |
||||
|
||||
ORLY's IPC architecture allows the sync command to: |
||||
- Connect to a running service over the network |
||||
- No filesystem access required |
||||
- Sync from a remote machine |
||||
- Multiple clients can sync simultaneously using the same relay's database |
||||
|
||||
## Architecture Comparison |
||||
|
||||
### strfry sync (Direct Mode Only) |
||||
``` |
||||
┌─────────────────┐ ┌─────────────────┐ |
||||
│ strfry sync │────────▶│ LMDB (file) │ |
||||
│ (CLI tool) │ read │ │ |
||||
└────────┬────────┘ └─────────────────┘ |
||||
│ |
||||
│ WebSocket |
||||
▼ |
||||
┌─────────────────┐ |
||||
│ Remote Relay │ |
||||
└─────────────────┘ |
||||
``` |
||||
|
||||
### ORLY sync (Direct Mode - same as strfry) |
||||
``` |
||||
┌─────────────────┐ ┌─────────────────┐ |
||||
│ orly sync │────────▶│ Badger/Neo4j │ |
||||
│ (CLI tool) │ read │ (database) │ |
||||
└────────┬────────┘ └─────────────────┘ |
||||
│ |
||||
│ WebSocket |
||||
▼ |
||||
┌─────────────────┐ |
||||
│ Remote Relay │ |
||||
└─────────────────┘ |
||||
``` |
||||
|
||||
### ORLY sync (Client Mode - gRPC IPC) |
||||
``` |
||||
┌─────────────────┐ ┌─────────────────────────────┐ |
||||
│ orly sync │────────▶│ orly-sync-negentropy │ |
||||
│ (CLI tool) │ gRPC │ (gRPC service) │ |
||||
│ │ │ │ |
||||
│ No DB access │ │ • Manages negentropy │ |
||||
│ needed! │ │ • Opens database │ |
||||
└─────────────────┘ │ • Handles NIP-77 protocol │ |
||||
└──────────────┬──────────────┘ |
||||
│ |
||||
┌──────────────┴──────────────┐ |
||||
│ │ |
||||
┌────────▼────────┐ ┌─────────▼─────────┐ |
||||
│ Badger/Neo4j │ │ Remote Relay │ |
||||
│ (database) │ │ (WebSocket) │ |
||||
└─────────────────┘ └───────────────────┘ |
||||
``` |
||||
|
||||
## Usage |
||||
|
||||
### Direct Mode (strfry-style) |
||||
|
||||
```bash |
||||
# Sync using local database (must have filesystem access) |
||||
orly sync wss://relay.example.com --filter '{"kinds": [0, 3, 1984]}' --dir both |
||||
|
||||
# Options: |
||||
# --data-dir=PATH Path to database (default: ~/.local/share/ORLY) |
||||
# --filter=JSON Nostr filter to limit sync scope |
||||
# --dir=DIR Direction: down, up, both (default: down) |
||||
``` |
||||
|
||||
### Client Mode (gRPC IPC) |
||||
|
||||
```bash |
||||
# Sync using a running ORLY service (no filesystem access needed!) |
||||
orly sync wss://relay.example.com --server 127.0.0.1:50064 --filter '{"kinds": [1]}' |
||||
|
||||
# Options: |
||||
# --server=ADDR gRPC address of orly-sync-negentropy service |
||||
# --filter=JSON Nostr filter to limit sync scope |
||||
# --dir=DIR Direction: down, up, both (default: down) |
||||
``` |
||||
|
||||
## Setting Up the Server |
||||
|
||||
### Option 1: Using orly-launcher (Recommended) |
||||
|
||||
The launcher manages all services automatically: |
||||
|
||||
```bash |
||||
# Enable negentropy service |
||||
export ORLY_LAUNCHER_SYNC_NEGENTROPY_ENABLED=true |
||||
export ORLY_LAUNCHER_SYNC_NEGENTROPY_LISTEN=127.0.0.1:50064 |
||||
|
||||
# Run launcher |
||||
./orly-launcher |
||||
``` |
||||
|
||||
### Option 2: Standalone Service |
||||
|
||||
Run the sync service manually: |
||||
|
||||
```bash |
||||
# Start database first |
||||
ORLY_DB_LISTEN=127.0.0.1:50051 ./orly-db-badger & |
||||
|
||||
# Start sync service |
||||
ORLY_LAUNCHER_SYNC_NEGENTROPY_LISTEN=127.0.0.1:50064 \ |
||||
ORLY_DB_TYPE=grpc \ |
||||
ORLY_GRPC_SERVER=127.0.0.1:50051 \ |
||||
./orly-sync-negentropy |
||||
``` |
||||
|
||||
## Real-World Scenarios |
||||
|
||||
### Scenario 1: Sync from Your Laptop to Production Relay |
||||
|
||||
**Problem:** Your production ORLY relay runs on a server. You want to sync events from another relay using your laptop, but the database is on the server. |
||||
|
||||
**Solution:** |
||||
```bash |
||||
# On server: ensure gRPC is accessible (bind to 0.0.0.0 or use SSH tunnel) |
||||
export ORLY_LAUNCHER_SYNC_NEGENTROPY_LISTEN=0.0.0.0:50064 |
||||
|
||||
# On laptop: sync via gRPC |
||||
orly sync wss://remote-relay.com \ |
||||
--server your-server.com:50064 \ |
||||
--filter '{"kinds": [0, 1, 3]}' \ |
||||
--dir both |
||||
``` |
||||
|
||||
### Scenario 2: Multiple Admins Sharing a Relay |
||||
|
||||
**Problem:** Multiple administrators need to sync content to the same relay, but you don't want to give them all filesystem access. |
||||
|
||||
**Solution:** |
||||
- Run `orly-sync-negentropy` as a service |
||||
- Each admin uses `orly sync --server` to sync |
||||
- No direct database access required |
||||
|
||||
### Scenario 3: Automated Sync Scripts |
||||
|
||||
**Problem:** You want to run sync scripts from a CI/CD pipeline or cron job. |
||||
|
||||
**Solution:** |
||||
```bash |
||||
# Cron job that syncs daily |
||||
0 2 * * * /usr/local/bin/orly sync wss://backup-relay.com \ |
||||
--server 127.0.0.1:50064 \ |
||||
--filter '{"kinds": [0, 1, 3, 1984]}' \ |
||||
--dir both \ |
||||
>> /var/log/orly-sync.log 2>&1 |
||||
``` |
||||
|
||||
## Comparison with strfry |
||||
|
||||
| Feature | strfry sync | ORLY sync (Direct) | ORLY sync (Client) | |
||||
|---------|-------------|-------------------|-------------------| |
||||
| Direct DB access | Yes | Yes | No | |
||||
| Network IPC | No | No | Yes (gRPC) | |
||||
| Remote sync | No | No | Yes | |
||||
| Multiple readers | LMDB | Badger/Neo4j | Yes (via service) | |
||||
| Requires relay running | No | No | Yes | |
||||
|
||||
## Technical Details |
||||
|
||||
### gRPC API |
||||
|
||||
The client mode uses the `NegentropyService.SyncWithPeer` RPC: |
||||
|
||||
```protobuf |
||||
rpc SyncWithPeer(SyncPeerRequest) returns (stream SyncProgress); |
||||
|
||||
message SyncPeerRequest { |
||||
string peer_url = 1; |
||||
Filter filter = 2; |
||||
int64 since = 3; |
||||
} |
||||
|
||||
message SyncProgress { |
||||
string peer_url = 1; |
||||
int32 round = 2; |
||||
int64 have_count = 3; |
||||
int64 need_count = 4; |
||||
int64 fetched_count = 5; |
||||
int64 sent_count = 6; |
||||
bool complete = 7; |
||||
string error = 8; |
||||
} |
||||
``` |
||||
|
||||
### Security Considerations |
||||
|
||||
When exposing the gRPC service over the network: |
||||
|
||||
1. **Bind to localhost only** (default) for single-machine use |
||||
2. **Use SSH tunnels** for remote access: |
||||
```bash |
||||
ssh -L 50064:localhost:50064 user@relay-server |
||||
orly sync wss://... --server 127.0.0.1:50064 |
||||
``` |
||||
3. **Use TLS/mTLS** for production gRPC (future enhancement) |
||||
4. **Firewall rules** to restrict access to trusted IPs |
||||
|
||||
## Troubleshooting |
||||
|
||||
### "Failed to connect to sync service" |
||||
|
||||
- Verify the service is running: `ps aux | grep orly-sync-negentropy` |
||||
- Check the address matches: `netstat -tlnp | grep 50064` |
||||
- Check firewall rules |
||||
|
||||
### "Sync service not ready" |
||||
|
||||
- The service might still be initializing |
||||
- Check that the database service is running |
||||
- Look at service logs for errors |
||||
|
||||
### "Permission denied" (Direct Mode) |
||||
|
||||
- Ensure you have read access to the database directory |
||||
- Check file ownership: `ls -la ~/.local/share/ORLY/` |
||||
|
||||
## Future Enhancements |
||||
|
||||
1. **TLS/mTLS support** for secure remote connections |
||||
2. **Authentication** for client connections |
||||
3. **Rate limiting** to prevent abuse |
||||
4. **Audit logging** for compliance |
||||
Loading…
Reference in new issue