# Bitwarden Database Seeder A class library for generating and inserting properly encrypted test data into Bitwarden databases. ## Domain Taxonomy ### Cipher Encryption States | Term | Description | Stored in DB? | | -------------- | ---------------------------------------------------- | ------------- | | **CipherView** | Plaintext/decrypted form. Human-readable data. | Never | | **Cipher** | Encrypted form. All sensitive fields are EncStrings. | Yes | The "View" suffix always denotes plaintext. No suffix means encrypted. ### Data Structure Differences **SDK Structure (nested):** ```json { "name": "2.x...", "login": { "username": "2.y...", "password": "2.z..." } } ``` **Server Structure (flat, stored in Cipher.Data):** ```json { "Name": "2.x...", "Username": "2.y...", "Password": "2.z..." } ``` The seeder transforms SDK output to server format before database insertion. ### Project Structure The Seeder is organized around six core patterns, each with a specific responsibility: #### Pipeline **Purpose:** Composable architecture for fixture-based and generated seeding. **When to use:** New bulk operations, especially with presets. Provides ultimate flexibility. **Flow**: Preset JSON → Loader → Builder → Steps → Executor → Context → BulkCommitter **Why this architecture wins**: - **Infrastructure as Code**: JSON presets define complete scenarios - **Mix & Match**: Fixtures + generation in one preset - **Extensible**: Add entity types via new step implementations **Phase order (org)**: Org → OrgApiKey → Roster → Owner (conditional) → Generator (conditional) → Users → Groups → Collections → Folders → Ciphers → CipherCollections → CipherFolders → CipherFavorites → PersonalCiphers **Phase order (individual)**: IndividualUser → NamedFolders → Generator → Folders → Ciphers → FolderAssignments → FavoriteAssignments **Files**: `Pipeline/` folder #### Factories **Purpose:** Create individual domain entities with cryptographically correct encrypted data. **When to use:** Need to create ONE entity (user, cipher, collection) with proper encryption. **Key characteristics:** - Create ONE entity per method call - Handle encryption/transformation internally - Stateless (except for SDK service dependency) - Do NOT interact with database directly **Naming:** `{Entity}Seeder` with `Create()` methods **Pipeline cipher path:** Each cipher factory accepts a single `CipherSeed` parameter. `CipherSeed.FromSeedItem()` converts a deserialized `SeedVaultItem` into a `CipherSeed` for the pipeline path. #### Recipes **Purpose:** Orchestrate cohesive bulk operations using BulkCopy for performance. **When to use:** Need to create MANY related entities as one cohesive operation. **Key characteristics:** - Orchestrate multiple entity creations as a cohesive operation - Use BulkCopy for performance optimization - Interact with database directly - Compose Factories for individual entity creation - **SHALL have a `Seed()` method** that executes the complete recipe - Use method parameters (with defaults) for variations, not separate methods **Naming:** `{DomainConcept}Recipe` with a `Seed()` method #### Models **Purpose:** DTOs that transform plaintext cipher data into encrypted form for database storage. **When to use:** Need to convert `CipherViewDto` to `EncryptedCipherDto` during the encryption pipeline. **Key characteristics:** - Pure data structures (DTOs) - No business logic - Handle serialization/deserialization (camelCase ↔ PascalCase) - Mark encryptable fields with `[EncryptProperty]` attribute #### Scenes **Purpose:** Create complete, isolated test scenarios for integration tests. **When to use:** Need a complete test scenario with proper ID mangling for test isolation. **Key characteristics:** - Complete, realistic test scenarios with ID mangling for isolation - Receive mangling service via DI — returns a map of original→mangled values for assertions - CAN modify database state **Naming:** `{Scenario}Scene` with a `SeedAsync()` method #### Queries **Purpose:** Read-only data retrieval for test assertions and verification. **When to use:** Need to READ existing seeded data for verification or follow-up operations. **Key characteristics:** - Read-only (no database modifications) - Return typed data for test assertions **Naming:** `{DataToRetrieve}Query` with an `Execute()` method #### Data **Purpose:** Reusable, realistic test data collections that provide the foundation for cipher generation. **When to use:** Need realistic, filterable data for cipher content (company names, passwords, usernames). **Key characteristics:** - Static readonly arrays and classes - Filterable by region, type, category - Deterministic (seeded randomness for reproducibility) - Composable across regions - Enums provide the public API See `Data/README.md` for Generators and Distributions details. #### Services **Purpose:** Injectable services that provide cross-cutting functionality via dependency injection. Context-aware string mangling for test isolation. Adds unique prefixes to emails and strings for collision-free test data. Enabled via `--mangle` CLI flag (SeederUtility) or application settings (SeederApi). ## Rust SDK Integration The seeder uses FFI calls to the Rust SDK for cryptographically correct encryption: ``` CipherViewDto → encrypt_fields (field-level encryption via bitwarden_crypto) → EncryptedCipherDto → Server Format ``` This ensures seeded data can be decrypted and displayed in the actual Bitwarden clients.