Files
paperclip/packages
Volodymyr Kartavyi 057e3a494c fix: ensure embedded PostgreSQL databases use UTF-8 encoding
On macOS, `initdb` defaults to SQL_ASCII encoding because it infers
locale from the system environment. When `ensurePostgresDatabase()`
creates a database without specifying encoding, the new database
inherits SQL_ASCII from the cluster. This causes string functions like
`left()` to operate on bytes instead of characters, producing invalid
UTF-8 when multi-byte characters are truncated.

Two-part fix:
1. Pass `--encoding=UTF8 --locale=C` via `initdbFlags` to all
   EmbeddedPostgres constructors so the cluster defaults to UTF-8.
2. Explicitly set `encoding 'UTF8'` in the CREATE DATABASE statement
   with `template template0` (required because template1 may already
   have a different encoding) and `C` locale for portability.

Existing databases created with SQL_ASCII are NOT automatically fixed;
users must delete their local `data/db` directory and restart to
re-initialize the cluster.

Relates to #636

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 21:59:02 +01:00
..