Towards Formal Verification of Password Generation Algorithms in Password Managers

1. Introduction

Password managers (PMs) are critical tools recommended by security experts to mitigate vulnerabilities associated with password authentication. They facilitate the use of strong, unique passwords, alleviating the cognitive burden on users. However, widespread user adoption is hampered by a lack of trust in these applications. This paper identifies the random password generation (RPG) feature as a key factor influencing user trust. The authors argue that a formally verified, provably secure RPG implementation is essential to bridge this trust gap and encourage PM adoption. The paper's core contribution is proposing such a reference implementation, with formal proofs of functional correctness and security properties using the EasyCrypt framework.

2. Current Password Generation Algorithms

The paper surveys 15 password managers, focusing on three open-source, widely-used examples: Google Chrome's built-in manager, Bitwarden, and KeePass. The analysis reveals common patterns and critical shortcomings in existing implementations.

2.1 Password Composition Policies

PMs allow users to define policies constraining generated passwords. These policies specify length, allowed character sets (e.g., lowercase, uppercase, numbers, special characters), and minimum/maximum occurrences per set. The paper provides a detailed comparative table (Table 1 in the PDF) showing policy options across the three studied PMs. Key observations include varying maximum lengths (up to 30,000 in KeePass), different definitions of "special characters," and inconsistent handling of "similar characters" (like 'l', '1', 'O', '0') to avoid visual ambiguity. KeePass offers the most granular control, allowing custom inclusion/exclusion sets and duplicate removal.

2.2 Random Password Generation

The surveyed algorithms generally follow a two-phase process: 1) Generate random characters to satisfy the minimum required occurrences for each specified character set. 2) Fill the remaining password length with random characters from any set that hasn't reached its maximum occurrence limit. Finally, the sequence of characters is randomly permuted. The paper implies that while this logic is straightforward, its implementation—particularly the randomness source and the enforcement of complex constraints—is rarely formally specified or verified, leaving room for subtle bugs that could weaken security.

3. Formal Verification Approach

The authors advocate for using formal methods to eliminate implementation errors. They selected EasyCrypt, a framework for constructing and verifying game-based cryptographic proofs. The approach involves:

Specification: Formally defining the RPG's functional requirements (input policy -> output password satisfying policy).
Implementation: Writing the algorithm code within EasyCrypt.
Verification: Proving that the implementation correctly refines the specification (functional correctness).
Security Proof: Modeling the algorithm as a cryptographic game to prove properties like the uniform distribution of outputs from the defined space (security).

This methodology provides mathematical certainty that the code behaves exactly as intended and possesses the desired security properties, assuming the underlying cryptographic primitives (random number generator) are secure.

4. Proposed Reference Implementation

The paper proposes a new, modular RPG design intended to serve as a verified reference. While full code isn't shown in the provided excerpt, the design logically separates:

Policy Parser & Validator: Ensures user-defined policies are internally consistent (e.g., minimums don't exceed maximums, total minimums don't exceed password length).
Constrained Sampler: The core engine that performs the two-phase generation under the policy constraints.
Random Permutation: Applies a final shuffle to the character sequence.

Each module would have a formal specification and verified implementation in EasyCrypt.

5. Technical Details & Mathematical Formulation

The security of an RPG hinges on the entropy and uniform distribution of its output. Let $\mathcal{P}$ be the set of all passwords defined by a policy (length $l$, character sets $C_1, C_2, ..., C_n$ with min/max constraints). The ideal RPG is a function $G$ that samples uniformly from $\mathcal{P}$.

The probability of any specific password $p \in \mathcal{P}$ being generated should be: $$Pr[G() = p] = \frac{1}{|\mathcal{P}|}$$ where $|\mathcal{P}|$ is the size of the password space. A common flaw in naive implementations is introducing bias, making some passwords more likely than others. For example, if the algorithm first picks characters for mandatory sets and then fills the rest, passwords with mandatory characters at the start are over-represented unless a perfect shuffle is applied. The formal verification proves the absence of such bias.

The entropy $H$ of the generated password (in bits) is: $$H = \log_2(|\mathcal{P}|)$$ The verification ensures the implementation does not reduce this entropy below the theoretical maximum for the policy.

6. Experimental Results & Chart Description

While the provided PDF excerpt does not contain traditional experimental results or charts, its core "result" is the formal proof itself. The successful verification in EasyCrypt serves as the ultimate evidence. One could conceptualize a chart comparing the assurance levels:

Hypothetical Chart: Assurance Level vs. Development Method

Traditional Testing: High confidence for tested cases, but unknown for untested edge cases (policy conflicts, boundary values). Coverage is limited.
Code Review: Moderate confidence. Highly dependent on reviewer skill. May miss subtle logical errors or side-channel issues.
Formal Verification (as proposed): Highest possible assurance. Provides mathematical proof of correctness for all possible inputs and explicit security properties. The "chart" would show this as the maximum point on the "Assurance" axis.

The paper's contribution is moving the RPG from the first two categories into the third.

7. Analysis Framework: A Non-Code Case Study

Consider a policy: Length=8, require at least 1 uppercase, 1 number, 1 special char. A flawed algorithm might:

Position 1: Pick random uppercase.
Position 2: Pick random number.
Position 3: Pick random special char.
Positions 4-8: Fill with random characters from all sets.
Shuffle all 8 characters.

Flaw: The initial fixed ordering (U, N, S) before the shuffle creates a bias. While the shuffle randomizes final positions, the process starts from a non-uniform distribution of intermediate states. A formally verified algorithm would construct the entire password through a single, unbiased sampling process from the constrained space $\mathcal{P}$, or would provably demonstrate that its multi-step process is equivalent to such uniform sampling.

8. Core Insight & Analyst's Perspective

Core Insight: The paper correctly identifies a critical, yet often overlooked, attack surface in password managers: the trustworthiness of the password generator itself. It's not enough to have a strong vault; if the source material (passwords) is weak or predictable due to a buggy generator, the entire system fails. The authors' move to apply formal methods—a technique more common in hardware or aviation software—to this consumer security problem is both ambitious and necessary.

Logical Flow: The argument is sound: 1) Trust is a barrier to PM adoption. 2) The RPG is a trust-critical component. 3) Current RPGs are implemented ad-hoc without rigorous assurance. 4) Formal verification provides the highest level of assurance. 5) We provide a blueprint for a verified RPG using EasyCrypt. The logic bridges a user-centric problem (trust) with a deep technical solution (formal methods).

Strengths & Flaws:
Strengths: The focus on open-source PMs is pragmatic. The comparative policy analysis is valuable. Proposing a reference implementation is more useful than just critiquing others; it sets a standard. Using EasyCrypt ties the work to established cryptographic verification practice, akin to the verification of algorithms like those in "The Security of Cryptographic Primitives" (M. Bellare, P. Rogaway).
Flaws: The provided excerpt is a starting point. The real test is the complexity of the full EasyCrypt proofs for real-world policies. The approach assumes a perfect RNG; a vulnerability there bypasses all formal guarantees. It also doesn't address side-channel attacks (timing, memory) in the final compiled binary, a limitation noted in other formal verification projects like seL4 microkernel verification.

Actionable Insights:
1. For PM Developers: Integrate this verified core, or one like it, as a library. The cost of formal verification is high upfront but reduces long-term security review burdens and liability.
2. For Auditors & Researchers: Use this work as a template to analyze other PMs. The policy comparison table is a ready-made checklist for security evaluations.
3. For Standards Bodies (e.g., NIST, FIDO): Consider formal verification as a requirement for certifying password generation modules, similar to how Common Criteria mandates rigorous development processes for high-assurance products.
4. For Users: Demand transparency. Favor PMs that publish their RPG algorithms and, ideally, their verification evidence. This paper provides the vocabulary to ask for it.

In essence, this research is a call to raise the engineering standards for a foundational security component. It shifts the goal from "hopefully secure" to "provably correct," which is the only acceptable benchmark for tools guarding our digital identities.

9. Future Applications & Research Directions

The implications extend beyond password managers:

API & Service Tokens: Cloud services and microservices architectures often require generated tokens. A verified generator ensures these secrets are cryptographically strong.
Cryptographic Key Generation: The principles apply to generating human-readable recovery codes or passphrases (via diceware-like methods) for cryptographic keys, where bias is equally dangerous.
Integration with Hardware Security: Future work could verify the interaction between the RPG software and hardware True Random Number Generators (TRNGs) or Trusted Execution Environments (TEEs).
Automated Policy Analysis: Build tools that formally analyze user-defined policies for weaknesses (e.g., effectively small search spaces despite apparent complexity) before generation even occurs.
Standardization: The biggest future direction is turning this reference implementation into a widely adopted standard, perhaps as a standalone library (like libsodium for crypto) that any application can use for verified secret generation.

10. References

Grilo, M., Ferreira, J. F., & Almeida, J. B. (2021). Towards Formal Verification of Password Generation Algorithms used in Password Managers. arXiv preprint arXiv:2106.03626.
Bellare, M., & Rogaway, P. (2005). Introduction to Modern Cryptography. Chapter on Pseudorandom Functions and Permutations.
Klein, G., et al. (2009). seL4: Formal verification of an OS kernel. Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles.
National Institute of Standards and Technology (NIST). (2017). Digital Identity Guidelines (SP 800-63B).
Common Criteria Recognition Agreement (CCRA). Common Criteria for Information Technology Security Evaluation.
Chiasson, S., van Oorschot, P. C., & Biddle, R. (2006). A second look at the usability of click-based graphical passwords. Proceedings of the 3rd symposium on Usable privacy and security.
EasyCrypt Proof Assistant. Official Documentation and Tutorials. https://easycrypt.info/